notes on quantization

“compute solves a lot of problem” If we just had enough compute, a lot of the problems that we experience today would be solved. Loger contexts, smarter weights and biases, etc. But right now we don’t have infinite compute. That’s the sad reality. So we optimize. Quantization is our attempt at just that. history Reference: https://arxiv.org/abs/2103.13630 Quantization is a way of compression. It is a process of mapping a large set of continous or high-precision values into smaller discrete set of values. ...

April 7, 2026 · 10 min

notes on PolyBlocks

Recently, at PyTorch Day India in Bangalore, I saw a talk on AI compilers. Here is the link: YouTube Picture from the session I didn't know there were Indian labs working on the AI compiler problem. But it turns out there are. PolyMage Labs is an IISc lab in Bangalore working on PolyBlocks. Since AI is moving fast, there is a clear need for efficient AI compilers that can lower high-level tensor programs to IR for GPUs, TPUs, and other backends. PolyBlocks minimizes dependency on external vendor libraries like cuBLAS/cuDNN while still generating highly optimized code via compiler-driven transformations and tiling. ...

March 29, 2026 · 8 min

MIME-ish implementation to share images over ssh

Some days back I was studying for computer networks exam. I came across few protocols which were very interesting. Like SMTP (Simple Mail Transfer Protocol), telnet, SCP (Secure Copy Protocol) just to name a few. SMTP and a little bit of theory Simple Mail Transfer Protocol is a protocol used to transfer mails over servers. It was written in 1981. IT works on port number 25. Since SMTP is server-to-server, the client port number is 587. ...

March 13, 2026 · 4 min

Python's argparse

In this article I’d like to introduce you to a rather useful python library that can be of use to you. It’s called argparse and recently I have been using it as my go to for couple of things. I first got to know about this library when participating in a kaggle comp. It was pretty intimidating at first because you’re not sure what’s going on but after this article I am hoping you’d know how to deal with code that mentions argparse. We’ll also talk about config files and how this library can be used to write config file. ...

March 4, 2026 · 5 min

Thoughts on AI; updates on essays

If you didn’t know, I recently started writing more. Published a new website. Yes, the website you’re reading this at. The reason was to get good at understanding and learning. With the coming of AI, writing code has never been easier. And to be honest, I don’t think AI has any role in this. This was way before AI came. The main thing that drives the world imo is an idea. Ideas and implementations. Now the way we implement things have been changing since ages. The one example I like to think about is of the compilers and assembly programmers when C language came. Pretty sure all of them were in the same position developers today are. But that’s another story. Implementations change, but the most thing that drives technology, sciences, math and all the important stuff, are, as i said, ideas. And to get better ideas, we don’t just need intellect. No. We need creativity, we need people who can understand deeply. Who can think. And I don’t use the word think in a lighter manner. Thinking was never easy. And in today’s world, it’s even harder. Which is why I started writing. Because believe me or not, writing is thinking. Every week I have this essay that I have to think about, learn, and write about. ...

February 27, 2026 · 2 min

Notes on torch code compilation

Before we see what torch.compile does, we should first understand pytorch’s default mode and why we’d ever want to move away from it. PyTorch runs in eager mode by default. Think of it as PyTorch reading and executing your code op by op, as Python encounters each line. It’s immediate, flexible, and great for prototyping — but it pays a Python interpreter cost on every single operation. For production and deployment, we want to skip that cost. That’s where compilation comes in. ...

February 18, 2026 · 4 min

Notes on SIMD

Today we look at matrix multiplication (matmul, as we will call in this essay). Since, the last essay was on backprop, it was only logical to think about the most fundamental math operation that lets us do the algo. That is, matmul. Also, the numbers in this essay are going to shock you. Like really. So if you think I am making this up, you should checkout my code for this essay. ...

February 9, 2026 · 7 min

Backpropagation: first draft

I’m assuming you understand the basic idea of neural networks. This essay focuses purely on the backpropagation algorithm itself. What is Backpropagation? Backpropagation is an algorithm that computes how much each weight and bias should change to reduce the loss. It tells us not just whether parameters should go up or down, but by how much, based on their actual impact on the loss function. We use math to figure out that. ...

January 30, 2026 · 7 min

Hashing and stuff

what is this about? Today we’re looking at hashing. We’ll get into the process, the data structures and some applications. Through this article, we’ll see how we can implement dictionary operations, which are mainly: insert delete search what is Hashing? Hashing is a technique of identifying an object out of a group of similar objects. Analogy for hashing: Imagine if you took your name tiwariji and ran it through a complex mathematical function that produced 7a3f9c2e. You couldn’t look at 7a3f9c2e and figure out it came from tiwariji, but every time you hash tiwariji you’d get the same result. That, is hashing. Ideally, no two different inputs should produce the same hash output. We’ll talk about that later when we discuss hash functions. ...

January 22, 2026 · 7 min

Stacks and Heaps

Data management during compiler process When compiling a piece of code, the data in the code is stored in segments. There are five types of segments: stack heap data code BSS We will focus on stack and heap in this essay. Stack Stack allocation is the process of allocating memory for local variables and function calls in call stack. Each function gets some memory in the stack to store variables in it. Since the memory is handled by the system, its faster. But the memory is less as compared to heaps. The size required is already known before execution. The compiler allocates some memory. ...

January 16, 2026 · 3 min