Vectors and matrices

Basically matrices is the bad kid and when a vector hangs out with the bad kid, it also becomes bad. Basically a matrix is a distorted grid and when you multiply it with a vector, it distorts the vector too in the same way it is distorted basically.

In a sense vector is a single entity; it is an arrow pointing towards some direction but matrix is information about two standard units and where they ended up after the distortion. And that distortion is the guide which tells us where any other vector would end up given that distortion is applied to it also. Basically if the standard units are 1, 0 and 0, 1, if they become 3, 0 and 0, 3, what would happen to any other vector? That we come to know by assembling that matrix using the standard units And multiplying our vector of interest

For AI, the vectors exist, and the transformer basically is a set of thousands of matrices, the vectors travel through these matrices and, at the end, become influenced by the parametric memory of the model itself and also the context they carry with each other as a group of words or sentences together.

Basically, words mean something as per the vectors. Right after the model is trained, they are no longer random; they are trained and mean something. As they go through the model weights and their presence with each other, they keep absorbing the parametric information and also the contextual information they carry themselves. By the end of it, the last token has the context enough to represent the whole sentence, kind of, and then when it is compared with the vector of all the words available, we find the most suitable word.


Vectors and matrices can also be looked at through the lens of analogy that I hat and J hat now became this different thing. What would a vector from the world of I hat and J hat become in this one? I can’t think of another example right now so imagine in our current world your mom gives you the birth and your dad doesn’t. So your umbilical cord is tied to your mom. But because of some magic the world just changes and now instead of mom giving you birth, your dad gives you birth. In this world by analogy, your umbilical cord will be tied to your dad.

Same thing that in a universe where I Hat and j hat are not distorted yet. The vector might be at 1, 1 but when the word is distorted and i hat and j hat changed their line position, where would the vector end up because it has to maintain that relationship with i hat and j hat?

Basically multiplying a vector by a matrix helps us maintain the analogy in the new linearly transformed space. The matrix allows us to find the spots where we should move our vector to maintain the relationship and thus analogy.

In a standard coordinate system there is basically a contract that if I hat and J hat are here, you will end up at a point in space. But when this coordinate system gets distorted by scaling or shearing, you still have to maintain that contract you had with i hat and j hat. So you multiply your vector by the matrix and fulfill that contract. Kind of like, let’s say you took a loan from two guys in your country and you had some inflation rate there. But imagine the inflation rate has increased. Now you got to adjust your loan payment so that you still pay them the real amount that you owe them.

You owe something defined in the old system, and when the system changes, you have to re-express that same obligation in the new one.

In a video game system I guess it can work like: hey you have a character static right now but on the press of the button it should jump. The shape of the character has to morph into a new shape. If the vectors are imagined as dots in a space like pixels on screen, a matrix can transform a normal still-standing character into a jumping one after you press the up button. Basically the action of the up button has changed the world. It has taken us from the point where it was a world where the character was standing still but now the paradigm is that it shall jump. It has to follow the obligation to jump. Am I making sense? I think it will make sense if the action of pressing the up button results in deployment of matrices that transform the current vectors that are standing still into the vectors that represent the act of jumping. Of course it will happen gradually so it will be like phases of matrix operation is happening.

Assuming that there are thirty frames per second and the jumping takes two seconds to go from still to the completion of the jump action, it will be taking like 120 matrix operations? Each just a slight bit different from the other. So a game designer is basically mapping what matrix operations to execute at the place of a button.

A button press selects which transformation rules should be applied to the character over time.

Or even when you are animating the visualization for how a coordinate system is distorted, you take it from i_hat and j_hat to their distorted level but the transition visually will happen gradually. You are covering that by using all the matrices in between. So animation heavily relies on vectors and matrices. And by extension games and software do too.


This is Uro from Jujutsu Kaisen. Her power also kind of reminds me of matrices. She basically can distort space. So what was supposed to be at point X with her powers? She can make it not be at X by distorting the surface. Imagine in a 3D space there is a plane. By her powers she can morph the plane into something else. If a point was on that plane, because the surface is now distorted, its position might have changed now. But her power is beyond just linear transformation. She can do non-linear transformation things also so I don’t know but cool example from fiction.


I was talking about animation and linear algebra. Let’s imagine we densely plot a lot of vectors as dots. If we do multiply it with a scalar matrix the box will get bigger, right? We can animate a small box getting bigger but the problem is that with matrices you can only do linear transformations. If you want to animate a box turning into something like a pineapple, how do we do that animation? There must be a limitation on linear algebra when it comes to morphing shapes. Well I think this is where you apply multiple matrices. Let’s assume currently there are many, many, many vectors arranged in a rectangle shape on a 2D plane. If you apply two different matrix operations to them, they will make up the shape of a pineapple better than just one matrix operation. As you keep increasing the number of matrix operations, I think you will get closer to the shape of a pineapple. You just keep building patches and patches and patches and the target vectors are different for each operation.

For example you have a dense rectangle of vectors. Now you divide them into thousands of smaller parts. Now to each of those parts you apply different matrices. They all get transformed independently as per their matrix and end up at places. Now that collective rectangle has taken up the shape of a pineapple. This gave me heavy calculus vibes that say: “Hey since we cannot calculate the volume, let’s do slices. Hey since we cannot do a pineapple, let’s slice small rectangles and arrange them.”

I asked Claude to write code for a demonstration and this is what it made me. Pretty cool. Not pineapple enough but hey! close enough we get the point

So I got to re-learn about matrices and vectors yesterday, and in like two days I am at basic computer graphics. I didn’t know maths was this fucking scalable. Like, I am a marketing guy, and things don’t scale this well there.