When I was younger I invested a lot of time into studying geometric algebra. Geometric algebra is a system where you can add, subtract and multiply oriented linear subspaces like lines and hyperplanes (cf. Grassmanian). These things are pretty important if you’re doing geometry, so it’s worth it to learn many ways to work with them. Geometric algebra emphasizes exterior products as a way to parameterize these primitives (cf. Plücker coordinates). Proponents claim that it’s simpler and more efficient than using “linear algebra”, but is this really the case?
In this blog post I want to dig into the much maligned linear algebra approach to geometry. Specifically we’ll see:
Two ways to parameterize a linear subspace using matrices
How to transform subspaces
How to compute the intersection and span of any two subspaces
A subspace of a vector space is a subset of vectors which are closed under scalar addition and multiplication. Geometrically they are points, lines and planes (which pass through the origin unless we use homogeneous coordinates). We can represent a k-dimensional subspace of an n-dimensional in two ways:
In the first case, we can interpret the span of k vectors as the image of an n-by-k matrix
In the second, the solution of a set of n-k linear equations is another way of saying the kernel of an (n – k)-by-n matrix, .
These two forms are dual to one another in the sense that taking the matrix transpose of one representation gives a different subspace, which happens to be it’s orthogonal complement.
Intersections and joins
The best parameterization depends on the application and the size of the flat under consideration. Stuff that’s easy to do in one form may be harder in the other and vice-versa. To get more specific, let’s consider the problem of intersecting and joining two subspaces.
If we have a pair of flats represented by systems of equations, then computing their intersection is trivial: just concatenate all the equations together. Similarly we can compute the smallest enclosing subspace of a set of subspaces which are all given by spans: again just concatenate them. And we can test if a subspace given by a span is contained in one given by equations by plugging in each of the basis vectors and checking that the result is contained in the kernel (ie maps to 0).
Linear transformations of flats
Depending on the form we pick flats transform differently. If we want to apply a linear transformation to a flat, then we need to consider it’s encoding:
If the flat is given by the image of a map, , then we can just multiply by
And if a flat is a system of equations, ie , then we need to multiply by the inverse transpose of .
The well known rule that normal vectors must transform by inverse transposes is a special case of the above.
Finally we can convert between these two forms, but it takes a bit of work. For example, finding the line determined by the intersection of two planes through the origin in 3D is equivalent to solving a 2×2 linear system. In the general case one can use Gaussian elimination.
Is this less intuitive?
I don’t really know. At this point I’m too far gone to learn something else, but it’s much easier for me to keep these two ideas in my head and just grind through some the same basic matrix algorithm over and over than to work with all the specialized geometric algebra terms. Converting things into exterior forms and plucker coordinates always seems to slow me down with extra details (is this a vee product, inner product, circle, etc.), but maybe it works for some people.
WARNING: This is a somewhat rambling post about category theory. If half-baked mathematical philosophy is not your thing, please consider navigating away right now.
Anyway, the thing that I want to write about today is the difference between category theory and what I shall call for lack of a better term the older “set theoretic” approach. My goal is to try to articulate what I see as the main difference is between these two structures, and why I think that while category theory offers many insights and new perspectives it is probably hopeless to try to shoe horn all of mathematics into that symbolism.
If you’ve ever taken an advanced math course, you probably already know that set theory is default “programming language” of modern mathematics. Modern set theory is built upon two basic components:
A set is any well defined unordered collection of objects, the members of which are called its elements. Sets by themselves are rather boring things, and don’t do much on their own. What makes set theory interesting and useful, is that in addition to sets we also have relations. There are many ways to define a relation, but if you already know set theory we can bootstrap the definition using the concept of a Cartesian product:
An n-ary relation amongst the sets is a subset of . A tuple of n elements, is related (under R) if and only if,
This may seem a bit circular, but it is unavoidable since we need some formal language to define set theory, and without set theory don’t really have any way to talk about formal languages! The only way out is to take at least something as a given, and for most mathematicians this is the definition of sets and relations.
The main goal of set theory is to define new sets and the relations between them. A classic example of a relation is a graph, which in the plane visualizes relationships between a pair of variables:
Relations show up all over the place in mathematics. For example, we have binary relations like =, <, >, etc. that we can use to compare quantities. It is also possible to think of arithmetic in terms of relations, for example + can be thought of as a ternary relation that takes 3 numbers as input and checks if the 3rd is the sum of the two inputs.
It is possible to build up more complex relations in set theory by combining simple relations using the quantifiers there-exists and for-all. For example using the multiplication relation we can write the statement “2 divides x” using a quantifier:
Where I am using somewhat non-standard notation here to write multiplication as a ternary relation:
Relational thinking is also the main concept behind tensor calculus, where one replaces all the sets involved with vector spaces and the relations with multilinear forms. In fact, the whole concept of the Galerkin method in numerical analysis can be thought of as first relaxing a set theoretic problem into a tensor relation; then performing a finite expansion in some basis.
You can approach category theory in many ways, but coming from set theory the easiest way to understand it is that it is the study of functional relations above all else. The basic axioms of a category define a mathematical structure in which we can study certain abstract classes of functions. Briefly a category has three pieces of data:
A set of objects,
For every pair of objects set of morphisms,
A relation for every triple of objects
Such that the following conditions are satisfied:
is a functional relation,
There exists some such that
is associative, that is
Categories play a central role in algebra, where they are used to express transformations between various structures. Perhaps the place where this is most obviously useful is in the study of groups and their representations. Also the fact that many common algebraic structures like monoids are secretly just degenerate versions of categories, highlights their central importance. Unlike relations categories have a great deal of structure, which makes it possible to say much more about them than one can about a relation in general. It can be difficult to cast a relational problem into the language of categories, but the payoff is generally worth it. For example, one of the more successful ways to study tensor algebra today is from the perspective of algebraic geometry.
Categories vs Relations
The main difference between categories and relations is that categories focus on change, while relations express invariance. Both philosophies are equal in their overall expressive power, but they may be better suited to some problems over others.
The main power of category theory is that it lends itself to explicit calculations and so it is most useful as a language for describing transformations. This makes it an especially nice language for reasoning about algorithms and programs, and one can see this in languages like Haskell. On the other hand, relations make minimal assertions about how we might compute something and instead only describe “what” is to be computed. Languages like SQL or Prolog make heavy use of relations and are good at expressing data processing concepts.
For example, it is trivial to convert any problem in category theory into the language of relations (this is vacuously easy, since the axioms of a category are specified in terms of sets and relations!) However, going the other way is a more complicated proposition and there is no perfect solution. Perhaps the most direct way is to “emulate” relational reasoning within category theory, which can be done using the language ofmonoidal categories. However, simply shoe horning relational thinking into this framework loses most of the advantage of categorical reasoning. It is similar to the argument that you can take any program written in BrainFuck and port it to Haskell by simply writing an interpreter in Haskell. While it would then be strictly true that doing this translates BrainFuck to Haskell, it misses the point since you are still basically coding in BrainFuck (just with an extra layer of abstraction tacked on).
Categorification is hard
This is really why categorification (and programming in general) is hard: there is always more than one way to do it, and the more general you get the worse the result. Effectively categorifiying various structures requires thinking deeply about the specific details fo the situation and picking axioms which emphasize the important features of a problem while neglecting the superfluous details.
In the end, one view point isn’t any better than the other – but they are different and it is worth trying to understand both deeply to appreciate their strengths and weaknesses. Relational thinking is successful when one needs to reason about very complicated structures, and is pervasive in analysis. Categories on the other hand bring with them much more structure and precision, and are most useful in describing transformations and syntactic abstraction.
I really doubt that I am going to win any awards for graphics… or gameplay… or sound.. BUT!
It is multiplayer
It has persistent state
It is HTML5
And it actually works!
How cool is that? Pretty amazing what you can do in a weekend. I’m not sure if this is the first successfully completed MMO made for Ludum Dare, but I daresay that it is not something which is frequently attempted (possibly with good reason). The tools I used to create this monstrosity were node.js, mongodb and crafty. In terms of development, here is what went right:
Node.JS is a fantastic platform for making multiplayer games. Event driven concurrency is basically the model most games use anyway, and I have yet to see any system which does this better.
MongoDB is also quite a nice tool for game programming. While in most situations I would say the lack of a structured column format is a disadvantage, it actually turns out to be a real help for gamedev and rapid prototyping. In particular, it was very convenient to make a table with player stats and just keep adding fields to it while developing without worrying about it. Of course you could probably get similar results by just making a raw string column in a SQL table, but this is a bit ugly and does not integrate as well with node.js.
Nodejitsu is a very nice host. Nuno Job graciously offered me a coupon earlier, and I have to say I was very impressed with their service. Compared to nodester, one of their best features is the automatic database set up and configuration.
Making a click based system for navigation is a cheesy way to handle latency, but it works.
Now for what went wrong:
Unfortunately, I was not very disciplined with the amount of time I spent programming. For me coding stuff is the most fun, and I tend to procrastinate on the drawing side of things. As a result, the graphics in the game look pretty terrible. The grass texture in particular turned out eye-bleedingly bad 😦
The other consequence of over indulging in coding was that I didn’t spend much time on the game itself. There is really only one type of enemy (a rat), and not much to do in the game besides kill people. The map is also very empty/boring. (Though there is a boss battle/final secret.)
This was my first time using crafty.js, which I selected using the “deliberative” method of random google search for “HTML 5 game engine”. As a result, I spent some time up front trying to figure out how to get everything working. Crafty itself seems like a fine piece of technology, but the documentation and examples really need some work. In particular the “RPG tutorial” is out of date and doesn’t work with the latest version. This cost me maybe 2-3 hours trying to get it to work, which I would say was the low point of the whole competition. In the end I just gave up with the docs and read the source code directly, which turned out to be much easier to understand. However, despite this setback, using crafty was still a net savings over doing everything from scratch, especially taking into account debugging and portability issues.
In the end, I had a lot of fun making this (although the final game is kind of terrible). Once again, if you want to give it a try here is the link:
There is a trend these days to avoid writing about technical things in programming — and in particular game development — blogs. Just go to places like r/programming or altdevblogaday, and so on and you find plenty of articles giving you great advice on everything EXCEPT math and programming! What gives?
There’s just not enough articles any more that get down to the nuts and bolts of algorithms and data structures, or at an even more basic level actually bother to try explaining some interesting theoretical concept. Once venerable clearing houses like flipcode or gamedev.net have either shutdown or become diluted into shallow social-networking-zombies. Perhaps this is a symptom of our ever decreasing attention spans, or even more pessimistically a sign that indie devs have simply given up on trying to push the technological envelope out of fear of competing with big studios. It seems that trying to innovate technically is now viewed as `engine-development’ and derided as unproductive low-level tinkering. Wherever it comes from, I am convinced that these insubstantial discussions are making us all stupider, and that the we need to start writing and paying attention to hard technical articles.
So, rather than sit back and complain, I’ve decided to do something proactive and try to update this blog more often with useful — or at least interesting and substantial — technical content. But before that, I will start by listing some of the things I am *not* going to talk about:
Coding “best practices”
As I’ve just described, there’s already an abundance of literature on these subjects, possibly because they are trivial to write about, and it is easy to have an opinion about them. As a result, I don’t really feel particularly compelled, or even much qualified as an industry-outsider-academic-type, to discuss any of those things. More pragmatically, I also find that discussing these sorts of “soft”, subjective issues leads to either pointless back patting or unproductive bickering.
Instead, I’m going to use this blog to talk about the “harder” foundational stuff. When possible, I will try to provide real code here — though I will probably switch languages a lot — but my real focus is going to be on explaining “why” things work the way they do. As a result, there will be math, and this is unavoidable. I’m also going to make an effort to provide relevant references and links when possible, or at least to the extent of my knowledge of the prior art. However, if I do miss a citation, please feel free to chime in and add something.
I don’t have a set schedule for topics that I want to cover, but I do have some general ideas. Here is a short and incomplete, list of things that I would like to talk about:
Procedural generation and implicit functions
Physical modeling using Lagrangians
Mesh data structures
Non-linear deformable objects
Collision detection and Minkowski operations
Fourier transforms, spherical harmonics and representation theory
Applications of group theory in computer graphics
I’m also open to requests at this stage. If there is a topic that more people are interested in, I can spend more time focusing on that, so please leave a request in the comments.