Thursday, December 2, 2010

The influence of advanced mathematics on programming.

I'm selling many of my books on amazon, and as I was going through the books I realized that most of it was useful, but only useful an indirect way. I would to share some thoughts on how my studies in graduate level mathematics influences by day to day operations of building products, managing databases, and doing everything a free electron can do in a given day at a start-ups.

Topology

I think the best introductory book to topology is "Introduction to Topology by Crump W. Baker". Topology is basically the study of connectedness and surfaces. When studying topology, you think about how are things different. Are a donut and coffee cup the same? Well. Yes they are once you define what "same" means. There are practical programming challenges in topology in how once can process and do feature selection in computer vision. But, there are more mundane ways of applying topology.

For instance, a relational database is a topology in a discrete graph sense. How does this help me? Well, I'm about to do some stupid DELETEs and UPDATEs on a very large data set. Is the data set before and after the same in regards to current business value? Did I botch up? Topology comes in with the idea of topological invariants. A topological invariant is a quality that can be measured and is invariant under any continuous transformation (isomorphism).

If I were to write a query that measures the business value of the database (say, by the sum of the transactions, sum of paying accounts, and so forth), then I can use these to get a good sense of whether or not I botched up my changes by measuring before and after.

Algebra

If you take a bunch of things and make those things operate on each other, then you have an algebra. There are a lot of properties involved in what the operation implies, and the first year is basically dedicated to defining all those properties and understanding their significance. The ultimate results you typically end up looking at in a first or second year course are the unsolvable theorems (i.e. Doubling a cube).

My most immediate thought on how any of this has any practical bearing on programming is MapReduce. Algebra, in my mind, plays a huge role in how to think about designing algorithms in a MapReduce environment. Namely, the reduce phase where you think about merging. Given two or more documents, how do you reduce them to one? The algebraic properties are things that one must consider (and you may get them for free).

Analysis

This is my favorite branch of mathematics because it is the puzzle of inserting zeros and bounding values. I recommend anyone to check out The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. It is an amazing little book that I go through every year to make sure I'm still smart enough to call myself a Mathematician. The most obvious application of this art is numerical analysis. However, most of the time, I don't need to do any numerical analysis since I work primarily on search problems these days.

Unfortunately, most people get the short end of the stick when they study calculus and get a very boiler plate version of the Calculus. I recommend Differential and Integral Calculus.

I take analysis concepts outside of code and into management. For instance, how can I measure the code and enforce a code quality metric to prevent SQL injection hacks? How can I enable developers to converge to a right answer under QA? What does QA need to do?

Proofs

I must admit that I was a stickler when it came to proofs since writing a proof is just as much fun as programming is to me. When anyone writes a program, they are writing a constructive proof that something exists. This begs the question of whether or not that something is what you want.

Does your program need a proof? Two years ago, I would have sad "absolutely". Now, I don't think so because proofs are kind of useless. The problem is that I have to understand enough about the formalism for the proof to make any sense. Well, the source code is a formalism of its own; in fact, a very precise formalism. The proof is already written.

Are proofs useless? I think going through the years of writing proofs has helped me write very good tests. I can look at the code, and know where the problem spots are going to be. Those trouble spots are going to need tests to ensure they work as expected. For instance, reliance on third party services always requires some kind of tests to ensure that updates are working as expected. Things I don't control are things I don't have a chance at proving, so I need tests that are automated and tested daily.

Problem Solving

I think the study of mathematics is probably the fastest way to build problem solving skills since you constantly fail and each failure costs nothing, but the caveat is you may not be solving practical problems. However, building it up as a skill enables you to be more effective at being a programmer.

Do you need advanced math?

Not really. Most math is basically a form of mental masturbation and building the mental discipline/stamina to sit down and think very hard. I think it makes you better in some ways, but there is an opportunity cost. It all depends on what you want to do. If you want to ship products, then you are probably fine to avoid it. If you want to make awesome libraries and sell them to product people, then you probable need some advanced math.

2 comments:

  1. +1 for Cauchy Schwarz Master Class. That's a great book that I don't think enough people know about.

    ReplyDelete
  2. Landau's book is great! I have an old copy (I see it is now back in print, but I honestly don't think calculus relates much to programming. Linear algebra, on the other hand, rocks for both training your mind to program and useful programming knowledge. Watch Gilbert Strang's MIT lecture series and get his textbook.

    ReplyDelete