Friday, September 2, 2011

Intersecting Lines (The Art of Representing Data)

So, Atwood's Law. Yes, I'm thinking about writing a very light weight computational geometry library in JavaScript. I'm purposely going to target 2D computational geometry because (a) it is 'easier', (b) it can be more readily understood. So, I'm about to start work on the "kernel". I'm about to define a "line".

Well, how do I do that?

If I go back to what I taught college algebra students, then a line is

y = m x + b

Ah, yes, intercept-slope form. Beyond the fact that this is a very easy representation of a line, it is very fucking useless. Why? You can't express vertical lines. Let's look at another version

(y - y0) = (x - x0) m

Oh, yes, still useless. Can't express vertical lines. Ok, Ok. Back up, college algebra was apparently a waste of time since it didn't give a meaningful definition of line. Somewhere, probably graduate school, I realized a line a equivalence relationship of some sort. Aha, yes, a general form to a line.

a x + b y + c = 0

This is way better as I have a complete representation of a line. Yay! Ok, so, let's use it. I can take two of these bastards and solve the system of two unknowns. Ok, what happens when there is no solution?... uh, they are parallel and don't intersect. Ok, so, yes, I find the point at which they intersect. Is this useful? Well, it could be, but I've lost information. Let's look at this equation in vector form.

N (x - x0) = 0

This would be the normal-i-can't-remember-name-but-similiar-to-plane-equation form. This is a nice and cute representation, but it gives be a really poor way to intersect lines naturally. It has a beautiful explaination thou, basically, the line is a collection of all vectors that are orthogonal to to the normal (when translated to the origin). While this is a useless computational representation, it does tell us something useful about the general form. Namely, N = (a,b); isn't that nice.

So far, the general form has given us the most bang for our buck. However, it still sucks since this general form is a compressed representation of a line that enables me to (a) check to see if a point is in the line, (b) find more points in the line, (c) find a line that is perpendicular to it. In the platonic world, this set is awesome. However, the real world tend to suck and lines start somewhere. So, let's go back to that vector stuff and consider (briefly) an Affine linear combination.

L(k) = k * P1 + (1 - k) * P2

Is this a line? Well, it only has one variable and it has at least two points. Well, as cute and as insightful as this is; it is still useless. Too many operations.

L(t) = P0 + D t

This is what I am going to go with. I know where the line starts! I know which direction it is going! It is very easy to intersect two lines! It's easy to compute the normal of the line (use complex numbers to rotate D)! It is trivial to compute new points in the line! As an added bonus, I get rays and segments for free by trapping t in a one dimensional box. Yay!

Moral of the Story

There are many ways to represent the same or very similar data. The one you learned first was probably not the best.

Having been alive for a while, I've learned to try to think ahead of time to find the right representation for the data. That is, after-all, 90% of computing. If spend a bunch of your time writing algorithms, then you are probably doing something wrong.

Am I making the right decision? Not sure. See, now that I've decided which numbers to actually store. I now need to pick between an object, an array, or a composition of arrays. I'll benchmark that since that is part of the core, and it will affect both the performance, readability, and beauty of the code.

Final Decision: going with readability and the 3rd representation even thou it costs double the CPU cycles. Although, I am impressed with how fast JavaScript is.

No comments:

Post a Comment