Postcode Distances

Sun, 17 Aug 2008

Generally, suburbs that are near each other have postcodes that are in the same ballpark. Since postcodes are only 1 dimension, and locations are 2 dimensions, there are obviously going to be plenty of exceptions to the rule. But not so many that you can't make a graph like this:

Postcode difference versus distance

So if you have two postcodes which are 10 apart (say 2000 and 2010), then on average they should be 105km apart (assuming you are working with Australian postcodes).

The graph can be drawn so it goes further in the x-direction, but things get a bit silly then:

Postcode difference versus distance

Towards the very right, there aren't many datapoints at all (there aren't many postcodes 8000 apart) so it is actually quite accurate there, it just doens't look as cool.

I sourced my postcode data from sixfive.co.uk, processed it with python (since I'm using it at work, I figure I might as well do more with it) and plotted it with gnuplot. I also worked under the assumption that 1 degree of longitude or latitude is 111km, which I think is about right for most of Australia (though I was doing this from memory, not wikipedia). In any case, if you are using the actual numbers here for anything serious, then you are doing something wrong.

Number Complexity

Mon, 22 Oct 2007

One of the ways of measuring the amount of information in something is by finding the shortest description of that something. For example, the sequence "183736353" has more information than "111111111" by most definitions. The reason that I say "most definitions" is because the language you use to express the description matters. If this language had a really short word for "the sequence 183736353" and a medium length word for "1", then it is the other way around. So to talk about complexity we need to have a definition of the language we are allowed to use. I believe this is called Kolmogorov complexity.

So I thought that it would be good to see what the complexity of a few numbers was, to see how it grows as the numbers get bigger. The description language I used was simple arithmetic (I think most people doing Kolmogorov complexity use simple programming languages, which are much more powerful than arithmetic) and the number "1". So we have the operations plus, minus, multiply and divide. For simplicity, all divisions must have integer results (though I never reached a stage where division was useful). The operations can be done in any order (so you can have free brackets), but you can't ram two 1s together to form an 11. I wrote a program which would try lots of these expressions, in order of increasing length, and found the shortest expression for each number.

Unfortunately, when working in a garbage collected language, you can write memory leaks without realising. So after I had left the program running for a while and came back, the program was using 500MB of memory, and wanting more, so lots of things were being pushed into swap space and basically it was really inefficient. So I didn't get quite as much of the results I was hoping for, but I got enough to draw a graph (after filling in a few blanks with what seem like the right numbers):

Kolomogorov complexity of integers <= 40

(the vertical axis is just the number of 1s needed, since each operator is binary, each expression must have exactly one more '1' than it does operators, so this number easily maps to the length of the whole expression)

There is also a line of best fit in there, which I think fits the data pretty well, and it feels like the right type of function (anything to do with information content tends to have logs in it): 3.233 * log(n) - 0.8701. So if I was to estimate the number of 1s needed to write out my phone number (an 8 digit number), it would be 59. But this estimate might get a bit wrong as we get higher, since all of the expressions my program found just used addition and multiplication. Once you get high enough that subtraction becomes more useful, the curve might take a different shape. I'd be interested to see if there is a number where the shortest (or even one of the shortest) expression which yields it uses division. I suspect that there is (with the expression of the form: ((some large but 'simple' number) + (some simple number)) / (some simple number). I suspect that it would be quite a large number if it exists.

Newton's Method Fractal

Sat, 22 Sep 2007

I'm not sure if this falls under my usual definition of graph (but my usual definition of graph doesn't agree with answers.com either, so meh). So, you can decide for yourself if it is a graph or not. In any case, it is about as useful as my other graphs.

In highschool we were taught a way of finding the roots of an equation using Newton's Method. For the non-mathsy people, if you have an equation, like y = x + 5, then you can draw it as a picture, where all the points where the 'y' coordinate is 5 more than the 'x' coordinate are black and everything else is white. In this case, it would be a diagonal line. A root is a place where y is 0 (so x = -5 in this case). Newton's method is a method for taking a guess at where a root might be, and then (hopefully) improving that guess. It looks at the slope of the line at the point you have guessed, and then assumes that the line doesn't get any steeper or shallower, and finds where it would hit the point where y = 0. This is the revised guess. In the case of y = x + 5, it will find the exact root straight away (because the slope doesn't change). For more complicated functions, it will usually just get closer and closer to the actual root (as you keep on guessing, using the last revised guess as the starting point).

When you have a function that has multiple roots (a squigly line that goes through the y = 0 line multiple times, you can theoretically end up converging towards any of the roots, though you are more likely to end up at one close to where you start from.

The final piece of the puzzle you need to know is that Newton's method works for complex numbers too (for the non-mathsy people... complex numbers aren't complex as long as you don't think too hard about them. If you think of numbers as having a left (negative) and a right (positive), complex numbers give them an up and a down, so a number is a point in 2D space, rather than a point on a 1D number line).

So, if we take a function with lots of roots - cos (which has an infinite number of roots) and find where we converge to from different starting points, we can get a cool picture. We just need some way to show which point goes to which root. Since the roots are spread apart by 3 and a bit, I just chopped off all the stuff after the decimal point, and then found the remainer when dividing by some constant:

Picture of which root Newton's method resolves to for cos(z), z near 0

I'm not the first to do this. This person has done it, and made this cool picture for y = x^3 - 1, which has 3 roots, which they have coloured red, green, and blue).

One thing which I thought would be fun to do though was to zoom in on it, and make an animation, so this is that animation:

What is especially cool is the frames that have purple in them, which I have no idea about (they are a bug), the thumbnail (if you can find it) (which also has the crazy purple), and the fact that it chopped off the end of the video. Actually, the last one isn't cool. Sometimes failure can be interesting (where did this purple come from?), and sometimes it can be boring (where did the last bit of the animation go?). I don't think I know enough about making these videos to fix it, so I'll just leave it as is.

Building Numbers

Sat, 04 Aug 2007

Time for another graph. If you take the digits in a number, and use any combination of +, -, * and /, then you can build other numbers. For example, with the number 62 you can make 3 (6 / 2), 4 (6 - 2), 8 (6 + 2) and 12 (6 * 2). So if you take all the two digit numbers, then find out which ones can build other two digit numbers. An arrow from a number to another indicates that it can build the other:

Graph of 2-digit number building in base 10

Or 3 digit numbers: Graph of 3-digit number building in base 10

You can also do it in different bases, like octal:

These are the 3 (octal) digit connections: Graph of 3-digit number building in base 8

Not much else too this. I just like graphs.

Partial Anagrams

Tue, 22 May 2007

Here is a term I just made up - partial anagram. I say that a word is a partial anagram of another if you can rearrange the letters of it, plus a few other letters to make the other word. The "order" of a partial anagram is the number of letters you need to add to the word to make the anagram. So basically, it is like an anagram, but you are allowed to add more letters to a word if it is too short.

So from this definition, we can define (nearly) a partial order over all the words in the English language (or more conveniently, all the words in /usr/share/dict/words that are only made up of lower-case letters). For the non maths people, a partial order is a bit like an order (as in, "alphabetical order") except instead of being able to put everything in one long line from first to last, you can only compare certain things. For example, if we wanted to order things by "how close they are to being a cake", then we could say that "flour" is further away than "a mixture of flour and butter", but you can't say that flour is closer than butter or vice versa (well, you could but that would be annoying).

So, our partial order says that a word comes before another if you can add a single letter to it, then scramble it up and get the other word. So if you were to take the words in English and see how they connect, you might expect lots of different patterns to emerge (maybe lots of disconnected islands, maybe a huge connected thing with "a" in the middle, connecting to "an", "ad", "at" etc then "and", "tan", "ran", "man"... etc. Eventually connecting every word to another. Enough talk, here are some pictures:

Simple graph

This is a simple graph showing what we are talking about. For example, you can turn "originate" into "invigorate" by adding a "v" ("originatev") and rearranging the letters.

Words of at least 8 letters

This is the graph for the biggest connected group of words with 8 letters or more. The graph for all words takes too much computing time to generate unfortunately.

Words of at least 9 letters

This is a graph with words of 9 letters or more. It is basically the same structure as above, just with less density.

Interestingly, if we take all words, then although nearly all are connected to each other, there are still a few which form islands. The biggest island is this one:

The largest "island" of words

And this concludes the cool graph of the day.

This isn't actually a partial order, since I've messed around with a few definitions to make it easier to write about (all of the "comes before" things should be "comes before, or is equal to" or similar) and then the "is equal to" part needs a good definition. But, I don't mind, because I got a pretty graph.

I also cheated and left out words ending with 's'. It meant I could be lazier when writing the algorithms so they didn't have to be as efficient.

 

About

I'm a nerd living in Sydney. This is a place where I can write stuff about my interests and not care that no one else is reading.

I like music, maths, programming, pretty pictures, filters and other good things.

(more info)

It should be fairly obvious that this isn't connected to my employer at all.

Email me (not a catchpa)

Email policy

Subscribe

RSS Feed RSS

Get an aggregator

Liferea (Linux)

Vienna (OSX)

Feedreader (Windows)

Google Reader (Web based)

I've only used Liferea, so I can't vouch for the other ones.

About this site

This site runs a (modified) version of blosxom.

The host is GeekISP, and they seem to do an excellent job.