Partial Anagrams

Tue, 22 May 2007

Here is a term I just made up - partial anagram. I say that a word is a partial anagram of another if you can rearrange the letters of it, plus a few other letters to make the other word. The "order" of a partial anagram is the number of letters you need to add to the word to make the anagram. So basically, it is like an anagram, but you are allowed to add more letters to a word if it is too short.

So from this definition, we can define (nearly) a partial order over all the words in the English language (or more conveniently, all the words in /usr/share/dict/words that are only made up of lower-case letters). For the non maths people, a partial order is a bit like an order (as in, "alphabetical order") except instead of being able to put everything in one long line from first to last, you can only compare certain things. For example, if we wanted to order things by "how close they are to being a cake", then we could say that "flour" is further away than "a mixture of flour and butter", but you can't say that flour is closer than butter or vice versa (well, you could but that would be annoying).

So, our partial order says that a word comes before another if you can add a single letter to it, then scramble it up and get the other word. So if you were to take the words in English and see how they connect, you might expect lots of different patterns to emerge (maybe lots of disconnected islands, maybe a huge connected thing with "a" in the middle, connecting to "an", "ad", "at" etc then "and", "tan", "ran", "man"... etc. Eventually connecting every word to another. Enough talk, here are some pictures:

Simple graph

This is a simple graph showing what we are talking about. For example, you can turn "originate" into "invigorate" by adding a "v" ("originatev") and rearranging the letters.

Words of at least 8 letters

This is the graph for the biggest connected group of words with 8 letters or more. The graph for all words takes too much computing time to generate unfortunately.

Words of at least 9 letters

This is a graph with words of 9 letters or more. It is basically the same structure as above, just with less density.

Interestingly, if we take all words, then although nearly all are connected to each other, there are still a few which form islands. The biggest island is this one:

The largest "island" of words

And this concludes the cool graph of the day.

This isn't actually a partial order, since I've messed around with a few definitions to make it easier to write about (all of the "comes before" things should be "comes before, or is equal to" or similar) and then the "is equal to" part needs a good definition. But, I don't mind, because I got a pretty graph.

I also cheated and left out words ending with 's'. It meant I could be lazier when writing the algorithms so they didn't have to be as efficient.

Name & email are optional. Email will not be obfuscated.
HTML tags will be removed except hyperlinks.
 

About

I'm a nerd living in Sydney. This is a place where I can write stuff about my interests and not care that no one else is reading.

I like music, maths, programming, pretty pictures, filters and other good things.

(more info)

It should be fairly obvious that this isn't connected to my employer at all.

Email me (not a catchpa)

Email policy

Subscribe

RSS Feed RSS

Get an aggregator

Liferea (Linux)

Vienna (OSX)

Feedreader (Windows)

Google Reader (Web based)

I've only used Liferea, so I can't vouch for the other ones.

About this site

This site runs a (modified) version of blosxom.

The host is GeekISP, and they seem to do an excellent job.