Partial Anagrams
Here is a term I just made up - partial anagram. I say that a word is a partial anagram of another if you can rearrange the letters of it, plus a few other letters to make the other word. The "order" of a partial anagram is the number of letters you need to add to the word to make the anagram. So basically, it is like an anagram, but you are allowed to add more letters to a word if it is too short.
So from this definition, we can define (nearly) a partial order over all the words in the English language (or more conveniently, all the words in /usr/share/dict/words that are only made up of lower-case letters). For the non maths people, a partial order is a bit like an order (as in, "alphabetical order") except instead of being able to put everything in one long line from first to last, you can only compare certain things. For example, if we wanted to order things by "how close they are to being a cake", then we could say that "flour" is further away than "a mixture of flour and butter", but you can't say that flour is closer than butter or vice versa (well, you could but that would be annoying).
So, our partial order says that a word comes before another if you can add a single letter to it, then scramble it up and get the other word. So if you were to take the words in English and see how they connect, you might expect lots of different patterns to emerge (maybe lots of disconnected islands, maybe a huge connected thing with "a" in the middle, connecting to "an", "ad", "at" etc then "and", "tan", "ran", "man"... etc. Eventually connecting every word to another. Enough talk, here are some pictures:

This is a simple graph showing what we are talking about. For example, you can turn "originate" into "invigorate" by adding a "v" ("originatev") and rearranging the letters.

This is the graph for the biggest connected group of words with 8 letters or more. The graph for all words takes too much computing time to generate unfortunately.

This is a graph with words of 9 letters or more. It is basically the same structure as above, just with less density.
Interestingly, if we take all words, then although nearly all are connected to each other, there are still a few which form islands. The biggest island is this one:

And this concludes the cool graph of the day.
This isn't actually a partial order, since I've messed around with a few definitions to make it easier to write about (all of the "comes before" things should be "comes before, or is equal to" or similar) and then the "is equal to" part needs a good definition. But, I don't mind, because I got a pretty graph.
I also cheated and left out words ending with 's'. It meant I could be lazier when writing the algorithms so they didn't have to be as efficient.

RSS