Tetroids

Mon, 13 Oct 2008

Screenshot

This is a game I made last week. It mixes together three different video games into one. It is actually quite fun to play, which is something that surprised me. The interaction of tetris and asteroids adds a whole lot to the asteroids game (and basically destroys the tetris game). While you are out shooting and dodging asteroids, you can build a safe haven out of the tetris blocks that have fallen, so you can get yourself into a safer area when there are lots of asteroids around. But you can't just camp there the whole game, or else the tetris blocks will reach the top of the screen.

I think I'm a big fan of games where you can do some sort of construction, even if it is as basic as shooting out a hole in a stack of tetris blocks and calling it a bunker. That part of me makes me want to call myself creative, but then when I look at the people that you would typically call creative (i.e. the artsie types), I don't see many similarities.

The code is C code (GPL3), it uses SDL for graphics and input. There is even a precompiled linux (x86) binary in there (but there is also a working makefile). Theoretically it should compile and run in Windows. I tried and failed (you might think that a compiler with the year 2005 in its name would be able to compile C99 code, but you would be wrong) and don't really have enough access to Windows machines or desire to fiddle with them in order to get a Windows binary. I'd be happy to host it though if someone makes one (send me the files you used to build it too, since that sort of thing is useful).

Finally, press Escape to exit, I never got around to listening to the window manager events.

Subgrep Fix

Mon, 10 Dec 2007

As pointed out to me in the comments, I had a bug in the subgrep program last week. I'll explain it, but first detail how the whole thing works:

1) All of the search patterns are represented by a DFA, when you draw a DFA on paper it is a bunch of circles with arrows connecting them:

An example DFA

To see if a string matches a dfa, you start at a particular (fixed) node (which is usually marked, but I forgot... in this case it can be the circle marked '1'), and follow the arrow corresponding to the next letter in the string. So this DFA only works for an alphabet of "a" and "b". If when you get to the very end of the string you are at a circle marked with a double line (the '3') then the string matches. Otherwise it doesn't.

Strings that match: "b", "abaa", "aa", "bba", "baaaa".

Strings that don't: "a", "abab", "bb", "bbb", "abbbb".

2) When the computer produces the regexp, it generally makes one that is a bit inefficient, and has lots of circles which aren't needed. It turns out that for every possible regular expression, there is a minimal number of circles you need to express it, and when you have it drawn with this minimal number of circles, the structure of the thing you have drawn (that is, everything except for the label) uniquely describes that regular expression. So if two regular expressions are for the same language, then they both have the same minimal DFA. Therefore, we like to simplify the DFA so that we can tell if two regular expressions are actually telling us the same thing.

3) To simplify a regular expression, you look at each pair of circles, and check if they are equivalent. For them to be equivalent, they must both look the same (if one has a double circle and the other doesn't, then they aren't equivalent). And then you have to check if all their arrows lead to the same circles (for the same letters), if we were to look at circles "2" and "3", both of them lead to circle "3" if they read an "a", but they go to different places if you give them a "b", so they aren't the same (and "3" is a double-circle). When you find equivalent circles, you merge them together and check for more equivalent circles.

4) To check if a regular expression (P) is a subset of another (Q), you create a new regular expression called S which is the union of P and Q. The details of this won't make sense here (convert DFAs to NDFAs, create new NDFA which is union, reduce to DFA, simplify), but at the end of the process, S is another DFA (which is simplified) and you then check if it is equivalent to P, or equivalent to Q. If it is equivalent to P, then everything in Q is also in P (so it is a subset) and the same goes the other way around.

My mistake was in step 3, when I was checking if two circles were equivalent, you want to assume the circles are equivalent when doing the check. I didn't do that, so if there were two circles which both pointed only to themselves, then they should be equivalent to each other, but my program didn't notice that.

The updated source (still C code, GPL3, with makefile) fixes this (the only change is in RegExpDfa.c).

Where Do I Come From?

Mon, 10 Dec 2007

wheredoicomefrom is a (potentially useful) tool for anyone using a debian-like system (such as Ubuntu). You tell it a file, and it tells you what package (if any) was responsible for putting the file there. If it isn't installed by any packages, then it tells you so. It is a pretty simple application, it just reads through all the files in /var/lib/dpkg/info to see which package installed it. But, I can actually imagine this being useful.

Get the public domain ruby source.

Subgrep

Mon, 03 Dec 2007

I've had this program lying around for a while, and not really known what to do with it. I wrote it a while ago when I was trying to work out my coding style conventions in C. I thought that the best way to do this would be to build a mediumish project and think a little bit about lots of the decisions I made, and when I realised I stuffed up, I would try to go back and change everything. I don't know how successful it was (as in, "I don't know", not "it wasn't successful, but I'll say 'I don't know' to avoid admitting failure") and at least one thing has changed since then (I now name functions and variables using underscores not camel case, so code looks prettier when using GTK+ libraries).

Skip this if you know what regular expresions are: A regular expresion is a description of a set of strings. Like "the set of all strings with 3 vowels in a row" or "the set of all strings ending in "bacon". They are useful for computer nerds who have a big file which has a little bit of stuff in it they want, but they don't know exactly what the stuff is (or else they wouldn't need to find it). So you create a regular expression that matches things that look like what you are after (say "four numbers, then a space, then 4 more numbers, optionally preceeded by two numbers in brackets" for phone numbers) and then run a program which looks for things which match.

Anyway, the program is a tool which I have no idea what you would actually use it for, but still is more a tool than a toy. It takes a regular expression, and a text file, and it then treats each line in the text file as a regular expression and tells you if it is a subset of the given expresion, or the other way around, or if they are equivalent or otherwise. I call it subgrep, because it is like grep, but it does subsets. (For those not in the know, "grep" takes a regular expresion and a file and tells you which lines in the file match the regular expression).

Going through uni, I think I was taught regular expressions about 3 times in different subjects (one was practical, one was maths-purist, one was computer science-purist), but I don't think any of them really cared about subsets. Pretty much the only thing the purists cared about was the equivalence to DFAs and NDFAs (which this code makes use of, a regular expression is converted to an NDFA, then to a DFA). The practical people didn't care about much. So, this code gives a bit of a bigger spectrum, it has a tokeniser, a parser, a parser to NDFA converter, an NDFA to DFA converter, a DFA simplifier, and a few other operations on DFAs (like unions and equivalence, which are then combined to do subsets).

The challenge now is to find an actual use for this tool. I haven't thought of one, but maybe someone will find it useful: C source code (GPL3).

Text Similarity

Mon, 26 Nov 2007

Here is a neat use of zlib, which is a library for compressing stuff (like how you use zip files for compressing stuff). Compression works best when the stuff you are compressing is self-similar. So, it will probably compress "aaaaaaa" better than "abcdefg". So, if you had a document that was written in English, but then you tacked on a little bit of German stuff, then typically it won't compress to being quite so small as if you had tacked on the same amount of English stuff (because the two languages use different words and have different letter frequencies and so on). The difference is slight, but present.

So, you can use this to write a program which knows the difference between different languages. And because plenty of people have already done the hard work, it is a nearly trivial program:

require 'zlib'

def compare_strings(a, b)
    al = Zlib::Deflate.deflate(a).length
    bl = Zlib::Deflate.deflate(b).length
    abl = Zlib::Deflate.deflate(a + b).length
    bal = Zlib::Deflate.deflate(b + a).length

    return [abl, bal].min.to_f / (al + bl).to_f
end

This function gives a 'distance' between two strings. A smaller number means they are more similar. A larger number means they are more different.

Some examples:

compare_strings("abcdefg", "aaaaaaa") => 0.653846153846154
compare_strings("aaaaaaa", "aaaaaaaaaaaaaaaa") => 0.5
compare_strings("aaaaaaa", "aaaaaaaaaaaab") => 0.521739130434783
compare_strings("aaaaaaa", "bbbbbbb") => 0.636363636363636

So, I downloaded a few of the Leipzig Corpora and grabbed 3 chunks of the English, German and Estonian corpora (each 20 sentences long) and ran them through a bit of wrapper code to compare everything to each other, then rank the distances:

The results were it ranked all the things in the same language between 0.511 and 0.960 (low numbers were for when it compared a file with itself). And all the things in different languages between 0.980 and 0.994. If you notice that those ranges don't overlap, so every single pair of files written in the same language had a shorter distance than every single pair of files in different languages. I would expand the test to more languages, but I much prefer being able to say that it provides a perfect classifier.

What is super cool about this is that the algorithm is really dumb, which means that it will work if we add more languages, or even do it on programming languages, or potentially other data that compresses well.

If you feel like playing with it, you can download my hacky driver code too (public domain) (includes the test data I used so you can verify my results).

Particle Simulator

Mon, 08 Oct 2007

Sometimes it is useful to have a physics simulator, for doing whatever you do with a physics simulator. I was doing a little bit of work on one recently, so if you want to have a play, then this has it. There isn't very much insighful about it, it is just a simple engine:

Screenshot

  • Verlet integration (makes collision resolution lots simpler)
  • Handles spheres of different weights
  • Crashes on singularities
  • Handles a few different global constraints (all objects inside a sphere, all objects have some coordinate < k
  • Handles a few different object constraints (two objects stay a fixed distance apart, an object stays at a fixed point)
  • Not numerically stable (this is a todo)
  • Brute force algorithms in too many spots (this is a todo)
  • Becomes unusable when processing gets too hard
  • Slow and ugly view of the world

Although it supports spheres, I didn't feel like doing 3d graphics, so I just set the z coordinate of everything to 0, and because of the accuracy of a computer, all the spheres magically balance on each other.

The direction this is going is to get something where I can approximate a fluid in a reasonably complex system. To do that, it needs to be able to support more spheres (so it will need a better way of indexing objects to cut down on collision detection time), able to support triangles (so you can model lots more things), and be a bit more stable (so it is more fun to play with). This will probably also involve switching the graphical frontend to something different so it is quicker to redraw (and thus more fun to play with).

Get it here (GPL3) I've included an executable file, which I suspect will work on pretty much any Linux system with GTK+ (which should be lots), and also the source (which should be compilable on Windows with GTK+ libs). It isn't my prettiest code, but it should be reasonably ok to hack around. There are a couple of spots where I know I'm going to have to re-engineer large sections of code.

It is actually quite fun to play with. You control the large ball with your mouse, so you can bump the other balls, and try to keep them up in the air or whatever. I can easily imagine a fun game being made from it - I just need triangles so I can build a net and it is an awesome tennis game :)

Board Game AI

Mon, 01 Oct 2007

My most recent project has been a stupid little board game I've called "Siege" which really has nothing to do with seiges in the current form of the rules. I wanted to make a board game, and a computer player for it, and since I know that my computer player will be pretty sucky, I thought that by inventing a new game, I would avoid having other people's programs beat mine.

So this is what the game looks like at the moment:

The start of the game

Currently the rules are:

  • Each turn you can move one of your guys.
  • A guy can move either left/right/up/down to an empty square
  • A guy can jump over another guy or castle onto a square (which can be occupied, except by someone on that guy's team)
  • If a guy jumps onto a tree, then he chops it down and carries it with him.
  • Once he takes the wood back to the castle, somehow the wood encourages another guy to come out of the castle on the opposite side, and the wood goes into the castle.

It sounds like it would make for an interesting game, but I'm pretty sure it isn't at the moment. Possibly it is because the AI player is a bit too sucky, but more likely because these rules lead to a boring game.

The thing which I think is a little bit interesting is in the design of the interface between the game and the AI player. Since I'm planning on changing the rules lots I decided that the computer player would know as little about the game as possible. So it is allowed to ask the following questions:

How good does this position look for the current player?

What moves can I make from this position?

What would the board look like if I made this move?

From this, I think it should be possible to write a computer player that plays really well. The main problem is the first question - defining how good a position looks is tough. At the moment, I've settled for something really simple (basically a count of the number of guys each player has, plus a bonus if they have chopped down a tree). If we had infinite computing time, then we would only need a "Which (if any) player has won in this position?" question.

But, the reason I find this a bit interesting is that there is so much the computer player doesn't know about the game (it doesn't know what the moves represent, or what the board represents or anything), and I suspect that if you tried to get a person to play the game blindfolded and only let them ask these questions, then they would be really bad at it. So then what are the things that a person needs to know about the game that give them this advantage?

For example, when I play the game, it seems like a useful thing to do to go around the side, chopping my way through the trees so I can get near the other castle. But my computer player doesn't even know that there are trees on the sides. One question which I think I might allow it to ask is "Are these two moves the same?", which seems like a stupid question, but I have a suspicion that I can use it to speed up the algorithm a bit.

The computer player does have one very effective strategy though - when you are about to win, it crashes. Which means that it maintains an undefeated record.

Next time I write, I'll hopefully have it at a stage where it is actualy fun to play, and I'll give some source code (and an updated amkel that can handle gtk+ applications).

Turing Complete Trains

Mon, 03 Sep 2007

Suppose you have lots of train track, and a really fast train and you need to calculate the millionth digit of pi. You aren't one of those people who likes memorizing stuff like that so you figure that you might as well use what you have and get your train to somehow work it out. If that isn't a great introduction, then I don't know what is. Oh, and as a warning, this article is long... but not as long as it looks (if you are looking at the scroll bar thinking that I've rambled heaps more than usual).

For the non-Computer Scientists (what are you doing here?), Turing completeness is a property of a system (like a train set, or a computer, or a ball rolling down a hill, or an ant farm) that means that you can arrange it in such a way that a few moments before the universe is destroyed (or earlier, if you are lucky), it will be able to "calculate" anything that your desktop computer can calculate.

I was originally making a flash application to do all of this, but I couldn't find any useful documentation for the flash API, it was so horribly slow after adding a few pieces of track that I switched to a boring text based interface. The way it works is you 'draw' a track using normal characters, and you place a train somewhere and it goes about its calculations and stops when the train stops. If the train ever runs into a lower case letter, we print that letter. The train starts at the letter "T" (for "train") facing East and stops at the letter "H" (for "halt").

This is an example track

T-----hello-world----\
                     |
               H     |
               |     |
               \-----/

You might think this prints out "hello world", but you'd be wrong. Trains can't print out spaces, that would be silly. So it just prints "helloworld" which is much better.

So, this would be useful if you knew that the millionth digit of pi was 7, then you could write:

T-----seven----H

But we need our train to be a bit smarter, so we introduce forks:

      H
      |
      a
      |
T-----<
      |
      b
      |
      H

This program will print "a", because the < fork always starts pointing up. But, we can change the fork by sending the train backwards through it. So if we send the train in from the south, the fork will now point down:

      H
      |
      a
      |
H--b--<
      |
   T--/

This prints "b", but changes the direction the fork faces.

Pretty lame.

The next fork, looks the same, but faces the other direction. This one first points down, but after you travel through it flips to pointing up (and back down the next time). If you come into this fork from the North or the South, your train crashes and you lose.

T------------\
             |
    H-a--\   |
         |   |
         >---/
         |
    H-b--/

Prints "b" (but the next time you go through the fork it will print "a")

The last two forks we have are really boring. One is "V" the other is "^". If you go up through the "V", you always come out pointing right. If you go down through the "^", you always come out pointing left.

So that is it, with this, we can build a computer.

First, we build a single bit of memory:

flip ---------------------\   /---  finished flip          
        /-V---------------+---/                                
        | |  /---------\  |                                  
        | |  |         >--/                                   
        | |  |         |                                     
        | |  |         |                                     
        | |  \--V------+-----------H   zero
        | |     |      |                                     
read T--/ \-----<      \-V---------H   one
                |        |                             
                \--------/

We have two "inputs" on the left, a "flip" and a "read". If we go in the "flip", we change the value we are storing (from 1 to 0, or 0 to 1). If we go in the "read", then we come out at either the "zero" or the "one".

So, we can make lots of these memory things (each one is just a single bit). Then we just need to connect them up together. Here is an example of connecting them together, creating a 3-bit counter.

/----overflow--------H------------------------------------\
\-----------------------------------------\               |
                                          |               |
   /---------------------------------\    |               |
   |                                 |    |               |
/--+----------------------\   /------/    |               |
|  |    /-V---------------+---/           |               |    
|  |    | |  /---------\  |               |               |  
|  |    | |  |         >--/               |               |   
|  |    | |  |         |                  |               |  
|  |    | |  |         |                  |               |   
|  |    | |  \--V------+------------------/               |
|  |    | |     |      |                                  |   
|  \----/ \-----<      \-V-------------------------------V/
|               |        |                               \\
|               \--------/                                |
|                                                         |
|                                                         |
\-----------------------------------------\               |
                                          |               |
   /---------------------------------\    |               |
   |                                 |    |               |
/--+----------------------\   /------/    |               |
|  |    /-V---------------+---/           |               |                    
|  |    | |  /---------\  |               |               |                  
|  |    | |  |         >--/               |               |                   
|  |    | |  |         |                  |               |                  
|  |    | |  |         |                  |               |                  
|  |    | |  \--V------+------------------/               |
|  |    | |     |      |                                  |                  
|  \----/ \-----<      \-V-------------------------------V/
|               |        |                               \\
|               \--------/                                | 
|                                                         |
\-----------------------------------------\               |
                                          |               |
   /---------------------------------\    |               |
   |                                 |    |               |
/T-+----------------------\   /------/    |               |
|  |    /-V---------------+---/           |               |   
|  |    | |  /---------\  |               |               | 
|  |    | |  |         >--/               |               | 
|  |    | |  |         |                  |               | 
|  |    | |  |         |                  |               | 
|  |    | |  \--V------+------------------/               |
|  |    | |     |      |                                  | 
|  \----/ \-----<      \-V-------------------------------V/
|               |        |                               \---------m-----\                
|               \--------/                                               |
|                                                                        |
\------------------------------------------------------------------------/

This will output "mmmmmmmoverflow" (7 m's). Since it is probably quite hard to intuitively see what is going on, the train flips the first memory cell, then goes and reads the result. If the result is "0", then it goes out the bottom path and eventually to the part which prints an "m". If the result was "1", then it goes to the next memory cell, and does the same thing.

In this example, every time we read the memory cell, we do the same thing - if it is "0", we print "m" and go to the start. If it is "1", we flip the next memory cell. To make our computer useful, we need to be able to do different things after reading depending on what state we are in. In this example, we have 4 tracks ("a", "b", "c", and "d"), each of them wants to do something (say, change the value of a memory cell) and then do some other stuff:

      /V---------\
 --a--/\-------\ \-----finished-a--H
      /V-------+-\
 T-b--/\-----\ | \-----finished-b--H
      /V-----+-+-\
 --c--/\---\ | | \-----finished-c--H
      /V---+-+-+-\
 --d--/\-\ | | | \-----finished-d--H
         | | | |
      /--< | | |
      |  \-< | |
      |    \-< |
      |      \-/
      |                                     /-\
      |                                     | |
      |                                     \V/
      \--do-something-here-and-turn-around---/

With the arrangement of tracks at the top, you can extend it as much as you want, and it will go about its business, then return to the track that it started from. So in this case we print: "bdosomethinghereandturnarounddnuoranrutdnaerehgnihtemosodfinishedb", but if we had entered from the "a" track, we would get the same thing, but with an "a" at the start and end.

This is nearly all of the constructs you need to have a [turing machine] (this would be a 4 state turing machine). The next thing you need to be able to do is to have your subroutine return a value of some description so you can have the different states do different things depending on the value of a memory cell. You can do this by having an extra memory cell for each state. When you do the subroutine, you set each of these memory cells to the outcome. Then, when the subroutine returns, you can read from the state's personal memory cell to get the value. If you look at the way a memory cell works, you can set it to "1" or "0" (as opposed to flipping it) by entering through the corresponding output line.

Since this is way long enough, it can be an exercise for someone who is really bored. Also check out the Turing Train Terminal where a real model train was used to make a computer that can add two 3-bit numbers. Their layout is a lot simpler than all of mine because theirs is specialised, and they don't have the limitations that I imposed by using ascii characters to represent everything.

Get the train track interpreter here. (Ruby, public domain, very hackish - won't be much fun to work on)

Target and Sudoku

Thu, 05 Jul 2007

There is a game called target in my newspaper where you are given 9 letters and have to find anagrams of some length with one particular letter included. I wanted to beat some people at it, so I wrote a target solver. It works the obvious brute force way (use as ./target.rb ninelettr).

There is another game called Sudoku, which I don't think goes by any other names so it shouldn't need an intoduction, and I wanted to beat some people at it too, so I wrote a sudoku solver. It works by picking a blank square and assuming it is 1, then trying to solve the rest of the puzzle. If it finds a solution, then we win, if it doesn't, then it assumes that the current blank square is a 2 and tries to solve the rest. Repeat and recurse. (Use as ./sudoku.rb puzzle where puzzle is a text file containing 9 lines, each with 9 numbers. Use '0' for blanks. Output isn't pretty.)

For my purposes, I only needed to do 4 target questions and 3 sudoku puzzles, so this is reflected in the quality of the code.

Both programs get a BSD license just for a change.

Amkel

Wed, 20 Jun 2007

Here is a program for all you people who are sick of writing makefiles. If you don't know what a makefile is, then you can skip this one. Amkel is a quick and dodgy build system which can build C projects or Ruby projects. You write:

amkel my_file.c

And it will figure out all the dependencies for my_file.c, compile them (if they have changed), link them (if needed), and give you a new binary called my_file. All this without you learning another build system. Of course, building most stuff is more complicated than that, but I find that I have so many small programs where figuring out everything that needs to be done can be done just by reading the #includes.

If you are wondering what I mean by "build" when it comes to a ruby project, then it just means package into a single executable file. This is all done using the magic of tar2rubyscript and rubyscript2exe (which depsite the name, can produce ELF binaries too). Note that the executable created by amkel will include all the contents of the directory the script you give it is in (so if you had a 'hello world' program in your home directory and you ran amkel hello.rb then the output file would contain everything in your home directory. Most likely this isn't what you wanted. If you feel like writing an improved handler for Ruby (or C) then do so.

So, it isn't really meant to be a replacement for make, but if you keep your project simple enough (every .c file has a corresponding .h and nothing fancy needs to be done) then maybe you can have a big program which is build with amkel. I'm not holding my breath though. I will add stuff to it as I need stuff (e.g. currently the only C library recognised is the standard maths library, but they are easy to add), I think there are plenty of more interesting things to do than to work on build systems.

Get the source here: amkel initial release version epsilon (GPL2)

Feel free to email me patches and I'll do my best to keep a current version available here.

 

About

I'm a nerd living in Sydney. This is a place where I can write stuff about my interests and not care that no one else is reading.

I like music, maths, programming, pretty pictures, filters and other good things.

(more info)

It should be fairly obvious that this isn't connected to my employer at all.

Email me (not a catchpa)

Email policy

Subscribe

RSS Feed RSS

Get an aggregator

Liferea (Linux)

Vienna (OSX)

Feedreader (Windows)

Google Reader (Web based)

I've only used Liferea, so I can't vouch for the other ones.

About this site

This site runs a (modified) version of blosxom.

The host is GeekISP, and they seem to do an excellent job.