Frequency Tracking

Mon, 19 Nov 2007

This is a fairly simple program for doing frequency tracking. Frequency tracking is where you take a sound and try to follow the dominant pitch of that sound. For lots of sounds this dominant pitch doesn't really exist (e.g. a cymbal sound, or multiple sounds playing at once) so I chose to measure the frequency, confidence and whether or not a dominant pitch exists.

There are heaps of ways of doing this, I think at the moment my pet favourite is to use a filter bank. A filter bank is just a whole bunch of bandpass filters covering bands over a bit of the frequency spectrum. So one might cover 50Hz to 60Hz, the next from 60Hz to 75Hz etc. When I first started doing stuff in the frequency domain, I really wanted to make sure that there was no overlap between such bands. After doing lots of stuff in this area, I think I'm of the opinion that it isn't really possible, and even if it was it wouldn't be that good. So, while each filter tries to cover a specific area, they do overlap somewhat.

The magic numbers I've been using mean that I have 200 filters covering the spectrum from 50Hz to 2KHz (spaced in a geometric series, so they are closer together down the bottom). The filter design is pretty simple, I have 201 low pass filters, which work by making each sample a running average of the previous samples (i.e. f[t] = f[t-1]*alpha + f[t]*(1-alpha)) where alpha is chosen to put the cutoff frequency in the right spot. I then apply the filter to a section of sound (512 samples long) and subtract the results to get bandpass filters. So I might subtract a 200Hz lowpass from a 210Hz lowpass to give a 200-210Hz bandpass filter. I then just measure the RMS energy in each band.

Rather than just finding the maximum energy band and declaring it the winner, I keep something resembling a probability distribution over all the bands. It starts out with uniform probability, and then each probability is multiplied by the amount of energy present in that band, and a constant is added (so that we don't get stuck at 0). After this we find the total of all the numbers (which in general won't add up to 1 now) and remember this number as the confidence. We then normalise the numbers, and the largest one is considered the winner.

If the winner is at the high end of the spectrum, then we say it is unpitched. It seems right in practice.

To test it, I made a program which will take a wave file and add a sine wave over the top of the pitch it detected. I don't have many sounds where it is just a single instrument, so I don't have anything particularly good to test it on. But, it seems to go ok, just not when there are multiple instruments.

You can download the pitch tracker here (C code, GPL3, builds in amkel). And you can get some wav files to test it on from youiseek.com (CC Attribution Noncommercial Share Alike) (a local band... not particularly good for testing this on, but given that my wave file loader is so picky I just went for the first thing that worked).

To build it, just compile together all the .c files with -std=c99 (or amkel pt.c). Run with the first argument being the wav file to run on.

You might also notice that I've changed my coding style. Previously I named variable and function identifiers with camelCase, now I use under_scores (I know it is one word). It makes it ugly when I interface with old code (the sound library), but I think changing/avoiding the sound library will be less work than changing/avoiding gtk. Also, it makes it very easy for me to tell which things I develop at home, and which things I do at work (since I use the other convention at work).

Name & email are optional. Email will not be obfuscated.
HTML tags will be removed except hyperlinks.
 

About

I'm a nerd living in Sydney. This is a place where I can write stuff about my interests and not care that no one else is reading.

I like music, maths, programming, pretty pictures, filters and other good things.

(more info)

It should be fairly obvious that this isn't connected to my employer at all.

Email me (not a catchpa)

Email policy

Subscribe

RSS Feed RSS

Get an aggregator

Liferea (Linux)

Vienna (OSX)

Feedreader (Windows)

Google Reader (Web based)

I've only used Liferea, so I can't vouch for the other ones.

About this site

This site runs a (modified) version of blosxom.

The host is GeekISP, and they seem to do an excellent job.