Smart Image Resizing

Fri, 24 Aug 2007

I recently saw a cool video demonstrating a different way to resize images. I was watching it late at night, and I didn't feel like wearing headphones so I just went without sound. At the end of the video I was suitably impressed. So I decided to make my own version of the program tonight. This is another one of those things where I do a bad, but workable job because of time restraints (this took a bit longer than the dithering algorithm, but I started at midnight and finished at about 1:30, so it is a pretty acceptable time scale).

The basic idea here is that when you resize an image, there are lots of parts which you care about, and lots that you don't. So if you have a person standing in front of a plain wall, it would look best if you concentrated all of your resizing efforts on the wall, since there isn't much detail there. This way, the resized image would have the person in perspective, and they would still look normal.

So, I wrote my image resizer to work like this:

  1. Load an image Two kids hitting a ball
  2. Find the edges (I don't know much about the maths behind what I did here - I used two 3x3 Sobel operators, one horizontal one vertical, and then took the sum of the absolute values for what they returned): Edges in two kids hitting a ball
  3. Find a path from the top of the image to the bottom, where for each step you can either go down+left, down or down+right. The path should try to go through the blackest areas in the edge picture: Cost functionPath found
  4. Chop out all the pixels along this path (or more accurately, we blend the pixels on this path with the ones just on the right, and shift the rest of the image over). This gives us an image that is one pixel narrower than the original.
  5. Repeat until it is narrow enough:Resized image

Pretty cool I think. Though I don't think my method is quite as robust as the inspiration. I might read the paper some day to see how they actually did it, rather than my speculation.

(test image from flickr, selected to make the algorithm look good, not to give general expected results)

If there is non-zero demand, I'm happy to put the source up (it is C) under GPL3 perhaps, just because it is new. But right now I want to go to bed.

I for one would like to check out that source code. It would be fun to try to hack this into Firefox's rescaler.
anonymous
2 days after story
Andy
2 days after story
Your guess of how it works is actually spot-on~ The paper discusses using other cost functions but say that in general the gradient magnitude (essentially Sobel) works.
anonymous
4 days after story
Well that was lucky :)
Andy
4 days after story
Brilliant stuff. I got it compiled first time. I noticed though in the code you initially set it to chop out 20% of the width and then you just reset it to remove 1 pixel.
4 days after story
how do you get the third image from the top? the one that looks like a ghost image or something.
4 days after story
Third image from the top represents the minimum cost of getting to a certain pixel from the top of the image. That's why there's a distinct line at the horizon and a blip in the area around the ball.
H
4 days after story
David: yeah, that was for making the example images. I probably should have set it back to 20% before putting it up, but I think it is only a useful program if you are going to play with the source code.
Andy
5 days after story
I put up some code (also GPL, but Java): http://www.semanticmetadata.net/2007/08/30/content-aware-image-resizing-gpl-implementation/
A command line resizer is integrated.

cheers,
mathias
6 days after story
Hi,

Great start -- I gotten that far on my own. However, have you tried expanding an image with this? I think if you rewatch the video you might know what's wrong with it. If not, email me and we can discuss.
35 days after story
Err, I'm not sure how that got posted like that. What I meant to say was that I couldn't have gotten that far on my own.

However, I'm sorry to say your program doesn't work like the paper. You've got a greedy algorithm written, and the problem doesn't have the greedy-choice property. Essentially, the last step in the best path might not be the cheapest last step in all paths, which means you can't pick it.

I'm using the gabeiscoding GUI tool as a source reference, in case you've already fixed this.
35 days after story
Name & email are optional. Email will not be obfuscated.
HTML tags will be removed except hyperlinks.
 

About

I'm a nerd living in Sydney. This is a place where I can write stuff about my interests and not care that no one else is reading.

I like music, maths, programming, pretty pictures, filters and other good things.

(more info)

It should be fairly obvious that this isn't connected to my employer at all.

Email me (not a catchpa)

Email policy

Subscribe

RSS Feed RSS

Get an aggregator

Liferea (Linux)

Vienna (OSX)

Feedreader (Windows)

Google Reader (Web based)

I've only used Liferea, so I can't vouch for the other ones.

About this site

This site runs a (modified) version of blosxom.

The host is GeekISP, and they seem to do an excellent job.