"GrabCut" - Interactive Foreground Extraction using Iterated Graph Cuts

Progress Reports for Matthew Marsh

Supervisors: Prof Shaun Bangay and Mrs Adele Lobb

 

 

Progress Report 1

16 March 2005

 

Previous Objectives:

  • Gain a working knowledge of the Lyx
  • Read and understand the “GrabCut” paper by Rother et al in order to lead the discussion at the VRSIG meeting.
  • Have a look at the tutorials on Gimp Plugins

 

Progress:

  • I followed the tutorial provided in Lyx through the help menu, and realised just how useful and powerful Lyx is.
  • I continued my research into the “GrabCut” approach of image segmentation, and also read the paper by Bolykov and Jolly on interactive graph cuts. “GrabCut” is based upon their research, and just extends their method of segmentation on greyscale images to colour images. Perhaps implementing a plugin to work on greyscale images first would be a good idea!
  • I managed to locate a template for a very basic Gimp plugin that can be downloaded at  ftp://ftp.gimp.org/pub/gimp/plugin-template/
  • I couldn’t get the plugin compiling, but after much help from Shaun the plugin finally appeared in Gimp! So now I have my first bit of code for the plugin.

 

 

Problems:

  • The maths in the “GrabCut” paper is fairly complex, and some of the exact details don’t appear to be explicitly written out. Most of the maths is based on the graph cut paper, and so maybe I should start off using their maths.
  • Getting the plugin to compile was a major problem – especially since I only started using Linux this year, but fortunately Shaun appears to be an expert in this regard so any further problems regarding Linux should be able to be solved. I can also easily compile and make the plugin now so in the future this won’t be a problem.

 

Objectives:

  • Update my website to include my main research project as well as my graphics project.
  • Read up on the literature related to my project for the literature review, and also to gain a broader fuller knowledge of image segmentation in general.
  • Read through articles regarding gimp plugins provided on the web.

 

 

 

Progress Report 2

7 April 2005

 

Progress:

  • My website has been updated to include a brief summary of the basics of my project.
  • I continued my study on the graph cut paper by Bolykov and Jolly, and started writing some basic code in java (as I didn’t have access to c during the vac). This code implemented the simplification suggested by Shaun at the last Meeting.
  • The Code has functions to create a histogram, the energy function and has various arrays for storing image intensity and alpha values. I used random alpha values to check weather a minimum cost did indeed correspond to a best segmentation. The algorithms seem to be correct, but it is difficult to test, and even very small images have a high complexity (number of iterations to get the best segmentation).
  • The actual gimp plugin is slowly taking shape. At the moment I can read in a “data block” of image data into memory, change it and write it back. This allows me to directly work with individual pixels.
  • After browsing through parts of the gimp development manual, it appears that a selection can easily be made by just supplying a built in function with an alpha layer. This will be very useful for my plugin as an alpha map will be the output of my plugin, and this can then easily be converted to a selection.
  • I also finished my PowerPoint slides for my first presentation.

 

 

Problems:

  • When I write a memory block to an image in gimp the rest of the image gets overwritten by seemingly random data. Somehow this must be fixed.
  • It is difficult to test the cost function

 

Objectives:

  • Convert the java code into c code and incorporate this into my plugin so that a cost function can be calculated.
  • Use random numbers to generate the opacity values, and use this alpha matte to create a selection. (Do this several times to make sure the cost function is working correctly)
  • Start looking for information on the max - flow algorithm.

 

 

Progress Report 3

13 April 2005

 

Progress:

  • I have converted my temporary java code into c code, and added this to my basic plugin.
  • The plugin can now add an alpha layer to the image layer.
  • It then creates a new transparent layer and provides the user with a foreground and background brush to draw on this layer.
  • I have almost finished converting the drawn data to alpha values on the image layer.
  • The plugin can also convert the alpha layer to a selection.
  • I read up some more on the Max Flow algorithm, and it appears as if I might be able to obtain an implementation of the Algorithm from Yuri Boykov and Vladimir Kolmogorov.

 Problems:

  • I really battled to get writing data to the alpha layer working. My first attempts including creating a drawable out of an image mask, and also out of the alpha channel. Nothing seemed to work. Finally I realised that all I had to do was use the image drawable and just write to the alpha value!!! Very simple to do actually (Just write to every second byte in the image for greyscale).

 

Objectives:

  • Finish converting the user drawn data into alpha values
  • Generate some random alpha values to make sure the cost function is working (this time with a graphical output - the selection).
  • Obtain a working version of the max-flow algorithm (hopefully!!)
  • If this is possible I will have to study the code in depth and then incorporate it into my project

 

 

Progress Report 4

20 April 2005

 

Progress:

  • A basic version of my graphical interface is working
  • I tried using random numbers for my alpha values, but this didn’t really give a good indication as to weather things were really working.
  • I managed to locate (with the help of Shaun) a c++ version of the max flow algorithm used in “GrabCut” and download it.
  • I have modified my program to so that it can create nodes and individual edge weights. These are then added to the graph provided by the MaxFlow program.
  • The MaxFlow program is very easy to work with, and did not need to be modified in any way.
  • My program used to just calculate the overall cost. It now has functions to calculate the nlink costs and terminal costs for each node. This is necessary to build up the graph.
  • A beta version of my plugin is now working.

 

 Problems:

  • Once again the build file was giving me hassles when I added the new classes for the max flow algorithm, but Shaun managed to get c++ and c compiling together.
  • The plugin appears to be working, but I know of at least one place in my code (a cost function) where there are errors. The log of 0 is calculated here and this gives infinity and must be sorted out.

 

Objectives:

  • Continue with gathering references for my literature review, and obtain the papers I already know of on ACM.
  • Tidy up my code and talk to Shaun about Triple pointers and the like, and the general layout of my program (this is very messy at the moment).
  • Consolidate on what I’ve done – maybe do some writing up and definitely comment my code.

 

  

 

 

Results

 

The white lines indicate the foreground and the black lines the background. The result is not that great at the moment, but I think that this is due to a few simple mathematical errors in my cost functions.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Progress Report 5

26 April 2005

 

Progress:

  • I have gathered about 25 references for my literature review, and used lyx to create a bib-tex file, so referencing them will be very easy when I actually write the review.
  • I have commented my code and made it neater.
  • Shaun has scheduled a meeting with me to go over my code so far and to make sure it will be easy to update to incorporate GMMS and matting. This may include taking a more object orientated approach to my program, and perhaps creating some wrapper classes for histograms etc.

Objectives:

  • Start reading the references for my literature review, and making notes on each one
  • Decide what wrapper classes are needed, and start creating them

 

Progress Report 6

03 May 2005

 

Progress:

  • I have had a meeting with Shaun, and we decided on a very basic outline of what GMMS are, and how they could be implemented. As I understand it at the moment, The colours (defined by the hard constraints) for the foreground and background are each modelled in 3d space using a colour model other than RGB. For each model, the colours are clustered using some sort of clustering algorithm into k = 5 components. These clusters are also weighted according to some function. Information such as the mean and standard deviation for each cluster is also needed. Guassians for each cluster are then created, and the overall guassian function for the foreground or background is just a weighted sum (mixture) of the five guassian functions.
  • In the meeting we also decided to standardise all my functions so that they relate to GrabCut and not to the graph cut equations. I added in a constant called gamma so that the value can easily be adjusted. The segmentation still doesn't look quite right. It must be something to do with the weighting of boundary over region costs.
  • I have started my literature review

 

Objectives:

  • Continue on my literature review
  • Change my calculation of Beta from an average to the standard deviation over the image
  • Create some nice wrapper classes for my histograms
  • Continue trying to understand GMMS

 

Progress Report 7

11 May 2005

 

Progress:

  • After some very long and extremely frustrating few days I have managed to make two nice working wrapper classes. One for my histograms and one for my neighbours.
  • This makes my main class much smaller and easier to understand which is great. The wrapper classes contain all the messy code for allocating memory in their constructors. I put the code for garbage collection into their destructors.
  • I have located a image base that the writers of GrabCut used to test their program. This is very useful as it contains images and their correct segmentation. This will be very useful for testing purposes. It is the berkely segmentation and dataset benchmark page and can be found at http://www.cs.berkeley.edu/projects/vision/grouping/segbench/
  • I had another meeting with Shaun about GMMS and he managed to locate a working 3d version of the algorithm in Java Code.
  • This can be found at http://www-cse.ucsd.edu/users/ibayrakt/java/em/docum.html

Problems:

  • While debugging my code I used printf ("%d", somedouble); THIS DOES NOT WORK!!!! one must use %f to get the correct results. This gave me endless trouble until I figured out what was going on.
  • With really big pictures my plugin crashes. Maybe something to do with allocating memory??
  • My plugin also seems to be running much slower than it used too.

 

Objectives:

  • I seem to be doing far more work on my project than on my literature review, so I really need to do some serious work on it this week.
  • I need to start changing the Java code into c code, but I will only start doing this once my literature review is well underway. This might have to be postponed to the holidays as exams are looming ahead.

 

Progress Report 8

17 May 2005

 

Progress:

  • I have managed to complete a first very basic draft of my literature review

Objectives:

  • Continue on my literature review

 

Progress Report 9

26 July 2005

 

Progress:

  • Segmentation results are much better than they used to be!
  • This is all to do with the cost functions

Objectives:

  • Read up more on the section in the “GrabCut” paper on Gaussian Mixture Models and how they can be used to represent colour.
  • Try and understand the Java GMM code.
  • Start Recoding the Java code into a class in C++ or C that can be used to represent colours for my plugin
  • Finish my presentation and the demo for it

 

Progress Report 10

02 August 2005

 

Progress:

  • I have converted the Java Class em.java in to hopefully equivalent C code
  • Testing on the class has been successful so far, and the Gaussian results seem to be the same with the class in C and in Java
  • Em.C takes in a 2d array of Doubles, and the number of Gaussians. It then calculates the means, mixing parameters and co-variances for each Gaussian – exactly what I need to implement colour
  • The Array is arranged as follows:

Pixel

Red

Green

Blue

1

123

132

153

2

234

242

034

3

132

153

109

  • The Gaussian algorithm is iterative, and uses random numbers in its calculation – I will need to decide how many iterations to perform, and the size of the random numbers – with the examples given, the input data was approximately with in the range | 0 - 3 |  range, and used random numbers in the | 0 – 1| range. My program has data values between 0 and 255, so maybe larger random numbers are necessary?
  • I have done a few tests on the cost functions for the short paper. So far changing the values either seems to have no effect at all, or totally messes up the segmentation for very large changes. Also if a plain white image with 2 black squares is used, and only one square is painted on as foreground the other square is not segmented. This is quite interesting and may indicate an error in my implementation – not quite sure yet.

Problems:

  • ALWAYS remember to initialize variables!!!!!! Spent a few hours to find this basic problem!
  • Finally have my class getting the means, mixing parameters and co-variances for input data from the image – just not sure what to do with them  yet!

Objectives:

  • I need to start working on my short paper
  • This needs to be something original, and there’s no point re-writing the Grab-Cut paper
  • Some ideas so far are to thoroughly investigate the max-flow algorithm and the graph weights to see how changing the structure of the graph affects the segmentation
  • Also investigate if the algorithm will segment two similar foreground objects if only one is used to clue the algorithm

 

Progress Report 10

10 August 2005

 

Progress:

  • I have fully converted my old plugin class to allow for colour images. I have introduced a struct called pixel that stores R,G,B values, and then I have a 2d array storing pixels.
  • There is now a class that assigns k components to all the pixels in the unknown region, using the Gaussian model of the foreground
  • The V term has been changed slightly, so that instead of a difference in intensity, the difference in colour space can be calculated.
  • The below results use the correct V term (weights between nodes based on difference in colour), but uses a U term that is always 1 (the weights between nodes and the source and sink node)
  • Even with this very limited information the max-flow algorithm still gives good results! (This is of course a fairly simple image)

 

Before

After

  • I have implemented the new U term now, by using the Gaussian parameters, but the results are not even as good as the above one.

 

Problems:

  • My plugin is running extremely slowly at the moment! This is due to the iterative nature of the Gaussian class that I am using. At the moment I perform about 10 iterations for each Gaussian, and this is very slow.
  • My code for calculating the mean and weights seems to be fine, but the determinant of a co-variance matrix does not seem to give similar results to the Gaussian model.
  • I am not sure how to set up the weights for the Sink and Source node now that I have introduced the Gaussians.

 

Objectives:

  • I need to start thinking about my poster, and some possible rough layouts
  • My plugin needs some serious fixes!!!

 

Progress Report 11

16 August 2005

 

Progress:

  • With a bit of fiddling I managed to get the algorithm to give me some nice results! For some unknown reason I had to negate the output of the Regional cost function and set the Gamma value to about 2000 from 50.
  • Just changing values to get good results is definitely not satisfactory, and I definitely to go over the regional cost function and GMM stuff with Shaun.
  • At the moment the iterations seem to improve the segmentation results, however I have coded the Plugin so that once a pixel is set as background it can never be retracted. This is fine for the initial selection, but I have a feeling that this should not be the case for pixels inside the original foreground selection, as these pixels should be allowed to change from background to foreground (and visa versa) each iteration, and so gradually approach a globally optimum segmentation.

Original Image

Selection

 

Iteration 1

Iteration 2

Iteration 3

 

  • I have changed the plugin so that it works according to the “GrabCut” paper. We decided to allow part of the background to be set as hard background that can never be retracted, and the rest of the pixels in the image can change from foreground to background and visa versa through each iteration. Each pixel in the image is assigned a component from each GMM, and this information is used to decide which GMM Model the pixel best falls.
  • I have completed a draft of my poster

 

Problems:

  • The EM class that I am using does not return the determinant of a co-variance matrix! This means that I have to somehow figure out what formula the EM class is using, so that I can calculate this value on each iteration. This has caused numerous problems and I still am not calculated the correct values.

 

Objectives:

  • Finish my poster
  • If I have enough time, try and get the covariance matrix stuff working

 

Progress Report 12

23 August 2005

 

Progress:

  • I have finished my poster
  • After much frustration with the old EM class, mainly to do with the creation of the covariance matrix, which is not in fact a co-variance matrix at all, Shaun and I have decided that it would be better to rewrite the Gaussian Mixture Model from scratch.
  • I have written most of the class, just need to do some testing and debugging to make sure it works

 

Problems:

  • THE EM CLASS!!!!!!!

 

Objectives:

  • Finish Rewriting the Gaussian Class

 

Progress Report 13

30 August 2005

 

Progress:

  • With much help from Shaun the Gaussian Class was completed
  • Just for fun I implemented a version of Grab Cut that creates colour histograms. This is very fast, and the results are pretty good – as good as the black and white results.
  • The Gaussian class wasn’t working well, so I hard coded in some starting parameters into the code, and the results were reasonable.
  • The processes of iterating through all the steps was very slow, so I improved it by buffering the inverse matrix and determinants which significantly increased performance, but the performance cost of this method of segmentation definitely does not out-weigh the simpler colour histogram approach.

 

Problems:

  • The Gaussian algorithm seems extremely dependent on starting conditions, and tends to create very large Gaussians that encompass all the data.

 

Objectives:

  • Start work on my short paper

 

Colour Histogram Input

OutPut: 2s

 

Gaussian Input

Ouput: 34s, 5 iterations

 

 

Progress Report 14

06 September 2005

 

Progress:

  • I have completed a draft of my short paper

 

Problems:

  • Images in LYX! I have tried using figure floats, but the diagrams are always placed over text?

 

Objectives:

  • Continue work on my short paper, and get figures working!

 

 

Progress Report 15

20 September 2005

 

Progress:

  • I have completed my short paper!

 

Objectives:

  • Continue working on my project. Maybe some more work on Gaussian Mixture models? I need to start deciding how much more needs to be done on my project before I start doing the write up.

 

 

Progress Report 16

27 September 2005

 

Progress:

  • After a meeting with Shaun we decided to leave Gaussian Mixture models as they are. I received an email from a post-doc student in China who has also implemented GrabCut and has experienced very similar problems with Gaussians. It appears that the Maths is not explicit enough in the paper to properly implement them, or that their reported results are incorrect.
  • We decided that the border matting feature should be implemented
  • Gimp provides a path object that will be very useful. It allows one to provide a distance along the path and will return the point on the path.
  • Unfortunately it appears that the GIMP API has no function to convert a selection to a path even though this can be done in GIMP.
  • Gimp Selections can be converted into border selections with a certain width which will also be very usefull.

 

 

Problems:

  • I can’t convert a selection into a path
  • The maths behind matting appears reasonable, however the energy function is minimized by Dynamic programming and I am not sure as to how to use DP.

 

Objectives:

  • Formulate a way of minimizing the energy function
  • Start work on the algorithm – creating a sliding window, Gaussians etc

 

 

Progress Report 17

10 October 2005

 

Progress:

  • I have implemented the first part of border matting
  • At the moment the energy function in minimized by using a genetic algorithm
  • It seems to perform very little matting, and just detect the boundaries that were previously detected
  • The first 2 chapters of my thesis have been written, but are not final

 

Problems:

  • I am not sure if the matting is performing correctly

 

Objectives:

  • Complete chapter 3 and 4
  • Continue with the matting algorithm

 

 

Progress Report 18

17 October 2005

 

Progress:

  • The sigmoid function was not correct, and Shaun has helped me correct this.
  • The correct formula for the Sigma function is: f(x) = 1 / (1 + Exp(-(x – Delta)/sigma))
  • I have almost finished the design and implementation chapters

 

Problems:

  • The matting results are not satisfactory
  • The edges of the matte do not blend in smoothly with the object

 

Matting Results:

 

10 Windows

10 Windows

30 Windows

30 WIndows

10 Windows

 

 

 

More Progress:

  • It appears as if my genetic algorithm was performing incorrectly!
  • I used to start with random values of sigma and delta for all my genes, and this resulted in very bad mattes – see above
  • These mattes have large values for delta and sigma, and so cause discontinuous boundaries between the images and the matte
  • I have fixed this by starting off with a delta value of 0 and sigma value of 1 for all genes
  • This starts the matte off at a good point, and minimization causes an optimal matte as can be seen from the images below

 

 

 

 

 

 

Objectives:

  • The programming aspect of the project is finally complete, so I can focus on the writeup
  • Complete the design and implementation chapters and hand in for assessment
  • Start writing my conclusion