"GrabCut" -
Interactive Foreground Extraction using Iterated Graph Cuts
Progress Reports for Matthew Marsh
Supervisors: Prof Shaun Bangay
and Mrs Adele Lobb
Progress Report 1
16 March 2005
Previous Objectives:
- Gain
a working knowledge of the Lyx
- Read
and understand the GrabCut paper by Rother
et al in order to lead the discussion at the VRSIG meeting.
- Have
a look at the tutorials on Gimp Plugins
Progress:
- I
followed the tutorial provided in Lyx through
the help menu, and realised just how useful and powerful Lyx is.
- I
continued my research into the GrabCut
approach of image segmentation, and also read the paper by Bolykov and Jolly on interactive graph cuts. GrabCut is based upon their research, and just
extends their method of segmentation on greyscale images to colour images.
Perhaps implementing a plugin to work on
greyscale images first would be a good idea!
- I
managed to locate a template for a very basic Gimp plugin
that can be downloaded at ftp://ftp.gimp.org/pub/gimp/plugin-template/
- I
couldnt get the plugin compiling, but after
much help from Shaun the plugin finally appeared
in Gimp! So now I have my first bit of code for the plugin.
Problems:
- The
maths in the GrabCut paper is fairly complex,
and some of the exact details dont appear to be explicitly written out.
Most of the maths is based on the graph cut paper, and so maybe I should
start off using their maths.
- Getting
the plugin to compile was a major problem
especially since I only started using Linux this year, but fortunately
Shaun appears to be an expert in this regard so any further problems
regarding Linux should be able to be solved. I can also easily compile and
make the plugin now so in the future this wont
be a problem.
Objectives:
- Update
my website to include my main research project as well as my graphics
project.
- Read
up on the literature related to my project for the literature review, and
also to gain a broader fuller knowledge of image segmentation in general.
- Read
through articles regarding gimp plugins provided
on the web.
Progress Report 2
7 April 2005
Progress:
- My
website has been updated to include a brief summary of the basics of my
project.
- I
continued my study on the graph cut paper by Bolykov
and Jolly, and started writing some basic code in java (as I didnt have
access to c during the vac). This code implemented the
simplification suggested by Shaun at the last Meeting.
- The
Code has functions to create a histogram, the energy function and has
various arrays for storing image intensity and alpha values. I used
random alpha values to check weather a minimum cost did indeed correspond
to a best segmentation. The algorithms seem to be correct, but it is
difficult to test, and even very small images have a high complexity
(number of iterations to get the best segmentation).
- The
actual gimp plugin is slowly taking shape. At
the moment I can read in a data block of image data into memory, change
it and write it back. This allows me to directly work with individual
pixels.
- After
browsing through parts of the gimp development manual, it appears that a
selection can easily be made by just supplying a built in function with an
alpha layer. This will be very useful for my plugin
as an alpha map will be the output of my plugin,
and this can then easily be converted to a selection.
- I
also finished my PowerPoint slides for my first presentation.
Problems:
- When
I write a memory block to an image in gimp the rest of the image gets
overwritten by seemingly random data. Somehow this must be fixed.
- It
is difficult to test the cost function
Objectives:
- Convert
the java code into c code and incorporate this into my plugin
so that a cost function can be calculated.
- Use
random numbers to generate the opacity values, and use this alpha matte to
create a selection. (Do this several times to make sure the cost function
is working correctly)
- Start
looking for information on the max - flow algorithm.
Progress Report 3
13 April 2005
Progress:
- I
have converted my temporary java code into c code, and added this to my
basic plugin.
- The
plugin can now add an alpha layer to the image
layer.
- It
then creates a new transparent layer and provides the user with a
foreground and background brush to draw on this layer.
- I
have almost finished converting the drawn data to alpha values on the
image layer.
- The
plugin can also convert the alpha layer to a
selection.
- I
read up some more on the Max Flow algorithm, and it appears as if I might
be able to obtain an implementation of the Algorithm from Yuri Boykov and Vladimir Kolmogorov.
Problems:
- I
really battled to get writing data to the alpha layer working. My first
attempts including creating a drawable out of an
image mask, and also out of the alpha channel. Nothing seemed to work.
Finally I realised that all I had to do was use the image drawable and just write to the alpha value!!! Very
simple to do actually (Just write to every second byte in the image for
greyscale).
Objectives:
- Finish
converting the user drawn data into alpha values
- Generate
some random alpha values to make sure the cost function is working (this
time with a graphical output - the selection).
- Obtain
a working version of the max-flow algorithm (hopefully!!)
- If
this is possible I will have to study the code in depth and then
incorporate it into my project
Progress Report 4
20 April 2005
Progress:
- A
basic version of my graphical interface is working
- I
tried using random numbers for my alpha values, but this didnt really
give a good indication as to weather things were really working.
- I
managed to locate (with the help of Shaun) a c++ version of the max flow
algorithm used in GrabCut and download it.
- I
have modified my program to so that it can create nodes and individual
edge weights. These are then added to the graph provided by the MaxFlow program.
- The
MaxFlow program is very easy to work with, and
did not need to be modified in any way.
- My
program used to just calculate the overall cost. It now has functions to
calculate the nlink costs and terminal costs for
each node. This is necessary to build up the graph.
- A
beta version of my plugin is now working.
Problems:
- Once
again the build file was giving me hassles when I added the new classes
for the max flow algorithm, but Shaun managed to get c++ and c compiling
together.
- The
plugin appears to be working, but I know of at
least one place in my code (a cost function) where there are errors. The
log of 0 is calculated here and this gives infinity and must be sorted
out.
Objectives:
- Continue
with gathering references for my literature review, and obtain the papers
I already know of on ACM.
- Tidy
up my code and talk to Shaun about Triple pointers and the like, and the
general layout of my program (this is very messy at the moment).
- Consolidate
on what Ive done maybe do some writing up and definitely comment my
code.
Results
The white lines indicate the foreground and the black lines
the background. The result is not that great at the moment, but I think that
this is due to a few simple mathematical errors in my cost functions.
Progress Report 5
26 April 2005
Progress:
- I
have gathered about 25 references for my literature review, and used lyx to create a bib-tex
file, so referencing them will be very easy when I actually write the
review.
- I
have commented my code and made it neater.
- Shaun
has scheduled a meeting with me to go over my code so far and to make sure
it will be easy to update to incorporate GMMS and matting. This may
include taking a more object orientated approach to my program, and
perhaps creating some wrapper classes for histograms etc.
Objectives:
- Start
reading the references for my literature review, and making notes on each
one
- Decide
what wrapper classes are needed, and start creating them
Progress Report 6
03 May 2005
Progress:
- I
have had a meeting with Shaun, and we decided on a very basic outline of
what GMMS are, and how they could be implemented. As I understand it at
the moment, The colours (defined by the hard
constraints) for the foreground and background are each modelled in 3d space
using a colour model other than RGB. For each model, the colours are
clustered using some sort of clustering algorithm into k = 5 components.
These clusters are also weighted according to some function. Information
such as the mean and standard deviation for each cluster is also needed. Guassians for each cluster are then created, and the
overall guassian function for the foreground or
background is just a weighted sum (mixture) of the five guassian functions.
- In
the meeting we also decided to standardise all my functions so that they
relate to GrabCut and not to the graph cut
equations. I added in a constant called gamma so that the value can easily
be adjusted. The segmentation still doesn't look quite right. It must be
something to do with the weighting of boundary over region costs.
- I
have started my literature review
Objectives:
- Continue
on my literature review
- Change
my calculation of Beta from an average to the standard deviation over the
image
- Create
some nice wrapper classes for my histograms
- Continue
trying to understand GMMS
Progress Report 7
11 May 2005
Progress:
- After
some very long and extremely frustrating few days I have managed to make
two nice working wrapper classes. One for my histograms and one for my
neighbours.
- This
makes my main class much smaller and easier to understand which is great.
The wrapper classes contain all the messy code for allocating memory in
their constructors. I put the code for garbage collection into their
destructors.
- I
have located a image base that the writers of GrabCut used to test their program. This is very
useful as it contains images and their correct segmentation. This will be
very useful for testing purposes. It is the berkely
segmentation and dataset benchmark page and can be found at http://www.cs.berkeley.edu/projects/vision/grouping/segbench/
- I
had another meeting with Shaun about GMMS and he managed to locate a
working 3d version of the algorithm in Java Code.
- This
can be found at http://www-cse.ucsd.edu/users/ibayrakt/java/em/docum.html
Problems:
- While
debugging my code I used printf ("%d",
somedouble); THIS DOES NOT WORK!!!! one must use %f to get the correct results. This gave
me endless trouble until I figured out what was going on.
- With
really big pictures my plugin crashes. Maybe
something to do with allocating memory??
- My
plugin also seems to be running much slower than
it used too.
Objectives:
- I
seem to be doing far more work on my project than on my literature review,
so I really need to do some serious work on it this week.
- I
need to start changing the Java code into c code, but I will only start
doing this once my literature review is well underway. This might have to
be postponed to the holidays as exams are looming ahead.
Progress Report 8
17 May 2005
Progress:
- I have managed to
complete a first very basic draft of my literature review
Objectives:
- Continue on my literature
review
Progress Report 9
26 July 2005
Progress:
- Segmentation
results are much better than they used to be!
- This
is all to do with the cost functions
Objectives:
- Read up more on the
section in the GrabCut paper on Gaussian
Mixture Models and how they can be used to represent colour.
- Try and understand the
Java GMM code.
- Start Recoding the Java
code into a class in C++ or C that can be used to represent colours for my
plugin
- Finish my presentation
and the demo for it


Progress Report 10
02 August 2005
Progress:
- I have converted the Java
Class em.java in to hopefully equivalent C code
- Testing on the class has
been successful so far, and the Gaussian results seem to be the same with
the class in C and in Java
- Em.C
takes in a 2d array of Doubles, and the number of Gaussians. It then
calculates the means, mixing parameters and co-variances for each Gaussian
exactly what I need to implement colour
- The Array is arranged as
follows:
|
Pixel
|
Red
|
Green
|
Blue
|
|
1
|
123
|
132
|
153
|
|
2
|
234
|
242
|
034
|
|
3
|
132
|
153
|
109
|
- The Gaussian algorithm is
iterative, and uses random numbers in its calculation I will need to
decide how many iterations to perform, and the size of the random numbers
with the examples given, the input data was approximately with in the
range | 0 - 3 | range, and used
random numbers in the | 0 1| range. My program has data values between 0
and 255, so maybe larger random numbers are necessary?
- I have done a few tests
on the cost functions for the short paper. So far changing the values
either seems to have no effect at all, or totally messes up the
segmentation for very large changes. Also if a plain white image with 2
black squares is used, and only one square is painted on as foreground the
other square is not segmented. This is quite interesting and may indicate
an error in my implementation not quite sure yet.
Problems:
- ALWAYS remember to
initialize variables!!!!!! Spent a few hours to find this basic problem!
- Finally have my class
getting the means, mixing parameters and co-variances for input data from
the image just not sure what to do with them yet!
Objectives:
- I need to start working
on my short paper
- This needs to be
something original, and theres no point re-writing the Grab-Cut paper
- Some ideas so far are to
thoroughly investigate the max-flow algorithm and the graph weights to see
how changing the structure of the graph affects the segmentation
- Also investigate if the
algorithm will segment two similar foreground objects if only one is used
to clue the algorithm
Progress Report 10
10
August 2005
Progress:
- I
have fully converted my old plugin class to
allow for colour images. I have introduced a struct
called pixel that stores R,G,B values, and then I
have a 2d array storing pixels.
- There
is now a class that assigns k components to all the pixels in the unknown
region, using the Gaussian model of the foreground
- The
V term has been changed slightly, so that instead of a difference in
intensity, the difference in colour space can be calculated.
- The
below results use the correct V term (weights between nodes based on
difference in colour), but uses a U term that is always 1 (the weights
between nodes and the source and sink node)
- Even
with this very limited information the max-flow algorithm still gives good
results! (This is of course a fairly simple image)
|
Before
|
After
|
|

|

|
- I have implemented the new
U term now, by using the Gaussian parameters, but the results are not even
as good as the above one.
Problems:
- My plugin is running
extremely slowly at the moment! This is due to the iterative nature of the
Gaussian class that I am using. At the moment I perform about 10
iterations for each Gaussian, and this is very slow.
- My code for calculating
the mean and weights seems to be fine, but the determinant of a
co-variance matrix does not seem to give similar results to the Gaussian
model.
- I am not sure how to set
up the weights for the Sink and Source node now that I have introduced the
Gaussians.
Objectives:
- I need to start thinking
about my poster, and some possible rough layouts
- My plugin needs some
serious fixes!!!
Progress Report 11
16 August 2005
Progress:
- With
a bit of fiddling I managed to get the algorithm to give me some nice
results! For some unknown reason I had to negate the output of the
Regional cost function and set the Gamma value to about 2000 from 50.
- Just
changing values to get good results is definitely not satisfactory, and I
definitely to go over the regional cost function and GMM stuff with Shaun.
- At
the moment the iterations seem to improve the segmentation results,
however I have coded the Plugin so that once a pixel is set as background
it can never be retracted. This is fine for the initial selection, but I
have a feeling that this should not be the case for pixels inside the
original foreground selection, as these pixels should be allowed to change
from background to foreground (and visa versa) each iteration, and so
gradually approach a globally optimum segmentation.
|
Original Image
|
Selection
|
|

|

|
|
Iteration 1
|
Iteration 2
|
Iteration 3
|
|

|

|

|
- I
have changed the plugin so that it works
according to the GrabCut paper. We decided to
allow part of the background to be set as hard background that can never
be retracted, and the rest of the pixels in the image can change from
foreground to background and visa versa through each iteration. Each pixel
in the image is assigned a component from each GMM, and this information
is used to decide which GMM Model the pixel best falls.
- I
have completed a draft of my poster
Problems:
- The
EM class that I am using does not return the determinant of a co-variance
matrix! This means that I have to somehow figure out what formula the EM
class is using, so that I can calculate this value on
each iteration. This has caused numerous problems and I still am
not calculated the correct values.
Objectives:
- Finish
my poster
- If
I have enough time, try and get the covariance matrix stuff working
Progress Report 12
23 August
2005
Progress:
- I
have finished my poster
- After
much frustration with the old EM class, mainly to do with the creation of
the covariance matrix, which is not in fact a co-variance matrix at all, Shaun and I have decided that it would be better
to rewrite the Gaussian Mixture Model from scratch.
- I
have written most of the class, just need to do some testing and debugging
to make sure it works
Problems:
Objectives:
- Finish
Rewriting the Gaussian Class
Progress Report 13
30 August
2005
Progress:
- With
much help from Shaun the Gaussian Class was completed
- Just
for fun I implemented a version of Grab Cut that creates colour
histograms. This is very fast, and the results are pretty good as good
as the black and white results.
- The
Gaussian class wasnt working well, so I hard coded in some starting
parameters into the code, and the results were reasonable.
- The
processes of iterating through all the steps was very slow, so I improved
it by buffering the inverse matrix and determinants which significantly
increased performance, but the performance cost of this method of
segmentation definitely does not out-weigh the simpler colour histogram
approach.
Problems:
- The
Gaussian algorithm seems extremely dependent on starting conditions, and
tends to create very large Gaussians that encompass all the data.
Objectives:
- Start
work on my short paper
|
Colour Histogram Input
|
OutPut: 2s
|
|

|

|
|
Gaussian Input
|
Ouput: 34s, 5 iterations
|
|

|

|
Progress Report 14
06 September
2005
Progress:
- I
have completed a draft of my short paper
Problems:
- Images
in LYX! I have tried using figure floats, but the diagrams are always
placed over text?
Objectives:
- Continue
work on my short paper, and get figures working!
Progress Report 15
20 September
2005
Progress:
- I
have completed my short paper!
Objectives:
- Continue
working on my project. Maybe some more work on Gaussian Mixture models? I
need to start deciding how much more needs to be done on my project before
I start doing the write up.
Progress Report 16
27 September 2005
Progress:
- After
a meeting with Shaun we decided to leave Gaussian Mixture models as they
are. I received an email from a post-doc student in China
who has also implemented GrabCut and has
experienced very similar problems with Gaussians. It appears that the
Maths is not explicit enough in the paper to properly implement them, or
that their reported results are incorrect.
- We
decided that the border matting feature should be implemented
- Gimp
provides a path object that will be very useful. It allows one to provide
a distance along the path and will return the point on the path.
- Unfortunately
it appears that the GIMP API has no function to convert a selection to a
path even though this can be done in GIMP.
- Gimp
Selections can be converted into border selections with a certain width
which will also be very usefull.
Problems:
- I
cant convert a selection into a path
- The
maths behind matting appears reasonable, however the energy function is
minimized by Dynamic programming and I am not sure as to how to use DP.
Objectives:
- Formulate
a way of minimizing the energy function
- Start
work on the algorithm creating a sliding window, Gaussians etc
Progress Report 17
10 October
2005
Progress:
- I
have implemented the first part of border matting
- At
the moment the energy function in minimized by using a genetic algorithm
- It
seems to perform very little matting, and just detect the boundaries that
were previously detected
- The
first 2 chapters of my thesis have been written, but are not final
Problems:
- I
am not sure if the matting is performing correctly
Objectives:
- Complete
chapter 3 and 4
- Continue
with the matting algorithm
Progress Report 18
17 October
2005
Progress:
- The
sigmoid function was not correct, and Shaun has helped me correct this.
- The
correct formula for the Sigma function is: f(x) = 1 / (1 + Exp(-(x
Delta)/sigma))
- I
have almost finished the design and implementation chapters
Problems:
- The
matting results are not satisfactory
- The
edges of the matte do not blend in smoothly with the object
Matting Results:
|

|

|

|
|

10 Windows
|

10 Windows
|

30 Windows
|
|

30 WIndows
|

10 Windows
|
|
More Progress:
- It
appears as if my genetic algorithm was performing incorrectly!
- I
used to start with random values of sigma and delta for all my genes, and
this resulted in very bad mattes see above
- These
mattes have large values for delta and sigma, and so cause discontinuous boundaries
between the images and the matte
- I
have fixed this by starting off with a delta value of 0 and sigma value of
1 for all genes
- This
starts the matte off at a good point, and minimization causes an optimal
matte as can be seen from the images below
Objectives:
- The
programming aspect of the project is finally complete, so I can focus on
the writeup
- Complete
the design and implementation chapters and hand in for assessment
- Start
writing my conclusion