As of SVN rev 21 I consider the project ready. Lots of fixes and improvements have to be made for this to be really usable, but that's going to be apart (like me ;)) of the Summer of Code program.
The schedule was heavily modified due to some caught exceptions - I'm leaving to the U.S. for a scholarship at the University of Arizona on wednesday and because of legal restrictions had to finish SoC before that time. I therefore claim that done and anyone from IRS reading this may note that no more paid work will be done by me for this years SoC ;)
Now I'm diving into my suitcases! Have fun at guadec everyone!
Monday, July 16, 2007
Sunday, July 15, 2007
July 15th, the boosting day
I checked in the code responsible for doing that :
into the project's SVN at code.google.com. I'm really excited. This is so cool. You can try it at home by checking out the code from svn and compiling it. Remember that the hack lives mostly inside libfspot (and the UI part in f-spot itself), so you need to build it! A word of caution - do not expect this to be anywhere near opencv, the detection algorithm bases on classifiers that have been trained on a very limited set of input data (I'm kind of in a rush), and the training algorithm itself needs some more love. But the results are promising, I think. I'd use some community input on how to handle the UI integration, I'm currently swinging something very rough, just to get basic functionality.
into the project's SVN at code.google.com. I'm really excited. This is so cool. You can try it at home by checking out the code from svn and compiling it. Remember that the hack lives mostly inside libfspot (and the UI part in f-spot itself), so you need to build it! A word of caution - do not expect this to be anywhere near opencv, the detection algorithm bases on classifiers that have been trained on a very limited set of input data (I'm kind of in a rush), and the training algorithm itself needs some more love. But the results are promising, I think. I'd use some community input on how to handle the UI integration, I'm currently swinging something very rough, just to get basic functionality.
Thursday, July 12, 2007
Matlab week? sic!
Who said matlab is easier to debug? Jumping gods, I was really frustrated with this one. It seems that I finally completed the AdaBoost training code, the classifiers are trained, and I'm getting some positive results. That's mostly thanks to the revised version of Viola & Jones paper that I found yesterday evening, hiding on my hard drive actually! I trained 15 classifiers and put them into 3 stages of a cascade, and I'm getting this (!!!) :
Wow, so I'm back on top of it, now rewriting to C! :-)
Wow, so I'm back on top of it, now rewriting to C! :-)
Friday, July 6, 2007
Matlab day
Well, yesterday after Larry Ewing, my mentor, helped me out with the scaling problem I had, we reached a point where we have the detections located correctly in f-spot, screenshot here.
Today I'm working on the Matlab code of the algorithm, trying to glue something, but I don't feel like I get the whole thing properly. I'll try to pseudocode the algorithm for people to comment on.
First, a bit abstract description.
The algorithm uses a cascade of classifiers, where a classifier is described by:
* a rectangular feature (f)
* parity (p)
* threshold (theta)
* error (alpha)
The cascade is built of several stages, which in turn are made of a number of classifiers. The property of a classifier is that it has a very low false negative rate, so if it considers the tested region not face, then we can immediately stop processing that region and follow to the next one.
When a classifier classifies the region as a face it will be tested by the following classifiers and stages until it reaches the end of the cascade or some classifier along the way reports that it's not a face after all. That way computation time is greatly reduced for most of the regions.
Now, to be able to detect faces that are in various scales and locations Viola and Jones propose a shifting window mechanism. The detector is applied at every location and every scale. The window is shifted each time by 1 pixel. It is common that the same face is detected several times, at slightly different locations or scales - V-J propose a simple method of summing these up.
The classifier itself is a simple binary function returning true if a feature over the examined image window exceeds a given threshold. The classifier's error has to be accounted here as well.
As to the feature calculation - I found several implementations, all not very educative, maybe the best one being the one in Torch. It seems that the feature is a simple sum of the pixel intesities (of the so called integral image), although the features vary by having different weight on different regions (i.e. 1 or -1 ). If the rectangular sum exceeds the given threshold - the classifier reports a face.
I'll try to post some pseudocode later, for discussion. As for now, please point out any misunderstandings if you find them.
Edit :
I posted the Matlab code here : http://wytyczak.com/andrzej/upload/MatlabVJDetector/
I'll check for Octave compliance in a minute and post the code in http://wytyczak.com/andrzej/upload/OctaveVJDetector/
Octave is not compliant because it seems not to understand Matlab's sparse arrays,
which are used in the feature set I'm using (reusing actually).
Today I'm working on the Matlab code of the algorithm, trying to glue something, but I don't feel like I get the whole thing properly. I'll try to pseudocode the algorithm for people to comment on.
First, a bit abstract description.
The algorithm uses a cascade of classifiers, where a classifier is described by:
* a rectangular feature (f)
* parity (p)
* threshold (theta)
* error (alpha)
The cascade is built of several stages, which in turn are made of a number of classifiers. The property of a classifier is that it has a very low false negative rate, so if it considers the tested region not face, then we can immediately stop processing that region and follow to the next one.
When a classifier classifies the region as a face it will be tested by the following classifiers and stages until it reaches the end of the cascade or some classifier along the way reports that it's not a face after all. That way computation time is greatly reduced for most of the regions.
Now, to be able to detect faces that are in various scales and locations Viola and Jones propose a shifting window mechanism. The detector is applied at every location and every scale. The window is shifted each time by 1 pixel. It is common that the same face is detected several times, at slightly different locations or scales - V-J propose a simple method of summing these up.
The classifier itself is a simple binary function returning true if a feature over the examined image window exceeds a given threshold. The classifier's error has to be accounted here as well.
As to the feature calculation - I found several implementations, all not very educative, maybe the best one being the one in Torch. It seems that the feature is a simple sum of the pixel intesities (of the so called integral image), although the features vary by having different weight on different regions (i.e. 1 or -1 ). If the rectangular sum exceeds the given threshold - the classifier reports a face.
I'll try to post some pseudocode later, for discussion. As for now, please point out any misunderstandings if you find them.
Edit :
I posted the Matlab code here : http://wytyczak.com/andrzej/upload/MatlabVJDetector/
I'll check for Octave compliance in a minute and post the code in http://wytyczak.com/andrzej/upload/OctaveVJDetector/
Octave is not compliant because it seems not to understand Matlab's sparse arrays,
which are used in the feature set I'm using (reusing actually).
Thursday, July 5, 2007
Face detection status
Tuesday I finished hooking up f-spot with the opencv library. This integrates the face detection into f-spot and we're able to mark the face candidates returned by opencv (although not very precisely, due to some bug that is yet to be located). It looks kind of funny, but here it is :
This shows both the detection integrated into f-spot and the original opencv detection (the one blending in, with the circles).
It seems like some problem with the coords, but don't really know. The hack is mostly located in the f-facedetect.c which is linked with the libfspot. You can check it out of the svn at http://code.google.com/p/facedetect-f-spot/
Anyway, I spent the last 2 days digging through the various Viola-Jones face detector articles and code. I still feel uncomplete though, so if anyone has any experience in this matter and is
willing to answer a couple of questions - please let me know :)
Also in the news is that I pretty much have to finish this until July 18th, due to my earlier unplanned trip to the US, where I'm not allowed to work on the SoC project, and where I'll be until the end of september :/ If (which is highly probable) I won't make it until the 18th I'll continue working on the project, but outside of SoC.
edit: I actually have the algorithm pretty much figured out right now, it didn't take as much as I thought it would ;) although, still, I'll probably have a couple of questions, so please let me know if you know anything about the V-J algorithm!
It seems like some problem with the coords, but don't really know. The hack is mostly located in the f-facedetect.c which is linked with the libfspot. You can check it out of the svn at http://code.google.com/p/facedetect-f-spot/
Anyway, I spent the last 2 days digging through the various Viola-Jones face detector articles and code. I still feel uncomplete though, so if anyone has any experience in this matter and is
willing to answer a couple of questions - please let me know :)
Also in the news is that I pretty much have to finish this until July 18th, due to my earlier unplanned trip to the US, where I'm not allowed to work on the SoC project, and where I'll be until the end of september :/ If (which is highly probable) I won't make it until the 18th I'll continue working on the project, but outside of SoC.
edit: I actually have the algorithm pretty much figured out right now, it didn't take as much as I thought it would ;) although, still, I'll probably have a couple of questions, so please let me know if you know anything about the V-J algorithm!
Subscribe to:
Posts (Atom)