apart's devel blog: 2007

Friday, December 14, 2007

Santa's here, late bastard.

A little trip back in time, just in time. For those of you that dare, there's a recompiled version of the TabBrowse extension for OpenOffice.org, now working with the current 2.3.0 branch! You can fetch it here (Unix version only right now, sorry). That's the same thing that came out as a product of my SoC05 endeavour with OO.o, but compiled against the current 2.3.0 office. It won't blow your mind, but hey, more is on the way!

As to face detection - running the training on an extensive set of examples and features turned out to be more time consuming then expected. Selecting the first! feature takes a brief 24h :-) hopefully the training will conclude this year...

My masters thesis is still blank, but stuff's starting to clarify in my head as work on the code advances (and it does).

Sunday, November 11, 2007

Sunday

Done some research regarding my masters thesis. Seems that I'll make a pretty good use of the lecture slides provided by Marc Pollefeys of UNC Chapel Hill, especially his Multiple View Geometry course (http://www.cs.unc.edu/~marc/mvg/), I'm also in process of acquiring a book by Hartley & Zisserman "Multiple View Geometry in Computer Vision" and "Computer Vision - A Modern approach" by Ponce. Those, along with a bunch of papers, should cover most of the 2D->3D stuff I'm going to do in my thesis.

F-spot - I looked into the Adaboost algorithm of my trainer, seems that it concludes the training too fast (after 4-5 stages), I remember I was happy with that earlier, because I got some results, but the training only took a couple of hours, but now it's time to make it work it's way up to 20-30 stages, like in the original V-J classifier.

I bought a dsm-g600 NAS device a couple of months back, today I'll be trying to put uShare onto it so it will uPnP stream my photos / music / videos to my home network. Maybe a cool idea for f-spot - adding a dpap plugin, so it could pop-up a photo library just like rhythmbox does with music libraries.

Wednesday, November 7, 2007

Follow-up

Wow, it's been 4 months since I last blogged. Time flies - seems to be my keyword for life. For the past couple of days I've been thinking about how to improve the results of my face detection implementation and I think the order's gonna be (I'll stick to Viola-Jone's nomenclature, if something's not clear to you please check their paper) :
1. Switch to integral images and integral-image-based features
2. Fix AdaBoost (there's something wrong with my AB implementation, it's not optimal - gotta find why

After that I should run the training process on a larger dataset and see how it affected the results.
----
During my stay in Tucson, over the summer, I've been doing some work at the University of Arizona, namely at the Arizona Simulation Technology and Education Center - ASTEC. Because of my interest in medicine I'm very happy of the time spent in that lab, I was also able to collect some data that should make a basis for my thesis - "Image processing in computer-guided surgical training". One of the interesting things I'll be doing in there is a structure-to-motion application. My website http://wytyczak.com/andrzej/upload/ should have the most up-to-date stuff on that matter.

Monday, July 16, 2007

SVN, rev 21

As of SVN rev 21 I consider the project ready. Lots of fixes and improvements have to be made for this to be really usable, but that's going to be apart (like me ;)) of the Summer of Code program.
The schedule was heavily modified due to some caught exceptions - I'm leaving to the U.S. for a scholarship at the University of Arizona on wednesday and because of legal restrictions had to finish SoC before that time. I therefore claim that done and anyone from IRS reading this may note that no more paid work will be done by me for this years SoC ;)

Now I'm diving into my suitcases! Have fun at guadec everyone!

Sunday, July 15, 2007

July 15th, the boosting day

I checked in the code responsible for doing that :

into the project's SVN at code.google.com. I'm really excited. This is so cool. You can try it at home by checking out the code from svn and compiling it. Remember that the hack lives mostly inside libfspot (and the UI part in f-spot itself), so you need to build it! A word of caution - do not expect this to be anywhere near opencv, the detection algorithm bases on classifiers that have been trained on a very limited set of input data (I'm kind of in a rush), and the training algorithm itself needs some more love. But the results are promising, I think. I'd use some community input on how to handle the UI integration, I'm currently swinging something very rough, just to get basic functionality.

Thursday, July 12, 2007

Matlab week? sic!

Who said matlab is easier to debug? Jumping gods, I was really frustrated with this one. It seems that I finally completed the AdaBoost training code, the classifiers are trained, and I'm getting some positive results. That's mostly thanks to the revised version of Viola & Jones paper that I found yesterday evening, hiding on my hard drive actually! I trained 15 classifiers and put them into 3 stages of a cascade, and I'm getting this (!!!) :

Wow, so I'm back on top of it, now rewriting to C! :-)

Friday, July 6, 2007

Matlab day

Well, yesterday after Larry Ewing, my mentor, helped me out with the scaling problem I had, we reached a point where we have the detections located correctly in f-spot, screenshot here.

Today I'm working on the Matlab code of the algorithm, trying to glue something, but I don't feel like I get the whole thing properly. I'll try to pseudocode the algorithm for people to comment on.
First, a bit abstract description.
The algorithm uses a cascade of classifiers, where a classifier is described by:
* a rectangular feature (f)
* parity (p)
* threshold (theta)
* error (alpha)
The cascade is built of several stages, which in turn are made of a number of classifiers. The property of a classifier is that it has a very low false negative rate, so if it considers the tested region not face, then we can immediately stop processing that region and follow to the next one.
When a classifier classifies the region as a face it will be tested by the following classifiers and stages until it reaches the end of the cascade or some classifier along the way reports that it's not a face after all. That way computation time is greatly reduced for most of the regions.
Now, to be able to detect faces that are in various scales and locations Viola and Jones propose a shifting window mechanism. The detector is applied at every location and every scale. The window is shifted each time by 1 pixel. It is common that the same face is detected several times, at slightly different locations or scales - V-J propose a simple method of summing these up.
The classifier itself is a simple binary function returning true if a feature over the examined image window exceeds a given threshold. The classifier's error has to be accounted here as well.
As to the feature calculation - I found several implementations, all not very educative, maybe the best one being the one in Torch. It seems that the feature is a simple sum of the pixel intesities (of the so called integral image), although the features vary by having different weight on different regions (i.e. 1 or -1 ). If the rectangular sum exceeds the given threshold - the classifier reports a face.

I'll try to post some pseudocode later, for discussion. As for now, please point out any misunderstandings if you find them.

Edit :
I posted the Matlab code here : http://wytyczak.com/andrzej/upload/MatlabVJDetector/
I'll check for Octave compliance in a minute and post the code in http://wytyczak.com/andrzej/upload/OctaveVJDetector/

Octave is not compliant because it seems not to understand Matlab's sparse arrays,
which are used in the feature set I'm using (reusing actually).

Thursday, July 5, 2007

Face detection status

Tuesday I finished hooking up f-spot with the opencv library. This integrates the face detection into f-spot and we're able to mark the face candidates returned by opencv (although not very precisely, due to some bug that is yet to be located). It looks kind of funny, but here it is :

This shows both the detection integrated into f-spot and the original opencv detection (the one blending in, with the circles).
It seems like some problem with the coords, but don't really know. The hack is mostly located in the f-facedetect.c which is linked with the libfspot. You can check it out of the svn at http://code.google.com/p/facedetect-f-spot/

Anyway, I spent the last 2 days digging through the various Viola-Jones face detector articles and code. I still feel uncomplete though, so if anyone has any experience in this matter and is
willing to answer a couple of questions - please let me know :)

Also in the news is that I pretty much have to finish this until July 18th, due to my earlier unplanned trip to the US, where I'm not allowed to work on the SoC project, and where I'll be until the end of september :/ If (which is highly probable) I won't make it until the 18th I'll continue working on the project, but outside of SoC.

edit: I actually have the algorithm pretty much figured out right now, it didn't take as much as I thought it would ;) although, still, I'll probably have a couple of questions, so please let me know if you know anything about the V-J algorithm!

Sunday, June 10, 2007

f-spot face detection

I'm a bit off the schedule with my project, but nothing to be worried about. I took the last couple of days to get familiar with f-spot's code, and I'm still in the process, but there are already some visible effects. I managed to register a little extension, initially in the Tools menu, but due to some problems (see below) had to switch to the Export submenu.

The code is registered as a Mono.Addin and it just passes the filename of the picture being viewed to a facedetect sample program from the opencv library to get the following result.

Ok, now to the details.
The wiki at http://f-spot.org/Extend_F-Spot is outdated. Stephane has already coded an extension point for the Tools menu, the extension has to use the ITool interface. The code for that is in the trunk/src/Extensions in the ITool.cs and ToolMenuItemNode.cs files.
I'd love to use that, but...

I have no idea how to call anything inside the main program from the addin, except for the little example in the export extension, that passes the list of currently selected images to the extension. I tried transferring a similar functionality to the ITool interface, but I have no way of knowing if it works because...

Right now my environment got screwed up by an apt-get upgrade of my ubuntu (gutsy) and for some reason make all in the svn trunk builds an unusable binary that crashes with some nullpointer exception. What's weird - the binary compiled before the system upgrade works fine.

So, while waiting on some updates of the mono libs in gutsy I'll be hacking the face detection algorithm in octave :) I'll probably take some parts from the opencv, it seems to do a pretty good job :

It turns out, that it wasn't a bug in gutsy's libraries, but some mess I've created in the f-spot's glade file ;-) works fine now. Thanks to Sebastian Dröge for his help :)

Init

Hi all!
This will be the place where I'll be posting any new hacks I've done, especially the progress on my face detection feature for f-spot, that I'm doing with the Google SoC this year.

-- Edit --
Since my feeds are now syndicated within planet.gnome.org I guess I'll follow the others, and say :

Hello Planet Gnome !!

:D
It's pretty cool to be listed among all the gnome hackers, whose blog entries I've been following for ca 2 years now :)

apart's devel blog