Robust Hand Detection Via Computer Vision

July 02, 2024 Post a Comment

I am currently working on a system for robust hand detection. The first step is to take a photo of the hand (in HSV color space) with the hand placed in a small rectangle to determ

Solution 1:

Have you taken a look at the camshift paper by Gary Bradski? You can download it from here

I used the the skin detection algorithm a year ago for detecting skin regions for hand tracking and it is robust. It depends on how you use it.

The first problem with using color for tracking is that it is not robust to lighting variations or like you mentioned, when people have different skin tones. However this can be solved easily as mentioned in the paper by:

Convert image to HSV color space.
Throw away the V channel and consider the H and S channel and hence discount for lighting variations.
Threshold pixels with low saturation due to their instability.
Bin the selected skin region into a 2D histogram. (OpenCV"s calcHist function) This histogram now acts as a model for skin.
Compute the "backprojection" (i.e. use the histogram to compute the "probability" that each pixel in your image has the color of skin tone) using calcBackProject. Skin regions will have high values.
You can then either use meanShift to look for the mode of the 2D "probability" map generated by backproject or to detect blobs of high "probability".

Throwing away the V channel in HSV and only considering H and S channels is really enough (surprisingly) to detect different skin tones and under different lighting variations. A plus side is that its computation is fast.

These steps and the corresponding code can be found in the original OpenCV book.

As a side note, I've also used Gaussian Mixture Models (GMM) before. If you are only considering color then I would say using histograms or GMM makes not much difference. In fact the histogram would perform better (if your GMM is not constructed to account for lighting variations etc.). GMM is good if your sample vectors are more sophisticated (i.e. you consider other features) but speed-wise histogram is much faster because computing the probability map using histogram is essentially a table lookup whereas GMM requires performing a matrix computation (for vector with dimension > 1 in the formula for multi-dimension gaussian distribution) which can be time consuming for real time applications.

So in conclusion, if you are only trying to detect skin regions using color, then go with the histogram method. You can adapt it to consider local gradient as well (i.e. histogram of gradients but possibly not going to the full extent of Dalal and Trigg's human detection algo.) so that it can differentiate between skin and regions with similar color (e.g. cardboard or wooden furniture) using the local texture information. But that would require more effort.

For sample source code on how to use histogram for skin detection, you can take a look at OpenCV"s page here. But do note that it is mentioned on that webpage that they only use the hue channel and that using both hue and saturation would give better result.

For a more sophisticated approach, you can take a look at the work on "Detecting naked people" by Margaret Fleck and David Forsyth. This was one of the earlier work on detecting skin regions that considers both color and texture. The details can be found here.

A great resource for source code related to computer vision and image processing, which happens to include code for visual tracking can be found here. And not, its not OpenCV.

Hope this helps.

Solution 2:

Here is a paper on adaptive gaussian mixture model skin detection that you might find interesting.

Also, I remember reading a paper (unfortunately I can't seem to track it down) that used a very clever technique, but it required that you have the face in the field of view. The basic idea was detect the person's face, and use the skin patch detected from the face to identify the skin color automatically. Then, use a gaussian mixture model to isolate the skin pixels robustly.

Finally, Google Scholar may be a big help in searching for state of the art in skin detection. It's heavily researched in adademia right now as well as used in industry (e.g., Google Images and Facebook upload picture policies).

Solution 3:

I have worked on something similar 2 years ago. You can try with Particle Filter (Condensation), using skin color pixels as input for initialization. It is quite robust and fast. The way I applied it for my project is at this link. You have both a presentation (slides) and the survey. If you initialize the color of the hand with the real color extracted from the hand you are going to track you shouldn't have any problems with black people.

For particle filter I think you can find some code implementation samples. Good luck.

Solution 4:

It will be hard for you to find skin tone based on color only. First of all, it depends strongly on the automatic white balance algorithm. For example, in this image, any person can see that the color is skin tone. But for the computer it will be blue. enter image description here

Second, correct color calibration in digital cameras is a hard thing, and it will be rarely accurate enough for your purposes. You can see www.DPReview.com, to understand what I mean.

In conclusion, I truly believe that the color by itself can be an input, but it is not enough.

Solution 5:

Well my experience with the skin modeling are bad, because: 1) lightning can vary - skin segmentation is not robust 2) it will mark your face also (as other skin-like objects)

I would use machine learning techniques like Haar training, which, in my opinion, if far more better approach than modeling and fixing some constraints (like skin detection + thresholding...)

Python Playground