1

I'm currently working on a vision system for a UAV I am building. The goal of the system is to find target objects, which are rather well defined (see below), in a video stream that will be a 2-D flyover view of the ground. So far I have tried training and using a Haar-like feature based cascade, a la Viola Jones, to do the detection. I am training it with 5000+ images of the targets at different angles (perspective shifts) and ranges (sizes in the frame), but only 1900 "background" images. This does not yield good results at all, as I cannot find a suitable number of stages to the cascade that balances few false positives with few false negatives.

I am looking for advice from anyone who has experience in this area, as to whether I should: 1) ditch the cascade, in favor of something more suitable to objects defined by their outline and color (which I've read the VJ cascade is not). 2) improve my training set for the cascade, either by adding positives, background frames, organizing/shooting them better, etc. 3) Some other approach I can't fathom currently.

A description of the targets:

  • Primary shapes: triangles, squares, circles, ellipses, etc.
  • Distinct, solid, primary (or close to) colors.
  • Smallest dimension between two and eight feet (big enough to be seen easily from a couple hundred feet AGL
  • Large, single alphanumeric in the center of the object, with its own distinct, solid, primary or almost primary color.

My goal is use something very fast, such as the VJ cascade, to find possible objects and their associated bounding box, and then pass these on to finer processing routines to determine the properties (color of the object and AN, value of the AN, actual shape, and GPS location). Any advice you can give me towards completing this goal would be much appreciated. The source code I currently have is a little lengthy for post here, but is freely available should you like to see it for reference. Thanks in advance!

-JB

2 Answers 2

2

I would recommend ditching Haar classification, since you know a lot about your objects. In these cases you should start by checking what features you can use:

1) overhead flight means, as you said, you can basically treat these as fixed shapes on a 2D plane. There will be scaling, rotations and some minor affine transformations, which depends a lot on how wide-angled your camera is. If it isn't particularly wide-angled, that part can probably be ignored. Also, you probably know your altitude, by which you can probably also make very good assumption on the target size (scaling).

2) You know the colors, which also makes it quite easy to find objects. If these are very defined as primary color, then you can just filter the image based on color and find those contours. If you want to do something a little more advanced (which to me doesn't seem necessary though...) you can do a backprojection, which in my experience is very effective and fast. Note, if you're creating the objects, it would be better to use Red Green and Blue instead of primary colors (red green and yellow). Then you can simply split the image into it's respective channels and use a very high threshold.

3) You know the geometric shapes. I've never done this myself, but as far as I know, the options are using moments or using Hough transforms (although openCV only has hough algorithms for lines and circles, so you'd have to write your own for other shapes...). You might already have sufficiently good results without this step though...

If you want more specific recommendations, it would be very useful to upload a couple sample images. :)

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a bunch for the great tip lifesayko! Here's my quick response: I'm not making these, this is part of a competition where the hosts will make and 'hide' these targets out in the search area. We also don't know the colors before hand, only that they're going to be in stark contrast with the brown/green land around them. I will be happy to upload a couple of sample images as soon as I figure out how to do that here.
meta.stackexchange.com/questions/28525/… :). A link will also suffice, if the competition has a website etc.
Yup, the points in 2) should actually be more than sufficient for you. You can approach it in different ways (filter out the known background, actively search for the object colors, etc.). Unless you need to differentiate between types of targets, I doubt you even need shape-matching in 3), anyway I would go for moments if you decide to do that.
1

May be solved but I came across a paper recently with an open-source license for generic object detection using normalised gradient features : http://mmcheng.net/bing/comment-page-9/

The details of the algorithms performance against illumination, rotation and scale may require a little digging. I can't remember on the top of my head where the original paper is.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.