Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Robotics Algorithm Anyone?
Message
From
04/01/2006 00:11:00
 
 
To
03/01/2006 16:07:28
General information
Forum:
Visual FoxPro
Category:
Other
Miscellaneous
Thread ID:
01082862
Message ID:
01083000
Views:
12
>I am developing a mobile robot that uses a top mounted camera to capture information about its surroundings. To simplify the problem, I am using special blue square-faced obstacles which can be scattered around a room (all the same size). The idea is that the robot could calculate its distance from these objects and formulate a "birds eye view" map of its environment, in relation to itself.
>
>However, I am unsure how to arrive at an algorithm for calculating these distances? From the robot's perspective, it has an image of the known size blue blocks to utilise. It knows the size of the blocks and the size of their appearance on its internal image from the camera. Somehow, it would be good to be able to "look" at the size of the images on screen, compared to their known physical size and therefore be able to calculate the robot's distance from the objects and possibly their distance from each other. Make sense?
>
>Finally, any calculation would need to be able to handle a situation when perhaps the objects it sees are not of a known size.
>
>If anyone has any ideas in this connection, your comments would be appreciated. FYI, this is not driven by Fox <s> - this is a C++ API.

Thinking on this further, the most significant sentence in your post is the second-to-last, where you're discussing the possibility of objects being not of a known size. This makes the problem a lot harder; if this is an actual need you're probably best off not even implementing any solution for the simple case of fixed-size objects because it won't "scale" to the more stringent requirment.

Regardless of whether the objects are fixed size or not, there are only 2 basic approaches to your task. As Dragan says, they are both trigonometric:

1. R-Theta
2. Triangulation

R-Theta - if you know the radius (R) and the azimuth angle (Theta) (i.e. North, South etc. or the angle from any fixed "North" reference) to each object from a single fixed point, the object locations are known. Your first case, where the objects are a fixed size, is a special case of this that makes R calculation simpler.

Assuming, as Dragan says the camera is fixed focal length it's easy to calibrate simply by moving an object a known distance away and imaging it. You then end up with a table of pixel diameters (or other dimension) vs. distance; you can interpolate as required. You may have to do some experimenting with object images off the center spot of the camera as lens distortion may come into play.

This simple lookup table approach will work only if the objects are the same size. If they are not, determination of R is more difficult as there is no relationship between image size and radius. In that case you will need to use an autofocus mechanism, applied separately to each object, to determine R. I don't know if any current autofocus systems are capable of reasonable accuracy, especially as the distance approaches "infinity" for the given focal length of the lens.

Triangulation is the other approach. If you take two images of the same scene, from camera locations that are a known distance apart (the "baseline"), the images will be slightly different, due to so-called "parallax". The angles (left or right) to any given object will differ; since the baseline length is known, with trigonometry you have all the information you need to determine the location of the object relative to the baseline and/or either end point. The longer the baseline, the better the position accuracy will be.

One limitation of triangulation is that if an object is on the baseline or an extension of the baseline its position can't be calculated. For objects close to the extended baseline position errors will be high.

There are 2 main approaches to triangulation:

1. Two cameras mounted fairly closely together on your robot. Similar in concept to human beings, the Mars Rover stereocameras etc. Upside is that the images don't differ very much which makes their processing simpler (see below); also, if the cameras can be swivelled it can combine the R-Theta approach to map the whole room without having to move the robot. Downside is that the short baseline limits accuracy unless high precision is used.

2. One camera on your robot. A picture is taken, the robot moves a known distance, and another picture is taken. Upside is the possibility of long baselines and relatively good accuracy. Downsides are low or no precision near the baseline and potentially much more difficult image processing.

The real difficulty with triangulation comes when you have multiple objects, especially if they are all the same. Basically, if you have two different images you have to somehow "recognize" that objects in different locations on the two images are actually the same object. This may not be too hard if one is red, another green etc. but if they're all the same colour even a human being (the gold standard in image recognition) may have a hard time. Computationally this is a non-trivial problem; a lot of grad student sweat has gone into it. You might approach your friendly neighbourhood university and chat with any department doing machine vision to see if they can help out. There may also be commercial stereocameras available with the required image processing and range finding already done for you.

Despite all of the above issues machine vision is becoming more and more viable. A couple of conspicuous successes:

- the above-mentioned Mars Exploration Rovers
- all the teams that completed the recent DARPA Grand Challenge (real-time machine vision, no less!) http://www.wired.com/wired/archive/14.01/stanley.html

**************

Another approach to this type of problem might be to use GPS and simply "visit" each object in a room. Current GPS capability is described at http://en.wikipedia.org/wiki/Global_Positioning_System and implies that advanced differential systems may achieve accuracy of 1cm (!)
Regards. Al

"Violence is the last refuge of the incompetent." -- Isaac Asimov
"Never let your sense of morals prevent you from doing what is right." -- Isaac Asimov

Neither a despot, nor a doormat, be

Every app wants to be a database app when it grows up
Previous
Reply
Map
View

Click here to load this message in the networking platform