Related
I am currently using a Project Tango tablet for robotic obstacle avoidance. I want to create a matrix of z-values as they would appear on the Tango screen, so that I can use OpenCV to process the matrix. When I say z-values, I mean the distance each point is from the Tango. However, I don't know how to extract the z-values from the TangoXyzIjData and organize the values into a matrix. This is the code I have so far:
public void action(TangoPoseData poseData, TangoXyzIjData depthData) {
byte[] buffer = new byte[depthData.xyzCount * 3 * 4];
FileInputStream fileStream = new FileInputStream(
depthData.xyzParcelFileDescriptor.getFileDescriptor());
try {
fileStream.read(buffer, depthData.xyzParcelFileDescriptorOffset, buffer.length);
fileStream.close();
} catch (IOException e) {
e.printStackTrace();
}
Mat m = new Mat(depthData.ijRows, depthData.ijCols, CvType.CV_8UC1);
m.put(0, 0, buffer);
}
Does anyone know how to do this? I would really appreciate help.
The short answer is it can't be done, at least not simply. The XYZij struct in the Tango API does not work completely yet. There is no "ij" data. Your retrieval of buffer will work as you have it coded. The contents are a set of X, Y, Z values for measured depth points, roughly 10000+ each callback. Each X, Y, and Z value is of type float, so not CV_8UC1. The problem is that the points are not ordered in any way, so they do not correspond to an "image" or xy raster. They are a random list of depth points. There are ways to get them into some xy order, but it is not straightforward. I have done both of these:
render them to an image, with the depth encoded as color, and pull out the image as pixels
use the model/view/perspective from OpenGL and multiply out the locations of each point and then figure out their screen space location (like OpenGL would during rendering). Sort the points by their xy screen space. Instead of the calculated screen-space depth just keep the Z value from the original buffer.
or
wait until (if) the XYZij struct is fixed so that it returns ij values.
I too wish to use Tango for object avoidance for robotics. I've had some success by simplifying the use case to be only interested in the distance of any object located at the center view of the Tango device.
In Java:
private Double centerCoordinateMax = 0.020;
private TangoXyzIjData xyzIjData;
final FloatBuffer xyz = xyzIjData.xyz;
double cumulativeZ = 0.0;
int numberOfPoints = 0;
for (int i = 0; i < xyzIjData.xyzCount; i += 3) {
float x = xyz.get(i);
float y = xyz.get(i + 1);
if (Math.abs(x) < centerCoordinateMax &&
Math.abs(y) < centerCoordinateMax) {
float z = xyz.get(i + 2);
cumulativeZ += z;
numberOfPoints++;
}
}
Double distanceInMeters;
if (numberOfPoints > 0) {
distanceInMeters = cumulativeZ / numberOfPoints;
} else {
distanceInMeters = null;
}
Said simply this code is taking the average distance of a small square located at the origin of x and y axes.
centerCoordinateMax = 0.020 was determined to work based on observation and testing. The square typically contains 50 points in ideal conditions and fewer when held close to the floor.
I've tested this using version 2 of my tango-caminada application and the depth measuring seems quite accurate. Standing 1/2 meter from a doorway I slid towards the open door and the distance changed form 0.5 meters to 2.5 meters which is the wall at the end of the hallway.
Simulating a robot being navigated I moved the device towards a trash can in the path until 0.5 meters separation and then rotated left until the distance was more than 0.5 meters and proceeded forward. An oversimplified simulation, but the basis for object avoidance using Tango depth perception.
You can do this by using camera intrinsics to convert XY coordinates to normalized values -- see this post - Google Tango: Aligning Depth and Color Frames - it's talking about texture coordinates but it's exactly the same problem
Once normalized, move to screen space x[1280,720] and then the Z coordinate can be used to generate a pixel value for openCV to chew on. You'll need to decide how to color pixels that don't correspond to depth points on your own, and advisedly, before you use the depth information to further colorize pixels.
The main thing is to remember that the raw coordinates returned are already using the basis vectors you want, i.e. you do not want the pose attitude or location
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I need the ability to verify that a user has drawn a shape correctly, starting with simple shapes like circle, triangle and more advanced shapes like the letter A.
I need to be able to calculate correctness in real time, for example if the user is supposed to draw a circle but is drawing a rectangle, my hope is to be able to detect that while the drawing takes place.
There are a few different approaches to shape recognition, unfortunately I don't have the experience or time to try them all and see what works.
Which approach would you recommend for this specific task?
Your help is appreciated.
We may define "recognition" as the ability to detect features/characteristics in elements and compare them with features of known elements seen in our experience. Objects with similar features probably are similar objects. The higher the amount and complexity of the features, the greater is our power to discriminate similar objects.
In the case of shapes, we can use their geometric properties such as number of angles, the angles values, number of sides, sides sizes and so forth. Therefore, in order to accomplish your task you should employ image processing algorithms to extract such features from the drawings.
Below I present a very simple approach that shows this concept in practice. We gonna recognize different shapes using the numbers of corners. As I said: "The higher the amount and complexity of the features, the greater is our power to discriminate similar objects". Since we are using just one feature, the number of corners, we can differentiate a few different kinds of shapes. Shapes with the same number of corners will not be discriminated. Therefore, in order to improve the approach you might add new features.
UPDATE:
In order to accomplish this task in real time you might extract the features in real time. If the object to be drawn is a triangle and the user is drawing the fourth side of any other figure, you know that he or she is not drawing a triangle. About the level of correctness you might calculate the distance between the feature vector of the desired object and the drawn one.
Input:
The Algorithm
Scale down the input image since the desired features can ben detected in lower resolution.
Segment each object to be processed independently.
For each object, extract its features, in this case, just the number of corners.
Using the features, classify the object shape.
The Software:
The software presented below was developed in Java and using Marvin Image Processing Framework. However, you might use any programming language and tools.
import static marvin.MarvinPluginCollection.floodfillSegmentation;
import static marvin.MarvinPluginCollection.moravec;
import static marvin.MarvinPluginCollection.scale;
public class ShapesExample {
public ShapesExample(){
// Scale down the image since the desired features can be extracted
// in a lower resolution.
MarvinImage image = MarvinImageIO.loadImage("./res/shapes.png");
scale(image.clone(), image, 269);
// segment each object
MarvinSegment[] objs = floodfillSegmentation(image);
MarvinSegment seg;
// For each object...
// Skip position 0 which is just the background
for(int i=1; i<objs.length; i++){
seg = objs[i];
MarvinImage imgSeg = image.subimage(seg.x1-5, seg.y1-5, seg.width+10, seg.height+10);
MarvinAttributes output = new MarvinAttributes();
output = moravec(imgSeg, null, 18, 1000000);
System.out.println("figure "+(i-1)+":" + getShapeName(getNumberOfCorners(output)));
}
}
public String getShapeName(int corners){
switch(corners){
case 3: return "Triangle";
case 4: return "Rectangle";
case 5: return "Pentagon";
}
return null;
}
private static int getNumberOfCorners(MarvinAttributes attr){
int[][] cornernessMap = (int[][]) attr.get("cornernessMap");
int corners=0;
List<Point> points = new ArrayList<Point>();
for(int x=0; x<cornernessMap.length; x++){
for(int y=0; y<cornernessMap[0].length; y++){
// Is it a corner?
if(cornernessMap[x][y] > 0){
// This part of the algorithm avoid inexistent corners
// detected almost in the same position due to noise.
Point newPoint = new Point(x,y);
if(points.size() == 0){
points.add(newPoint); corners++;
}else {
boolean valid=true;
for(Point p:points){
if(newPoint.distance(p) < 10){
valid=false;
}
}
if(valid){
points.add(newPoint); corners++;
}
}
}
}
}
return corners;
}
public static void main(String[] args) {
new ShapesExample();
}
}
The software output:
figure 0:Rectangle
figure 1:Triangle
figure 2:Pentagon
The other way is you can use math with this problem using the average of each point that are smallest distance from the one your'e comparing it from,
first you must resize shape with the ones in your library of shapes and then:
function shortestDistanceSum( subject, test_subject ) {
var sum = 0;
operate( subject, function( shape ){
var smallest_distance = 9999;
operate( test_subject, function( test_shape ){
var distance = dist( shape.x, shape.y, test_shape.x, test_shape.y );
smallest_distance = Math.min( smallest_distance, distance );
});
sum += smallest_distance;
});
var average = sum/subject.length;
return average;
}
function operate( array, callback ) {
$.each(array, function(){
callback( this );
});
}
function dist( x, y, x1, y1 ) {
return Math.sqrt( Math.pow( x1 - x, 2) + Math.pow( y1 - y, 2) );
}
var square_shape = Array; // collection of vertices in a square shape
var triangle_shape = Array; // collection of vertices in a triangle
var unknown_shape = Array; // collection of vertices in the shape your'e comparing from
square_sum = shortestDistanceSum( square_shape, unknown_shape );
triangle_sum = shortestDistanceSum( triangle_shape, unknown_shape );
Where the lowest sum is the closest shape.
You have two inputs - the initial image and the user input - and you are looking for a boolean outcome.
Ideally you would convert all your input data to a comparable format. Instead, you could also parameterize both types of input and use a supervised machine learning algorithm (Nearest Neighbor comes to mind for closed shapes).
The trick is in finding the right parameters. If your input is a flat image file, this could be a binary conversion. If user input is a swiping motion or pen stroke, I'm sure there are ways to capture and map this as binary but the algorithm would probably be more robust if it used data closest to the original input.
I'm using open frameworks and opencv to track blobs on a webcam. I'm getting the x value of the blob centroid and tracking it. The problem is, it jumps around allot, I'm wondering if there is a better way to compute the average position over a certain number of frames and use that number it's all being computed in the draw() function.
void testApp::draw(){
ofVec2f centroid = contourFinder.blobs[0].centroid;
int width = ofGetWidth();
float pct = (float)centroid.x / (float)width;
float totFrame = fingerMovie.getTotalNumFrames ();
float gotFrame = totFrame * pct;
}
you should create a loop for N frames, sum all coordinates you get, then divide by N.
I am not experienced with ofx but there must be a function to get next frame.
After loop ends, move camera to the average coordinate and re-initialize the loop.
Can anyone please help.
I have a cube which I have made in 3DS Max. I don't know the dimensions of the cube. Is there a way to get the vertices of each of the triangles of the faces of the cube? I am trying to get the normal to one of the faces of the cube to determine which way its pointing. So if I can determine the vertices I can get the normal for the face if I have 3 vertices, V1, V2 and V3, ordered in counterclockwise order, I can obtain the direction of the normal by computing (V2 - V1) x (V3 - V1), where x is the cross product of the two vectors.
I have looked in my models .fbx file and I can see a number of values there:
Vertices: *24 {
a: -15,-12.5,0,15,-12.5,0,-15,12.5,0,15,12.5,0,-15,-12.5,0.5,15,-12.5,0.5,-15,12.5,0.5,15,12.5,0.5}
PolygonVertexIndex: *36 {
a: 0,2,-4,3,1,-1,4,5,-8,7,6,-5,0,1,-6,5,4,-1,1,3,-8,7,5,-2,3,2,-7,6,7,-4,2,0,-5,4,6,-3}
Are these my models vertices?
Also, I would assume that Vertices: * 24 would be my list of vertices, but why is there only 24? Should a cube not have 36 vertices? And finally, if the coordinates for my vertices are PolygonVertexIndex: * 36 these values just seem off to me when I imagine the cube in my head with those dimensions?
Or alternatively, is there a automatic way to get the vertices of a cube without having to manually enter all the values for each vertex? I might have a couple of models to
Any help would be greatly appreciated
I can't figure why you need that... because when you load a model it is calculated , internally each vertex will have the normal,...
Anyway it is easy to calc...
The three first indexes define the first triangle of a face, the next three, the other triangle of a face.
You need only one triangle to calculate the normal...
So with the three indexes access to the veretex array and get three points... A, B and C
Now your normal is the result of the cross product between two vectors formed with that vertex.
Vector3 Normal = Vector3.Cross(B-A, C-B);
If the normal go back or forward will depend on the A,B,C order, can be CounterClockWise or ClockWise, but every triangle of the model will be ordered in one way. So you will have try it and fix it
You can write an XNA program which reads your normals without much hassle.
If you still want to calculate them, however, use this C# code, taken from FFWD, as a guide. Check the URL for a more detailed discussion on pros and cons. Personally, I'm not too happy with the result, but for the time being it works. Of course, since this code is FFWD related (implementation of Unity's API for XNA), it does not match XNA exactly, but the mathematics remain the same.
/// <summary>
/// Recalculates the normals.
/// Implementation adapted from http://devmaster.net/forums/topic/1065-calculating-normals-of-a-mesh/
/// </summary>
public void RecalculateNormals()
{
Vector3[] newNormals = new Vector3[_vertices.Length];
// _triangles is a list of vertex indices,
// with each triplet referencing the three vertices of the corresponding triangle
for (int i = 0; i < _triangles.Length; i = i + 3)
{
Vector3[] v = new Vector3[]
{
_vertices[_triangles[i]],
_vertices[_triangles[i + 1]],
_vertices[_triangles[i + 2]]
};
Vector3 normal = Vector3.Cross(v[1] - v[0], v[2] - v[0]);
for (int j = 0; j < 3; ++j)
{
Vector3 a = v[(j+1) % 3] - v[j];
Vector3 b = v[(j+2) % 3] - v[j];
float weight = (float)Math.Acos(Vector3.Dot(a, b) / (a.magnitude * b.magnitude));
newNormals[_triangles[i + j]] += weight * normal;
}
}
foreach (Vector3 normal in newNormals)
{
normal.Normalize();
}
normals = newNormals;
}
I'm currently working on a XNA game prototype. I'm trying to achieve a isometric view of the game world (or is it othographic?? I'm not sure which is the right term for this projection - see pictures).
The world should a tile-based world made of cubic tiles (e.g. similar to Minecraft's world), and I'm trying to render it in 2D by using sprites.
So I have a sprite sheet with the top face of the cube, the front face and the side (visible side) face. I draw the tiles using 3 separate calls to drawSprite, one for the top, one for the side, one for the front, using a source rectangle to pick the face I want to draw and a destination rectangle to set the position on the screen according to a formula to convert from 3D world coordinates to isometric (orthographic?).
(sample sprite:
)
This works good as long as I draw the faces, but if I try to draw fine edges of each block (as per a tile grid) I can see that I get a random rendering pattern in which some lines are overwritten by the face itself and some are not.
Please note that for my world representation, X is left to right, Y is inside screen to outside screen, and Z is up to down.
In this example I'm working only with top face-edges. Here is what I get (picture):
I don't understand why some of the lines are shown and some are not.
The rendering code I use is (note in this example I'm only drawing the topmost layers in each dimension):
/// <summary>
/// Draws the world
/// </summary>
/// <param name="spriteBatch"></param>
public void draw(SpriteBatch spriteBatch)
{
Texture2D tex = null;
// DRAW TILES
for (int z = numBlocks - 1; z >= 0; z--)
{
for (int y = 0; y < numBlocks; y++)
{
for (int x = numBlocks - 1; x >=0 ; x--)
{
myTextures.TryGetValue(myBlockManager.getBlockAt(x, y, z), out tex);
if (tex != null)
{
// TOP FACE
if (z == 0)
{
drawTop(spriteBatch, x, y, z, tex);
drawTop(spriteBatch, x, y, z, outlineTexture);
}
// FRONT FACE
if(y == numBlocks -1)
drawFront(spriteBatch, x, y, z, tex);
// SIDE FACE
if(x == 0)
drawSide(spriteBatch, x, y, z, tex);
}
}
}
}
}
private void drawTop(SpriteBatch spriteBatch, int x, int y, int z, Texture2D tex)
{
int pX = OffsetX + (int)(x * TEXTURE_TOP_X_OFFRIGHT + y * TEXTURE_SIDE_X);
int pY = OffsetY + (int)(y * TEXTURE_TOP_Y + z * TEXTURE_FRONT_Y);
topDestRect.X = pX;
topDestRect.Y = pY;
spriteBatch.Draw(tex, topDestRect, TEXTURE_TOP_RECT, Color.White);
}
I tried using a different approach, creating a second 3-tiers nested for loop after the first one, so I keep the top face drawing in the first loop and the edge highlight in the second loop (I know, this is inefficient, I should also probably avoid having a method call for each tile to draw it, but I'm just trying to get it working for now).
The results are somehow better but still not working as expected, top rows are missing, see picture:
Any idea of why I'm having this problem? In the first approach it might be a sort of z-fighting, but I'm drawing sprites in a precise order so shouldn't they overwrite what's already there?
Thanks everyone
Whoa, sorry guys I'm an idiot :) I started the batch with SpriteBatch.begin(SpriteSortMode.BackToFront) but I didn't use any z-value in the draw.
I should have used SpriteSortMode.Deferred! It's now working fine. Thanks everyone!
Try tweaking the sizes of your source and destination rectangles by 1 or 2 pixels. I have a sneaking suspicion this has something to do with the way these rectangles are handled as sort of 'outlines' of the area to be rendered and a sort of off-by-one problem. This is not expert advice, just a fellow coder's intuition.
Looks like a sub pixel precision or scaling issue. Also try to ensure your texture/tile width/height is a power of 2 (32, 64, 128, etc.) as that could make the effect less bad as well. It's really hard to tell just from those pictures.
I don't know how/if you scale everything, but you should try to avoid rounding wherever possible (especially inside your drawTop() method). Every time you round some position/coordinate chances are good you might increase the error/random offsets. Try to use double (or better: float) coordinates instead of integer.