I am currently using a Project Tango tablet for robotic obstacle avoidance. I want to create a matrix of z-values as they would appear on the Tango screen, so that I can use OpenCV to process the matrix. When I say z-values, I mean the distance each point is from the Tango. However, I don't know how to extract the z-values from the TangoXyzIjData and organize the values into a matrix. This is the code I have so far:
public void action(TangoPoseData poseData, TangoXyzIjData depthData) {
byte[] buffer = new byte[depthData.xyzCount * 3 * 4];
FileInputStream fileStream = new FileInputStream(
depthData.xyzParcelFileDescriptor.getFileDescriptor());
try {
fileStream.read(buffer, depthData.xyzParcelFileDescriptorOffset, buffer.length);
fileStream.close();
} catch (IOException e) {
e.printStackTrace();
}
Mat m = new Mat(depthData.ijRows, depthData.ijCols, CvType.CV_8UC1);
m.put(0, 0, buffer);
}
Does anyone know how to do this? I would really appreciate help.
The short answer is it can't be done, at least not simply. The XYZij struct in the Tango API does not work completely yet. There is no "ij" data. Your retrieval of buffer will work as you have it coded. The contents are a set of X, Y, Z values for measured depth points, roughly 10000+ each callback. Each X, Y, and Z value is of type float, so not CV_8UC1. The problem is that the points are not ordered in any way, so they do not correspond to an "image" or xy raster. They are a random list of depth points. There are ways to get them into some xy order, but it is not straightforward. I have done both of these:
render them to an image, with the depth encoded as color, and pull out the image as pixels
use the model/view/perspective from OpenGL and multiply out the locations of each point and then figure out their screen space location (like OpenGL would during rendering). Sort the points by their xy screen space. Instead of the calculated screen-space depth just keep the Z value from the original buffer.
or
wait until (if) the XYZij struct is fixed so that it returns ij values.
I too wish to use Tango for object avoidance for robotics. I've had some success by simplifying the use case to be only interested in the distance of any object located at the center view of the Tango device.
In Java:
private Double centerCoordinateMax = 0.020;
private TangoXyzIjData xyzIjData;
final FloatBuffer xyz = xyzIjData.xyz;
double cumulativeZ = 0.0;
int numberOfPoints = 0;
for (int i = 0; i < xyzIjData.xyzCount; i += 3) {
float x = xyz.get(i);
float y = xyz.get(i + 1);
if (Math.abs(x) < centerCoordinateMax &&
Math.abs(y) < centerCoordinateMax) {
float z = xyz.get(i + 2);
cumulativeZ += z;
numberOfPoints++;
}
}
Double distanceInMeters;
if (numberOfPoints > 0) {
distanceInMeters = cumulativeZ / numberOfPoints;
} else {
distanceInMeters = null;
}
Said simply this code is taking the average distance of a small square located at the origin of x and y axes.
centerCoordinateMax = 0.020 was determined to work based on observation and testing. The square typically contains 50 points in ideal conditions and fewer when held close to the floor.
I've tested this using version 2 of my tango-caminada application and the depth measuring seems quite accurate. Standing 1/2 meter from a doorway I slid towards the open door and the distance changed form 0.5 meters to 2.5 meters which is the wall at the end of the hallway.
Simulating a robot being navigated I moved the device towards a trash can in the path until 0.5 meters separation and then rotated left until the distance was more than 0.5 meters and proceeded forward. An oversimplified simulation, but the basis for object avoidance using Tango depth perception.
You can do this by using camera intrinsics to convert XY coordinates to normalized values -- see this post - Google Tango: Aligning Depth and Color Frames - it's talking about texture coordinates but it's exactly the same problem
Once normalized, move to screen space x[1280,720] and then the Z coordinate can be used to generate a pixel value for openCV to chew on. You'll need to decide how to color pixels that don't correspond to depth points on your own, and advisedly, before you use the depth information to further colorize pixels.
The main thing is to remember that the raw coordinates returned are already using the basis vectors you want, i.e. you do not want the pose attitude or location
Related
I have detected vehicles as a blob in OpenCV. Below is the blob.h file
class Blob {
public:
// member variables
std::vector<cv::Point> currentContour;
cv::Rect currentBoundingRect;
std::vector<cv::Point> centerPositions;
double dblCurrentDiagonalSize;
double dblCurrentAspectRatio;
bool blnCurrentMatchFoundOrNewBlob;
bool blnStillBeingTracked;
int intNumOfConsecutiveFramesWithoutAMatch;
cv::Point predictedNextPosition;
// function prototypes
Blob(std::vector<cv::Point> _contour);
void predictNextPosition(void);
};
What algorithm should I use to estimate the speed of the detected vehicle??
Thanks in Advance.
UPDATE
Here is the code I have tried to estimate the speed, but it doesn't put the text plus it crashes.
for (auto blob : blobs) {
if (blob.blnStillBeingTracked == true && blob.centerPositions.size() >= 2) {
int prevFrameIndex = (int)blob.centerPositions.size() - 2;
int currFrameIndex = (int)blob.centerPositions.size() - 1;
if (blob.centerPositions[prevFrameIndex].y > (intHorizontalLinePosition-50) && blob.centerPositions[currFrameIndex].y <= intHorizontalLinePosition) {
int distance = blob.centerPositions[currFrameIndex].y - blob.centerPositions[0].y;
int tickCount = cv::getTickCount();
int time = (tickCount - blob.firstTickCount)/cv::getTickFrequency();
int speed = distance/time;
double dblFontScale = blobs[currFrameIndex].dblCurrentDiagonalSize / 10.0;
int intFontThickness = (int)std::round(dblFontScale * 1.0);
std::cout<<"Speed: "<<speed<<std::endl;
cv::putText(img, std::to_string(speed), blobs[currFrameIndex].centerPositions.back(), CV_FONT_HERSHEY_SIMPLEX, dblFontScale, SCALAR_GREEN, intFontThickness);
}
}
}
In order to predict the vehicle's speed in a 3-dimensional space from a 2D image in the general case, you need to know the orientation of the vehicle (direction of travel) and distance from the camera.
If you know for example that the vehicle is travelling perpendicular to the direction the camera points (moving directly across the frame, not toward or away from the camera at all), you can use either
a) A known distance from the camera to the road and basic trigonometry, or
b) Markers of known distance
to calculate the velocity of the vehicle using several frames.
If you know the vehicle is travelling directly toward or directly away from the camera, you can use the change in width/height of the image outline to get a sense of the vehicle's speed. If you can also identify when the vehicle passes a landmark at a known distance from the camera, you can calculate the actual width/height of the vehicle and therefore accurately calculate the speed using that known width/height and rate of change of the size of the 2D projection of the vehicle.
Update
Given the additional information, it seems you can determine what Y position in the camera's 2D image corresponds to a particular distance down the road. If you measure two such points, you can count how long it takes for the lower bounds of currentBoundingRect to pass from the first point to the second point, e.g. in the diagram below to move from y=800 to y=200.
If it takes 2 seconds to move from y=800 to y=200, it also takes 2 seconds to move 100m - 50m = 50m, or 50m/2 seconds = 25m/second.
I'm using open frameworks and opencv to track blobs on a webcam. I'm getting the x value of the blob centroid and tracking it. The problem is, it jumps around allot, I'm wondering if there is a better way to compute the average position over a certain number of frames and use that number it's all being computed in the draw() function.
void testApp::draw(){
ofVec2f centroid = contourFinder.blobs[0].centroid;
int width = ofGetWidth();
float pct = (float)centroid.x / (float)width;
float totFrame = fingerMovie.getTotalNumFrames ();
float gotFrame = totFrame * pct;
}
you should create a loop for N frames, sum all coordinates you get, then divide by N.
I am not experienced with ofx but there must be a function to get next frame.
After loop ends, move camera to the average coordinate and re-initialize the loop.
Can anyone please help.
I have a cube which I have made in 3DS Max. I don't know the dimensions of the cube. Is there a way to get the vertices of each of the triangles of the faces of the cube? I am trying to get the normal to one of the faces of the cube to determine which way its pointing. So if I can determine the vertices I can get the normal for the face if I have 3 vertices, V1, V2 and V3, ordered in counterclockwise order, I can obtain the direction of the normal by computing (V2 - V1) x (V3 - V1), where x is the cross product of the two vectors.
I have looked in my models .fbx file and I can see a number of values there:
Vertices: *24 {
a: -15,-12.5,0,15,-12.5,0,-15,12.5,0,15,12.5,0,-15,-12.5,0.5,15,-12.5,0.5,-15,12.5,0.5,15,12.5,0.5}
PolygonVertexIndex: *36 {
a: 0,2,-4,3,1,-1,4,5,-8,7,6,-5,0,1,-6,5,4,-1,1,3,-8,7,5,-2,3,2,-7,6,7,-4,2,0,-5,4,6,-3}
Are these my models vertices?
Also, I would assume that Vertices: * 24 would be my list of vertices, but why is there only 24? Should a cube not have 36 vertices? And finally, if the coordinates for my vertices are PolygonVertexIndex: * 36 these values just seem off to me when I imagine the cube in my head with those dimensions?
Or alternatively, is there a automatic way to get the vertices of a cube without having to manually enter all the values for each vertex? I might have a couple of models to
Any help would be greatly appreciated
I can't figure why you need that... because when you load a model it is calculated , internally each vertex will have the normal,...
Anyway it is easy to calc...
The three first indexes define the first triangle of a face, the next three, the other triangle of a face.
You need only one triangle to calculate the normal...
So with the three indexes access to the veretex array and get three points... A, B and C
Now your normal is the result of the cross product between two vectors formed with that vertex.
Vector3 Normal = Vector3.Cross(B-A, C-B);
If the normal go back or forward will depend on the A,B,C order, can be CounterClockWise or ClockWise, but every triangle of the model will be ordered in one way. So you will have try it and fix it
You can write an XNA program which reads your normals without much hassle.
If you still want to calculate them, however, use this C# code, taken from FFWD, as a guide. Check the URL for a more detailed discussion on pros and cons. Personally, I'm not too happy with the result, but for the time being it works. Of course, since this code is FFWD related (implementation of Unity's API for XNA), it does not match XNA exactly, but the mathematics remain the same.
/// <summary>
/// Recalculates the normals.
/// Implementation adapted from http://devmaster.net/forums/topic/1065-calculating-normals-of-a-mesh/
/// </summary>
public void RecalculateNormals()
{
Vector3[] newNormals = new Vector3[_vertices.Length];
// _triangles is a list of vertex indices,
// with each triplet referencing the three vertices of the corresponding triangle
for (int i = 0; i < _triangles.Length; i = i + 3)
{
Vector3[] v = new Vector3[]
{
_vertices[_triangles[i]],
_vertices[_triangles[i + 1]],
_vertices[_triangles[i + 2]]
};
Vector3 normal = Vector3.Cross(v[1] - v[0], v[2] - v[0]);
for (int j = 0; j < 3; ++j)
{
Vector3 a = v[(j+1) % 3] - v[j];
Vector3 b = v[(j+2) % 3] - v[j];
float weight = (float)Math.Acos(Vector3.Dot(a, b) / (a.magnitude * b.magnitude));
newNormals[_triangles[i + j]] += weight * normal;
}
}
foreach (Vector3 normal in newNormals)
{
normal.Normalize();
}
normals = newNormals;
}
I'm recently playing with Away3D Library and have a problem in finding Face center in Away3D. Why Away3DLite has a face.center feature while Away3D doesn't have it ? and what is the alternative solution for this ?
If you want to find the center of a face, it's simply the average position of all the vertices making up that face:
function getFaceCenter(f : Face) : Vector3D
{
var vert : Vertex;
var ret : Vector3D = new Vector3D;
for each (vert in f.vertices) {
ret.x += vert.x;
ret.y += vert.y;
ret.z += vert.z;
}
ret.x /= f.vertices.length;
ret.y /= f.vertices.length;
ret.z /= f.vertices.length;
return ret;
}
The above is a very simple function to calculate an average, although on a 3D vector instead of a simple scalar number. That average is the center of all the vertices in the face.
If you need to do this a lot, optimize the method by preventing it from allocating a vector (by passing in a vector to which the return values should be written) and create a temporary variable for the vertex list length instead of dereferencing it through two object references like min (f and vertices), which is unnecessarily heavy.
I'm currently working on a XNA game prototype. I'm trying to achieve a isometric view of the game world (or is it othographic?? I'm not sure which is the right term for this projection - see pictures).
The world should a tile-based world made of cubic tiles (e.g. similar to Minecraft's world), and I'm trying to render it in 2D by using sprites.
So I have a sprite sheet with the top face of the cube, the front face and the side (visible side) face. I draw the tiles using 3 separate calls to drawSprite, one for the top, one for the side, one for the front, using a source rectangle to pick the face I want to draw and a destination rectangle to set the position on the screen according to a formula to convert from 3D world coordinates to isometric (orthographic?).
(sample sprite:
)
This works good as long as I draw the faces, but if I try to draw fine edges of each block (as per a tile grid) I can see that I get a random rendering pattern in which some lines are overwritten by the face itself and some are not.
Please note that for my world representation, X is left to right, Y is inside screen to outside screen, and Z is up to down.
In this example I'm working only with top face-edges. Here is what I get (picture):
I don't understand why some of the lines are shown and some are not.
The rendering code I use is (note in this example I'm only drawing the topmost layers in each dimension):
/// <summary>
/// Draws the world
/// </summary>
/// <param name="spriteBatch"></param>
public void draw(SpriteBatch spriteBatch)
{
Texture2D tex = null;
// DRAW TILES
for (int z = numBlocks - 1; z >= 0; z--)
{
for (int y = 0; y < numBlocks; y++)
{
for (int x = numBlocks - 1; x >=0 ; x--)
{
myTextures.TryGetValue(myBlockManager.getBlockAt(x, y, z), out tex);
if (tex != null)
{
// TOP FACE
if (z == 0)
{
drawTop(spriteBatch, x, y, z, tex);
drawTop(spriteBatch, x, y, z, outlineTexture);
}
// FRONT FACE
if(y == numBlocks -1)
drawFront(spriteBatch, x, y, z, tex);
// SIDE FACE
if(x == 0)
drawSide(spriteBatch, x, y, z, tex);
}
}
}
}
}
private void drawTop(SpriteBatch spriteBatch, int x, int y, int z, Texture2D tex)
{
int pX = OffsetX + (int)(x * TEXTURE_TOP_X_OFFRIGHT + y * TEXTURE_SIDE_X);
int pY = OffsetY + (int)(y * TEXTURE_TOP_Y + z * TEXTURE_FRONT_Y);
topDestRect.X = pX;
topDestRect.Y = pY;
spriteBatch.Draw(tex, topDestRect, TEXTURE_TOP_RECT, Color.White);
}
I tried using a different approach, creating a second 3-tiers nested for loop after the first one, so I keep the top face drawing in the first loop and the edge highlight in the second loop (I know, this is inefficient, I should also probably avoid having a method call for each tile to draw it, but I'm just trying to get it working for now).
The results are somehow better but still not working as expected, top rows are missing, see picture:
Any idea of why I'm having this problem? In the first approach it might be a sort of z-fighting, but I'm drawing sprites in a precise order so shouldn't they overwrite what's already there?
Thanks everyone
Whoa, sorry guys I'm an idiot :) I started the batch with SpriteBatch.begin(SpriteSortMode.BackToFront) but I didn't use any z-value in the draw.
I should have used SpriteSortMode.Deferred! It's now working fine. Thanks everyone!
Try tweaking the sizes of your source and destination rectangles by 1 or 2 pixels. I have a sneaking suspicion this has something to do with the way these rectangles are handled as sort of 'outlines' of the area to be rendered and a sort of off-by-one problem. This is not expert advice, just a fellow coder's intuition.
Looks like a sub pixel precision or scaling issue. Also try to ensure your texture/tile width/height is a power of 2 (32, 64, 128, etc.) as that could make the effect less bad as well. It's really hard to tell just from those pictures.
I don't know how/if you scale everything, but you should try to avoid rounding wherever possible (especially inside your drawTop() method). Every time you round some position/coordinate chances are good you might increase the error/random offsets. Try to use double (or better: float) coordinates instead of integer.