I'm tracking the position of an quadrangle on a surface with my camera. This works fine.
After getting these coordinates (4 edges) I want to display a cube on this position.
(AR-like).
How to calculate the model-transformations or (if it's better) to setup the camera-position ?
Thanks in advance ...
Related
I've written a little app using CoreMotion, AV and SceneKit to make a simple panorama. When you take a picture, it maps that onto a SK rectangle and places it in front of whatever CM direction the camera is facing. This is working fine, but...
I would like the user to be able to click a "done" button and turn the entire scene into a single image. I could then map that onto a sphere for future viewing rather than re-creating the entire set of objects. I don't need to stitch or anything like that, I want the individual images to remain separate rectangles, like photos glued to the inside of a ball.
I know about snapshot and tried using that with a really wide FOV, but that results in a fisheye view that does not map back properly (unless I'm doing it wrong). I assume there is some sort of transform I need to apply? Or perhaps there is an easier way to do this?
The key is "photos glued to the inside of a ball". You have a bunch of rectangles, suspended in space. Turning that into one image suitable for projection onto a sphere is a bit of work. You'll have to project each rectangle onto the sphere, and warp the image accordingly.
If you just want to reconstruct the scene for future viewing in SceneKit, use SCNScene's built in serialization, write(to:options:delegate:progressHandler:) and SCNScene(named:).
To compute the mapping of images onto a sphere, you'll need some coordinate conversion. For each image, convert the coordinates of the corners into spherical coordinates, with the origin at your point of view. Change the radius of each corner's coordinate to the radius of your sphere, and you now have the projected corners' locations on the sphere.
It's tempting to repeat this process for each pixel in the input rectangular image. But that will leave empty pixels in the spherical output image. So you'll work in reverse. For each pixel in the spherical output image (within the 4 corner points), compute the ray (trivially done, in spherical coordinates) from POV to that point. Convert that ray back to Cartesian coordinates, compute its intersection with the rectangular image's plane, and sample at that point in your input image. You'll want to do some pixel weighting, since your output image and input image will have different pixel dimensions.
I need to display some view with notification in my scene at given position. This notification should stay the same size, no matter what is the distance. But most important is that it should look like 2d object, no matter what rotation camera has. I don't know if I can actually insert some 2d object, this would be great. So far I'm experimenting with SCNNodes containing Box. I don't know how to make them always rotate towards camera (which rotates in every axis). I tried to use
let lookAt = SCNLookAtConstraint(target: self.cameraNode)
lookAt.gimbalLockEnabled = true
notificationNode.constraints = [lookAt]
This almost works, but nodes are all rotated in some random angle. Looks like UIView with rotation applied. Can someone help me with this?
Put your 2-D object(s) on an SCNPlane. Make the plane node a child of your camera node. Position and rotate the plane as you like, then leave it alone. Anytime the camera moves or rotates, the plane will move and revolve with it always appearing the same.
Ok I know how to do it now. Create empty node without geometry, and add a new node with SCNLookAtConstraint. Then I can move this invisible node with animation, and subnode stays looking at camera.
I would like to move a 3D plane in a 3D space, and have the movement match
the screens pixels so I can snap the plane to the edges of the screen.
I have played around with the focal length, camera position and camera scale,
and I have managed to get a plane to match the screen pixels in terms of size,
however when moving the plane things are not correct anymore.
So basically my current status is that I feed the plane size with values
assuming that I am working with standard 2D graphics.
So if I set the plane size to 128x128, it more or less is viewed as a 2D sqaure with that
exact size.
I am not using and will not use Orthographic view, I am using and will be using Projection view because my application needs some perspective to it.
How can this be calculated?
Does anyone have any links to resources that I can read?
you need to grab the transformation matrices you use in the vertex shader and apply them to the point/some points that represents the plane
that will result in a set of points in -1,-1 to 1,1 (after dividing by w) which you will need to map to the viewport
I am new to OPENCV so bear with me if there are simple things that I am missing here.
I am trying to work out a camera based system that can continuously output the speed of a vehicle with the following assumptions:
1. The camera is horizontally placed and the vehicle passes near 3 to 5 feet of the camera lens.
2. The speed will not be more than 30KM/Hrs
I was hoping to start with the concept of a optical mouse which detects the displacement in the surface pattern. However I am unclear as to how to handle the background when the vehicle starts to enter the frame.
There are two methods I was interested in experiment with but am looking for further inputs.
Detect the vehicle as it enters the frame and separate from background.
Use cvGoodFeaturesToTrack to find points on the vehicle.
Track the point across the next frame. & Calculate the horizontal velocity using Lucas_Kanade Pyramid function for optical flow.
Repeat
Please suggest corrections and amendments.
Also I request more experienced members to help me code this procedure efficiently since I don't know which are the most correct functions to use here.
Thanks in advance.
Hope you will be using a simple camera with 20 fps to 30 fps and your camera is placed perpendicular to the road but away from it...the object i.e. your cars have a max velocity of 8 ms-1 in the image plane...calculate the speed of the cars in the image plane with the help of the lens you are using...
( speed in object plane / distance of camera from road ) = ( speed in image plane / focal length )
you should get in pixels per second if you know how much each pixel measures...
Steps...
You can use frame differentiation...that is subtract the current frame from the previous frame and take the absolute difference...threshold the difference...this segments out your moving car from the back ground...remember this segments all moving objects...so if u want a car and not a moving man you can use the shape characteristic that is the height is to width ratio...fit a rectangle to the segmented part and in each frame do the same steps. so in each frame you can keep a record of the coordinate of the leading edge of the bounding box... that way when a car enters the view till it pass out of the view you know for how long the car has persisted...use the number of frames , the frame rate and the coordinaes of the leading edge of the bounding box to calculate the speed...
You can use goodfeaturestotrack and optical flow of open cv...that way you can make distinguish between fast moving and slow moving objects...but keep refreshing the points that goodfeaturestotrack gives you or else any new car coming into the camera view will not be updated...record the displacement of the set of points picked by goodfeaturestotrack in each frame..that is the displacement of the moving object...calculate speed in the same way...the basic idea to calculate speed is to record the number of frames the object has persisted in the camera field of view...if your camera is fixed so is your field of view...hence what matters is in how many frames are you able to catch the object...
remember....the optical flow of opencv works for tracking slow moving objects or more theoretically the feature point (determined by goodfeatures to track..) displacement is less between 2 consecutive frames for the algorithm to work...big displacements will have some erroneous predictions by the algorithm...that is why the speed in the image plane is important..at least qualitatively you should get an idea of it...
NOTE: both the methods are for single object tracking ..for multiple object tracking you need some modifications...however you can start with either of the method...i think it will work..
I'm trying to convert the position of a point which was filmed with a freely moving camera (local space) into the position in a image of the same scene (global space). The position of the point is given in local space and I need to calculate it in global space. I have markers distributed all over the scene to have corresponding points in both global and local space to calculate the perspective transform.
I tried to calculate the perspective transform matrix by comparing the points of corresponding markers in gloabl and local space with the help of JavaCV (cvGetPerspectiveTransform(localMarker, globalMarker, mmat)). Then I transform the postion of the point in local space with the help of the perspective transform matrix (cvPerspectiveTransform(localFieldPoints, globalFieldPoints, mmat)).
I though that would be enough to solve my problem, but it doesn't quite work good. I also noticed that when I calculate the perspective transform matrix of different markers in one specific image of the video, i get diefferent perspective transform matrices. If I understood everything correct, this shouldn't happen, because the perspective is alway the same here, so I should always get the same perspective transform matrix, shouldn't I?
Because I'm quite new to all of this and this was my first attempt, I just wanted to know If the method I used is generally right or should it be done differently? Maybe I just missed something?
EDIT:
Again, I have one image of the complete scene i look at and a video from a camara which moves freely in the scene. Now I take every Image of the video and compare it with the image of the complete scene (I used different cameras for making the image and the video, so the camera intrinsics actually aren't the same. Could that be the Problem?
Perspective Transform Screenshot.
On the rigth side I have the image of the scene, on the left one Image of the video. The red circle in the left video image is the given point. The red square in the right image ist the calculated point with the help of perspective transform. As you can see, the calculated point isn't at the right position.
What I meant with „I get different perspective transform matrices“ is that when I calculate a perspective transform matrix with the help of marker „0E3E“ I get a different matrix than using marker „0272“.