Directx shadow mapping - directx

I have successfully implemented shadow maps in my engine but the problem is the shadow map doesn't cover the whole scene. If I make the shadowmap large shadow quality will drop. So I'm trying make my shadows move with camera. I can do this if I can calculate the 8 world space positions of camera frustum vertices.
So how can I calculate world space positions of camera frustum vertices ? I'm working with Directx if it changes the way how it's calculated.
Thanks.

The frustum (near plane, far plane, fov) is in view space so multiplying it with an inverse view matrix will move it into world space. If you use DirectXMath (which I recommend) you can utilize the bounding frustum object. An example code might look something like this:
DirectX::BoundingFrustum frustum;
DirectX::BoundingFrustum::CreateFromMatrix(frustum, camera.getProjectionMatrix());
DirectX::XMMATRIX inverseViewMatrix = DirectX::XMMatrixInverse(nullptr, camera.getViewMatrix());
frustum.Transform(frustum, inverseViewMatrix);
BoundingFrustum docs: https://msdn.microsoft.com/en-us/library/windows/desktop/microsoft.directx_sdk.directxmath.boundingfrustum(v=vs.85).aspx
If your frustum is big and you have to view a large area (e.g. outdoor scene) then even a large moving shadow map might not be enough (or it takes a huge amount of memory). One technique to solve this is called cascaded shadow maps (CSM). In CSM more precise shadow maps are rendered close to the camera and less precise shadows are rendered in the distance where the low quality is not visible anyway. Here is a CSM tutorial in case you are interested: https://msdn.microsoft.com/en-us/library/windows/desktop/ee416307(v=vs.85).aspx

Related

Easiest/most robust to detect shape for OpenCV for Intersection over Union of two objects

I am trying to measure the precision of my marker tracking algorithm via post-processing a video.
My algorithm is: Find a printed planar marker in a Videostream and place a virtual marker at that position. I am working with AR.
Here are two frames of such a video:
Virtual Marker on top of detected marker
Virtual Marker with offset to actual marker
I want to calculate the Intersecion over Union / Jaccard Index of the actual marker and virtual marker. For the first picture it would give me ~98% and the second ~1/5th %. This will give me the quality for my algorithm, how precise and well it works.
I want to get the position and rotation of both markers in each frame with OpenCV and calculate the Jaccard Index. As you can see though, if I directly place a virtual marker on top of the paper marker, I will make it difficult for myself (with OpenCV) to detect them.
My idea is to not place a white marker on top of the actual marker, but place an easily detectable "thing" with a specific color or shape with an offset to the marker, let's say 10cm to the right maybe. Then I subtract the offset. So now, at the best case scenario, the position and rotation of the actual marker and the "thing" with the offset subtracted will be the same.
But what should I use as the easily detectable "thing"? I don't have enough experience with OpenCV to know what (colored?) shape I should use. The augmentation can go in front, behind, left, right... of the actual marker anytime during the video and it should do two things:
Not hinder the detection of the actual marker, like currently shown in the pictures
Be easily detectable itself
Help would be much appreciated!
Assuming you have enough white background around the visual marker:
You could use colored circles, for example in red, green, blue and black.
Use opencv blob detection [1] to detect all blobs and filter for circular ones:
Look-up average color values for detected blobs and filter for the colors of the circles.
Alternatively you could filter the whole image for each color and do blob detection on the filtered images. But this is slower.
Find the centroids (~ center point) of each blob using moments of the blob contours. [2] "Center of multiple blobs in an Image".
Now you have the four pixel positions of your circles. If you know the world coordinates of your light projected circles you can use solvePnP to get a pose from this.
Knowing the correct world coordinates is tricky in your case because you project the circle with light on a surface. This involves some 3D geometry. You need to know the transformation from camera coordinate system to pattern projector coordinate system and the projection parameters of your projector.
I guess you send the projected pattern as an image to the projector. I think you can then model the projector as a camera with a certain camera matrix (basically field of view & center point). Naturally you know the pixel coordinates of the projected circles. From this you can compute rays in 3D space (in projector coordinate system). As a starting point see [3]. Intersecting [4] them with the correct surface plane (in projector coordinate system) gives you the 3D coordinates of
the projected circle pattern in projector coordinate system. Transform these to camera coordinate system using your known transformation. Now use opencv solvePnP to determine pose of projected light marker.
How to get surface plane?
If your setup is static you could use visual marker detection of all recorded images and use mean oder median of marker pose as surface plane. Not sure what this implies for your evaluation though..
[1] https://www.learnopencv.com/blob-detection-using-opencv-python-c/
[2] https://www.learnopencv.com/find-center-of-blob-centroid-using-opencv-cpp-python/
[3] https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
[4] https://www.cs.princeton.edu/courses/archive/fall00/cs426/lectures/raycast/sld017.htm

Turn an entire SceneKit scene into an image suitable for a texture

I've written a little app using CoreMotion, AV and SceneKit to make a simple panorama. When you take a picture, it maps that onto a SK rectangle and places it in front of whatever CM direction the camera is facing. This is working fine, but...
I would like the user to be able to click a "done" button and turn the entire scene into a single image. I could then map that onto a sphere for future viewing rather than re-creating the entire set of objects. I don't need to stitch or anything like that, I want the individual images to remain separate rectangles, like photos glued to the inside of a ball.
I know about snapshot and tried using that with a really wide FOV, but that results in a fisheye view that does not map back properly (unless I'm doing it wrong). I assume there is some sort of transform I need to apply? Or perhaps there is an easier way to do this?
The key is "photos glued to the inside of a ball". You have a bunch of rectangles, suspended in space. Turning that into one image suitable for projection onto a sphere is a bit of work. You'll have to project each rectangle onto the sphere, and warp the image accordingly.
If you just want to reconstruct the scene for future viewing in SceneKit, use SCNScene's built in serialization, write(to:​options:​delegate:​progress​Handler:​) and SCNScene(named:).
To compute the mapping of images onto a sphere, you'll need some coordinate conversion. For each image, convert the coordinates of the corners into spherical coordinates, with the origin at your point of view. Change the radius of each corner's coordinate to the radius of your sphere, and you now have the projected corners' locations on the sphere.
It's tempting to repeat this process for each pixel in the input rectangular image. But that will leave empty pixels in the spherical output image. So you'll work in reverse. For each pixel in the spherical output image (within the 4 corner points), compute the ray (trivially done, in spherical coordinates) from POV to that point. Convert that ray back to Cartesian coordinates, compute its intersection with the rectangular image's plane, and sample at that point in your input image. You'll want to do some pixel weighting, since your output image and input image will have different pixel dimensions.

How to measure ratio between lines in a photo

I'm working with OpenCV for a task on measuring the solar angle in a photo (without any camera parameter). In the photo there is a straight stick with the height of 3 meters standing in the middle of the field. The shadow it casts, however, lies obliquely on the ground (not in the same projection plane as the stick). I can obtain the pixel length of the stick and shadow, but I don't know if the ratio should be directly calculated with the two numbers, since only lines within the same projection plane have the same scale.
This is more like a graphic issue rather than algorithm. Can anyone shed some light on me about how to determine the height-shadow ratio?

Calculating position of object so it matches screen pixels

I would like to move a 3D plane in a 3D space, and have the movement match
the screens pixels so I can snap the plane to the edges of the screen.
I have played around with the focal length, camera position and camera scale,
and I have managed to get a plane to match the screen pixels in terms of size,
however when moving the plane things are not correct anymore.
So basically my current status is that I feed the plane size with values
assuming that I am working with standard 2D graphics.
So if I set the plane size to 128x128, it more or less is viewed as a 2D sqaure with that
exact size.
I am not using and will not use Orthographic view, I am using and will be using Projection view because my application needs some perspective to it.
How can this be calculated?
Does anyone have any links to resources that I can read?
you need to grab the transformation matrices you use in the vertex shader and apply them to the point/some points that represents the plane
that will result in a set of points in -1,-1 to 1,1 (after dividing by w) which you will need to map to the viewport

GLKit / OpenGL-ES 2.0 - 2D Object World Space and 2D Camera

I am new to OpenGL-es 2.0 and GLKit,
and would like to ask a question.
I tried to find a good example on 2D camera but couldn't find any,
so I hope that you guys can help me :D
--
1)
Firstly, I have an object, and I store its position in GLKVector2.
I would like to know how to draw it in the world space.
2)
I have a "2D Camera" class, storing as a CGRect with its world position and size.
Its size may change depending on the "zoom" I want.
Is there any way to easily draw the objects from world space into this 2D Camera?
Is any optimization required too? such as not drawing objects outside this 2D Camera,
and clipping objects that have some parts outside of 2D Camera.
3)
If the objects are drawn into this 2D camera, how do I apply effects like clip/scale/etc so that it fits on the device screen, and draw it on the screen?
--
I have seen many things about model, view, and projection matrix, but I don't get them. I have only done XNA/Android bitmap drawing calls, which is drawing them onto a Bitmap, and resizing the Bitmap onto the screen.

Resources