I am using DirectShow to capture images from a USB-connected camera. My goal is to have the captured camera image on DirectX 11 Texture2D to use it for rendering, and I would like it to happen automatically by DirectShow graph, without having the buffers being copied to the user space at all.
Looked at many examples/threads, but could not see how to do exactly that. I find many recommendations to use Media Foundation instead, but for current project it's not an option at this point.
There seem to be examples of playback on DirectX 9 texture, maybe there is a way to get a "Dx9" texture out of "Dx11" and use it in the rendering later?
Related
I am currently working on a service to update the content of the video bases on the markers present in the video. I was curious if we can use Vuforia to achieve the same by providing the pre-recorded video as an input to Vuforia instead of the live camera feed from the mobile phone.
TLDR; This is not possible because replacing the camera is not a function that neither Vuforia or ARKit expose.
Aside from not exposing the camera both frameworks use a combination of camera input and sensor data (gyro, accelerometer, compass, altitude, etc) to calculate the camera/phone's position (translation/rotation) relative to the marker image.
The effect you are looking for is image tracking and rendering within a video feed. You should consider OpenCV for the feature point tracking, or some computer vision library. With regard to rendering there's three options SceneKit, Metal, or OpenGL. Following Apple's lead you could use SceneKit for the rendering, similar to how ARKit handles the sensor inputs and uses SceneKit for rendering. If you are ambitious and want to control the rendering as well you could use Metal or OpenGL.
I'm trying to mod Oculus World Demo to show an video stream from a camera and not a pre-set graphic, however, I'm finding it difficult to find the proper way to render an cv::IplImage or cv::mat image type onto the Oculus screen. If anyone knows how to display an image to the oculus I would be very grateful. This is for the DK 2.
Pure OpenCV isn't really well suited to rendering to the Rift, because you would need to manually implement the distortion mechanisms that are normally provided by the Oculus Rift SDK.
The best way to render an image from OpenCV onto the screen is to load the image into an OpenGL or Direct3D texture and use the 3D rendering API (GL or D3D) to place it into a rendered scene. There is an example of this in Github repository for my book on Rift development.
In summary, it sets up the video capture using the OpenCV API and then launches a thread which is responsible for capturing images from the camera device. In the main thread, the draw call renders a simple 3D scene which includes the captured image. Most of the interesting Rift related code is in the parent class, RiftApp.
I saw that someone has made an app that tracks your feet using the camera, so that you can kick a virtual football on your iPhone screen.
How could you do something like this? Does anyone know of any code examples or other information about using the iPhone camera for detecting objects and tracking them?
I just gave a talk at SecondConf where I demonstrated the use of the iPhone's camera to track a colored object using OpenGL ES 2.0 shaders. The post accompanying that talk, including my slides and sample code for all demos can be found here.
The sample application I wrote, whose code can be downloaded from here, is based on an example produced by Apple for demonstrating Core Image at WWDC 2007. That example is described in Chapter 27 of the GPU Gems 3 book.
The basic idea is that you can use custom GLSL shaders to process images from the iPhone camera in realtime, determining which pixels match a target color within a given threshold. Those pixels then have their normalized X,Y coordinates embedded in their red and green color components, while all other pixels are marked as black. The color of the whole frame is then averaged to obtain the centroid of the colored object, which you can track as it moves across the view of the camera.
While this doesn't address the case of tracking a more complex object like a foot, shaders like this should be able to be written that could pick out such a moving object.
As an update to the above, in the two years since I wrote this I've now developed an open source framework that encapsulates OpenGL ES 2.0 shader processing of images and video. One of the recent additions to that is a GPUImageMotionDetector class that processes a scene and detects any kind of motion within it. It will give you back the centroid and intensity of the overall motion it detects as part of a simple callback block. Using this framework to do this should be a lot easier than rolling your own solution.
My problem is simple: I have to process each frame of a video. The process computes a zone to crop on the original frame. To have better performances, I have to downscale the original frame. Nowadays, It is done thanks to a dedicated library. However, it is slow. We are wondering if there is any possibility to downscale this frame thanks to OpenGL ES 2.0 glsl.
David
If you're using AV Foundation to load the video from disk or to pull video from the camera, you could use my open source GPUImage framework to handle the underlying OpenGL ES processing for you.
Specifically, you can use a GPUImageCropFilter to crop out a selected region of the input video using normalized 0.0-1.0 coordinates in a CGRect. The FilterShowcase example shows how this works in practice for live video from the camera. With this, you don't need to touch any manual OpenGL ES API calls if you don't want to.
Finally, i Will use a frame buffer object to render my texture. I will set the viewport to desired size and render my texture as usual. To get back downsampled image, i Will use glGetReadPixels.
David
I have a complex pre-rendered scene which I would like to use as a backdrop in a 3D iPad game which uses a static camera.
For each frame redraw the screen will be erased to this background. The part I do not know how to do is set the depth buffer to the one stored in this pre-rendered image, so that dynamic 3D objects will respect the depth information in said image.
Is there any way to achieve this on an iPad, using opengl es 2.0?
I looked into several approaches, but could not find anything suitable so far.