My problem is simple: I have to process each frame of a video. The process computes a zone to crop on the original frame. To have better performances, I have to downscale the original frame. Nowadays, It is done thanks to a dedicated library. However, it is slow. We are wondering if there is any possibility to downscale this frame thanks to OpenGL ES 2.0 glsl.
David
If you're using AV Foundation to load the video from disk or to pull video from the camera, you could use my open source GPUImage framework to handle the underlying OpenGL ES processing for you.
Specifically, you can use a GPUImageCropFilter to crop out a selected region of the input video using normalized 0.0-1.0 coordinates in a CGRect. The FilterShowcase example shows how this works in practice for live video from the camera. With this, you don't need to touch any manual OpenGL ES API calls if you don't want to.
Finally, i Will use a frame buffer object to render my texture. I will set the viewport to desired size and render my texture as usual. To get back downsampled image, i Will use glGetReadPixels.
David
Related
I'm trying to mod Oculus World Demo to show an video stream from a camera and not a pre-set graphic, however, I'm finding it difficult to find the proper way to render an cv::IplImage or cv::mat image type onto the Oculus screen. If anyone knows how to display an image to the oculus I would be very grateful. This is for the DK 2.
Pure OpenCV isn't really well suited to rendering to the Rift, because you would need to manually implement the distortion mechanisms that are normally provided by the Oculus Rift SDK.
The best way to render an image from OpenCV onto the screen is to load the image into an OpenGL or Direct3D texture and use the 3D rendering API (GL or D3D) to place it into a rendered scene. There is an example of this in Github repository for my book on Rift development.
In summary, it sets up the video capture using the OpenCV API and then launches a thread which is responsible for capturing images from the camera device. In the main thread, the draw call renders a simple 3D scene which includes the captured image. Most of the interesting Rift related code is in the parent class, RiftApp.
In my application i should play video in unusual way.
Something like interactive player for special purposes.
Main issues here:
video resolution can be from 200*200px up to 1024*1024 px
i should have ability to change speed from -60 FPS to 60 PFS (in this case video should be played slower or faster depending on selected speed, negative means that video should play in back direction)
i should draw lines and objects over the video and scale it with image.
i should have ability Zoom image and pan it if its content more than screen size
i should have ability to change brightness, contrast and invert colors of this video
Now im doing next thing:
I splited my video to JPG frames
created timer for N times per seconds (play speed control)
each timer tick im drawing new texture (next JPG frame) with OpenGL
for zoom and pan im playing with OpenGL ES transformations (translate, scale)
All looks fine until i use 320*240 px, but if i use 512*512px my play rate is going down. Maybe timer behavour problem, maybe OpenGL. Sometimes, if im trying to open big textures with high play rate (more than 10-15 FPS), application just crash with memory warnings.
What is the best practice to solve this issue? What direction should i dig? Maybe cocos2d or other game engines helps me? Mb JPG is not best solution for textures and i should use PNG or PVR or smth else?
Keep the video data as a video and use AVAssetReader to get the raw frames. Use kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange as the colorspace, and do YUV->RGB colorspace conversion in GLES. It will mean keeping less data in memory, and make much of your image processing somewhat simpler (since you'll be working with luma and chroma data rather than RGB values).
You don't need to bother with Cocos 2d or any game engine for this. I strongly recommend doing a little bit of experimenting with OpenGL ES 2.0 and shaders. Using OpenGL for video is very simple and straightforward, adding a game engine to the mix is unnecessary overhead and abstraction.
When you upload image data to the textures, do not create a new texture every frame. Instead, create two textures: one for luma, and one for chroma data, and simply reuse those textures every frame. I suspect your memory issues are arising from using many images and new textures every frame and probably not deleting old textures.
JPEG frames will be incredibly expensive to uncompress. First step: use PNG.
But wait! There's more.
Cocos2D could help you mostly through its great support for sprite sheets.
The biggest help, however, may come from packed textures a la TexturePacker. Using PVR.CCZ compression can speed things up by insane amounts, enough for you to get better frame rates at bigger video sizes.
Vlad, the short answer is that you will likely never be able to get all of these features you have listed working at the same time. Playing video 1024 x 1024 video at 60 FPS is really going to be a stretch, I highly doubt that iOS hardware is going to be able to keep up with those kind of data transfer rates at 60FPS. Even the h.264 hardware on the device can only do 30FPS at 1080p. It might be possible, but to then layer graphics rendering over the video and also expect to be able to edit the brightness/contrast at the same time, it is just too many things at the same time.
You should focus in on what is actually possible instead of attempting to do every feature. If you want to see an example Xcode app that pushes iPad hardware right to the limits, please have a look at my Fireworks example project. This code displays multiple already decoded h.264 videos on screen at the same time. The implementation is built around CoreGraphics APIs, but the key thing is that Apple's impl of texture uploading to OpenGL is very fast because of a zero copy optimization. With this approach, a lot of video can be streamed to the device.
I have a complex pre-rendered scene which I would like to use as a backdrop in a 3D iPad game which uses a static camera.
For each frame redraw the screen will be erased to this background. The part I do not know how to do is set the depth buffer to the one stored in this pre-rendered image, so that dynamic 3D objects will respect the depth information in said image.
Is there any way to achieve this on an iPad, using opengl es 2.0?
I looked into several approaches, but could not find anything suitable so far.
I'm using a texture cache to draw video frames to the screen, just like the RosyWriter sample application from Apple.
I want to downsample an image from 1080p down to around 320x480 (for various reasons, I don't want to capture at a lower resolution) and use mipmap filtering to get rid of aliasing. However, when I try adding:
glGenerateMipmap(CVOpenGLESTextureGetTarget(inputTexture));
glTexParameteri(CVOpenGLESTextureGetTarget(inputTexture), GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
I just get a black screen, as though the mipmaps aren't being generated. I'm rendering offscreen from one texture to another. Both source and destination are mapped to pixel buffers using texture caches.
Mipmaps can only be generated for power-of-two sized textures. None of the video frame sizes returned by the iOS cameras that I can think of have power-of-two dimensions. For using the texture caches while still generating mipmaps, I think you'd have to do something like do an offscreen re-render of the texture to a power-of-two FBO backed by a texture, then generate a mipmap for that.
That said, this is probably not the best way to accomplish what you want. Mipmaps only help when making a texture smaller on the screen, not making it larger. Also, they are pretty slow to generate at runtime, so this would drag your entire video processing down.
What kind of aliasing are you seeing when you zoom in? The normal hardware texture filtering should produce a reasonably smooth image when zoomed in on a video frame. As an example of this, grab and run the FilterShowcase sample from my GPUImage framework and look at the Crop filter. Zooming in on a section of the video that way seems to smooth things out pretty nicely, just using hardware filtering.
I do employ mipmaps for smooth downsampling of large images in the framework (see the GPUImagePicture when smoothlyScaleOutput is set to YES), but again that's for shrinking an image, not zooming in on it.
I know it is possible, and a lot faster than using GDI+. However I haven't found any good example of using DirectX to resize an image and save it to disk. I have implemented this over and over in GDI+, thats not difficult. However GDI+ does not use any hardware acceleration, and I was hoping to get better performance by tapping into the graphics card.
You can load the image as a texture, texture-map it onto a quad and draw that quad in any size on the screen. That will do the scaling. Afterwards you can grab the pixel-data from the screen, store it in a file or process it further.
It's easy. The basic texturing DirectX examples that come with the SDK can be adjusted to do just this.
However, it is slow. Not the rendering itself, but the transfer of pixel data from the screen to a memory buffer.
Imho it would be much simpler and faster to just write a little code that resizes an image using bilinear scaling from one buffer to another.
Do you really need to use DirectX? GDI+ does the job well for resizing images. In DirectX, you don't really need to resize images, as most likely you'll be displaying your images as textures. Since textures can only applies on 3d object (triangles/polygons/mesh), the size of the 3d object and view port determines the actual image size displayed. If you need to scale your texture within the 3d object, just play the texture coordinate or matrix.
To manipute the texture, you can use alpha blending, masking and all sort of texture manipulation technique, if that's what you're looking for. To manipulate individual pixel like GDI+, I still think GDI+ is the way to do. DirectX was never mend to do image manipulation.