I'm using Metal to display camera frames. I convert the outputted sample buffers into id<MTLTexture>s using CVMetalTextureCacheCreateTextureFromImage, and that works great... except that the frames come rotated 90 degrees clockwise.
How can I rotate the id<MTLTexture> 90 degrees counter clockwise?
I considered doing this when I display the texture (in the MTKView), but then it will still be rotated the wrong way when I record the video.
You have at least a couple of different options here.
The easiest is probably requesting "physically" rotated frames from AVFoundation. Assuming you're using an AVCaptureConnection, you can use the setVideoOrientation API (when available) to specify that you want frames to be rotated before they're delivered. Then, displaying and recording can be handled in a uniform way with no further work on your part.
Alternatively, you could apply a rotation transform both when drawing the frame into the Metal view and when recording a movie. It sounds like you already know how to do the former. The latter just amounts to setting the transform property on your AVAssetWriterInput, assuming that's what you're using. Similar APIs exist on lower-level AVFoundation classes such as AVMutableComposition.
Related
I just know a few about opengl es 2.0, like draw 2 triangle into 2 rect or a cube. But I have no idea how to handle this. a few about vertex and fragment, not much.
I shoot a video with 360 degree, How am I supposed to play video on iOS, the functions are: you can move your phone or drag one direction to another direction, so you can watch the video in different view.
The effect should be like Kolor Eyes.
I think the steps are:
get each frame from the video (original, looks like a sphere)
handle frame one by one, to make it be view in panorama way to watch.
Hope somebody could help me out, Thanks a lot
The problem is not connected to ios or any other specific platform but first of all an algorithmic thing. How to convert the pixels from the pano view to a panaromic view? My best guess is something like a transfer function which takes pixel a at position A in the src image and transfers it into a corresponding pixel b at Position B in the destination image.
Maybe you should check the basics of texture mapping which is a common technique to map an image onto an arbitray surface.
Just as an idea: the source is a radial view ranging from 0° to 360°, so what you need is to transfer this into a view where the angle increases horizontally from 0° to 360°. Each src pixel would need an angle and a distance. Given these two properties you could write a function which puts this into a different view.
I'm doing some work with a camera and video stabilization with OpenCV.
Let's suppose I know exactly (in meters) how much my camera has moved from one frame to another and I want to use this to return the second frame where it should be.
I'm sure I have to do some math with this number before I make the translation matrix, but i'm a little lost with that... Any help?
Thanks.
EDIT:Ok I'll try to explain it better:
I want to remove from a video the movement (shaking) of the camera and I know how much the camera has moved (and the direction) from one frame to another.
So what I want to do is to move back the second frame where it should be using that information I have.
I have to make a traslation matrix for each two frames and apply it to the second frame.
But here is when I doubt: As the info I have is en meters and is the movement of the camera, and now I'm working with a image and pixels, I think I have to do some operations so the traslation is correct, but I'm not sure what they are exactly
Knowing how much the camera has moved is not enough for creating a synthesized frame. For that you'll need the 3D model of the world as well, which I assume you don't have.
To demonstrate that assume the camera movement is pure translation and you are looking at two objects, one is very far - a few kilometers away and the other is very close - a few centimeters away. The very far object will hardly move in the new frame, while the very close one can move dramatically or even disappear from the field of view of the second frame, you need to know how much the viewing angle has changed for each point and for that you need the 3D model.
Having sensor information may help in the case of rotation but it is not as useful for translations.
In my application i should play video in unusual way.
Something like interactive player for special purposes.
Main issues here:
video resolution can be from 200*200px up to 1024*1024 px
i should have ability to change speed from -60 FPS to 60 PFS (in this case video should be played slower or faster depending on selected speed, negative means that video should play in back direction)
i should draw lines and objects over the video and scale it with image.
i should have ability Zoom image and pan it if its content more than screen size
i should have ability to change brightness, contrast and invert colors of this video
Now im doing next thing:
I splited my video to JPG frames
created timer for N times per seconds (play speed control)
each timer tick im drawing new texture (next JPG frame) with OpenGL
for zoom and pan im playing with OpenGL ES transformations (translate, scale)
All looks fine until i use 320*240 px, but if i use 512*512px my play rate is going down. Maybe timer behavour problem, maybe OpenGL. Sometimes, if im trying to open big textures with high play rate (more than 10-15 FPS), application just crash with memory warnings.
What is the best practice to solve this issue? What direction should i dig? Maybe cocos2d or other game engines helps me? Mb JPG is not best solution for textures and i should use PNG or PVR or smth else?
Keep the video data as a video and use AVAssetReader to get the raw frames. Use kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange as the colorspace, and do YUV->RGB colorspace conversion in GLES. It will mean keeping less data in memory, and make much of your image processing somewhat simpler (since you'll be working with luma and chroma data rather than RGB values).
You don't need to bother with Cocos 2d or any game engine for this. I strongly recommend doing a little bit of experimenting with OpenGL ES 2.0 and shaders. Using OpenGL for video is very simple and straightforward, adding a game engine to the mix is unnecessary overhead and abstraction.
When you upload image data to the textures, do not create a new texture every frame. Instead, create two textures: one for luma, and one for chroma data, and simply reuse those textures every frame. I suspect your memory issues are arising from using many images and new textures every frame and probably not deleting old textures.
JPEG frames will be incredibly expensive to uncompress. First step: use PNG.
But wait! There's more.
Cocos2D could help you mostly through its great support for sprite sheets.
The biggest help, however, may come from packed textures a la TexturePacker. Using PVR.CCZ compression can speed things up by insane amounts, enough for you to get better frame rates at bigger video sizes.
Vlad, the short answer is that you will likely never be able to get all of these features you have listed working at the same time. Playing video 1024 x 1024 video at 60 FPS is really going to be a stretch, I highly doubt that iOS hardware is going to be able to keep up with those kind of data transfer rates at 60FPS. Even the h.264 hardware on the device can only do 30FPS at 1080p. It might be possible, but to then layer graphics rendering over the video and also expect to be able to edit the brightness/contrast at the same time, it is just too many things at the same time.
You should focus in on what is actually possible instead of attempting to do every feature. If you want to see an example Xcode app that pushes iPad hardware right to the limits, please have a look at my Fireworks example project. This code displays multiple already decoded h.264 videos on screen at the same time. The implementation is built around CoreGraphics APIs, but the key thing is that Apple's impl of texture uploading to OpenGL is very fast because of a zero copy optimization. With this approach, a lot of video can be streamed to the device.
My problem is simple: I have to process each frame of a video. The process computes a zone to crop on the original frame. To have better performances, I have to downscale the original frame. Nowadays, It is done thanks to a dedicated library. However, it is slow. We are wondering if there is any possibility to downscale this frame thanks to OpenGL ES 2.0 glsl.
David
If you're using AV Foundation to load the video from disk or to pull video from the camera, you could use my open source GPUImage framework to handle the underlying OpenGL ES processing for you.
Specifically, you can use a GPUImageCropFilter to crop out a selected region of the input video using normalized 0.0-1.0 coordinates in a CGRect. The FilterShowcase example shows how this works in practice for live video from the camera. With this, you don't need to touch any manual OpenGL ES API calls if you don't want to.
Finally, i Will use a frame buffer object to render my texture. I will set the viewport to desired size and render my texture as usual. To get back downsampled image, i Will use glGetReadPixels.
David
I have a complex pre-rendered scene which I would like to use as a backdrop in a 3D iPad game which uses a static camera.
For each frame redraw the screen will be erased to this background. The part I do not know how to do is set the depth buffer to the one stored in this pre-rendered image, so that dynamic 3D objects will respect the depth information in said image.
Is there any way to achieve this on an iPad, using opengl es 2.0?
I looked into several approaches, but could not find anything suitable so far.