I saw that a few video players (e.g. AVPlayerHD) are doing hardware-accelerated playing on iOS for unsupported containers like MKV. How do you think they're achieving that?
I'm thinking reading packet with ffmpeg, decoding with Core Video. Does that make sense? I'm trying to achieve the same.
Thanks!
I think that the HW accelerators for video rendering (decoding) support fixed formats, due to hard wired logic. I don't know of any HW accelerator to be able to transcode from MKV.
Another method of accelerating video playback, would be the usage of OpenCL and make use of the integrated GPU on your device. This method enables HW acceleration of a wider area of applications.
The problem with this approach is that if you are not lucky enough to find a framework that uses OpenCL to do GPU acceleration of transcode / decode, you probably need to do it yourself.
Added info
To implement a fully HW accelerated solution you first need to transcode the MKV into H264 & sub, and from there you can use the HW decoder to render the H264 component.
For the HW accelerated transcode operations you could use GPU (via OpenCL) and/or multithreading.
I found https://handbrake.fr/ that might have some OpenCL transcoding features.
Cheers!
Related
I'm trying to figure out if there is a way to configure OpenCV 4.5.4 that uses FFMPEG back-end to write video file (via VideoWriter) with h264_v4l2m2m codec instead of h264. The difference between those 2 codecs on ffmpeg side is that h264_v4l2m2m uses hardware support to encode frames into video file.
If using ffmpeg tool directly via command line (Linux), the codec can be chosen with -vcodec argument, however, I don't see a way to accomplish the same in OpenCV and it seems to me that it just uses h264.
I notice that by means of CPU usage. h264 codec uses all cores of CPU, while h264_v4l2m2m takes just a little amount of CPU resources due to offloading encoding operations to hardware.
Thus, ffmpeg by itself works fine. The question is: How to achieve the same via OpenCV?
EDIT (Feb 2022): At this point of time this is not supported / tested on RPI4 as stated by the dev team in this comment.
I would like to stream a .avi container and not use any codec in the encoding process, that is, I do not want it to encode in H264 or H265, just upload the video and do not encode it, I am using the Azure SDK media services in .NET.
The presets that azure media services has for example in their sdk, they all use h264 or h265 to encode and return an mp4, I just want to upload .avi and see if it is possible that it does not apply any compression and then download the .avi
Thanks!
Adding the answer here. It looks like you were wanting to do a lossless, or near lossless encoding pass using CRF (constant rate factor encoding). There is currently no support for setting CRF encoding in the standard encoder in AMS, but there is work going on to add CRF encoding settings to the SDK in the near future.
For now, you are limited to the settings available in the Transform preset in the H264 or H265 Layers.
You can see all of the available encoding settings most easily in the REST API
https://github.com/Azure/azure-rest-api-specs/blob/main/specification/mediaservices/resource-manager/Microsoft.Media/stable/2021-06-01/Encoding.json
Or if you look at the Transform object in your favorite SDK. Look at the H264Video and H264 Layer classes in the model, as well as the H265 equivalent ones for settings you can control in your code.
https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.management.media.models.h264video?view=azure-dotnet
https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.management.media.models.h264layer?view=azure-dotnet
UPDATE: SDK for .NET is available now with Exposed RateControlMode for H264 encoding, enabling 2 new ratecontrol modes - CBR (Constant Bit Rate) and CRF (Constant Rate Factor).
See- https://www.nuget.org/packages/Microsoft.Azure.Management.Media
I am working on my final project in Software Engineering B.Sc. Our project includes tracking a ball in Foosball game. Actually with the size of the Foosball table I will need at least HD 1080p format (1920x1080 pixels) camera and because of the high speed I will also need 60 fps.
I will use OpenCV opensource to write code in C/C++ and detect a ball on each received frame.
So here is my issue: I need to get steam from the HD camera with 60fps, Wide-angled.
I can't use a web-cam because it will not give me HD format with 60fps
(webcams can't do this, even expensive Logitech or Microsoft while it is written on the package - actually they mean that it can be low resolution with 60 fps OR HD with 30 fps) Also it is not wide-angled.
On the other hand I would like to use a web camera because it is easy to get stream out of it.
The preferred solution is to use extreme camera (something like Go Pro but cheaper version - I have AEE S70 - about 120$) I can use HDMI output of this camera to stream data to PC. But I can't use USB, it will be recognized as a Mass Storage Device. It has micro HDMI output but I have no HDMI Input on my PC.
The question is if it is possible to find some cheap capture device (HDMI->USB3.0/PCI Express) which can stream frames as HD 1080p and 60fps from this extreme camera to PC via HDMI? What device should I use? Maybe you suggest me another camera/or better solution?
Thanks
I've been looking into this for a sport application (Kinovea). It is virtually impossible to find 1080p # 60fps due to the limits of USB 2.0 bandwidth. Actually even for lower bandwidth the camera needs to perform compression on-board.
The closest camera I found is the ELP-USBFHD01M, it's from a Chinese manufacturer and can do 720p # 60fps on the MJPEG stream. I've written a full review in the following blog post.
The nice thing about this camera for computer vision is that it has a removable M12 lens, so you can use a wide angle if you want. They sell various versions of the board with pre-mounted lenses of 140°, 180°, etc.
MJPEG format means that you'll have to decompress on the fly if you want to process each image though.
Other solutions we have explored were USB 3.0 cameras but as you mention they aren't cheap and for me the fact that they don't do on-board compression was a drawback for fast recording to disk.
Another option I haven't had time to fully investigate is the HD capture cards for gamers like AVerMedia. These cards supposedly capture HD at high speed and can stream it to central memory.
Do you really need real-time processing? If you could perform the tracking on video files that you have recorded by other means you could use even 120fps files from the GoPro and get even better results.
Your choice of 1080p with 60 fps is good for the tracking application is good and as you said most of the web cams don't support such high resolution / frame rate combinations. Instead of going for a HDMI->USB3.0/PCI Express converter for your AEE S70 (which will increase the camera latency, cost and time for you to find a solution), you can check See3CAM_CU30 which streams 1080P60 uncompressed data over USB 3.0 off the shelf. Also it costs similar to your AEE S70.
Is anybody using OpenGLES2.0 shaders (GLSL) successfully for audio synthesis?
I already use vDSP to accelerate audio in my iOS app, which provides a simple vector instruction set from C code. The main problem with vDSP is that you have to write what amounts to vector oriented assembly language, because the main per-sample loop gets pushed down into each primitive operation (vector add, vector multiply). Compiling expressions into these sequences is the essence of what shader languages automate for you. OpenCL is not public in iOS. It is also interesting that GLSL is compiled at runtime, which means that if most of the sound engine could be in GLSL, then users could make non-trivial patch contributions.
Although the iOS GPU shaders can be relatively "fast", the paths to load and recover data (textures, processed pixels, etc.) from the GPU are slow enough to more than offset any current shader computational efficiencies from using GLSL.
For real-time synthesis, the latencies of the GPU pixel unload path are much larger than the best possible audio response latency using just CPU synthesis to feed RemoteIO. e.g. display frame rates (to which the GPU pipeline is locked) are slower than optimal RemoteIO callback rates. There's just not enough parallelism to exploit within these short audio buffers.
I am currently in a webcam streaming server project that requires the function of dynamically adjusting the stream's bitrate according to the client's settings (screen sizes, processing power...) or the network bandwidth. The encoder is ffmpeg, since it's free and open sourced, and the codec is MPEG-4 part 2. We use live555 for the server part.
How can I encode MBR MPEG-4 videos using ffmpeg to achieve this?
The multi-bitrate video you are describing is called "Scalable Video Codec". See this wiki link for basic understanding.
Basically, in a scalable video codec, a base layer stream itself has completely decodable; however, additional information is represented in the form of (one or many) enhancement streams. There are couple of techniques to be able to do this including lower/higher resolution, framerate and change in Quantization. The following papers explains in details
of Scalable Video coding for MEPG4 and H.264 respectively. Here is another good paper that explains what you intend to do.
Unfortunately, this is broadly a research topic and till date no open source (ffmpeg and xvid) doesn't support such multi layer encoding. I guess even commercial encoders don't support this as well. This is significantly complex. Probably you can check out if Reference encoder for H.264 supports it.
The alternative (but CPU expensive) way could be transcode in real-time while transmitting the packets. In this case, you should start off with reasonably good quality to start with. If you are using FFMPEG as API, it should not be a problem. Generally multiple resolution could still be a messy but you can keep changing target encoding rate.