bitrate quality level for specific device - adaptive-bitrate

I'm looking mediaconvert service from aws to transcode videos. The value I'm trying to set just now is quality level (QL) for QVBR, according with this it could depends on the platform, for example for 720p/1080p resolution it proposes QL=8/9 (for TV), QL=7 (for tablet), QL=6 (for smartphone).
In fact, the app have a version for the 3 type of devices then I'm asking: I need to keep 3 versions for the same video? I want to save some money in streaming and my app has similar number of users using it in each platform, I want to save in bandwidth but providing good quality videos

Higher QVBR quality levels (QL) correspond to higher bitrates in the output.
For a large display such as a TV, a higher QVBR QL is recommended to help improve the viewer experience. But when viewing the same content on a smaller display such as a phone you may not need all of those extra bits to still have a good experience.
In general, it's recommended to create an output targeted for each of the various devices or resolutions content will be viewed on. This will help save bandwidth for the smaller devices while still delivering high quality for the larger ones.
This concept is referred to as Adaptive Bitrate (ABR) Streaming, and is a common feature of streaming formats such as HLS and DASH (among others). The MediaConvert documentation has a section on how to create ABR outputs as well: https://docs.aws.amazon.com/mediaconvert/latest/ug/video-abr-streaming-outputs.html

Related

The sound quality of slow playback using AVPlayer is not good enough even when using AVAudioTimePitchAlgorithmSpectral

In iOS, playback rate can be changed by setting AVPlayer.rate.
When AVPlayback rate is set to 0.5, the playback becomes slow.
By default, the sound quality of the playback at 0.5 playback rate is terrible.
To increase the quality, you need to set AVPlayerItem.audioTimePitchAlgorithm.
According to the API documentation, setting AVPlayerItem.audioTimePitchAlgorithm to AVAudioTimePitchAlgorithmSpectral makes the quality the highest.
The swift code is:
AVPlayerItem.audioTimePitchAlgorithm = AVAudioTimePitchAlgorithm.spectral // AVAudioTimePitchAlgorithmSpectral
AVAudioTimePitchAlgorithmSpectral increases the quality more than default quality.
But the sound quality of AVAudioTimePitchAlgorithmSpectral is not good enough.
The sound still echoed and it is stressful to listen to it.
In Podcast App of Apple, when I set playback speed to 1/2, the playback becomes slow and the sound quality is very high, no echo at all.
I want my app to provide the same quality as the Podcast App of Apple.
Are there iOS APIs to increase sound quality much higher than AVAudioTimePitchAlgorithmSpectral?
If not, why Apple doesn't provide it, even though they use it in their own Podcast App?
Or should I use third party library?
Are there good libraries which is free or low price and which many people use to change playback speed?
I've been searching and trying to learn AudioKit and Audio Unit or even considering purchasing a third party time-stretch audio processing library to fix the quality issue of slow playback for the last 3 weeks.
Now finally I found a super easy solution.
AVPlayer can slow down audio with very good quality by setting AVPlayerItem.audioTimePitchAlgorithm
to AVAudioTimePitchAlgorithm.timeDomain instead of AVAudioTimePitchAlgorithm.spectral.
The documentation says:
timeDomain is a modest quality pitch algorithm that is less computationally intensive. Suitable for voice.
This means spectral is suitable for music. timeDomain is suitable for voice.
That's why the voice files which my app uses was echoed.
And that's why Apple's Podcasts App's slowed down audio quality is very high.
It must also uses this time domain algorithm.
And that's why AudioKit, which seems to be developed for music use, plays voice audio with bad quality.
I've encountered the same issues with increasing/decreasing speed while maintaining some level of quality. I couldn't get it to work well using Apples API's.
In the end I found that it's worth taking a look at this excellent 3rd party framework:
https://github.com/AudioKit/AudioKit
which allows you to do that and much more, in a straightforward manner.
Hope this helps

How to amplify the voice recorded from far distance

When a person speaks far away from a mobile, the voice recorded is low.
When a person speaks near a mobile, the voice recorded is high. I want to is to play the human voice in equal volume no matter how far away (not infinite) he is from the phone when the voice is recorded.
What I have already tried:
adjust the volume based on the dB such as AVAudioPlayer But
the problem is that the dB contains all the environmental sound. So
it only works when the human voice vary heavily.
Then I thought I should find a way to sample the intensity of the
human voice in the media which leads me to voice recognition. But
this is a huge topic. I cannot narrow the areas which could
solve my problems.
The voice recorded from distance suffers from significant corruption. One problem is noise, another is echo. To amplify it you need to clean voice from echo and noise. Ideally you need to do that with a better microphone, but if only a single microphone is available you have to apply signal processing. The signal processing algorithms you are interested in are:
Noise cancellation. You can find many samples on Google from simple
to very advanced ones
Echo cancellation. Again you can find many implementations.
There is no ready library to do the above, you will have to implement a large part yourself, you can look on the WebRTC code which has both noise and echo cancellation, like described in this question:
Is it possible to reduce background noise while streaming audio on the iPhone?

how to equalize all ios audio streams?

I decided to write an equalizer of ios, which would allow to change the level of audio frequencies to improve the audibility of sound for people with hearing problems. For example in my left ear is missing audibility of high frequencies, and I would like to be able to increase the high frequencies in all applications (skype, youtube etc), including a voice call over the cellular connection. How it could be implemented? Sorry for my bad english.

Streaming Video to PC in HD 60 fps

I am working on my final project in Software Engineering B.Sc. Our project includes tracking a ball in Foosball game. Actually with the size of the Foosball table I will need at least HD 1080p format (1920x1080 pixels) camera and because of the high speed I will also need 60 fps.
I will use OpenCV opensource to write code in C/C++ and detect a ball on each received frame.
So here is my issue: I need to get steam from the HD camera with 60fps, Wide-angled.
I can't use a web-cam because it will not give me HD format with 60fps
(webcams can't do this, even expensive Logitech or Microsoft while it is written on the package - actually they mean that it can be low resolution with 60 fps OR HD with 30 fps) Also it is not wide-angled.
On the other hand I would like to use a web camera because it is easy to get stream out of it.
The preferred solution is to use extreme camera (something like Go Pro but cheaper version - I have AEE S70 - about 120$) I can use HDMI output of this camera to stream data to PC. But I can't use USB, it will be recognized as a Mass Storage Device. It has micro HDMI output but I have no HDMI Input on my PC.
The question is if it is possible to find some cheap capture device (HDMI->USB3.0/PCI Express) which can stream frames as HD 1080p and 60fps from this extreme camera to PC via HDMI? What device should I use? Maybe you suggest me another camera/or better solution?
Thanks
I've been looking into this for a sport application (Kinovea). It is virtually impossible to find 1080p # 60fps due to the limits of USB 2.0 bandwidth. Actually even for lower bandwidth the camera needs to perform compression on-board.
The closest camera I found is the ELP-USBFHD01M, it's from a Chinese manufacturer and can do 720p # 60fps on the MJPEG stream. I've written a full review in the following blog post.
The nice thing about this camera for computer vision is that it has a removable M12 lens, so you can use a wide angle if you want. They sell various versions of the board with pre-mounted lenses of 140°, 180°, etc.
MJPEG format means that you'll have to decompress on the fly if you want to process each image though.
Other solutions we have explored were USB 3.0 cameras but as you mention they aren't cheap and for me the fact that they don't do on-board compression was a drawback for fast recording to disk.
Another option I haven't had time to fully investigate is the HD capture cards for gamers like AVerMedia. These cards supposedly capture HD at high speed and can stream it to central memory.
Do you really need real-time processing? If you could perform the tracking on video files that you have recorded by other means you could use even 120fps files from the GoPro and get even better results.
Your choice of 1080p with 60 fps is good for the tracking application is good and as you said most of the web cams don't support such high resolution / frame rate combinations. Instead of going for a HDMI->USB3.0/PCI Express converter for your AEE S70 (which will increase the camera latency, cost and time for you to find a solution), you can check See3CAM_CU30 which streams 1080P60 uncompressed data over USB 3.0 off the shelf. Also it costs similar to your AEE S70.

Can I use ffmpeg to create multi-bitrate (MBR) MPEG-4 videos?

I am currently in a webcam streaming server project that requires the function of dynamically adjusting the stream's bitrate according to the client's settings (screen sizes, processing power...) or the network bandwidth. The encoder is ffmpeg, since it's free and open sourced, and the codec is MPEG-4 part 2. We use live555 for the server part.
How can I encode MBR MPEG-4 videos using ffmpeg to achieve this?
The multi-bitrate video you are describing is called "Scalable Video Codec". See this wiki link for basic understanding.
Basically, in a scalable video codec, a base layer stream itself has completely decodable; however, additional information is represented in the form of (one or many) enhancement streams. There are couple of techniques to be able to do this including lower/higher resolution, framerate and change in Quantization. The following papers explains in details
of Scalable Video coding for MEPG4 and H.264 respectively. Here is another good paper that explains what you intend to do.
Unfortunately, this is broadly a research topic and till date no open source (ffmpeg and xvid) doesn't support such multi layer encoding. I guess even commercial encoders don't support this as well. This is significantly complex. Probably you can check out if Reference encoder for H.264 supports it.
The alternative (but CPU expensive) way could be transcode in real-time while transmitting the packets. In this case, you should start off with reasonably good quality to start with. If you are using FFMPEG as API, it should not be a problem. Generally multiple resolution could still be a messy but you can keep changing target encoding rate.

Resources