How to download 1080p + best audio using youtube-dl? - youtube

I'm trying to download 1920x1080 video with 192kb/s audio but I'm unable to extract both.
For example:
format code extension resolution note
249 webm audio only DASH audio 51k , opus # 50k, 600.56KiB
250 webm audio only DASH audio 67k , opus # 70k, 788.31KiB
251 webm audio only DASH audio 132k , opus #160k, 1.53MiB
140 m4a audio only DASH audio 134k , m4a_dash container, mp4a.40.2#128k, 1.52MiB
171 webm audio only DASH audio 135k , vorbis#128k, 1.53MiB
160 mp4 256x144 144p 110k , avc1.42c00c, 12fps, video only, 1.30MiB
278 webm 256x144 144p 134k , webm container, vp9, 12fps, video only, 1.18MiB
133 mp4 426x240 240p 248k , avc1.4d4015, 24fps, video only, 2.91MiB
242 webm 426x240 240p 254k , vp9, 24fps, video only, 2.30MiB
243 webm 640x360 360p 475k , vp9, 24fps, video only, 4.17MiB
134 mp4 640x360 360p 617k , avc1.4d401e, 24fps, video only, 5.00MiB
244 webm 854x480 480p 861k , vp9, 24fps, video only, 7.31MiB
135 mp4 854x480 480p 1110k , avc1.4d401e, 24fps, video only, 9.33MiB
247 webm 1280x720 720p 1691k , vp9, 24fps, video only, 13.90MiB
136 mp4 1280x720 720p 2220k , avc1.4d401f, 24fps, video only, 17.41MiB
248 webm 1920x1080 1080p 3044k , vp9, 24fps, video only, 24.43MiB
137 mp4 1920x1080 1080p 4160k , avc1.640028, 24fps, video only, 31.42MiB
17 3gp 176x144 small , mp4v.20.3, mp4a.40.2# 24k
36 3gp 320x180 small , mp4v.20.3, mp4a.40.2
43 webm 640x360 medium , vp8.0, vorbis#128k
18 mp4 640x360 medium , avc1.42001E, mp4a.40.2# 96k
22 mp4 1280x720 hd720 , avc1.64001F, mp4a.40.2#192k (best)
I'm trying to merge the 1920x1080 video from 137 and 192kb/s audio from 22 (as it's the best available).
-f 'bestvideo[height<=1080]+bestaudio/best[height<=1080]' --merge-output-format mp4
But the audio bitrate of output file was 125kb/s.
How can I download the required specifications?
EDIT 1:
Link for the example video: https://www.youtube.com/watch?v=i-c-K3pNtj4
NOTE: I don't know too much about audio codecs but I want to select the best one so if possible, please provide the facts that can be used to identify the best audio if it outfactors the bitrate.

First of all, a word about audio quality: As described on our sister site, mp3 becomes indistinguishable from lossless CD quality (transparent) at about 192kb/s with a constant bitrate. However, any modern encoder is using a variable bitrate (VBR), putting more quality in some sections than others. With VBR, the cutoff is likely a bit lower. With professional ears and equipment, it may be a little bit higher.
AAC and Vorbis are one generation farther than MP3. This seems to be the most comprehensive quality test - at least the one I could find. AAC and Vorbis have been claimed to be transparent at 128kb/s, although I'd guess 160kb/s is a more realistic threshold.
Opus is yet another significant improvement, being reasonably good for music at 64kb/s and probably transparent at 128kb/s.
When youtube-dl lists the format quality for YouTube, the quality it lists is hardcoded. Other websites supported sometimes relay quality information in advance, but for YouTube we'd have to download at least the headers of each file.
I got bad news for your claim to be able to hear the difference between 192kb/s and 128kb/s on this video: All the audio formats offered for this video (namely, 251, 140, 171 and 22) are encoded with 128KB/s VBR. You can check so by downloading them (for 22, you need to split off the audio) and comparing file sizes: They're all 1.6MB = 12.8 Mb (conveniently, the video is 100 seconds long).
In particular, the codecs are opus(251), Vorbis(171), AAC(140 and 22). Of these, Opus definitely offers the highest quality. So why does youtube-dl pick Vorbis with bestaudio? The way I originally designed the youtube-dl format selection, it would have picked the Opus indeed. But there was significant user feedback that some formats may be of worse quality, but supported more broadly.
Even today, lots of applications are unable to handle Opus, or even Vorbis or AAC and their container. A high-quality music player such as VLC will support everything, but out of the box, many laptops will be limited; smartphones more so, and smartwatches or smart headphones even more so. This is why most podcasts will still serve mp3 files - it is a much worse user experience to be unable to play a file at all, than a slight degradation in audio quality and/or file size. In addition, some of these formats are also free, while others are not, bringing further problems on systems configured to use only free software.
If you value audio quality above all, you should pick format 251 here. Store your preferences in a configuration file to make them permanent.
Note that all of this discussion presumes that the original audio source is high-fidelity, if possible lossless. Since the uploader of that video is called MikeTheAnimeRunnerX2, I would not presume expert audio recording skills - although there is credit of the original singer, so he may have gotten a high-quality file in private. If the audio that was uploaded to YouTube was in a lossy format (especially one at the edge of transparency or lower), all the further reencoding by YouTube can do is minimize further artifacts.
Note that to non-experts, worse compression can sometimes sound better, especially when the original source is not that good, noisy, or has been degraded by lossy reencodings. This is because worse compression will remove some inaccuracies and may make the sound more "smooth".
Fortunately, youtube-dl gives you the option to test multiple formats. Just download all candidates (e.g. with youtube-dl -f 251 i-c-K3pNtj4, or -f bestvideo+251 to get a video file) and pick the one you like most.

Related

FFMPEG eats all the memory

Probably one of the most cliche question but here is the problem: So I have a Ubuntu Server running as an isolated machine only to handle FFMPEG jobs. It has 4 vCPU, 4GB RAM, 80GB storage. I am currently using this script to convert a video into HLS playlist: https://gist.github.com/maitrungduc1410/9c640c61a7871390843af00ae1d8758e This works fine for all video including 4K recorded from iPhone. However, I am trying to add watermark so I changed the line 106 of this script
from:
cmd+=" ${static_params} -vf scale=w=${widthParam}:h=${heightParam}"
to:
cmd+=" ${static_params} -filter_complex [1]colorchannelmixer=aa=0.5,scale=iw*0.1:-1[wm];[0][wm]overlay=W-w-5:H-h-5[out];[out]scale=w=${widthParam}:h=${heightParam}[final] -map [final]"
Now this works flawlessly in videos from Youtube or other sources but as soon as I am trying to use 4K videos from iPhone, the RAM usage grows from 250MB to 3.8GB in less than minute and crashes the entire process. So I looked out for some similar question:
FFmpeg Concat Filter High Memory Usage
https://github.com/jitsi/jibri/issues/269
https://superuser.com/questions/1509906/reduce-ffmpeg-memory-usage-when-joining-videos
ffmpeg amerge Error while filtering: Cannot allocate memory
I understand that FFMPEG requires high amount of memory consumption but I am unsure what's the exact way to process video without holding the stream in the memory but instead release any memory allocation in real-time. Even if we decide to work without watermark, It still hangs around 1.8GB RAM for processing 5 seconds 4K video and this create a risk of what if our user upload rather longer video than it will eventually crash down the server. I have thought about ulimit but this does seem like restricting FFMPEG instead of writing an improved command. Let me know how I can tackle this problem. Thanks
Okay, I found a solution. The problem is that the 4K video has extremely higher bitrate and it will load on your RAM to process the filter_complex which will eventually kill your process. To tackle this problem first thing I did was to transcode the input video to H264 format (you can put custom bitrate if you want to but I left that one out).
So I added this new command after line 58 of this script https://gist.github.com/maitrungduc1410/9c640c61a7871390843af00ae1d8758e
ffmpeg -i SOURCE.MOV -c:a aac -ar 48000 -c:v libx264 -profile:v main -crf 19 -preset ultrafast /home/myusername/myfolder/out.mp4
Now that we have a new processed out.mp4. We will go down the script line 121 and remove it. The reason for doing this is to stop FFMPEG from overloading all the command at once. Now we will remove line 107 to 109 and do this:
filters=[1]colorchannelmixer=aa=0.5,scale=iw*0.1:-1[wm];[0][wm]overlay=W-w-5:H-h-5[out];[out]scale=w=${widthParam}:h=${heightParam}[final]
cmd=""
cmd+=" ${static_params} -filter_complex ${filters} -map [final]"
cmd+=" -b:v ${bitrate} -maxrate ${maxrate%.*}k -bufsize ${bufsize%.*}k -b:a ${audiorate}"
cmd+=" -hls_segment_filename ${target}/${name}_%03d.ts ${target}/${name}.m3u8"
ffmpeg ${misc_params} -i /home/myusername/myfolder/out.mp4 -i mylogo.png ${cmd}
So now we are running FFMPEG inside a loop to handle per resolution basis output. This will eliminate the overloading of all filters in memory at once. You might even wanna remove line 53 depending on your use case.
Test
4K HEVC iPhone video of 1.2 minute long (453MB)
transcoding to H264 - Memory Usage stayed at 750MB
HLS + watermark - Memory Usage stayed between 430MB to 1.1GB
4K HEVC LG HDR video of 1.13 minute long (448MB)
transcoding to H264 - Memory Usage stayed at 800MB
HLS + watermark - Memory Usage stayed between 380MB to 850MB
My final thoughts
FFMPEG is a power eater. The total number of core/memory requirement will depend majorly on how much video you want to process. In my case, We only wanted to support upto 500MB videos so our test for 4K video processing is fitting the need but if you have larger video requirement then you have to test with more RAM/CPU core at hand
It's never a good idea to run parallel FFMPEG. Processing videos in a batch-wise will ensure optimum use of available resources and lesser chance for breaking your system in the middle of the night
Always run FFMPEG in an isolated machine away from your webserver, database, mail server, etc
Increasing resource is not always the answer. While we tend to first conclude that more resource === more stability is not always right. I've read enough thread about how even 64GB Ram with 32 core fails to keep up with FFMPEG so your best bet is to first improve your command or segregate commands into smaller command to handle the resources as effectively as possible
I'm not an expert in FFMPEG but I think this information will help someone who might have similar question.

Streaming audio distorted when played in mobile safari in iOS

We are hosting mp3 files on AWS s3. We have built a web app (in React) that will play back the mp3s. However, it sometimes becomes distorted when played in Safari on iOS. The strange thing is that this does not happen all the time.
Here is the original file (sometimes distorted): https://sayyit-prod-static-assets.s3.amazonaws.com/static/audio/Darrin+M.+McMahon.original.mp3
Here is the file sounds when distorted: https://sayyit-prod-static-assets.s3.amazonaws.com/static/audio/WhatsApp+Video+2019-09-26+at+11.06.49+AM.mp4
Now, this distortion only happens when playing it through our app. When we provide a direct link to s3 (like I did above), it works. The distortion also happens when linking directly to s3 in our app.
Here are some ideas:
The mp3 file is broken
When going directly to the S3 link, it downloads entirely, which seems to allow the mp3 file to play perfectly
Any help would be greatly appreciated.
The sample rate on this MP3 file is 16 kHz. That's very low (not abnormal for voice), but also uncharacteristically low for a 128k MP3. I suspect that there's a bug with the resampler (as the iPhone hardware is locked to 48 kHz anyway), or that you're hitting an edge case bug with the decoder.
I'd recommend that you stop using MP3 and solve a few things at once. While MP3 is of acceptable quality, it's quality for a given bitrate isn't as good as alternatives. These days, you should consider using Opus. It's supported on iOS if muxed into a CAF file, and is extremely efficient. You could drop the bitrate down to 48k for voice and still have excellent quality. And, you'll bypass whatever resampling or decoding issue you're having now all in one go.

AVPlayer / AVFoundation play specified HLS bandwidth and resolution

Is there any good solution of iOS AVPlayer to let user choose specified HLS video resolution / bandwidth?
So the question will be separated in to two
Get the resolution / bandwidth list in m3u8:
Specify the stream resolution and bandwidth
For 1. A workaround solution is to get indicatedBitrate of
AVPlayerItemAccessLogEvent
(Get bandwidth of stream from m3u stream)
The other possible solution is to download and parse m3u8, apart from the AVPlayer interface.
For 2. A workaround solution to change the default adaptive behavior of AVPlayer is to use preferredPeakBitRate or preferredMaximumResolution. But video quality might still get lower if network gets slower. (Change HLS bandwidth manually?)
Thank you.

iOS compatible live mp4 stream

I have a video source which gives me a raw h264 stream. I need to re-stream this live input in a way it is cross-compatible and playable without any plugin. I've tried using ffmpeg+ffserver to produce a fragmented mp4, but unfortunately my iPhone isn't playing it.
Is there a way to make it (raw h264 in mp4 container) playable in iOS's Safari, or maybe another cross-platform container?
Ps: i'm using a raspberry pi 3 to host ffmpeg processes, so i'm avoiding re encoding tasks; instead i'm just trying to fit my raw h264 in a "ios-compatible" container and make it accessible through a media server.
For live streams you must use HTTP Live Streaming (HLS) with either the traditional MPEG-TS or fMP4 for newer iOS versions (see Apple HLS examples).
With FFmpeg you can use the hls muxer. The hls_segment_type option is used to choose between mpegts and fmp4.

Is transmitting PES(Packetised Elementary Stream) better or MPEG-TS in Live Streaming from mobile (IOS) to server

I am developing a Live Streaming application where mobile application (iOS/Android) records the video and Audio and then encodes raw pixels to h.264 and AAC using VideoToolBox and AudioToolBox, these encoded pixels are converted to PES (Packetized Elementary Stream) separately (Video & Audio). Now we are stuck at what to transfer to the server either PES or MPEG-TS, which one gives the minimum latency and packet loss like Periscope, Meerkat, Qik, UStream and other live streaming applications.
For transmitting which networking protocol is best suitable TCP/UDP.
And what is required at server to receive these packets. I know FFMPEG will trans-code and generates the segmented files(.ts) and .m3u8 file for HLS streaming, but do we need any pipe before FFMPEG layer?
Please give me some ideas in terms of which is best and what are pros and cons of each.
Thanks
Shiva.

Resources