Probably one of the most cliche question but here is the problem: So I have a Ubuntu Server running as an isolated machine only to handle FFMPEG jobs. It has 4 vCPU, 4GB RAM, 80GB storage. I am currently using this script to convert a video into HLS playlist: https://gist.github.com/maitrungduc1410/9c640c61a7871390843af00ae1d8758e This works fine for all video including 4K recorded from iPhone. However, I am trying to add watermark so I changed the line 106 of this script
from:
cmd+=" ${static_params} -vf scale=w=${widthParam}:h=${heightParam}"
to:
cmd+=" ${static_params} -filter_complex [1]colorchannelmixer=aa=0.5,scale=iw*0.1:-1[wm];[0][wm]overlay=W-w-5:H-h-5[out];[out]scale=w=${widthParam}:h=${heightParam}[final] -map [final]"
Now this works flawlessly in videos from Youtube or other sources but as soon as I am trying to use 4K videos from iPhone, the RAM usage grows from 250MB to 3.8GB in less than minute and crashes the entire process. So I looked out for some similar question:
FFmpeg Concat Filter High Memory Usage
https://github.com/jitsi/jibri/issues/269
https://superuser.com/questions/1509906/reduce-ffmpeg-memory-usage-when-joining-videos
ffmpeg amerge Error while filtering: Cannot allocate memory
I understand that FFMPEG requires high amount of memory consumption but I am unsure what's the exact way to process video without holding the stream in the memory but instead release any memory allocation in real-time. Even if we decide to work without watermark, It still hangs around 1.8GB RAM for processing 5 seconds 4K video and this create a risk of what if our user upload rather longer video than it will eventually crash down the server. I have thought about ulimit but this does seem like restricting FFMPEG instead of writing an improved command. Let me know how I can tackle this problem. Thanks
Okay, I found a solution. The problem is that the 4K video has extremely higher bitrate and it will load on your RAM to process the filter_complex which will eventually kill your process. To tackle this problem first thing I did was to transcode the input video to H264 format (you can put custom bitrate if you want to but I left that one out).
So I added this new command after line 58 of this script https://gist.github.com/maitrungduc1410/9c640c61a7871390843af00ae1d8758e
ffmpeg -i SOURCE.MOV -c:a aac -ar 48000 -c:v libx264 -profile:v main -crf 19 -preset ultrafast /home/myusername/myfolder/out.mp4
Now that we have a new processed out.mp4. We will go down the script line 121 and remove it. The reason for doing this is to stop FFMPEG from overloading all the command at once. Now we will remove line 107 to 109 and do this:
filters=[1]colorchannelmixer=aa=0.5,scale=iw*0.1:-1[wm];[0][wm]overlay=W-w-5:H-h-5[out];[out]scale=w=${widthParam}:h=${heightParam}[final]
cmd=""
cmd+=" ${static_params} -filter_complex ${filters} -map [final]"
cmd+=" -b:v ${bitrate} -maxrate ${maxrate%.*}k -bufsize ${bufsize%.*}k -b:a ${audiorate}"
cmd+=" -hls_segment_filename ${target}/${name}_%03d.ts ${target}/${name}.m3u8"
ffmpeg ${misc_params} -i /home/myusername/myfolder/out.mp4 -i mylogo.png ${cmd}
So now we are running FFMPEG inside a loop to handle per resolution basis output. This will eliminate the overloading of all filters in memory at once. You might even wanna remove line 53 depending on your use case.
Test
4K HEVC iPhone video of 1.2 minute long (453MB)
transcoding to H264 - Memory Usage stayed at 750MB
HLS + watermark - Memory Usage stayed between 430MB to 1.1GB
4K HEVC LG HDR video of 1.13 minute long (448MB)
transcoding to H264 - Memory Usage stayed at 800MB
HLS + watermark - Memory Usage stayed between 380MB to 850MB
My final thoughts
FFMPEG is a power eater. The total number of core/memory requirement will depend majorly on how much video you want to process. In my case, We only wanted to support upto 500MB videos so our test for 4K video processing is fitting the need but if you have larger video requirement then you have to test with more RAM/CPU core at hand
It's never a good idea to run parallel FFMPEG. Processing videos in a batch-wise will ensure optimum use of available resources and lesser chance for breaking your system in the middle of the night
Always run FFMPEG in an isolated machine away from your webserver, database, mail server, etc
Increasing resource is not always the answer. While we tend to first conclude that more resource === more stability is not always right. I've read enough thread about how even 64GB Ram with 32 core fails to keep up with FFMPEG so your best bet is to first improve your command or segregate commands into smaller command to handle the resources as effectively as possible
I'm not an expert in FFMPEG but I think this information will help someone who might have similar question.
Related
i want to transform videos to jpeg and wav as output. I wrote the programm by myself with FFmpeg-api. The video(.webm e.g.) is decoded and video and audio frames are encoded to jpeg and wav. With running of the program, more and more videos are converted but more and more rss are used(top cmd in linux).
I have also used valgrind to test if there is memory leak. And there is None.
The rss increases not linear and like follows(just example):
1 video: 30m
2 video: 150m
3 video: 200m
4 video: 220m
5 video: 220m
6 video: 230m
The programm run in docker container and k8s with memory limit(2G). After days(about 30000 videos are processed for example) the pod will be force restarted.
Is it real memory leak? Or these memory are keeped by ffmpeg lib as memory pool or something?
Thanks
We are using Web Audio API to play and manipulate audio in a web app.
When trying to decode large mp3 files (around 5MB) the memory usage spikes upwards in Safari on iPad, and if we load another similar size file it will simply crash.
It seems like Web Audio API is not really usable when running on the iPad unless we use small files.
Note that the same code works well on Chrome Desktop version - Safari version does complain on high memory usage.
Does anybody knows how to get around this issue? or what's the memory limit for playing audio files using Web Audio on an iPad?
Thanks!
Decoded audio files weight a lot more in RAM than on disk. A single sample uses 4 bytes (32-bit float). This translates to 230 MB of RAM for 10 minutes of audio at 48 000 Hz sample rate and in stereo. One hour of audio at the same sample rate and with stereo will take ~1,3 GB of RAM!
So, if you decode a lot of files, you can consume big amounts of RAM. My suggestion is to "undecode" files that you don't need (just "forget" unneeded audio buffers, so garbage collector can free memory).
You can also use mono audio files instead of stereo, that should reduce memory usage by half.
Note, that decoded audio files are always resampled to device's sample rate. This means that using audio with low sample rates won't help with memory usage.
I'm using ffmpeg to read an h264 RTSP stream from a Cisco 3050 IP camera and reencode it to disk as h264 (there are reasons why I'm not just using -codec:copy).
The ffmpeg version is as follows:
ffmpeg version 3.2.6 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 6.3.0 (Alpine 6.3.0)
I've also tried with ffmpeg 2.8.14-0ubuntu0.16.04.1 and the latest ffmpeg built from source (I used this commit) and see the same behaviour as below.
The command I'm running is:
ffmpeg -rtsp_transport udp -i 'rtsp://<user>:<pw>#<ip>:554/StreamingSetting?version=1.0&action=getRTSPStream&ChannelID=1&ChannelName=Channel1' -r 10 -c:v h264 -crf 23 -x264-params keyint=60:min-keyint=60 -an -f ssegment -segment_time 60 -strftime 1 /output/%Y%m%d_%H%M%S.ts -abort_on empty_output
I get a variety of errors at a fairly steady rate of at least one per second. Here's a sample:
[rtsp # 0x7f268c5e9220] max delay reached. need to consume packet
[rtsp # 0x7f268c5e9220] RTP: missed 40 packets
[h264 # 0x55b1e115d400] left block unavailable for requested intra mode
[h264 # 0x55b1e115d400] error while decoding MB 0 12, bytestream 114567
[h264 # 0x55b1e115d400] concealing 3889 DC, 3889 AC, 3889 MV errors in I frame
The most common one is 'error while decoding MB x x, bytestream x'. This corresponds to severe corruption in the video file when played back.
I see many references to that error message on stackoverflow and elsewhere, but I've yet to find a satisfying explanation or workaround. It comes from this line which appears to correspond to missing data at the end of the stream. 'left block unavailable' comes from here and also looks like missing data.
Others have suggested using -rtsp_transport tcp instead (1, 2, 3) which in my case just gives a slightly different mix of errors, and still video corruption:
[h264 # 0x557923191b00] left block unavailable for requested intra4x4 mode -1
[h264 # 0x557923191b00] error while decoding MB 0 28, bytestream 31068
[h264 # 0x557923191b00] concealing 2609 DC, 2609 AC, 2609 MV errors in I frame
[rtsp # 0x7f88e817b220] CSeq 5 expected, 0 received.
Using Wireshark I confirmed that in both UDP and TCP mode, all of the packets are making it from the camera to the PC (sequential RTP sequence numbers without any missing) which makes me think the data is being lost after it arrives at ffmpeg.
I also see similar behaviour when running the same command against a Panasonic WV-SFV110 camera, but with less frequent errors overall. Switching from UDP to TCP on the Panasonic camera reduces but does not completely eliminate the errors/corruption.
I also tried a similar command with VLC and got similar errors (cvlc rtsp://<user>:<pw>#<ip>/MediaInput/h264 :sout='#transcode{vcodec=h264}:std{access=file, mux=ts, dst="output.ts"}) -- presumably the code hasn't diverged much since libav forked from ffmpeg.
The camera is plugged directly into a PoE port on the PC so network congestion can't be a problem. Given that the PC has enough CPU to keep up encoding the live stream, it seems to me a problem with ffmpeg that it still drops data from the TCP stream.
Qualitatively, there are several factors which seem to make the problem worse:
Higher video resolution
Higher system load on the machine running ffmpeg (e.g. transcoding to a low res .avi file produces fewer errors than transcoding to h264 VBR; using -codec:copy eliminates all errors except a couple while ffmpeg is starting up)
Greater motion within the camera view
What the does the error mean? And what can I do about it?
Looking at the initial error message:
[rtsp # 0x7f268c5e9220] max delay reached. need to consume packet
[rtsp # 0x7f268c5e9220] RTP: missed 40 packets
I guess that you are loosing UDP packets. The rest of the H.264 error messages are caused by receiving an incomplete bitstream.
Now key is to isolate the issue. Is your network dropping packets? Or is your sever too slow or overloaded receiving the UDP (RTP).
First I'd check the UDP buffer size of your OS. https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html
If increasing the UDP buffer size doesn't help - use ffmpeg with -codec:copy to lower the CPU load. Do you still get errors?
Since you want to reencode consider using Intel Quicksync -vcodec h264_qsv or some other hardware encoder lowering your CPU load.
The question is not so much about if the PC has enough CPU. But more about identifying the bottle neck in the processing pipeline. Your H.264 encoder (x264) may over subscribe your CPU so that you get momentary peak loads that result in packet drops. Try limiting the number of threads for x264 and/or lower the quality to 'fast' or 'faster'.
It does sound like packet loss is an issue. Higher video resolution and greater motion both increase the bitrate of the encoded video stream which will increase your packet loss. Depending on which packet is lost, you will see varying errors in the decoding process as you indicated in your post.
The higher system load running ffmpeg also indicates that your network card might be dropping packets, when e.g. ffmpeg takes too long to read them while it is busy transcoding the video.
First question is what is your network topology? Streaming over the public Internet is a lot harder than streaming over your LAN. What kind of switches/routers are in the network?
Next question, what bitrate is your camera streaming at? Try reducing this and check the results. Be systematic in your approach i.e.
don't transcode at first.
just receive the video.
write it to file.
Check for packet loss/video artifacts.
start at lower bitrates e.g. 100kbps and increase this if no loss is evident
The next thing I would try to do is to increase the size of the receiver buffers. While I am not that familiar with ffmpeg, it looks like you can set it via recv_buffer_size as indicated here. You then need to work out a reasonably big enough size based on your camera configuration to store e.g. a couple (5?) of seconds of video data. Check if there are less artifacts as you increase the receiver buffer size or longer periods without artifacts.
Of course if your processor is too slow to transcode the video in real-time, you will run out of space sooner or later, in which case, you might have to transcode to a lower resolution/bitrate or use less intensive encoder settings, etc or run the transcoding on a faster machine.
Also, note that adjusting receiver buffer size will not compensate for packet loss occurring on the public Internet so the above will help assuming you're streaming on a local network that supports the bitrate of the camera. If you exceed the bandwidth of the network you can expect packet loss. In that case streaming over TCP could help somewhat (at least until the receiver buffer overruns eventually).
More things you can try if the above does not help or solve the problem completely:
Sniff the incoming traffic with wireshark or tcpdump.
Have a look at the traces. Filter the trace using "RTSP".
You should be able to see the RTP traffic where consecutive RTP packets have increasing sequence numbers e.g. 20, 21, 22, 23, etc. If you see missing sequence numbers, then you've got packet loss and try streaming over TCP. Repeat the trace while streaming over TCP. Also, remember to increase the receiver buffer size also when streaming over TCP.
In summary you have a pipeline architecture and you need to determine where in the pipeline the loss is occurring:
camera -> network -> receiver buffer (OS) -> application (ffmpeg)
I'm using a Raspberry Pi 2 to route wifi-eth connections. So from the eth side I have a computer that will connect to internet using the Pi wifi connection. On the Raspberry I started htop to monitor the CPUs load, then on the computer I started chrome and played a 20-minute 1080 video. The load on the CPU didn't seem to go beyond 5% anyhow. After that I closed youtube tab and started a download of a binary file of 5GB from the first row here (https://testdebit.info/). Well, I noticed that CPU load was much more higher, around 10%!
Any explanation of such a difference?
It has to do with compression and how video is encoded. A normal file can be compressed, but nothing like that of a video stream.
A video stream can achieve very high compressions due to the predictable characteristics of video, e.g. video from one frame to another doesn't change much. As such, video will send a whole frame (I-frame) and then update it with just the changes (P-frame). It's even possible to do backward prediction (B-frame). Here's a wikipedia reference.
Yes, I hear your next unspoken question: Doesn't more compression mean more CPU time to uncompress? That's true for a lot of types of compression, such as that used by zip files. But since raw video is not very information dense over time, you have compression techniques that in essence reduce the amount of data you send with very little CPU usage.
I hope this helps.
I have used the vlc plugin(vlc web plugin 2.1.3.0) in Firefox to display the receiving live stream from my server into my browser. and i need to display 16 channels into one web page, but when i play more than 10 channels in the same time, i show that the processor is 100% and some breaking in the video appear. i have checked the plugin-memory in the running task, i have showed that around 45 MB from memory is dedicated for each video (so 10 channels : 10 * 45 = 450 MB).
kindly, do you have any method to reduce the consumption of the VLC plugin to allow the display of 16 channels in the same time ?
best regards,
There is no way to do that correctly. You could probably save a few megabytes by disabling audio decoding if there are audio tracks in one of your 16 streams in case you don't need them. Except for that, 45MB per stream is quite reasonable in terms of VLC playback and won't be able to go much below that, unless you reduce the video dimensions.
Additionally, your problem is probably not the use of half a giga byte of memory (Chrome and Firefox easily manage to use that much memory by themselves if you open a few tabs), but that VLC exceeds your CPU capacity. Make sure not to use windowless playback since this is less efficient that the normal windowed mode.
VLC 2.2 will improve the performance of the webplugins on windows by adding hardware acceleration known from the standalone application.