Parse stats in ffprobe - parsing

I'm gathering some metadata information of audio files using ffprobe. However, due to my unfamiliarity with this tool, I'm getting extra information that's not necessary. This is the command that I'm running:
ffprobe -f lavfi -i amovie=<audio_file>,astats=metadata=1:reset=4400 -hide_banner
This is a short sample output of what I'm getting:
Input #0, lavfi, from 'amovie=<audio_file>,astats=metadata=1:reset=4400':
Duration: N/A, start: 0.000000, bitrate: 3072 kb/s
Stream #0:0: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
[Parsed_astats_1 # 0x7fcfd4d01140] Channel: 1
[Parsed_astats_1 # 0x7fcfd4d01140] DC offset: -0.032707
[Parsed_astats_1 # 0x7fcfd4d01140] Min level: -0.041852
...
Is there a combination of flags that will produce a nice JSON or CSV output, hiding the Input #0 ... and [Parsed_astats_1 # 0x7fcfd4d01140] like this:
{
"Channel": 1,
"DC offset": -0.032707,
"Min level": -0.041852
...
}

The nicest solution I can come up with is using the ametadata filter and write the stats to a file.
$ ffmpeg -f lavfi -i sine -t 1s -af 'astats=metadata=1:reset=4400:metadata=true,ametadata=mode=print:file=stats' -f null -
$ cat stats
...
lavfi.astats.1.Bit_depth=16.000000
lavfi.astats.1.Bit_depth2=16.000000
lavfi.astats.1.Dynamic_range=78.265678
lavfi.astats.1.Zero_crossings=920.000000
lavfi.astats.1.Zero_crossings_rate=0.019965
lavfi.astats.Overall.DC_offset=0.000043
lavfi.astats.Overall.Min_level=-4095.000000
lavfi.astats.Overall.Max_level=4095.000000
lavfi.astats.Overall.Min_difference=0.000000
lavfi.astats.Overall.Max_difference=257.000000
lavfi.astats.Overall.Mean_difference=163.407865
lavfi.astats.Overall.RMS_difference=181.500114
lavfi.astats.Overall.Peak_level=-18.063656
lavfi.astats.Overall.RMS_level=-21.073770
lavfi.astats.Overall.RMS_peak=-21.058020
lavfi.astats.Overall.RMS_trough=-21.118775
lavfi.astats.Overall.Flat_factor=0.000000
lavfi.astats.Overall.Peak_count=502.000000
lavfi.astats.Overall.Bit_depth=16.000000
lavfi.astats.Overall.Bit_depth2=16.000000
lavfi.astats.Overall.Number_of_samples=46080.000000
I guess your interested in the last frames lavfi.astats.Overall.* values.

Related

v4l2loopback SMPTE color bars in Genymotion

I'm running v4l2loopback on a Ubuntu 18.04 machine v4l2-ctl and virtualbox installed.
I use command below to initialize a loopback camera:
sudo modprobe v4l2loopback video_nr=2 card_label="Hello world" exclusive_caps=1 devices=1
v4l2-ctl --device=/dev/video2 --all
and output form the second command above is:
Driver Info (not using libv4l2):
Driver name : v4l2 loopback
Card type : Hello world
Bus info : platform:v4l2loopback-000
Driver version: 5.3.18
Capabilities : 0x85208000
Video Memory-to-Memory
Read/Write
Streaming
Extended Pix Format
Device Capabilities
Device Caps : 0x05208000
Video Memory-to-Memory
Read/Write
Streaming
Extended Pix Format
Priority: 0
Format Video Output:
Width/Height : 416/720
Pixel Format : 'YU12'
Field : None
Bytes per Line : 416
Size Image : 449280
Colorspace : sRGB
Transfer Function : Default (maps to sRGB)
YCbCr/HSV Encoding: Default (maps to ITU-R 601)
Quantization : Default (maps to Limited Range)
Flags :
Streaming Parameters Video Capture:
Frames per second: 30.000 (30/1)
Read buffers : 2
Streaming Parameters Video Output:
Frames per second: 30.000 (30/1)
Write buffers : 2
User Controls
keep_format 0x0098f900 (bool) : default=0 value=0
sustain_framerate 0x0098f901 (bool) : default=0 value=0
timeout 0x0098f902 (int) : min=0 max=100000 step=1 default=0 value=0
timeout_image_io 0x0098f903 (bool) : default=0 value=0
Now I can feed the input from my desktop
sudo ffmpeg -f x11grab -r 25 -s 416x768 -i :0.0+0,0 -vcodec rawvideo -pix_fmt yuv420p -threads 0 -f v4l2 /dev/video2
Or my stream OBS:
ffmpeg -f flv -listen 1 -i rtmp://localhost:1935/live/app -f v4l2 /dev/video2
And both work perfect; because I can view the output using WebRTC, Chrome, Firefox, and ffplay:
ffplay /dev/video2
My machine also has a webcam running on /dev/video0 which works perfect with genymotion.
But when I choose my "Hello world", genymotion exports noise (SMPTE color bars) as result.
Whats wrong with my Genymotion? I found that there are differences between UVC output and v4l2loopback.
Can you provide the logs of the Genymotion Emulator, located here ~/.Genymobile/Genymotion/deployed/<yourdevice>/genymotion-player.log, there might be interesting insights in there.

Use ffmpeg to encode AUDIO+IMAGE into a VIDEO for YouTube

I need to generate a video containing a single image throughout the duration of the audio comming from an audio file. This video should be compatible with the parameters supported by YouTube.
I'm using ffmpeg.
I was trying various configurations explained right here and in other forums but not all have worked well.
I'm currently using these settings:
ffmpeg -i a.mp3 -loop 1 -i a.jpg -vcodec libx264 -preset slow -crf 20 -threads 0 -acodec copy -shortest a.mkv
Where a.mp3 containing audio, a.jpg contains the image and a.mkv is the name of the resulting video.
Using these parameters a.mkv works well on YouTube and can be played with Media Player Classic; but KMPlayer only recognizes the audio, showing a blank image as background.
My questions are two:
1 - There is something wrong that causes KMPlayer to fail?
2 - Is there any configuration that can deliver the video faster, of course losing some compression?
Muchas gracias!
Try this:
ffmpeg -i a.mp3 -loop 1 -r 1 -i a.jpg -vcodec libx264 -preset ultrafast -crf 20 -threads 0 -acodec copy -shortest -r 2 a.mkv
Notable changes:
added -r 1
changed -preset slow to -preset ultrafast
added -r 2

Capturing/segmenting video on iOS and rejoining via HLS results in audio dropouts

I'm attempting to capture video on an iPhone 5 for realtime upload and HLS streaming. I'm at the stage where I'm generating the video on the device (not yet uploading to the server). Like these links on SO suggest, I've hacked together some code that switches out AssetWriters every five seconds.
Upload live streaming video from iPhone like Ustream or Qik
streaming video FROM an iPhone
Data corruption when reading realtime H.264 output from AVAssetWriter
Right now during dev, I'm just saving the files to the device locally and pulling them out via XCode Organizer. I then run Apple's mediafilesegmenter to simply convert them to MPEG2-TS (they're below 10 seconds already, so there's no actual segmenting happening - I assume they're just being converted to TS). I build the m3u8 by editing together the various index files created during this process (also manually at the moment).
When I put the assets on a server for testing, they're mostly streamed correctly, but I can tell when there's a segment switch because the audio briefly drops (possibly the video too, but I can't tell for sure - it looks ok). This obviously doesn't happen for typical HLS streams segmented from one single input file. I'm at a loss as to what's causing this.
You can open my HLS stream on your iPhone here (you can hear the audio drop after 5 seconds and again around 10)
http://cdn.inv3ntion.com/ms/stitch/stitch.html
Could there be something happening in my creation process (either on the device or the post-processing) that's causing the brief audio drops? I don't think I'm dropping any sampleBuffer's during AssetWriter switch outs (see code).
- (void)writeSampleBuffer:(CMSampleBufferRef)sampleBuffer ofType:(NSString *)mediaType
{
if (!self.isStarted) {
return;
}
#synchronized(self) {
if (mediaType == AVMediaTypeVideo && !assetWriterVideoIn) {
videoFormat = CMSampleBufferGetFormatDescription(sampleBuffer);
CFRetain(videoFormat);
assetWriterVideoIn = [self addAssetWriterVideoInput:assetWriter withFormatDesc:videoFormat];
[tracks addObject:AVMediaTypeVideo];
return;
}
if (mediaType == AVMediaTypeAudio && !assetWriterAudioIn) {
audioFormat = CMSampleBufferGetFormatDescription(sampleBuffer);
CFRetain(audioFormat);
assetWriterAudioIn = [self addAssetWriterAudioInput:assetWriter withFormatDesc:audioFormat];
[tracks addObject:AVMediaTypeAudio];
return;
}
if (assetWriterAudioIn && assetWriterVideoIn) {
recording = YES;
if (assetWriter.status == AVAssetWriterStatusUnknown) {
if ([assetWriter startWriting]) {
[assetWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];
if (segmentationTimer) {
[self setupQueuedAssetWriter];
[self startSegmentationTimer];
}
} else {
[self showError:[assetWriter error]];
}
}
if (assetWriter.status == AVAssetWriterStatusWriting) {
if (mediaType == AVMediaTypeVideo) {
if (assetWriterVideoIn.readyForMoreMediaData) {
if (![assetWriterVideoIn appendSampleBuffer:sampleBuffer]) {
[self showError:[assetWriter error]];
}
}
}
else if (mediaType == AVMediaTypeAudio) {
if (assetWriterAudioIn.readyForMoreMediaData) {
if (![assetWriterAudioIn appendSampleBuffer:sampleBuffer]) {
[self showError:[assetWriter error]];
}
}
}
}
}
}
}
- (void)setupQueuedAssetWriter
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), ^{
NSLog(#"Setting up queued asset writer...");
queuedFileURL = [self nextFileURL];
queuedAssetWriter = [[AVAssetWriter alloc] initWithURL:queuedFileURL fileType:AVFileTypeMPEG4 error:nil];
if ([tracks objectAtIndex:0] == AVMediaTypeVideo) {
queuedAssetWriterVideoIn = [self addAssetWriterVideoInput:queuedAssetWriter withFormatDesc:videoFormat];
queuedAssetWriterAudioIn = [self addAssetWriterAudioInput:queuedAssetWriter withFormatDesc:audioFormat];
} else {
queuedAssetWriterAudioIn = [self addAssetWriterAudioInput:queuedAssetWriter withFormatDesc:audioFormat];
queuedAssetWriterVideoIn = [self addAssetWriterVideoInput:queuedAssetWriter withFormatDesc:videoFormat];
}
});
}
- (void)doSegmentation
{
NSLog(#"Segmenting...");
AVAssetWriter *writer = assetWriter;
AVAssetWriterInput *audioIn = assetWriterAudioIn;
AVAssetWriterInput *videoIn = assetWriterVideoIn;
NSURL *fileURL = currentFileURL;
//[avCaptureSession beginConfiguration];
#synchronized(self) {
assetWriter = queuedAssetWriter;
assetWriterAudioIn = queuedAssetWriterAudioIn;
assetWriterVideoIn = queuedAssetWriterVideoIn;
}
//[avCaptureSession commitConfiguration];
currentFileURL = queuedFileURL;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), ^{
[audioIn markAsFinished];
[videoIn markAsFinished];
[writer finishWritingWithCompletionHandler:^{
if (writer.status == AVAssetWriterStatusCompleted ) {
[fileURLs addObject:fileURL];
} else {
NSLog(#"...WARNING: could not close segment");
}
}];
});
}
You can try inserting a #EXT-X-DISCONTINUITY between every segment in the m3u8, but I doubt this will work. There are a lot of thing that can be going wrong here.
Assuming you are sample audio at 44100kHz There is a new audio sample every 22 microseconds. During the time you are closing and reopening the file, you are definitely losing samples. If you concatenate the final wave form, it will play back slightly faster that real time due to this loss. In reality, this is probably not an issues.
As #vipw said, you will also have timestamp issues. Every time you start a new mp4, you are starting from timestamp zero. So, the player is getting confused, because the timestamps keep getting reset.
Also, is the transport stream format. The TS encapsulates each frame into 'streams'. HLS typically has 4 (PAT, PMT, Audio and Video), each stream is split into 188 byte packets with a 4 byte header. The headers have a 4 bit per stream continuity counter that wraps around on overflow. So, running mediafilesegmenter on every mp4, you are breaking the stream every segment by reseting the continuity counter back to zero.
You need a a tool that will accept mp4 and create a streaming output that maintains/rewrites timestamps (PTS, DTS, CTS), as well as continuity counters.
Shifting Packets
We had trouble using older versions of ffmpeg pts filter to shift packets. The more recent ffmpeg1 and ffmpeg2 support time shifts for mpegts.
Here's an example of a ffmpeg call, note -t for the duration and -initial_time for the shift at the end of the command (keep on scrolling right...)
Here's a segment with a 10 second shift
/opt/ffmpeg -i /tmp/cameo/58527/6fc2fa1a7418bf9d4aa90aa384d0eef2244631e8 -threads 0 -ss 10 -i /tmp/cameo/58527/79e684d793e209ebc9b12a5ad82298cb5e94cb54 -codec:v libx264 -pix_fmt yuv420p -preset veryfast -strict -2 -bsf:v h264_mp4toannexb -flags -global_header -crf 28 -profile:v baseline -x264opts level=3:keyint_min=24:keyint=24:scenecut=0 -b:v 100000 -bt 100000 -bufsize 100000 -maxrate 100000 -r 12 -s 320x180 -map 0:0 -map 1:0 -codec:a aac -strict -2 -b:a 64k -ab 64k -ac 2 -ar 44100 -t 9.958333333333334 -segment_time 10.958333333333334 -f segment -initial_offset 10 -segment_format mpegts -y /tmp/cameo/58527/100K%01d.ts -codec:v libx264 -pix_fmt yuv420p -preset veryfast -strict -2 -bsf:v h264_mp4toannexb -flags -global_header -crf 28 -profile:v baseline -x264opts level=3:keyint_min=24:keyint=24:scenecut=0 -b:v 200000 -bt 200000 -bufsize 200000 -maxrate 200000 -r 12 -s 320x180 -map 0:0 -map 1:0 -codec:a aac -strict -2 -b:a 64k -ab 64k -ac 2 -ar 44100 -t 9.958333333333334 -segment_time 10.958333333333334 -f segment -initial_offset 10 -segment_format mpegts -y /tmp/cameo/58527/200K%01d.ts -codec:v libx264 -pix_fmt yuv420p -preset veryfast -strict -2 -bsf:v h264_mp4toannexb -flags -global_header -crf 28 -profile:v baseline -x264opts level=3:keyint_min=24:keyint=24:scenecut=0 -b:v 364000 -bt 364000 -bufsize 364000 -maxrate 364000 -r 24 -s 320x180 -map 0:0 -map 1:0 -codec:a aac -strict -2 -b:a 64k -ab 64k -ac 2 -ar 44100 -t 9.958333333333334 -segment_time 10.958333333333334 -f segment -initial_offset 10 -segment_format mpegts -y /tmp/cameo/58527/364K%01d.ts -codec:v libx264 -pix_fmt yuv420p -preset veryfast -strict -2 -bsf:v h264_mp4toannexb -flags -global_header -crf 28 -profile:v baseline -x264opts level=3:keyint_min=24:keyint=24:scenecut=0 -b:v 664000 -bt 664000 -bufsize 664000 -maxrate 664000 -r 24 -s 480x270 -map 0:0 -map 1:0 -codec:a aac -strict -2 -b:a 64k -ab 64k -ac 2 -ar 44100 -t 9.958333333333334 -segment_time 10.958333333333334 -f segment -initial_offset 10 -segment_format mpegts -y /tmp/cameo/58527/664K%01d.ts -codec:v libx264 -pix_fmt yuv420p -preset veryfast -strict -2 -bsf:v h264_mp4toannexb -flags -global_header -crf 23 -profile:v baseline -x264opts level=3.1:keyint_min=24:keyint=24:scenecut=0 -b:v 1264000 -bt 1264000 -bufsize 1264000 -maxrate 1264000 -r 24 -s 640x360 -map 0:0 -map 1:0 -codec:a aac -strict -2 -b:a 64k -ab 64k -ac 2 -ar 44100 -t 9.958333333333334 -segment_time 10.958333333333334 -f segment -initial_offset 10 -segment_format mpegts -y /tmp/cameo/58527/1264K%01d.ts
There's also the adaption of the c++ segmenter that I've updated on github, but it's only been reasonably tested for video only mpegts. AV still causes it some issues (I wasn't confident which type of packet should be shifted to the new value the first video or the first audio packet, opted for the first video packet). Also, as you bumped into it can have problems with certain media as you noted in your issue.
If I had more time on my hands, I'd like to debug your specific case and improve the c++ shifter. I hope the above ffmpeg example helps get your http live streaming example working, we've gone through our share of streaming trouble. We're currently working around an audio pop that occurs from shifted segments. The fix is to gather all source media before splitting into segmented streams (which we can do when we finalize a video but it would slow us down during iterative builds).
I think that your ts files aren't going to be created on the same timeline. Within ts files are the presentation timestamps of the packets, and if a new ts is being created on each segment, there's probably a discontinuity.
What might work is for you to concatenate the recorded segments together so that the new part is timestamped in the same timeline. Then segmenting should work properly and the segment transitions should be smooth in the generated stream.
I think you need a process that always keeps the last part of the previous segment so that the timestamps are always synchronized.

Join images and audio to result video

I have a lot of images with different sizes (i.e. 1024x768 and 900x942) and an audio file (audio.mp3) of 30 seconds and I need to create a video from them.
I'm trying it now with: result%d.png (1 to 4) and audio.mp3
ffmpeg -y -i result%d.png -i audio.mp3 -r 30 -b 2500k -vframes 900
-acodec libvo_aacenc -ab 160k video.mp4
The video video.mp4 has 30 seconds but the 3 first images is showed very quickly when the last image remains until the end of the audio.
Each image needs to be showed in a equal time until the end of the audio. Anyone knows how to do it?
The number of the images will vary sometimes.
FFMPEG version: 3.2.1-1
UBUNTU 16.04.1
Imagine, you have an mp3 audio file named wow.mp3 In that case, the following command will get the duration of the mp3 in seconds.
ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 wow.mp3
Once you have the duration in seconds (imagine I got 11.36 seconds). Now since I have 3 images, I want to run each image for (11.36/3 = 3.79), then please use the following:
ffmpeg -y -framerate 1/3.79 -start_number 1 -i ./swissGenevaLake_%d.jpg -i ./akib.mp3 -c:v libx264 -r 25 -pix_fmt yuv420p -c:a aac -strict experimental -shortest output.mp4
Here the images are ./swissGenevaLake_1.jpg, ./swissGenevaLake_2.jpg , and ./swissGenevaLake_3.jpg.
-framerate 1/3.784 means, each image runs for 3.784 seconds.
-start_number 1 means, starts with image number one, meaning ./swissGenevaLake_1.jpg
-c:v libx264: video codec H.264
-r 25: output video framerate 25
-pix_fmt yuv420p: output video pixel format.
-c:a aac: encode the audio using aac
-shortest: end the video as soon as the audio is done.
output.mp4: output file name
Disclaimer: I have not tested merging images of multiple sizes.
References:
https://trac.ffmpeg.org/wiki/Create%20a%20video%20slideshow%20from%20images
https://trac.ffmpeg.org/wiki/Encode/AAC
http://trac.ffmpeg.org/wiki/FFprobeTips
For creating video= n no of Image + Audio
Step 1)
You will create Video of these Images, as
Process proc = Runtime.getRuntime().exec(ffmpeg + " -y -r "+duration +" -i " + imagePath + " -c:v libx264 -r 15 -pix_fmt yuv420p -vf fps=90 " + imageVideoPath);
InputStream stderr = proc.getErrorStream();
InputStreamReader isr = new InputStreamReader(stderr);
BufferedReader br = new BufferedReader(isr);
String line = null;
while ((line = br.readLine()) != null)
{
//System.out.println(line);
}
int exitVal = proc.waitFor();
proc.destroy();
Where duration=No of Images/Duration of Audio i.e in 1 sec you want how many Images
Step 2)
Process proc4VideoAudio = Runtime.getRuntime().exec(ffmpeg +" -i " + imageVideoPath + " -i "+ audioPath + " -map 0:0 -map 1:0 " + videoPath);
InputStream stderr1 = proc4VideoAudio.getErrorStream();
InputStreamReader isr1 = new InputStreamReader(stderr1);
BufferedReader br1 = new BufferedReader(isr1);
String line1 = null;
while ((line1 = br1.readLine()) != null)
{
//System.out.println(line1);
}
int exitVal1 = proc4VideoAudio.waitFor();
proc4VideoAudio.destroy();
Both Step 1 and Step 2 can be run in sequence now. If you want to do it manually then only run Runtime.getTime.exec(..)
The code below it is to make it synchronized.
** Also note the statement of FFMPEG to create video in one step from images and audio, gives you the same problem as mentioned by you and if not the solution will be static for fix number of images for given audio file.
These imagePath, VideoPath, audioPath are all Strings
This should help: http://ffmpeg.org/trac/ffmpeg/wiki/Create%20a%20video%20slideshow%20from%20images
Using -r, you can set the number of seconds you want each image to appear.

How can I extract audio from video with ffmpeg? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 days ago.
The community reviewed whether to reopen this question 5 days ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I tried the following command to extract audio from video:
ffmpeg -i Sample.avi -vn -ar 44100 -ac 2 -ab 192k -f mp3 Sample.mp3
but I get the following output
libavutil 50.15. 1 / 50.15. 1
libavcodec 52.72. 2 / 52.72. 2
libavformat 52.64. 2 / 52.64. 2
libavdevice 52. 2. 0 / 52. 2. 0
libavfilter 1.19. 0 / 1.19. 0
libswscale 0.11. 0 / 0.11. 0
libpostproc 51. 2. 0 / 51. 2. 0
SamplE.avi: Invalid data found when processing input
Can anyone help, please?
To extract the audio stream without re-encoding:
ffmpeg -i input-video.avi -vn -acodec copy output-audio.aac
-vn is no video.
-acodec copy says use the same audio stream that's already in there.
Read the output to see what codec it is, to set the right filename extension.
To encode a high quality MP3 or MP4 audio from a movie file (eg AVI, MP4, MOV, etc) or audio file (eg WAV), I find it's best to use -q:a 0 for variable bit rate and it's good practice to specify -map a to exclude video/subtitles and only grab audio:
ffmpeg -i sample.avi -q:a 0 -map a sample.mp3
If you want to extract a portion of audio from a video use the -ss option to specify the starting timestamp, and the -t option to specify the encoding duration, eg from 3 minutes and 5 seconds in for 45 seconds:
ffmpeg -i sample.avi -ss 00:03:05 -t 00:00:45.0 -q:a 0 -map a sample.mp3
The timestamps need to be in HH:MM:SS.xxx format or in seconds.
If you don't specify the -t option it will go to the end.
You can use the -to option instead of the -t option, if you want to specify the range, eg for 45 seconds: 00:03:05 + 45 = 00:03:50
Working example:
Download ffmpeg
Open a Command Prompt (Start > Run > CMD) or on a Mac/Linux open a Terminal
cd (the change directory command) to the directory with the ffmeg.exe, as depicted.
Issue your command and wait for the output file (or troubleshoot any errors)
Windows
Mac/Linux
Extract all audio tracks / streams
This puts all audio into one file:
ffmpeg -i input.mov -map 0:a -c copy output.mov
-map 0:a selects all audio streams only. Video and subtitles will be excluded.
-c copy enables stream copy mode. This copies the audio and does not re-encode it. Remove -c copy if you want the audio to be re-encoded.
Choose an output format that supports your audio format. See comparison of container formats.
Extract a specific audio track / stream
Example to extract audio stream #4:
ffmpeg -i input.mkv -map 0:a:3 -c copy output.m4a
-map 0:a:3 selects audio stream #4 only (ffmpeg starts counting from 0).
-c copy enables stream copy mode. This copies the audio and does not re-encode it. Remove -c copy if you want the audio to be re-encoded.
Choose an output format that supports your audio format. See comparison of container formats.
Extract and re-encode audio / change format
Similar to the examples above, but without -c copy. Various examples:
ffmpeg -i input.mp4 -map 0:a output.mp3
ffmpeg -i input.mkv -map 0:a output.m4a
ffmpeg -i input.avi -map 0:a -c:a aac output.mka
ffmpeg -i input.mp4 output.wav
Extract all audio streams individually
This input in this example has 4 audio streams. Each audio stream will be output as single, individual files.
ffmpeg -i input.mov -map 0:a:0 output0.wav -map 0:a:1 output1.wav -map 0:a:2 output2.wav -map 0:a:3 output3.wav
Optionally add -c copy before each output file name to enable stream copy mode.
Extract a certain channel
Use the channelsplit filter. Example to get the Front Right (FR) channel from a stereo input:
ffmpeg -i stereo.wav -filter_complex "[0:a]channelsplit=channel_layout=stereo:channels=FR[right]" -map "[right]" front_right.wav
channel_layout is the channel layout of the input. It is not automatically detected so you must provide the layout name.
channels lists the channel(s) you want to extract.
See ffmpeg -layouts for audio channel layout names (for channel_layout) and channel names (for channels).
Using stream copy mode (-c copy) is not possible to use when filtering, so the audio must be re-encoded.
See FFmpeg Wiki: Audio Channels for more examples.
What's the difference between -map and -vn?
ffmpeg has a default stream selection behavior that will select 1 stream per stream type (1 video, 1 audio, 1 subtitle, 1 data).
-vn is an old, legacy option. It excludes video from the default stream selection behavior. So audio, subtitles, and data are still automatically selected unless told not to with -an, -sn, or -dn.
-map is more complicated but more flexible and useful. -map disables the default stream selection behavior and ffmpeg will only include what you tell it to with -map option(s). -map can also be used to exclude certain streams or stream types. For example, -map 0 -map -0:v would include all streams except all video.
See FFmpeg Wiki: Map for more examples.
Errors
Invalid audio stream. Exactly one MP3 audio stream is required.
MP3 only supports 1 audio stream. The error means you are trying to put more than 1 audio stream into MP3. It can also mean you are trying to put non-MP3 audio into MP3.
WAVE files have exactly one stream
Similar to above.
Could not find tag for codec in stream #0, codec not currently supported in container
You are trying to put an audio format into an output that does not support it, such as PCM (WAV) into MP4.
Remove -c copy, choose a different output format (change the file name extension), or manually choose the encoder (such as -c:a aac).
See comparison of container formats.
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
This is a useless, generic error. The actual, informative error should immediately precede this generic error message.
Seems like you're extracting audio from a video file & downmixing to stereo channel.
To just extract audio (without re-encoding):
ffmpeg.exe -i in.mp4 -vn -c:a copy out.m4a
To extract audio & downmix to stereo (without re-encoding):
ffmpeg.exe -i in.mp4 -vn -c:a copy -ac 2 out.m4a
To generate an mp3 file, you'd re-encode audio:
ffmpeg.exe -i in.mp4 -vn -ac 2 out.mp3
-c (select codecs) & -map (select streams) options:
-c:a -> select best supported audio (transcoded)
-c:a copy -> best supported audio (copied)
-map 0:a -> all audio from 1st (audio) input file (transcoded)
-map 0:0 -> 1st stream from 1st input file (transcoded)
-map 1:a:0 -> 1st audio stream from 2nd (audio) input file (transcoded)
-map 1:a:1 -c:a copy -> 2nd audio stream from 2nd (audio)input file (copied)
ffmpeg -i sample.avi will give you the audio/video format info for your file. Make sure you have the proper libraries configured to parse the input streams. Also, make sure that the file isn't corrupt.
The command line is correct and works on a valid video file. I would make sure that you have installed the correct library to work with mp3, install lame o probe with another audio codec.
Usually
ffmpeg -formats
or
ffmpeg -codecs
would give sufficient information so that you know more.
To encode mp3 audio ffmpeg.org shows the following example:
ffmpeg -i input.wav -codec:a libmp3lame -qscale:a 2 output.mp3
I extracted the audio from a video just by replacing input.wav with the video filename. The 2 means 190 kb/sec. You can see the other quality levels at my link above.
For people looking for the simpler way to extract audio from a video file while retaining the original video file's parameters, you can use:
ffmpeg -i <video_file_name.extension> <audio_file_name.extension>
For example, running:
ffmpeg -i screencap.mov screencap.mp3
extracts an mp3 audio file from a mov video file.
Here's what I just used:
ffmpeg -i my.mkv -map 0:3 -vn -b:a 320k my.mp3
Options explanation:
my.mkv is a source video file, you can use other formats as well
-map 0:3 means I want 3rd stream from video file. Put your N there - video files often has multiple audio streams; you can omit it or use -map 0:a to take the default audio stream. Run ffprobe my.mkv to see what streams does the video file have.
my.mp3 is a target audio filename, and ffmpeg figures out I want an MP3 from its extension. In my case the source audio stream is ac3 DTS and just copying wasn't what I wanted
320k is a desired target bitrate
-vn means I don't want video in target file
Creating an audio book from several video clips
First, extracting the audio (as `.m4a) from a bunch of h264 files:
for f in *.mp4; do ffmpeg -i "$f" -vn -c:a copy "$(basename "$f" .mp4).m4a"; done
the -vn output option disables video output (automatic selection or mapping of any video stream). For full manual control see the -map option.
Optional
If there's an intro of, say, 40 seconds, you can skip it with the -ss parameter:
for f in *.m4a; do ffmpeg -i "$f" -ss 00:00:40 -c copy crop/"$f"; done
To combine all files in one:
ffmpeg -f concat -safe 0 -i <(for f in ./*.m4a; do echo "file '$PWD/$f'"; done) -c copy output.m4a
If the audio wrapped into the avi is not mp3-format to start with, you may need to specify -acodec mp3 as an additional parameter. Or whatever your mp3 codec is (on Linux systems its probably -acodec libmp3lame). You may also get the same effect, platform-agnostic, by instead specifying -f mp3 to "force" the format to mp3, although not all versions of ffmpeg still support that switch. Your Mileage May Vary.
To extract without conversion I use a context menu entry - as file manager custom action in Linux - to run the following (after having checked what audio type the video contains; example for video containing ogg audio):
bash -c 'ffmpeg -i "$0" -map 0:a -c:a copy "${0%%.*}".ogg' %f
which is based on the ffmpeg command ffmpeg -i INPUT -map 0:a -c:a copy OUTPUT.
I have used -map 0:1 in that without problems, but, as said in a comment by #LordNeckbeard, "Stream 0:1 is not guaranteed to always be audio. Using -map 0:a instead of -map 0:1 will avoid ambiguity."
Use -b:a instead of -ab as -ab is outdated now, also make sure your input file path is correct.
To extract audio from a video I have used below command and its working fine.
String[] complexCommand = {"-y", "-i", inputFileAbsolutePath, "-vn", "-ar", "44100", "-ac", "2", "-b:a", "256k", "-f", "mp3", outputFileAbsolutePath};
Here,
-y - Overwrite output files without asking.
-i - FFmpeg reads from an arbitrary number of input “files” specified by the -i option
-vn - Disable video recording
-ar - sets the sampling rate for audio streams if encoded
-ac - Set the number of audio channels.
-b:a - Set the audio bitrate
-f - format
Check out this for my complete sample FFmpeg android project on GitHub.

Resources