I want to implement a python project in which the input will be a .mp4 file and the output will be the transcript or subtitle of the video. The constraint is to use OpenVINO. How can I do that?
mp4 is a container. I believe the current OpenVINO speech demo/samples use wav files as that is what the model is trained for.
If you can convert your mp3 or audio from the mp4 container using a tool to convert it to the wav format, that may work.
speech rec demo
Related
I am working with Tensorflow_TTS. I am generating audio using fastspeech and melgan. Now this audio is an eager tensor, more precisely: <class 'tensorflow.python.framework.ops.EagerTensor'> . I want to play this audio tensor in the script without converting it to an audio file and then playing it. Is there any way to do this?
Noob that I am I just figured it out, using the sounddevice library in python I was able to play the audio without saving it by converting the eagertensor to a numpy array.
I have .mp4 file and I want to convert it to .webm with the same quality by avconv tool,
and vice from .webm to . mp4 with the same quality
document fuzzy to me.
That's not possible. WebM cannot contain MPEG media (video or audio.) So you can't repack ("remux") the media from mp4 to webm. You would need to re-encode them ("transcode" them), with VP8 or VP9 as the video format and Vorbis as the audio format. Thus you can't have the same quality, since transcoding from one lossy format to another always loses some quality.
So your only option is to transcode.
I have a case where i need to get audio out from video file. Is this possible in iOS ??
I need only the output file as an audio any type. I have the video file in my documents directory which I record earlier in application.
1.Convert Video Reverse
2.Extract audio from a video file
3.Add Audio & Video Together
Download Code From Here
may be helpful to you.
Yes, it can be done. See Extract audio from video file for ideas.
Extracting is probably the more accurate term.
I have several audio files which all of them are in .mp3 format. But playing .mp3 won't maximize the efficiency of audio playing on iOS. From one of research, .wav or .caf is a go for short loop or sound effect audio file used in game. Thus I need to convert this .mp3 file into those two formats.
Currently I use the following command to convert .mp3 to .caf.
afconvert -f caff -d LEI16 kick_sfx.mp3
However, I see that .caf is a container format thus I have a bad feeling that it's still not so efficient. According to that, I think I need to convert those .mp3 files to .wav first in order to maximize efficiency in playing audio file.
Do I need to really convert those (.mp3) files to .wav first before convert them again to .caf? Or using that command alone will do the work for me ?
Any additional info is welcome.
In the app I'm working on, there's an AVIRecord class that manually write AVI headers and JPEG frames into a video files. They are .avi files with MJPEG codec, according to my media player (using KLite codec pack).
My question is: is this AVI compressed or uncompressed? Because the file size is basically sum of all the jpeg frames.
Can I write a similar code to produce a .mov file (Quicktime format)? By similar i mean: writing headers to the file, putting each frames manually into the files.
The app I am working on is supposed to save the jpeg stream from a IP Cam and save it under quicktime format.
Most file formats like AVI, MOV do not compress the video and audio bitstreams present in them. File formats are used to store video and audio decodeable units with associated metadata like timestamps. So when you add JPEGs to AVI file, it does not get compressed any further.
You can create MOV file with MJPEG video, similar to way you have been able to create AVI file with MJPEG video. However you would need creator for MOV (similar to the one you have for AVI).
MOV file format has been specified by Apple. A version of the format is available at http://developer.apple.com/standards/qtff-2001.pdf