I looking for basic example how to encode pcm buffer(16kHz/16bit/fixed point) to Opus buffer and pack it OGG-container? I found only vorbis-based examples, not Opus.
Thanks for any ideas!)
This should produce opus files, looks basic enough.
https://gitlab.xiph.org/xiph/libopusenc/-/blob/master/examples/opusenc_example.c
Related
Let's say I have an array of Y values for a sine wave. (Assume X is time)
In Python you can just write it to a Wav file:
wav.write("file.wav", <sample rate>, <waveform>)
Is it possible to do this in Swift using AVFoundation? If so how? If not, what library should I be using? (I'm trying to avoid AudioKit for now.)
Thanks,
Charles
In AVFoundation there is AVAudioFile, but you'll have to provide the data as AVAudioPCMBuffers, which keeps the data in a AudioBufferList, which in turn consists of AudioBuffers, which are imho all rather complicated since their design goal apparently was to be able to handle every conceivable audio format (including compressed, VBR etc.). So AVAudioFile is probably overkill for just writing some synthetic samples to a WAV file.
Alternatively, there is the Audio File Services C-API. It provides AudioFileCreateWithURL, AudioFileWriteBytes and AudioFileClose, which will probably do the trick for your task.
The most complicated part may be the AudioStreamBasicDescription required by AudioFileCreateWithURL. To help with this a utility function exists: FillOutASBDForLPCM.
Are there any good examples that show how to render the IMFSample output from the H.264 decoder? My scenario uses a 4K resolution H.264 stream and the PC that I am targeting will only accept 1080p using the DXGI buffers. But the H.264 decoder will handle 4K so I need to find a way to feed that NV12 IMFSample directly to the DirectX 11 renderer. I have already tried using the DX11VideoRenderer sample but it fails due to this particular IMFSample not having an IMFDXGIBuffer interface.
It looks like in the DX11VideoRenderer the input IMFDXGIBuffer is NV12 type and that can be rendered successfully in hardware. So it seems logical that a non-DXGI buffer of NV12 type should be acceptable too?
Perhaps I need to create a ID3D11Texture2D texture or resource with an NV12 type? I found examples for how to create a texture from a file but none for how to create a texture from a sample, which would seem to be even more useful. And if I can create a NV12 texture, how to figure out the SysMemPitch and SysMemSlicePitch values in the D3D11_SUBRESOURCE_DATA structure for NV12?
Any help would be really appreciated! Thank you.
I was able to find a complete example that renders an NV12 sample to the screen. Although there are some simple stride calculation errors in how it renders it's own example image, the actual rendering code does work correctly. It appears to be an old Microsoft sample that I cannot find any other information about.
D3D11NV12Rendering
I have been using the Superpowered iOS library to analyse audio and extract BPM, loudness, pitch data. I'm working on an iOS Swift 3.0 project and have been able to get the C classes work with Swift using the Bridging headers for ObjC.
The problem I am running into is that whilst I can create a decoder object, extract audio from the Music Library and store it as a .WAV - I am unable to create a decoder object for just snippets of the extracted audio and get the analyser class to return data.
My approach has been to create a decoder object as follows:
var decodeAttempt = decoder!.open(self.originalFilePath, metaOnly: false, offset: offsetBytes, length: lengthBytes, stemsIndex: 0)
'offsetBytes' and 'LengthBytes' I think are the position within the audio file. As I have already decompressed audio, stored it as WAV and then am providing it to the decoder here, I am calculating the offset and length using the PCM Wave audio formula of 44100 x 2 x 16 / 8 = 176400 bytes per second. Then using this to specify a start point and length in bytes. I'm not sure that this is the correct way to do this as the decoder will return 'Unknown file format'.
Any ideas or even alternative suggestions of how to achieve the title of this question? Thanks in advance!
The offset and length parameters of the SuperpoweredDecoder are there because of the Android APK file format, where bundled audio files are simply concatenated to the package.
Despite a WAV file is as "uncompressed" as it can be, there is a header at the beginning, so offset and length are not a good way for this purpose. Especially as the header is present at the beginning only, and without the header decoding is not possible.
You mention that you can extract audio to PCM (and save to WAV). Then you have the answer in your hand: just submit different extracted portions to different instances of the SuperpoweredOfflineAnalyzer.
For a specific purpose I am trying to convert an AVI video to a kind of Moving JPEG format using OpenCV. In order to do so I read images from the source video, convert them to JPEG using imEncode, and write these JPEG images to the target video.
After several hundreds of frames suddenly the size of the resulting JPEG image nearly doubles. Here's a list of sizes:
68045
68145
68139
67885
67521
67461
67537
67420
67578
67573
67577
67635
67700
67751
127800
127899
127508
127302
126990
126904
Anybody got a clue what's going on here?
By the way: I'm using OpenCV.Net as a wrapper for OpenCV.
Thanks a lot in advance,
Paul
I found the solution. If I explicitly enter the third parameter to imEncode (for JPEG encoding this indicates the quality of the encoding, ranging from 0 to 100) instead of using the default (95) the problem disappears. It's likely this is a bug in OpenCV.Net, but it could also be a bug in OpenCV itself.
Im using BASS.dll library and all I want to do is to "redirect" part of MP3 Im playing using for example BASS_StreamCreateFile to another file (may be MP3 or WAVe). I dont know how to start? Im trying to use help to find an answer, but still nothing. I can play this stream. Read some data I need. Now I need to copy ile for example from 2:00 to 2:10 (or by position).
Any ideas how should I start?
Regards,
J.K.
Well, I don't know BASS specifically, but I know a little about music playing and compressed data formats in general, and copying the data around properly involves an intermediate decoding step. Here's what you'll need to do:
Open the file and find the correct position.
Decode the audio into an in-memory buffer. The size of your buffer should be (LengthInSeconds * SamplesPerSecond * Channels * BytesPerSample) bytes. So if it's 10 seconds of CD quality audio, that's 10 * 44100 * 2 (stereo) * 2 (16-bit audio) = 1764000 bytes.
Take this buffer of decoded data and feed it into an MP3 encoding function, and save the resulting MP3 to a file.
If BASS has functions for decoding to an external buffer and for encoding a buffer to MP3, you're good; all you have to do is figure out which ones to use. If not, you'll have to find another library for MP3 encoding and decoding.
Also, watch out for generational loss. MP3 uses lossy compression, so if you decompress and recompress the data multiple times, it'll hurt the sound quality.