I want to process the bytes read from the microphone using Swift 3 on my iOS. I currently use AVAudioEngine.
print(inputNode.inputFormat(forBus: bus).settings)
print(inputNode.inputFormat(forBus: bus).formatDescription)
This gives me the following output:
["AVNumberOfChannelsKey": 1, "AVLinearPCMBitDepthKey": 32, "AVSampleRateKey": 16000, "AVLinearPCMIsNonInterleaved": 1, "AVLinearPCMIsBigEndianKey": 0, "AVFormatIDKey": 1819304813, "AVLinearPCMIsFloatKey": 1]
<CMAudioFormatDescription 0x14d5bbb0 [0x3a5fb7d8]> {
mediaType:'soun'
mediaSubType:'lpcm'
mediaSpecific: {
ASBD: {
mSampleRate: 16000.000000
mFormatID: 'lpcm'
mFormatFlags: 0x29
mBytesPerPacket: 4
mFramesPerPacket: 1
mBytesPerFrame: 4
mChannelsPerFrame: 1
mBitsPerChannel: 32 }
cookie: {(null)}
ACL: {(null)}
FormatList Array: {(null)}
}
extensions: {(null)}
}
The problem is that the server I want to send the data to does not expect 32 bit floats but 16 bit unsigned ints. I think I have to change the mFormatFlags. Does anybody know how I can do this and what value would be the right one?
The resulting byte stream should be equivalent to the one I get on android using
AudioRecord recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, SAMPLES_PER_SECOND,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT,
recordSegmentSizeBytes);
I tried this:
let cfmt = AVAudioCommonFormat.pcmFormatInt16
inputNode.inputFormat(forBus: bus) = AVAudioFormat(commonFormat: cfmt, sampleRate: 16000.0, channels: 1, interleaved: false)
but got this error
Cannot assign to value: function call returns immutable value
Any ideas?
Oh my god, I think I got it. I was too blind to see that you can specify the format of the installTap callback. This seems to work
let audioEngine = AVAudioEngine()
func startRecording() {
let inputNode = audioEngine.inputNode!
let bus = 0
let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 16000.0, channels: 1, interleaved: false)
inputNode.installTap(onBus: bus, bufferSize: 2048, format: format) { // inputNode.inputFormat(forBus: bus)
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
let values = UnsafeBufferPointer(start: buffer.int16ChannelData![0], count: Int(buffer.frameLength))
let arr = Array(values)
print(arr)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("Error info: \(error)")
}
}
Related
I'm trying to implement Bluetooth FTMS(Fitness Machine).
guard let characteristicData = characteristic.value else { return -1 }
let byteArray = [UInt8](characteristicData)
let nsdataStr = NSData.init(data: (characteristic.value)!)
print("pwrFTMS 2ACC Feature Array:[\(byteArray.count)]\(byteArray) Hex:\(nsdataStr)")
Here is what's returned from the bleno server
PwrFTMS 2ACC Feature Array:[8][2, 64, 0, 0, 8, 32, 0, 0] Hex:{length = 8, bytes = 0x0240000008200000}
Based on the specs, the returned data has 2 characteristics, each of them 4 octet long.
I'm having trouble getting the 4 octets split so I can get it converted to binary and get the relevant Bits for decoding.
Part of the problem is the swift will remove the leading zero. Hence, instead of getting 00 00 64 02, I'm getting 642. I tried the below to pad it with leading zero but since it's formatted to a string, I can't convert it to binary using radix:2
let FTMSFeature = String(format: "%02x", byteArray[3]) + String(format: "%02x", byteArray[2]) + String(format: "%02x", byteArray[1]) + String(format: "%02x", byteArray[0])
I've been banging my head on this for an entire day and went thru multiple SO and Google to no avail.
How Can I convert:
From - [HEX] 00 00 40 02
To - [DEC] 16386
To - [BIN] 0100 0000 0000 0010
then I can get to Bit1 = 1 and Bit14 = 1
How Can I convert:
From - [HEX] 00 00 40 02 To - [DEC] 16386 To - [BIN] 0100 0000
0000 0010
You can simply use ContiguousBytes withUnsafeBytes method to load your bytes as UInt32. Note that it will use only the same amount of bytes needed to create the resulting type (4 bytes)
let byteArray: [UInt8] = [2, 64, 0, 0, 8, 32, 0, 0]
let decimal = byteArray.withUnsafeBytes { $0.load(as: UInt32.self) }
decimal // 16386
To convert from bytes to binary you just need to pad to left your resulting binary string. Note that your expected binary string has only 2 bytes when a 32-bit unsigned integer should have 4:
extension FixedWidthInteger {
var binary: String {
(0 ..< Self.bitWidth / 8).map {
let byte = UInt8(truncatingIfNeeded: self >> ($0 * 8))
let string = String(byte, radix: 2)
return String(repeating: "0",
count: 8 - string.count) + string
}.reversed().joined(separator: " ")
}
}
let binary = decimal.binary // "00000000 00000000 01000000 00000010"
To know if a specific bit is on or off you can do as follow:
extension UnsignedInteger {
func bit<B: BinaryInteger>(at pos: B) -> Bool {
precondition(0..<B(bitWidth) ~= pos, "invalid bit position")
return (self & 1 << pos) > 0
}
}
decimal.bit(at: 0) // false
decimal.bit(at: 1) // true
decimal.bit(at: 2) // false
decimal.bit(at: 3) // false
decimal.bit(at: 14) // true
If you need to get a value at a specific bytes position you can check this post
I am trying to communicate with a Bluetooth laser tag gun that takes data in 20 byte chunks, which are broken down into 16, 8 or 4-bit words. To do this, I made a UInt8 array and changed the values in there. The problem happens when I try to send the UInt8 array.
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("Response: \(laserTagGun.writeValue(bytes, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
commandID is just a UInt8. This gives me the error, Cannot convert value of type '[UInt8]' to expected argument type 'Data', which I tried to solve by doing this:
var bytes = [UInt8](repeating: 0, count: 20)
bytes[0] = commandID
if commandID == 240 {
commandID = 0
}
commandID += commandIDIncrement
print(commandID)
bytes[2] = 128
bytes[4] = UInt8(gunIDSlider.value)
print("bytes: \(bytes)")
assert(bytes.count * MemoryLayout<UInt8>.stride >= MemoryLayout<Data>.size)
let data1 = UnsafeRawPointer(bytes).assumingMemoryBound(to: Data.self).pointee
print("data1: \(data1)")
print("Response: \(laserTagGun.writeValue(data1, for: gunCControl, type: CBCharacteristicWriteType.withResponse))")
To this, data1 just prints 0 bytes and I can see that laserTagGun.writeValue isn't actually doing anything by reading data from the other characteristics. How can I convert my UInt8 array to Data in swift? Also please let me know if there is a better way to handle 20 bytes of data than a UInt8 array. Thank you for your help!
It looks like you're really trying to avoid a copy of the bytes, if not, then just init a new Data with your bytes array:
let data2 = Data(bytes)
print("data2: \(data2)")
If you really want to avoid the copy, what about something like this?
let data1 = Data(bytesNoCopy: UnsafeMutableRawPointer(mutating: bytes), count: bytes.count, deallocator: .none)
print("data1: \(data1)")
I'm using AVAssetWriter to write audio CMSampleBuffer to an mp4 file, but when I later read that file using AVAssetReader, it seems to be missing the initial chunk of data.
Here's the debug description of the first CMSampleBuffer passed to writer input append method (notice the priming duration attachement of 1024/44_100):
CMSampleBuffer 0x102ea5b60 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
TrimDurationAtStart = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1024;
}
formatDescription = <CMAudioFormatDescription 0x281fd9720 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x2
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f50a0 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {(null)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {(null)}
}
sbufToTrackReadiness = 0x0
numSamples = 1
outputPTS = {6683542167/44100 = 151554.244, rounded}(based on cachedOutputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {6683541143/44100 = 151554.221, rounded}, DTS = {6683541143/44100 = 151554.221, rounded}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[1] = {
sampleSize = 163,
}
dataBuffer = 0x281cc7a80
Here's the debug description of the second CMSampleBuffer (notice the priming duration attachement of 1088/44_100, which combined with the previous trim duration yields the standard value of 2112):
CMSampleBuffer 0x102e584f0 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
TrimDurationAtStart = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1088;
}
formatDescription = <CMAudioFormatDescription 0x281fd9720 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x2
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f50a0 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {(null)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {(null)}
}
sbufToTrackReadiness = 0x0
numSamples = 1
outputPTS = {6683543255/44100 = 151554.269, rounded}(based on cachedOutputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {6683542167/44100 = 151554.244, rounded}, DTS = {6683542167/44100 = 151554.244, rounded}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[1] = {
sampleSize = 179,
}
dataBuffer = 0x281cc4750
Now, when I read the audio track using AVAssetReader, the first CMSampleBuffer I get is:
CMSampleBuffer 0x102ed7b20 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
EmptyMedia(P) = true
formatDescription = (null)
sbufToTrackReadiness = 0x0
numSamples = 0
outputPTS = {0/1 = 0.000}(based on outputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {0/1 = 0.000}, DTS = {INVALID}, duration = {0/1 = 0.000}},
}
dataBuffer = 0x0
and the next one is contains priming info of 1088/44_100:
CMSampleBuffer 0x10318bc00 retainCount: 7 allocator: 0x1c061f840
invalid = NO
dataReady = YES
makeDataReadyCallback = 0x0
makeDataReadyRefcon = 0x0
buffer-level attachments:
FillDiscontinuitiesWithSilence(P) = true
GradualDecoderRefresh(P) = 1
TrimDurationAtStart(P) = {
epoch = 0;
flags = 1;
timescale = 44100;
value = 1088;
}
IsGradualDecoderRefreshAuthoritative(P) = false
formatDescription = <CMAudioFormatDescription 0x281fdcaa0 [0x1c061f840]> {
mediaType:'soun'
mediaSubType:'aac '
mediaSpecific: {
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }
cookie: {<CFData 0x2805f3800 [0x1c061f840]>{length = 39, capacity = 39, bytes = 0x03808080220000000480808014401400 ... 1210068080800102}}
ACL: {Stereo (L R)}
FormatList Array: {
Index: 0
ChannelLayoutTag: 0x650002
ASBD: {
mSampleRate: 44100.000000
mFormatID: 'aac '
mFormatFlags: 0x0
mBytesPerPacket: 0
mFramesPerPacket: 1024
mBytesPerFrame: 0
mChannelsPerFrame: 2
mBitsPerChannel: 0 }}
}
extensions: {{
VerbatimISOSampleEntry = {length = 87, bytes = 0x00000057 6d703461 00000000 00000001 ... 12100680 80800102 };
}}
}
sbufToTrackReadiness = 0x0
numSamples = 43
outputPTS = {83/600 = 0.138}(based on outputPresentationTimeStamp)
sampleTimingArray[1] = {
{PTS = {1024/44100 = 0.023}, DTS = {1024/44100 = 0.023}, duration = {1024/44100 = 0.023}},
}
sampleSizeArray[43] = {
sampleSize = 179,
sampleSize = 173,
sampleSize = 178,
sampleSize = 172,
sampleSize = 172,
sampleSize = 159,
sampleSize = 180,
sampleSize = 200,
sampleSize = 187,
sampleSize = 189,
sampleSize = 206,
sampleSize = 192,
sampleSize = 195,
sampleSize = 186,
sampleSize = 183,
sampleSize = 189,
sampleSize = 211,
sampleSize = 198,
sampleSize = 204,
sampleSize = 211,
sampleSize = 204,
sampleSize = 202,
sampleSize = 218,
sampleSize = 210,
sampleSize = 206,
sampleSize = 207,
sampleSize = 221,
sampleSize = 219,
sampleSize = 236,
sampleSize = 219,
sampleSize = 227,
sampleSize = 225,
sampleSize = 225,
sampleSize = 229,
sampleSize = 225,
sampleSize = 236,
sampleSize = 233,
sampleSize = 231,
sampleSize = 249,
sampleSize = 234,
sampleSize = 250,
sampleSize = 249,
sampleSize = 259,
}
dataBuffer = 0x281cde370
The input append method keeps returning true which in principle means that all sample buffers got appended, but the reader for some reason skips the first chunk of data. Is there anything I'm doing wrong here?
I'm using the following code to read the file:
let asset = AVAsset(url: fileURL)
guard let assetReader = try? AVAssetReader(asset: asset) else {
return
}
asset.loadValuesAsynchronously(forKeys: ["tracks"]) { in
guard let audioTrack = asset.tracks(withMediaType: .audio).first else { return }
let audioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: nil)
assetReader.startReading()
while assetReader.status == .reading {
if let sampleBuffer = audioOutput.copyNextSampleBuffer() {
// do something
}
}
}
First some pedantry: you haven't lost your first sample buffer, but rather the first packet within your first sample buffer.
The behaviour of AVAssetReader with nil outputSettings when reading AACÂ packet data has changed on iOS 13 and macOS 10.15 (Catalina).
Previously you would get the first AAC packet, that packet's presentation timestamp (zero) and a trim attachment instructing you to discard the usual first 2112 frames of decoded audio.
Now [iOS 13, macOS 10.15] AVAssetReader seems to discard the first packet, leaving you the second packet, whose presentation timestamp is 1024, and you need only discard 2112 - 1024 = 1088 of the decoded frames.
Something that might not be immediately obvious in the above situations is that AVAssetReader is talking about TWO timelines, not one. The packet timestamps are referred to one, the untrimmed timeline, and the trim instruction implies the existence of another: the untrimmed timeline.
The transformation from untrimmed to trimmed timestamps is very simple, it's usually trimmed = untrimmed - 2112.
So is the new behaviour a bug? The fact that if you decode to LPCM and correctly follow the trim instructions, then you should still get the same audio, leads me believe the change was intentional (NB: I haven't yet personally confirmed the LPCM samples are the same).
However, the documentation says:
A value of nil for outputSettings configures the output to vend samples in their original format as stored by the specified track.
I don't think you can both discard packets [even the first one, which is basically a constant] and claim to be vending samples in their "original format", so from this point of view I think the change has a bug-like quality.
I also think it's an unfortunate change as I used to consider nil outputSettings AVAssetReader to be a sort of "raw" mode, but now it assumes your only use case is decoding to LPCM.
There's only one thing that could downgrade "unfortunate" to "serious bug", and that's if this new "let's pretend the first AAC packet doesn't exist" approach extends to files created with AVAssetWriter because that would break interoperability with non-AVAssetReader code, where the number of frames to trim has congealed to a constant 2112 frames. I also haven't personally confirmed this. Do you have a file created with the above sample buffers that you can share?
p.s. I don't think your input sample buffers are relevant here, I think you'd lose the first packet reading from any AAC file. However your input sample buffers seem slightly unusual in that they have hosttime [capture session?] style timestamps, yet are AAC, and only have one packet per sample buffer, which isn't very many and seems like a lot of overhead for 23ms of audio. Are you creating them them yourself in an AVCaptureSession -> AVAudioConverter chain?
I have a rusoto_core::ByteStream which implements futures' Stream trait:
let chunks = vec![b"1234".to_vec(), b"5678".to_vec()];
let stream = ByteStream::new(stream::iter_ok(chunks));
I'd like to pass it to actix_web's HttpResponseBuilder::streaming method.
use actix_web::dev::HttpResponseBuilder; // 0.7.18
use rusoto_core::ByteStream; // 0.36.0
fn example(stream: ByteStream, builder: HttpResponseBuilder) {
builder.streaming(stream);
}
When I try to do it I receive the following error:
error[E0271]: type mismatch resolving `<rusoto_core::stream::ByteStream as futures::stream::Stream>::Item == bytes::bytes::Bytes`
--> src/main.rs:5:13
|
5 | builder.streaming(stream);
| ^^^^^^^^^ expected struct `std::vec::Vec`, found struct `bytes::bytes::Bytes`
|
= note: expected type `std::vec::Vec<u8>`
found type `bytes::bytes::Bytes`
I believe the reason is that streaming() expects a S: Stream<Item = Bytes, Error> (i.e., Item = Bytes) but my ByteStream has Item = Vec<u8>. How can I fix it?
I think the solution is to flatmap my ByteStream somehow but I couldn't find such a method for streams.
Here's an example how streaming() can be used:
let text = "123";
let (tx, rx_body) = mpsc::unbounded();
let _ = tx.unbounded_send(Bytes::from(text.as_bytes()));
HttpResponse::Ok()
.streaming(rx_body.map_err(|e| error::ErrorBadRequest("bad request")))
How can I flatmap streams in Rust?
A flat map converts an iterator of iterators into a single iterator (or stream instead of iterator).
Futures 0.3
Futures 0.3 doesn't have a direct flat map, but it does have StreamExt::flatten, which can be used after a StreamExt::map.
use futures::{stream, Stream, StreamExt}; // 0.3.1
fn into_many(i: i32) -> impl Stream<Item = i32> {
stream::iter(0..i)
}
fn nested() -> impl Stream<Item = i32> {
let stream_of_number = into_many(5);
let stream_of_stream_of_number = stream_of_number.map(into_many);
let flat_stream_of_number = stream_of_stream_of_number.flatten();
// Returns: 0, 0, 1, 0, 1, 2, 0, 1, 2, 3
flat_stream_of_number
}
Futures 0.1
Futures 0.1 doesn't have a direct flat map, but it does have Stream::flatten, which can be used after a Stream::map.
use futures::{stream, Stream}; // 0.1.25
fn into_many(i: i32) -> impl Stream<Item = i32, Error = ()> {
stream::iter_ok(0..i)
}
fn nested() -> impl Stream<Item = i32, Error = ()> {
let stream_of_number = into_many(5);
let stream_of_stream_of_number = stream_of_number.map(into_many);
let flat_stream_of_number = stream_of_stream_of_number.flatten();
// Returns: 0, 0, 1, 0, 1, 2, 0, 1, 2, 3
flat_stream_of_number
}
However, this doesn't solve your problem.
streaming() expects a S: Stream<Item = Bytes, Error> (i.e., Item = Bytes) but my ByteStream has Item = Vec<u8>
Yes, this is the problem. Use Bytes::from via Stream::map to convert your stream Item from one type to another:
use bytes::Bytes; // 0.4.11
use futures::Stream; // 0.1.25
fn example(stream: ByteStream, mut builder: HttpResponseBuilder) {
builder.streaming(stream.map(Bytes::from));
}
I am using ExtAudioFileCreateWithURL and consistently get a runtime kAudioFileUnsupportedDataFormatError error when creating a Stereo LPCM Float32 Wave file. I insist that the same procedure works fine with a Mono (single channel) file. Any hints?
Here's the code snippet:
let audioType: AudioFileTypeID = kAudioFileWAVEType
var recordingFormatStream = CAStreamBasicDescription(sampleRate: sampleRate, numChannels: 2, pcmf: .Float32, isInterleaved: false)!
err = ExtAudioFileCreateWithURL(audioFileRecordingURL,
audioType,
&recordingFormatStream,
nil,
AudioFileFlags.EraseFile.rawValue,
&audioRecordingAudioFile)
noting that audioFileRecordingURL and audioRecordingAudioFile are correctly typed and set.
For the records, the recordingFormatStream contains:
mFormatFlags = kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked | kAudioFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved
mFormatID = kAudioFormatLinearPCM
mSampleRate: 44100.0
mBytesPerPacket: 4, mFramesPerPacket: 1, mBytesPerFrame: 4, mChannelsPerFrame: 2, mBitsPerChannel: 32, mReserved: 0
I insist that if I change the numChannels to 1, everything is fine! Using iOS 9.3 SDK.
After much struggle: The ExtAudioFile methods in SDK do not accept non-interleaved audio. I believe that this is somehow new!
Thanks to this post: Using ExtAudioFileWriteAsync() in callback function. Can't get to run