Find bit-depth of an AVAsset with audio encoded in linear PCM - ios

Given an AVAsset representing a movie that has at least one audio track one can determine various properties of this audio track by obtaining an AudioStreamBasicDescription instance corresponding to it:
AVAssetTrack audio_track = [[asset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
CMFormatDescriptionRef formatDescriptionRef = [audio_track.formatDescriptions objectAtIndex:0];
AudioStreamBasicDescription *ASBD = CMAudioFormatDescriptionGetStreamBasicDescription(formatDescriptionRef);
this instance (ASBD) can then be examined, for example:
ASBD->mFormatID == kAudioFormatLinearPCM // True if the track is PCM
ASBD->mFormatFlags & kAudioFormatFlagIsBigEndian // nonzero if the format is big endian
However I cannot seem to find a way to determine the bit-depth of the sample. This is necessary as it will be supplied as a value for the key AVLinearPCMBitDepthKey in a dictionary that will get passed as output settings to +[AVAssetWriterInput assetWriterInputWithMediaType: outputSettings:].
How may this information may be extracted from an AVAsset or an AVAssetTrack?
(The context is re-encoding the video in an AVAsset, but leaving the audio as-is)

The bit depth is stored in ASBD->mBitsPerChannel


How can I write XMP metadata to a QuickTime video on iOS?

I’m trying to attach some XMP metadata to a QuickTime video I'm exporting using AVAssetExportSession.
AVFoundation does support writing metadata (AVMetadataItem) and I’ve managed to export simple values which can subsequently be examined using exiftool:
AVMutableMetadataItem *item = [AVMutableMetadataItem metadataItem];
item.identifier = [AVMetadataItem identifierForKey:AVMetadataCommonKeyTitle keySpace:AVMetadataKeySpaceCommon];
item.value = #"My Title";
exportSession.metadata = #[item];
But I’m having trouble configuring my AVMetadataItem’s to correctly encode XMP. According to the Adobe XMP spec, XMP in QuickTime videos should be under the moov / udta / XMP_ atoms but I can’t see a way to make hierarchical metadata using the AVFoundation API, or any key space that corresponds to this part of the metadata.
I also need to write XMP metadata to images, and Image I/O does have direct support for this (CGImageMetadataCreateFromXMPData), but I can't find anything equivalent in AVFoundation.
If it's not possible using AVFoundation (or similar), I'll probably look at integrating XMP-Toolkit-SDK but this feels like a clunky solution when AVFoundation almost seems to do what I need.
I finally managed to figure this after trying lots of variations of keys/key spaces and other attributes of AVMetadataItem:
Use a custom XMP_ key in the AVMetadataKeySpaceQuickTimeUserData key space
Set the value not as an NSString but as an NSData containing UTF-8 data for the payload
Set the dataType to raw data
This results in XMP attributes that can be read by exiftool as expected.
NSString *payload =
#"<x:xmpmeta xmlns:x=\"adobe:ns:meta/\" x:xmptk=\"MyAppXMPLibrary\">"
"<rdf:RDF xmlns:rdf=\"\">"
"<rdf:Description rdf:about=\"\" xmlns:xmp=\"\">"
"<xmp:CreatorTool>My App</xmp:CreatorTool>"
NSData *data = [payload dataUsingEncoding:kCFStringEncodingUTF8];
AVMutableMetadataItem *item = [AVMutableMetadataItem metadataItem];
item.identifier = [AVMetadataItem identifierForKey:#"XMP_"
item.dataType = (NSString *)kCMMetadataBaseDataType_RawData;
item.value = data;
exportSession.metadata = #[item];

Detect current Keyframe interval in AVAsset

I am working on an application that plays back video and allows the user to scrub forwards and backwards in the video. The scrubbing has to happen smoothly, so we always re-write the video with SDAVAssetExportSession with the video compression property AVVideoMaxKeyFrameIntervalKey:#1 so that each frame will be a keyframe and allow smooth reverse scrubbing. This works great and provides smooth playback. The application uses video from a variety of sources and can be recorded on android or iOS devices and even downloaded from the web and added to the application, so we end up with quite different encodings, some of which are already suited for scrubbing (each frame is a keyframe). Is there a way to detect the keyframe interval of a video file so I can avoid needless video processing? I have been through much of AVFoundation's docs and don't see an obvious way to get this information. Thanks for any help on this.
If you can quickly parse the file without decoding the images by creating an AVAssetReaderTrackOutput with nil outputSettings. The frame sample buffers you encounter have an attachment array containing a dictionary with useful information, include whether the frame depends on other frames, or whether other frames depend on it. I would interpret that former as indicating a keyframe, although it gives me some low number (4% keyframes in one file?). Anyway, the code:
let asset = AVAsset(url: inputUrl)
let reader = try! AVAssetReader(asset: asset)
let videoTrack = asset.tracks(withMediaType: AVMediaTypeVideo)[0]
let trackReaderOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: nil)
var numFrames = 0
var keyFrames = 0
while true {
if let sampleBuffer = trackReaderOutput.copyNextSampleBuffer() {
// NB: not every sample buffer corresponds to a frame!
if CMSampleBufferGetNumSamples(sampleBuffer) > 0 {
numFrames += 1
if let attachmentArray = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false) as? NSArray {
let attachment = attachmentArray[0] as! NSDictionary
// print("attach on frame \(frame): \(attachment)")
if let depends = attachment[kCMSampleAttachmentKey_DependsOnOthers] as? NSNumber {
if !depends.boolValue {
keyFrames += 1
} else {
print("\(keyFrames) on \(numFrames)")
N.B. This only works for local file assets.
p.s. you don't say how you're scrubbing or playing. An AVPlayerViewController and an AVPlayer?
Here is the Objective C version of the same answer. After implementing this and using it, Videos that should have all keyframes are returning about 96% keyframes from this code. I'm not sure why, so I am using that number as a determining factor even though I would like it to be more accurate. I am also only looking through the first 600 frames or the end of the video (whichever comes first) since I don't need to read through a whole 20 minute video to make this determination.
+ (BOOL)videoNeedsProcessingForSlomo:(NSURL*)fileUrl {
BOOL needsProcessing = YES;
AVAsset* anAsset = [AVAsset assetWithURL:fileUrl];
NSError *error;
AVAssetReader *assetReader = [AVAssetReader assetReaderWithAsset:anAsset error:&error];
if (error) {
DLog(#"Error:%#", error.localizedDescription);
return YES;
AVAssetTrack *videoTrack = [[anAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
AVAssetReaderTrackOutput *trackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:videoTrack outputSettings:nil];
[assetReader addOutput:trackOutput];
[assetReader startReading];
float numFrames = 0;
float keyFrames = 0;
while (numFrames < 600) { // If the video is long - only parse through 20 seconds worth.
CMSampleBufferRef sampleBuffer = [trackOutput copyNextSampleBuffer];
if (sampleBuffer) {
// NB: not every sample buffer corresponds to a frame!
if (CMSampleBufferGetNumSamples(sampleBuffer) > 0) {
numFrames += 1;
NSArray *attachmentArray = ((NSArray*)CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false));
if (attachmentArray) {
NSDictionary *attachment = attachmentArray[0];
NSNumber *depends = attachment[(__bridge NSNumber*)kCMSampleAttachmentKey_DependsOnOthers];
if (depends) {
if (depends.boolValue) {
keyFrames += 1;
else {
needsProcessing = keyFrames / numFrames < 0.95f; // If more than 95% of the frames are keyframes - don't decompress.
return needsProcessing;
Using kCMSampleAttachmentKey_DependsOnOthers was giving me 0 key frames in some cases, when ffprobe would return key frames.
To get the same number of key frames as ffprobe shows, I used:
if attachment[CMSampleBuffer.PerSampleAttachmentsDictionary.Key.notSync] == nil {
keyFrames += 1
In the CoreMedia header it says:
/// Boolean (absence of this key implies Sync)
public static let notSync: CMSampleBuffer.PerSampleAttachmentsDictionary.Key
for dependsOnOthers key it says:
/// `true` (e.g., non-I-frame), `false` (e.g. I-frame), or absent if
/// unknown
public static let dependsOnOthers: CMSampleBuffer.PerSampleAttachmentsDictionary.Key

I'm trying to use AVQueuePlayer to create a seamless audio loop, however, I don't know why there is a small silent pause between loops?

I have a simple audio file in .wav format (the audio file is cut perfectly to loop). I've tried different methods to loop it. My first attempt was simply using AVPlayer and NSNotification to detect when audioItem ended to seek time at zero and play again. However, there was clearly a gap.
I've been looking at different solutions online, and found people using AVQueuePlayer to do a switching:
Looping AVPlayer seamlessly
However, when implemented, this still produces a gap.
Here's my current notification code:
weak var weakSelf = self
NSNotificationCenter.defaultCenter().addObserverForName(AVPlayerItemDidPlayToEndTimeNotification, object: nil, queue: nil, usingBlock: {(note: NSNotification) -> Void in
if weakSelf?.currentQueuePlayer.currentItem == weakSelf?.currentAudioItemOne {
weakSelf?.currentQueuePlayer.insertItem((weakSelf?.currentAudioItemTwo)!, afterItem: nil)
} else {
weakSelf?.currentQueuePlayer.insertItem((weakSelf?.currentAudioItemOne)!, afterItem: nil)
Here's my code to set up the current QueuePlayer.
let audioPlayerItem = AVPlayerItem(URL: url)
currentAudioItemOne = audioPlayerItem
currentAudioItemTwo = audioPlayerItem
currentQueuePlayer = AVQueuePlayer()
currentQueuePlayer.insertItem(currentAudioItemOne, afterItem: nil)
I've been working at this problem for several days now. Any leads or new things to try would be appreciated. The only thing I haven't tried so far is lower quality audio files. These .wav files are all over 1mb, and had be suspecting that the file size could be affecting the seamless looping.
Using AVPlayerLooper to create the 'Treadmill' effect:
let url = URL(fileURLWithPath: path)
let audioPlayerItem = AVPlayerItem(url: url)
currentAudioItemOne = audioPlayerItem
currentQueuePlayer = AVQueuePlayer()
currentAudioPlayerLayer = AVPlayerLayer(player: currentQueuePlayer)
currentAudioLooper = AVPlayerLooper(player: currentQueuePlayer, templateItem: currentAudioItemOne)
afinfo on one of my wav files:
Num Tracks: 1
Data format: 2 ch, 44100 Hz, 'lpcm' (0x0000000C) 16-bit little-endian signed integer
no channel layout.
estimated duration: 11.302336 sec
audio bytes: 1993732
audio packets: 498433
bit rate: 1411200 bits per second
packet size upper bound: 4
maximum packet size: 4
audio data file offset: 44
not optimized
source bit depth: I16
You are inserting the item too late in your current solution. You need to queue up more than one initial item, so there's always a primed AVPlayerItem ready to go.
This is called the AVPlayerQueue "treadmill pattern" as better described in this WWDC 2016 session. If you're targeting iOS 10, you can use new AVPlayerLooper class which does it for you (also described in the same link). Apple has also provided a sample project which provides an example of both strategies.
Lower level solutions include queuing up the audio buffers to an AVAudioEngine instance or using an AudioQueue or mashing the buffers together yourself with an AudioUnit.

How to get the current captured timestamp of Camera data from CMSampleBufferRef in iOS

I developed and iOS application which will save captured camera data into a file and I used
(void) captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
to capture CMSampleBufferRef and this will encode into H264 format, and frames will be saved to a file using AVAssetWriter.
I followed the sample source code to create this app:
Now I want to get the timestamp of saved video frames to create a new movie file. For this, I have done the following things
Locate the file and create AVAssestReader to read the file
CMSampleBufferRef sample = [asset_reader_output copyNextSampleBuffer];
CMSampleBufferRef buffer;
while ([assestReader status] == AVAssetReaderStatusReading) {
buffer = [asset_reader_output copyNextSampleBuffer];
// CMSampleBufferGetPresentationTimeStamp(buffer);
CMTime presentationTimeStamp = CMSampleBufferGetPresentationTimeStamp(buffer);
UInt32 timeStamp = (1000 * presentationTimeStamp.value) / presentationTimeStamp.timescale;
NSLog(#"timestamp %u", (unsigned int) timeStamp);
// CFRelease(buffer);
printed value gives me a wrong timestamp and I need to get frame's captured time.
Is there any way to get frame captured timestamp?
I've read an answer to get it to timestamp but it does not properly elaborate my question above.
I read the sample time-stamp before it writes to a file, it gave me xxxxx value (33333.23232). After I tried to read the file it gave me different value. Any specific reason for this??
The file timestamps are different to the capture timestamps because they are relative to the beginning of the file. This means they are the capture timestamps you want, minus the timestamp of the very first frame captured:
presentationTimeStamp = fileFramePresentationTime + firstFrameCaptureTime
So when reading from the file, this should calculate the capture timestamp you want:
CMTime firstCaptureFrameTimeStamp = // the first capture timestamp you see
CMTime presentationTimeStamp = CMTimeAdd(CMSampleBufferGetPresentationTimeStamp(buffer), firstCaptureFrameTimeStamp);
If you do this calculation between launches of your app, you'll need to serialise and deserialise the first frame capture time, which you can do with CMTimeCopyAsDictionary and CMTimeMakeFromDictionary.
You could store this in the output file, via AVAssetWriter's metadata property.

iOS: How to get Audio Sample Rate from AVAsset or AVAssetTrack

After loading an AVAsset like this:
AVAsset *asset = [AVAsset assetWithURL:url];
I want to know what the Sampling Rate is of the Audio track. Currently, I am getting the Audio Track like this:
AVAssetTrack *audioTrack = [[asset tracksWithMediaCharacteristic:AVMediaCharacteristicAudible] objectAtIndex:0];
Which works. But I can't seem to find any kind of property, not even after using Google ;-) , that gives me the sampling rate. How does this work normally ? Is it even possible ? (I start doubting more and more, because Googling is not giving me a lot of information ...)
let asset = AVAsset(url: URL(fileURLWithPath: "asset/p2a2.aif"))
let track = asset.tracks[0]
let desc = track.formatDescriptions[0] as! CMAudioFormatDescription
let basic = CMAudioFormatDescriptionGetStreamBasicDescription(desc)
I'm using Swift so it looks a bit different but it should still work with Obj-C.
Also seems to give the correct answer but I'm a bit apprehensive because of the name.
Using Swift and AVFoundation :
let url = Bundle.main.url(forResource: "audio", withExtension: "m4a")!
let asset = AVAsset(url: url)
if let firstTrack = asset.tracks.first {
print("bitrate: \(firstTrack.estimatedDataRate)")
To find more information in your metadata, you can also consult:
Found it. I was using the MTAudioProcessingTap, so in the prepare() function I could just use:
void prepare(MTAudioProcessingTapRef tap, CMItemCount maxFrames, const AudioStreamBasicDescription *processingFormat)
sampleRate = processingFormat->mSampleRate;
NSLog(#"Preparing the Audio Tap Processor");
