Swift - get kCMSampleBufferAttachmentKey_DroppedFrameReason from CMSampleBuffer - ios

I'm trying to understand why my AVCaptureOutput is dropping frames. In the captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) delegate method, I get a CMSampleBuffer that should contains an attachement explaining the reason the frame was dropped (doc)
The reason is expected to be one of those CFString:
kCMSampleBufferDroppedFrameReason_FrameWasLate // "FrameWasLate"
kCMSampleBufferDroppedFrameReason_OutOfBuffers // "OutOfBuffers"
kCMSampleBufferDroppedFrameReason_Discontinuity // "Discontinuity"
From the docs it's really not clear how to get this value. I've tried using CMGetAttachment but this returns a CMAttachmentMode aka UInt32:
func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
var reason: CMAttachmentMode = 0
CMGetAttachment(sampleBuffer, kCMSampleBufferAttachmentKey_DroppedFrameReason, &reason)
print("reason \(reason)") // 1
and I don't really know how to match this UInt32 to the CFString constant

I was stupidly not looking at the right output:
var mode: CMAttachmentMode = 0
let reason = CMGetAttachment(sampleBuffer, kCMSampleBufferAttachmentKey_DroppedFrameReason, &mode)
print("reason \(String(describing: reason))") // Optional(OutOfBuffers)


How to perform a method in a continuous method from a delegate every few seconds?

I want to perform something using the AVCaptureVideoDataOutputSampleBufferDelegate protocol. But since it captures every frame at (I think) 30 frames per second, it performs the method 30 times in 1 second and I don't want that. What I want to do is only to perform the method for let's say every 1 second at a time. So far my code looks like this:
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
// ... perform something
// ... wait for a second
// ... perform it again
// ... wait for another second
// ... and so on
How can I manage to do this?
You can add a counter and only perform your code every n steps, like eg when you want to perform your code every 30 times the function is called:
var counter: Int = 0
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
if counter%30 == 0 {
// perform your code
counter += 1
You can use Timer for that.
Timer.scheduledTimer(withTimeInterval: 1, repeats: true) { timer in
let randomNumber = Int.random(in: 1...20)
print("Number: \(randomNumber)")
if randomNumber == 10 {

Why does AVCaptureSession.stopRunning not stop?

I have recently been onboarded onto an existing project at work which is written in Swift.
The metadataOutput(_:didOutput:from:) delegate function looks like this:
func metadataOutput(_ output: AVCaptureMetadataOutput,
didOutput metadataObjects: [AVMetadataObject],
from connection: AVCaptureConnection) {
if let metadataObject = metadataObjects.first,
let readableObject = metadataObject as? AVMetadataMachineReadableCodeObject {
... // other code
Theoretically speaking, the print function should only print once; however, in the console, I can see that the output is printed several times.
Is there something that I am missing out here? I have created a barebones standalone scanning app, and it prints only once there.

How can I extract image buffer from captureOutput (didDrop samplebuffer)?

My application involves getting video frames from the camera and running CoreML on it. This will then get displayed on the view. To do this, I have an AVCaptureSession() that is hooked to a video output. The main processing of CoreML is being done in captureOutput(didOutput Samplebuffer). Each processing of the frame takes about 0.15 seconds, meaning that I WILL have some dropped frames.After CoreML process is done, I have an AVAssetWriter that append all these frames together and saves it within the phone directory.
THE main problem is however, that my use-case also requires the original video to be saved, and this video should have HIGH FPS and since I am only able to get the image frames only in captureOutput(didOutput), the video quality will be choppy.
I have tried the following:
The reason that I'm using AVAssetWriter is because it is given here: https://forums.developer.apple.com/thread/98113#300885 that it is NOT POSSIBLE to have both AVCaptureVideoDataOutput and AVCaptureMovieFileOutput.
I have also tried extracting Image buffer from the captureOutput(didDrop) using guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } but it is giving me nil. This is because sampleBuffer only contains metadata but not imagebuffer as explained here: https://developer.apple.com/documentation/avfoundation/avcapturevideodataoutputsamplebufferdelegate/1388468-captureoutput.
Here is some of my code:
func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { /* HOW DO I possibly extract an image buffer here from the dropped frames here? */ }
func captureOutput(_ captureOutput: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
let inputVideoImage = UIImage(pixelBuffer: pixelBuffer)
if self.isRecording{
let sourceTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
if let camera = self.videoWriterVideoInput, camera.isReadyForMoreMediaData {
videoWriterQueue.async() {
self.videoWriterInputPixelBufferAdaptor.append(pixelBuffer, withPresentationTime: sourceTime)
print("AVAssetInput is not ready for more media data ... ")

How to process only 1/10 frames in func captureOutput

I am doing some kind of processing on frames in following method
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {}
But I do not want to do it in all frames say I want 1/15 or 1/10 how can I achieve this is there any pre-build logic provided by swift?
you can add a counter in your class and then increment it in captureOutput
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection){
i += 1
if (i % 15 == 0) //every 15 frames
//process your frame

Real time face tracking with camera in swift 4

I want to be able to track a users face from the camera feed. I have looked at this SO post. I used the code given in the answer but it did not seem to do anything. I have heard that
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!)
has been changed to something else in swift 4. Could this be the problem with the code?
While face tracking I want to also monitor face landmarks with CIFaceFeature. How would I do this?
I have found a starting point here: https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision.
Basically you can instatiate a video capture session declaring a lazy variable like this:
private lazy var captureSession: AVCaptureSession = {
let session = AVCaptureSession()
session.sessionPreset = AVCaptureSession.Preset.photo
let frontCamera = AVCaptureDevice.default(.builtInWideAngleCamera, for: .video, position: .front),
let input = try? AVCaptureDeviceInput(device: frontCamera)
else { return session }
return session
Then inside viewDidLoad you start the session
And finally you can perform your requests inside
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
for example:
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer:
CMSampleBuffer, from connection: AVCaptureConnection) {
// make sure the pixel buffer can be converted
let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
else { return }
let faceRequest = VNDetectFaceRectanglesRequest(completionHandler: self.faceDetectedRequestUpdate)
// perform the request
do {
try self.visionSequenceHandler.perform([faceRequest], on: pixelBuffer)
} catch {
print("Throws: \(error)")
And then you define your faceDetectedRequestUpdate function.
Anyway I have to say that I haven't been able to figure out how to create a working example from here. The best working example I have found is in Apple's documentation: https://developer.apple.com/documentation/vision/tracking_the_user_s_face_in_real_time
