How to apply MPSImageGuidedFilter in Swift

How to apply MPSImageGuidedFilter in Swift - ios

I applied OpenCV guided filter for my project in Python successfully and now I have to carry this function to my iOS application. I searched Apple developer website and found out a filter called MPSImageGuidedFilter. I suppose it works in a similar way as OpenCV guided filter. However, my limited knowledge of iOS programming does not let me figure out how to use it. Unfortunately, I could not find a sample code on the web too. Is there anyone who used this filter before? I really appreciate the help. Or a sample code using any filter under MPS would be helpful to figure out.

After a few days of hard work, I managed to make MPSImageGUidedFilter. Please see below the code who wants to work on:
import UIKit
import MetalPerformanceShaders
import MetalKit
class ViewController: UIViewController {
public var texIn: MTLTexture!
public var coefficient: MTLTexture!
public var context: CIContext!
let device = MTLCreateSystemDefaultDevice()!
var queue: MTLCommandQueue!
var Metalview: MTKView { return view as! MTKView }
override func viewDidLoad() {
super.viewDidLoad()
Metalview.drawableSize.height = 412
Metalview.drawableSize.width = 326
Metalview.framebufferOnly = false
Metalview.device = device
Metalview.delegate = self
queue = device.makeCommandQueue()
let textureLoader = MTKTextureLoader(device: device)
let urlCoeff = Bundle.main.url(forResource: "mask", withExtension: "png")
do {
coefficient = try textureLoader.newTexture(URL: urlCoeff!, options: [:])
} catch {
fatalError("coefficient file not uploaded")
}
let url = Bundle.main.url(forResource: "guide", withExtension: "png")
do {
texIn = try textureLoader.newTexture(URL: url!, options: [:])
} catch {
fatalError("resource file not uploaded")
}
}
}
extension ViewController: MTKViewDelegate {
func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {}
func draw(in view: MTKView) {
guard let commandBuffer = queue.makeCommandBuffer(),
let drawable = view.currentDrawable else {
return
}
let shader = MPSImageGuidedFilter(device: device, kernelDiameter: 5)
shader.epsilon = 0.001
let textOut = drawable.texture
shader.encodeReconstruction(to: commandBuffer, guidanceTexture: texIn, coefficientsTexture: coefficient, destinationTexture: textOut)
commandBuffer.present(drawable)
commandBuffer.commit()
}
}

Related

ARKit: Tracking VisonCoreML detected object

I'm new to iOS and I am currently refactoring a code I got from a tutorial on VisionCoreML and ARKit that adds a node to the detected object.
currently, if the I move the object the node does not move and follow the object. I can see from Apple's sample code for Recognizing Objects in Live Capture they use layers and repositions this each time Vision detects the object at a new position which is what I was hoping to replicate with an ARObject.
Is there a way I can achieve this with ARKit?
Any help around this would be greatly appreciated.
Thanks.
EDIT: Working code with solution
#IBOutlet var sceneView: ARSCNView!
private var viewportSize: CGSize!
private var previousAnchor: ARAnchor?
private var trackingNode: SCNNode!
lazy var objectDetectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: yolov5s(configuration: MLModelConfiguration()).model)
let request = VNCoreMLRequest(model: model) { [weak self] request, error in
self?.processDetections(for: request, error: error)
}
return request
} catch {
fatalError("Failed to load Vision ML model.")
}
}()
func renderer(_ renderer: SCNSceneRenderer, willRenderScene scene: SCNScene, atTime time: TimeInterval) {
guard let capturedImage = sceneView.session.currentFrame?.capturedImage
else { return }
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: capturedImage, orientation: .leftMirrored, options: [:])
do {
try imageRequestHandler.perform([objectDetectionRequest])
} catch {
print("Failed to perform image request.")
}
}
func processDetections(for request: VNRequest, error: Error?) {
guard error == nil else {
print("Object detection error: \(error!.localizedDescription)")
return
}
guard let results = request.results else { return }
for observation in results where observation is VNRecognizedObjectObservation {
let objectObservation = observation as! VNRecognizedObjectObservation
let topLabelObservation = objectObservation.labels.first
print(topLabelObservation!.identifier + " " + "\(Int(topLabelObservation!.confidence * 100))%")
guard recognisedObject(topLabelObservation!.identifier) && topLabelObservation!.confidence > 0.9
else { continue }
let rect = VNImageRectForNormalizedRect(
objectObservation.boundingBox,
Int(self.sceneView.bounds.width),
Int(self.sceneView.bounds.height))
let midPoint = CGPoint(x: rect.midX, y: rect.midY)
let raycastQuery = self.sceneView.raycastQuery(from: midPoint,
allowing: .estimatedPlane,
alignment: .any)
let raycastArray = self.sceneView.session.raycast(raycastQuery!)
guard let raycastResult = raycastArray.first else { return }
let position = SCNVector3(raycastResult.worldTransform.columns.3.x,
raycastResult.worldTransform.columns.3.y,
raycastResult.worldTransform.columns.3.z)
if let _ = trackingNode {
trackingNode!.worldPosition = position
} else {
trackingNode = createNode()
trackingNode!.worldPosition = position
self.sceneView.scene.rootNode.addChildNode(trackingNode!)
}
}
}
private func recognisedObject(_ identifier: String) -> Bool {
return identifier == "remote" || identifier == "mouse"
}
private func createNode() -> SCNNode {
let sphereNode = SCNNode(geometry: SCNSphere(radius: 0.01))
sphereNode.geometry?.firstMaterial?.diffuse.contents = UIColor.purple
return sphereNode
}
private func loadSession() {
let configuration = ARWorldTrackingConfiguration()
configuration.planeDetection = []
sceneView.session.run(configuration)
}
override func viewDidLoad() {
super.viewDidLoad()
sceneView.delegate = self
viewportSize = sceneView.frame.size
}
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
loadSession()
}
override func viewWillDisappear(_ animated: Bool) {
super.viewWillDisappear(animated)
sceneView.session.pause()
}

To be honest, the technologies you're using here cannot do that out of the box. YOLO (and any other object detection model you swapped out for it) have no built in concept of tracking the same object in a video. They look for objects in a 2D bitmap, and return 2D bounding boxes for them. As either the camera or object moves, and you pass in the next capturedImage buffer, it will give you a new bounding box in the correct position, but it has no way of knowing whether or not it's the same instance of the object detected in a previous frame.
To make this work, you'll need to do some post processing of those Vision results to determine whether or not it's the same object, and if so, manually move the anchor/mesh to match the new position. If you're confident there should only be one object in view at any given time, then it's pretty straightforward. If there will be multiple objects, you're venturing into complex (but still achievable) territory.
You could try to incorporate Vision Tracking, which might work though would depend on the nature and behavior of the tracked object.
Also, sceneView.hitTest() is deprecated. You should probably port that over to use ARSession.raycast()

Unwanted "smoothing" in AVDepthData on iPhone 13 (not evident in iPhone 12)

We are writing an app which analyzes a real world 3D data by using the TrueDepth camera on the front of an iPhone, and an AVCaptureSession configured to produce AVDepthData along with image data. This worked great on iPhone 12, but the same code on iPhone 13 produces an unwanted "smoothing" effect which makes the scene impossible to process and breaks our app. We are unable to find any information on this effect, from Apple or otherwise, much less how to avoid it, so we are asking you experts.
At the bottom of this post (Figure 3) is our code which configures the capture session, using an AVCaptureDataOutputSynchronizer, to produce frames of 640x480 image and depth data. I boiled it down as much as possible, sorry it's so long. The main two parts are the configure function, which sets up our capture session, and the dataOutputSynchronizer function, near the bottom, which fires when a sycned set of data is available. In the latter function I've included my code which extracts the information from the AVDepthData object, including looping through all 640x480 depth data points (in meters). I've excluded further processing for brevity (believe it or not :)).
On an iPhone 12 device, the PNG data and the depth data merge nicely. The front view and side view of the merged pointcloud are below (Figure 1) . The angles visible in the side view are due to the application of the focal length which "de-perspectives" the data and places them in their proper position in xyz space.
The same code on an iPhone 13 produces depth maps that result in point cloud further below (Figure 2 -- straight on view, angled view, and side view). There is no longer any clear distinction between objects and the background becasue the depth data appears to be "smoothed" between the mannequin and the background -- i.e., there are seven or eight points between the subject and background that are not realistic and make it impossible to do any meaningful processing such as segmenting the scene.
Has anyone else encountered this issue, or have any insight into how we might change our code to avoid it? Any help or ideas are MUCH appreciated, since this is a definite showstopper (we can't tell people to only run our App on older phones :)). Thank you!
Figure 1 -- Merged depth data and image into point cloud, from iPhone 12
Figure 2 -- Merged depth data and image into point cloud, from iPhone 13; unwanted smoothing effect visible
Figure 3 -- Our configuration code and capture handler; edited to remove downstream processing of captured data (which was basically formatting it into an XML file and uploading to the cloud)
import Foundation
import Combine
import AVFoundation
import Photos
import UIKit
import FirebaseStorage
public struct AlertError {
public var title: String = ""
public var message: String = ""
public var primaryButtonTitle = "Accept"
public var secondaryButtonTitle: String?
public var primaryAction: (() -> ())?
public var secondaryAction: (() -> ())?
public init(title: String = "", message: String = "", primaryButtonTitle: String = "Accept", secondaryButtonTitle: String? = nil, primaryAction: (() -> ())? = nil, secondaryAction: (() -> ())? = nil) {
self.title = title
self.message = message
self.primaryAction = primaryAction
self.primaryButtonTitle = primaryButtonTitle
self.secondaryAction = secondaryAction
}
}
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
//
//
// this is the CameraService class, which configures and runs a capture session
// which acquires syncronized image and depth data
// using an AVCaptureDataOutputSynchronizer
//
//
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
public class CameraService: NSObject,
AVCaptureVideoDataOutputSampleBufferDelegate,
AVCaptureDepthDataOutputDelegate,
AVCaptureDataOutputSynchronizerDelegate,
MyFirebaseProtocol,
ObservableObject{
#Published public var shouldShowAlertView = false
#Published public var shouldShowSpinner = false
public var labelStatus: String = "Ready"
var images: [UIImage?] = []
public var alertError: AlertError = AlertError()
public let session = AVCaptureSession()
var isSessionRunning = false
var isConfigured = false
var setupResult: SessionSetupResult = .success
private let sessionQueue = DispatchQueue(label: "session queue") // Communicate with the session and other session objects on this queue.
#objc dynamic var videoDeviceInput: AVCaptureDeviceInput!
private let videoDeviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInTrueDepthCamera], mediaType: .video, position: .front)
var videoCaptureDevice : AVCaptureDevice? = nil
let videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() // Define frame output.
let depthDataOutput = AVCaptureDepthDataOutput()
var outputSynchronizer: AVCaptureDataOutputSynchronizer? = nil
let dataOutputQueue = DispatchQueue(label: "video data queue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
var scanStateCounter: Int = 0
var m_DepthDatasetsToUpload = [AVCaptureSynchronizedDepthData]()
var m_FrameBufferToUpload = [AVCaptureSynchronizedSampleBufferData]()
var firebaseDepthDatasetsArray: [String] = []
#Published var firebaseImageUploadCount = 0
#Published var firebaseTextFileUploadCount = 0
public func configure() {
/*
Setup the capture session.
In general, it's not safe to mutate an AVCaptureSession or any of its
inputs, outputs, or connections from multiple threads at the same time.
Don't perform these tasks on the main queue because
AVCaptureSession.startRunning() is a blocking call, which can
take a long time. Dispatch session setup to the sessionQueue, so
that the main queue isn't blocked, which keeps the UI responsive.
*/
sessionQueue.async {
self.configureSession()
}
}
// MARK: Checks for user's permisions
public func checkForPermissions() {
switch AVCaptureDevice.authorizationStatus(for: .video) {
case .authorized:
// The user has previously granted access to the camera.
break
case .notDetermined:
/*
The user has not yet been presented with the option to grant
video access. Suspend the session queue to delay session
setup until the access request has completed.
*/
sessionQueue.suspend()
AVCaptureDevice.requestAccess(for: .video, completionHandler: { granted in
if !granted {
self.setupResult = .notAuthorized
}
self.sessionQueue.resume()
})
default:
// The user has previously denied access.
setupResult = .notAuthorized
DispatchQueue.main.async {
self.alertError = AlertError(title: "Camera Access", message: "SwiftCamera doesn't have access to use your camera, please update your privacy settings.", primaryButtonTitle: "Settings", secondaryButtonTitle: nil, primaryAction: {
UIApplication.shared.open(URL(string: UIApplication.openSettingsURLString)!,
options: [:], completionHandler: nil)
}, secondaryAction: nil)
self.shouldShowAlertView = true
}
}
}
// MARK: Session Management
// Call this on the session queue.
/// - Tag: ConfigureSession
private func configureSession() {
if setupResult != .success {
return
}
session.beginConfiguration()
session.sessionPreset = AVCaptureSession.Preset.vga640x480
// Add video input.
do {
var defaultVideoDevice: AVCaptureDevice?
let frontCameraDevice = AVCaptureDevice.default(.builtInTrueDepthCamera, for: .video, position: .front)
// If the rear wide angle camera isn't available, default to the front wide angle camera.
defaultVideoDevice = frontCameraDevice
videoCaptureDevice = defaultVideoDevice
guard let videoDevice = defaultVideoDevice else {
print("Default video device is unavailable.")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
let videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice)
if session.canAddInput(videoDeviceInput) {
session.addInput(videoDeviceInput)
self.videoDeviceInput = videoDeviceInput
} else if session.inputs.isEmpty == false {
self.videoDeviceInput = videoDeviceInput
} else {
print("Couldn't add video device input to the session.")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
} catch {
print("Couldn't create video device input: \(error)")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// MARK: add video output to session
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
videoDataOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString) : NSNumber(value: kCVPixelFormatType_32BGRA)] as [String : Any]
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "camera_frame_processing_queue"))
if session.canAddOutput(self.videoDataOutput) {
session.addOutput(self.videoDataOutput)
} else if session.outputs.contains(videoDataOutput) {
} else {
print("Couldn't create video device output")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
guard let connection = self.videoDataOutput.connection(with: AVMediaType.video),
connection.isVideoOrientationSupported else { return }
connection.videoOrientation = .portrait
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// MARK: add depth output to session
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Add a depth data output
if session.canAddOutput(depthDataOutput) {
session.addOutput(depthDataOutput)
depthDataOutput.isFilteringEnabled = false
//depthDataOutput.setDelegate(T##delegate: AVCaptureDepthDataOutputDelegate?##AVCaptureDepthDataOutputDelegate?, callbackQueue: <#T##DispatchQueue?#>)
depthDataOutput.setDelegate(self, callbackQueue: DispatchQueue(label: "depth_frame_processing_queue"))
if let connection = depthDataOutput.connection(with: .depthData) {
connection.isEnabled = true
} else {
print("No AVCaptureConnection")
}
} else if session.outputs.contains(depthDataOutput){
} else {
print("Could not add depth data output to the session")
session.commitConfiguration()
return
}
// Search for highest resolution with half-point depth values
let depthFormats = videoCaptureDevice!.activeFormat.supportedDepthDataFormats
let filtered = depthFormats.filter({
CMFormatDescriptionGetMediaSubType($0.formatDescription) == kCVPixelFormatType_DepthFloat16
})
let selectedFormat = filtered.max(by: {
first, second in CMVideoFormatDescriptionGetDimensions(first.formatDescription).width < CMVideoFormatDescriptionGetDimensions(second.formatDescription).width
})
do {
try videoCaptureDevice!.lockForConfiguration()
videoCaptureDevice!.activeDepthDataFormat = selectedFormat
videoCaptureDevice!.unlockForConfiguration()
} catch {
print("Could not lock device for configuration: \(error)")
session.commitConfiguration()
return
}
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
// Use an AVCaptureDataOutputSynchronizer to synchronize the video data and depth data outputs.
// The first output in the dataOutputs array, in this case the AVCaptureVideoDataOutput, is the "master" output.
//////////////////////////////////////////////////////////////////////////////////////////////////////////////
outputSynchronizer = AVCaptureDataOutputSynchronizer(dataOutputs: [videoDataOutput, depthDataOutput])
outputSynchronizer!.setDelegate(self, queue: dataOutputQueue)
session.commitConfiguration()
self.isConfigured = true
//self.start()
}
// MARK: Device Configuration
/// - Tag: Stop capture session
public func stop(completion: (() -> ())? = nil) {
sessionQueue.async {
//print("entered stop")
if self.isSessionRunning {
//print(self.setupResult)
if self.setupResult == .success {
//print("entered success")
DispatchQueue.main.async{
self.session.stopRunning()
self.isSessionRunning = self.session.isRunning
if !self.session.isRunning {
DispatchQueue.main.async {
completion?()
}
}
}
}
}
}
}
/// - Tag: Start capture session
public func start() {
// We use our capture session queue to ensure our UI runs smoothly on the main thread.
sessionQueue.async {
if !self.isSessionRunning && self.isConfigured {
switch self.setupResult {
case .success:
self.session.startRunning()
self.isSessionRunning = self.session.isRunning
if self.session.isRunning {
}
case .configurationFailed, .notAuthorized:
print("Application not authorized to use camera")
DispatchQueue.main.async {
self.alertError = AlertError(title: "Camera Error", message: "Camera configuration failed. Either your device camera is not available or its missing permissions", primaryButtonTitle: "Accept", secondaryButtonTitle: nil, primaryAction: nil, secondaryAction: nil)
self.shouldShowAlertView = true
}
}
}
}
}
// ------------------------------------------------------------------------
// MARK: CAPTURE HANDLERS
// ------------------------------------------------------------------------
public func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer, didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
//printWithTime("Capture")
guard let syncedDepthData: AVCaptureSynchronizedDepthData =
synchronizedDataCollection.synchronizedData(for: depthDataOutput) as? AVCaptureSynchronizedDepthData else {
return
}
guard let syncedVideoData: AVCaptureSynchronizedSampleBufferData =
synchronizedDataCollection.synchronizedData(for: videoDataOutput) as? AVCaptureSynchronizedSampleBufferData else {
return
}
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
//
//
// Below is the code that extracts the information from depth data
// The depth data is 640x480, which matches the size of the synchronized image
// I save this info to a file, upload it to the cloud, and merge it with the image
// on a PC to create a pointcloud
//
//
///////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////
let depth_data : AVDepthData = syncedDepthData.depthData
let cvpixelbuffer : CVPixelBuffer = depth_data.depthDataMap
let height : Int = CVPixelBufferGetHeight(cvpixelbuffer)
let width : Int = CVPixelBufferGetWidth(cvpixelbuffer)
let quality : AVDepthData.Quality = depth_data.depthDataQuality
let accuracy : AVDepthData.Accuracy = depth_data.depthDataAccuracy
let pixelsize : Float = depth_data.cameraCalibrationData!.pixelSize
let camcaldata : AVCameraCalibrationData = depth_data.cameraCalibrationData!
let intmat : matrix_float3x3 = camcaldata.intrinsicMatrix
let cal_lensdistort_x : CGFloat = camcaldata.lensDistortionCenter.x
let cal_lensdistort_y : CGFloat = camcaldata.lensDistortionCenter.y
let cal_matrix_width : CGFloat = camcaldata.intrinsicMatrixReferenceDimensions.width
let cal_matrix_height : CGFloat = camcaldata.intrinsicMatrixReferenceDimensions.height
let intrinsics_fx : Float = camcaldata.intrinsicMatrix.columns.0.x
let intrinsics_fy : Float = camcaldata.intrinsicMatrix.columns.1.y
let intrinsics_ox : Float = camcaldata.intrinsicMatrix.columns.2.x
let intrinsics_oy : Float = camcaldata.intrinsicMatrix.columns.2.y
let pixelformattype : OSType = CVPixelBufferGetPixelFormatType(cvpixelbuffer)
CVPixelBufferLockBaseAddress(cvpixelbuffer, CVPixelBufferLockFlags(rawValue: 0))
let int16Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(cvpixelbuffer), to: UnsafeMutablePointer<Float16>.self)
let int16PerRow = CVPixelBufferGetBytesPerRow(cvpixelbuffer) / 2
for x in 0...height-1
{
for y in 0...width-1
{
let luma = int16Buffer[x * int16PerRow + y]
/////////////////////////
// SAVE DEPTH VALUE 'luma' to FILE FOR PROCESSING
}
}
CVPixelBufferUnlockBaseAddress(cvpixelbuffer, CVPixelBufferLockFlags(rawValue: 0))
}

How to project shadows on meshed environment in RealityKit?

I'm using RealityKit in combination with a USDZ model and scene understanding (LiDAR iPhone), and simply want my model to project some nice shadows on the real-world environment. I can easily get a custom spotlight on my model, but fail to project any shadows from my it.
I'm stuck with what appears to be the default top-down lighting.
What am I doing wrong?
My code:
override func viewDidLoad() {
super.viewDidLoad()
arView.session.delegate = self
arView.environment.sceneUnderstanding.options = []
arView.environment.sceneUnderstanding.options.insert(.receivesLighting)
arView.automaticallyConfigureSession = false
let configuration = ARWorldTrackingConfiguration()
configuration.planeDetection = [.horizontal, .vertical]
configuration.sceneReconstruction = .mesh
configuration.environmentTexturing = .automatic
arView.session.run(configuration) }
Here's the function I call when adding a model
#IBAction func addMyModel(_ sender: Any) {
do {
let mymodel = try ModelEntity.load(named: "mymodel")
// Lights
let spotLight = CustomSpotLight()
let anchor = AnchorEntity(plane: .horizontal, minimumBounds: [0.30, 1.00])
anchor.children.append(mymodel)
anchor.addChild(spotLight)
self.arView.scene.anchors.append(anchor)
} catch {
fatalError("Failed to load asset.")
}
}
and finally here's my custom spotlight code:
import ARKit
import RealityKit
class CustomSpotLight: Entity, HasSpotLight {
required init() {
super.init()
self.light = SpotLightComponent(color: .blue,
intensity: 500000,
innerAngleInDegrees: 45,
outerAngleInDegrees: 169,
attenuationRadius: 9.0)
self.shadow = SpotLightComponent.Shadow()
self.position.y = 5.0
self.orientation = simd_quatf(angle: -.pi/1.5,
axis: [1,0,0])
}
}

slow frame rate when rendering cifiltered ciimage and MTKView while using face detection (Vision and CIDetection)

I have an app which does real time filtering on camera feed, i'm getting each frame from camera and then do some filtering using CIFilter and then pass the final frame(CIImage) to MTKView to be shown on my swiftUI view, it works fine, but when i want to do face/body detection, real time, on camera feed, frame rate goes down to 8 frames per second and super laggy.
i tried anything i could find on the internet, using vision, CIDetector, CoreML, everything is the same result, well, i would do this on global thread, which makes the UI responsive but the feed which i'm showing into the main view is still laggy, but things like scrollview are working fine.
so i tried to change the view from MTKView to UIImageView, Xcode shows its rendering at 120FPS (which i dont understand why, its 30FPS when not using any face detection) but the feed is still laggy, cannot keep up somehow to the output frame rate, i'm new to this, i dont understand why is it like that.
i also tried just to pass the coming image to MTKView (without any filtering in between, with face detection) also the same laggy result, without face detection, it goes to 30FPS (why not 120?).
this is the code i'm using for converting sampleBuffer to ciImage
extension CICameraCapture: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
var ciImage = CIImage(cvImageBuffer: imageBuffer)
if self.cameraPosition == AVCaptureDevice.Position.front {
ciImage = ciImage.oriented(.downMirrored)
}
ciImage = ciImage.transformed(by: CGAffineTransform(rotationAngle: 3 * .pi / 2))
ciImage = ciImage.transformToOrigin(withSize: ciImage.extent.size)
detectFace(image: ciImage) // this is for detecting face realtime, i have done it in vision
//and also cidetector - cidetector is a little bit faster when setted to low accuracy
//but still not desired result(frame rate)
DispatchQueue.main.async {
self.callback(ciImage)
}
}
}
and this is the MTKView code, which is very simple and basic implementation of it:
import MetalKit
import CoreImage
class MetalRenderView: MTKView {
//var textureCache: CVMetalTextureCache?
override init(frame frameRect: CGRect, device: MTLDevice?) {
super.init(frame: frameRect, device: device)
if super.device == nil {
fatalError("No support for Metal. Sorry")
}
framebufferOnly = false
preferredFramesPerSecond = 120
sampleCount = 2
}
required init(coder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
private lazy var commandQueue: MTLCommandQueue? = {
[unowned self] in
return self.device!.makeCommandQueue()
}()
private lazy var ciContext: CIContext = {
[unowned self] in
return CIContext(mtlDevice: self.device!)
}()
var image: CIImage? {
didSet {
renderImage()
}
}
private func renderImage() {
guard var image = image else { return }
image = image.transformToOrigin(withSize: drawableSize) // this is an extension to resize
//the image to the render size so i dont get the render error while rendering a frame
let commandBuffer = commandQueue?.makeCommandBuffer()
let destination = CIRenderDestination(width: Int(drawableSize.width),
height: Int(drawableSize.height),
pixelFormat: .bgra8Unorm,
commandBuffer: commandBuffer) { () -> MTLTexture in
return self.currentDrawable!.texture
}
try! ciContext.startTask(toRender: image, to: destination)
commandBuffer?.present(currentDrawable!)
commandBuffer?.commit()
draw()
}
}
and here is the code for face detection using CIDetector:
func detectFace (image: CIImage){
//DispatchQueue.global().async {
let options = [CIDetectorAccuracy: CIDetectorAccuracyHigh,
CIDetectorSmile: true, CIDetectorTypeFace: true] as [String : Any]
let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil,
options: options)!
let faces = faceDetector.features(in: image)
if let face = faces.first as? CIFaceFeature {
AppState.shared.mouth = face.mouthPosition
AppState.shared.leftEye = face.leftEyePosition
AppState.shared.rightEye = face.rightEyePosition
}
//}
}
what I have tried
1) different face detection methods, using Vision, CIDetector and also CoreML(this one not very deeply as i dont have experience in it)
I would get the detection info, but frame rate is 8 or at the best case its 15 (which would be a delayed detection)
2) I've read somewhere that it might be result of the image colorsapce so i have tried different video setting and different rendering colorspace, still no change in the frame rate.
3) I'm somehow sure that it might be regarding to pixelbuffer release time, so i deep copied the imageBuffer and pass it to the detection, beside some memory issues it went up to 15 FPS, but still not minimum 30FPS. in here i also tried to convert imageBuffer to ciimage and then render ciimage to cgimage and the back to ciimage to just release the buffer, but also could not get more than 15FPS (well on average, sometimes goes to 17 or 19, but still laggy)
i'm new in this and still trying to figure it out, i would appreciate any suggestions, samples or tips that could direct me to a better path of solving this.
update
this is the camera capture setup code:
class CICameraCapture: NSObject {
typealias Callback = (CIImage?) -> ()
private var cameraPosition = AVCaptureDevice.Position.front
var ciContext: CIContext?
let callback: Callback
private let session = AVCaptureSession()
private let sampleBufferQueue = DispatchQueue(label: "buffer", qos: .userInitiated)//, attributes: [], autoreleaseFrequency: .workItem)
// face detection
//private var sequenceHandler = VNSequenceRequestHandler()
//var request: VNCoreMLRequest!
//var visionModel: VNCoreMLModel!
//let detectionQ = DispatchQueue(label: "detectionQ", qos: .background)//, attributes: [], autoreleaseFrequency: .workItem)
init(callback: #escaping Callback) {
self.callback = callback
super.init()
prepareSession()
ciContext = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!)
}
func start() {
session.startRunning()
}
func stop() {
session.stopRunning()
}
private func prepareSession() {
session.sessionPreset = .high //.hd1920x1080
let cameraDiscovery = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInDualCamera, .builtInWideAngleCamera], mediaType: .video, position: cameraPosition)
guard let camera = cameraDiscovery.devices.first else { fatalError("Can't get hold of the camera") }
//try! camera.lockForConfiguration()
//camera.activeVideoMinFrameDuration = camera.formats[0].videoSupportedFrameRateRanges[0].minFrameDuration
//camera.activeVideoMaxFrameDuration = camera.formats[0].videoSupportedFrameRateRanges[0].maxFrameDuration
//camera.unlockForConfiguration()
guard let input = try? AVCaptureDeviceInput(device: camera) else { fatalError("Can't get hold of the camera") }
session.addInput(input)
let output = AVCaptureVideoDataOutput()
output.videoSettings = [:]
//print(output.videoSettings.description)
//[875704438, 875704422, 1111970369]
//output.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String : Int(kCVPixelFormatType_32BGRA)]
output.setSampleBufferDelegate(self, queue: sampleBufferQueue)
session.addOutput(output)
session.commitConfiguration()
}
}

How do i use mapbox's new MGLOfflinePackDelegate correctly?

I'm creating an app which needs an offline map. I'm testing with MapBox, which supports offline maps since today (yay!). The code I have now seems to work for downloading the map, but the delegate to report on progress never triggers, and I don't have a clue why this is.
I have this class for my mapView:
import UIKit
import Mapbox
class MapController: UIViewController, MGLMapViewDelegate, UIPopoverPresentationControllerDelegate {
#IBOutlet var mapView: MGLMapView!
override func viewDidLoad() {
super.viewDidLoad()
downloadIfNeeded()
mapView.maximumZoomLevel = 18
}
override func didReceiveMemoryWarning() {
super.didReceiveMemoryWarning()
}
func downloadIfNeeded() {
MGLOfflineStorage.sharedOfflineStorage().getPacksWithCompletionHandler { (packs, error) in guard error == nil else {
return
}
for pack in packs {
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
if userInfo["name"] == "London" {
// allready downloaded
return
}
}
// define the download region
let sw = CLLocationCoordinate2DMake(51.212120, 4.415906)
let ne = CLLocationCoordinate2DMake(51.223781, 4.442401)
let bounds = MGLCoordinateBounds(sw: sw, ne: ne)
let region = MGLTilePyramidOfflineRegion(styleURL: MGLStyle.streetsStyleURL(), bounds: bounds, fromZoomLevel: 10, toZoomLevel: 12)
let userInfo = ["name": "London"]
let context = NSKeyedArchiver.archivedDataWithRootObject(userInfo)
MGLOfflineStorage.sharedOfflineStorage().addPackForRegion(region, withContext: context) { (pack, error) in
guard error == nil else {
return
}
// create popup window with delegate
let storyboard : UIStoryboard = UIStoryboard(name: "Main", bundle: nil)
let downloadProgress: MapDownloadController = storyboard.instantiateViewControllerWithIdentifier("MapDownloadController") as! MapDownloadController
downloadProgress.modalPresentationStyle = .Popover
downloadProgress.preferredContentSize = CGSizeMake(300, 150)
let popoverMapDownloadController = downloadProgress.popoverPresentationController
popoverMapDownloadController?.permittedArrowDirections = .Any
popoverMapDownloadController?.delegate = self
popoverMapDownloadController?.sourceView = self.mapView
popoverMapDownloadController?.sourceRect = CGRect(x: self.mapView.frame.midX, y: self.mapView.frame.midY, width: 1, height: 1)
self.presentViewController(downloadProgress, animated: true, completion: nil)
// set popup as delegate <----
pack!.delegate = downloadProgress
// start downloading
pack!.resume()
}
}
}
}
And the MapDownloadController is a View which is displayed as popup (see code above) and has the MGLOfflinePackDelegate:
import UIKit
import Mapbox
class MapDownloadController: UIViewController, MGLOfflinePackDelegate {
#IBOutlet var progress: UIProgressView!
#IBOutlet var progressText: UILabel!
override func viewDidLoad() {
super.viewDidLoad()
}
override func didReceiveMemoryWarning() {
super.didReceiveMemoryWarning()
}
func offlinePack(pack: MGLOfflinePack, progressDidChange progress: MGLOfflinePackProgress) {
// this function is never called, but why? <----
let completed = progress.countOfResourcesCompleted
let expected = progress.countOfResourcesExpected
let bytes = progress.countOfBytesCompleted
let MB = bytes / 1024
let str: String = "\(completed)/\(expected) voltooid (\(MB)MB)"
progressText.text = str
self.progress.setProgress(Float(completed) / Float(expected), animated: true)
}
func offlinePack(pack: MGLOfflinePack, didReceiveError error: NSError) {
// neither is this one... <----
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
let strError = error.localizedFailureReason
}
func offlinePack(pack: MGLOfflinePack, didReceiveMaximumAllowedMapboxTiles maximumCount: UInt64) {
// .. or this one <----
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
}
}
This is all pretty much taken from the documentation, so why are the delegate's functions (func offlinePack) never called? I did test with breakpoints so i am sure it is not. Still, the popup is shown and the region gets downloaded. (Checked with observing network traffic and with other code which lists the offline packs.)

Here’s an extremely simple implementation of Minh’s answer, using the current v3.2.0b1 example code. Expect this answer to become outdated quickly, as we’re still working on the v3.2.0 release.
import UIKit
import Mapbox
class ViewController: UIViewController, UIPopoverPresentationControllerDelegate, MGLOfflinePackDelegate {
#IBOutlet var mapView: MGLMapView!
// Array of offline packs for the delegate work around (and your UI, potentially)
var offlinePacks = [MGLOfflinePack]()
override func viewDidLoad() {
super.viewDidLoad()
mapView.maximumZoomLevel = 2
downloadOffline()
}
func downloadOffline() {
// Create a region that includes the current viewport and any tiles needed to view it when zoomed further in.
let region = MGLTilePyramidOfflineRegion(styleURL: mapView.styleURL, bounds: mapView.visibleCoordinateBounds, fromZoomLevel: mapView.zoomLevel, toZoomLevel: mapView.maximumZoomLevel)
// Store some data for identification purposes alongside the downloaded resources.
let userInfo = ["name": "My Offline Pack"]
let context = NSKeyedArchiver.archivedDataWithRootObject(userInfo)
// Create and register an offline pack with the shared offline storage object.
MGLOfflineStorage.sharedOfflineStorage().addPackForRegion(region, withContext: context) { (pack, error) in
guard error == nil else {
print("The pack couldn’t be created for some reason.")
return
}
// Set the pack’s delegate (assuming self conforms to the MGLOfflinePackDelegate protocol).
pack!.delegate = self
// Start downloading.
pack!.resume()
// Retain reference to pack to work around it being lost and not sending delegate messages
self.offlinePacks.append(pack!)
}
}
func offlinePack(pack: MGLOfflinePack, progressDidChange progress: MGLOfflinePackProgress) {
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
let completed = progress.countOfResourcesCompleted
let expected = progress.countOfResourcesExpected
print("Offline pack “\(userInfo["name"])” has downloaded \(completed) of \(expected) resources.")
}
func offlinePack(pack: MGLOfflinePack, didReceiveError error: NSError) {
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
print("Offline pack “\(userInfo["name"])” received error: \(error.localizedFailureReason)")
}
func offlinePack(pack: MGLOfflinePack, didReceiveMaximumAllowedMapboxTiles maximumCount: UInt64) {
let userInfo = NSKeyedUnarchiver.unarchiveObjectWithData(pack.context) as! [String: String]
print("Offline pack “\(userInfo["name"])” reached limit of \(maximumCount) tiles.")
}
}

(Cross-posted from this GitHub issue.)
This is a bug in the SDK. The workaround is for the completion handler to assign the passed-in MGLOfflinePack object to an ivar or other strong reference in the surrounding MapDownloadController class (example).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to apply MPSImageGuidedFilter in Swift - ios

Related

ARKit: Tracking VisonCoreML detected object

Unwanted "smoothing" in AVDepthData on iPhone 13 (not evident in iPhone 12)

How to project shadows on meshed environment in RealityKit?

slow frame rate when rendering cifiltered ciimage and MTKView while using face detection (Vision and CIDetection)

How do i use mapbox's new MGLOfflinePackDelegate correctly?

Categories

Resources