So I'm writing a MapKit-based app which draws an overlay over the map. However, a lot of the overlay drawing is dynamic, such that tile which gets drawn is frequently changing, so I've implemented a custom MKTileOverlay and a custom MKTileOverlayRenderer. The first one to handle the url-scheme for where the tile images are stored, and the second to handle the custom drawMapRect implementation.
The issue I'm running into is that I seem to be drawing the same tile image in multiple locations. Here's a screenshot to help you visualize what I mean: (I know the tiles are upside-down and backwards and I can fix that)
iOS Simulator Screenshot
I've changed certain tile images such that they're a different color and have their tile path included. What you'll notice is that many of the tile images are repeated over different areas.
I've been trying to figure out why that might be happening, so following my code path, the overlay starting point is pretty standard--the ViewController sets the addOverlay() call, which calls the delegates' mapView(rendererForOverlay:) which returns my custom MKTileOverlayRenderer class, which then attempts to call my drawMapRect(mapRect:, zoomScale:, context). It then takes the given map_rect and calculates which tile that map_rect belongs to, calls the custom MKTileOverlay class's loadTileAtPath() and then draws the resulting tile image data. And that's exactly what it looks like my code is doing as well, so I'm not really sure where I'm going wrong. That said, it works perfectly fine if I'm not trying to implement custom drawing and use a default MKTileOverlayRenderer. Unfortunately, that's also the crux of the app so not really a viable solution.
For reference, here's the relevant code from my custom classes:
My custom MKTileOverlay class
class ExploredTileOverlay: MKTileOverlay {
var base_path: String
//var tile_path: String?
let cache: NSCache = NSCache()
var point_buffer: ExploredSegment
var last_tile_path: MKTileOverlayPath?
var tile_buffer: ExploredTiles
init(URLTemplate: String?, startingLocation location: CLLocation, city: City) {
let paths = NSSearchPathForDirectoriesInDomains(NSSearchPathDirectory.DocumentDirectory, NSSearchPathDomainMask.UserDomainMask, true)
let documentsDirectory: AnyObject = paths[0]
self.base_path = documentsDirectory.stringByAppendingPathComponent("/" + city.name + "_tiles")
if (!NSFileManager.defaultManager().fileExistsAtPath(base_path)) {
try! NSFileManager.defaultManager().createDirectoryAtPath(base_path, withIntermediateDirectories: false, attributes: nil)
let new_point = MKMapPointForCoordinate(location.coordinate)
self.point_buffer = ExploredSegment(fromPoint: new_point, inCity: city)
self.tile_buffer = ExploredTiles(startingPoint: ExploredPoint(mapPoint: new_point, r: 50))
self.last_tile_path = Array(tile_buffer.edited_tiles.values).last!.path
super.init(URLTemplate: URLTemplate)
override func URLForTilePath(path: MKTileOverlayPath) -> NSURL {
let filled_template = String(format: "%d_%d_%d.png", path.z, path.x, path.y)
let tile_path = base_path + "/" + filled_template
//print("fetching tile " + filled_template)
if !NSFileManager.defaultManager().fileExistsAtPath(tile_path) {
return NSURL(fileURLWithPath: "")
return NSURL(fileURLWithPath: tile_path)
override func loadTileAtPath(path: MKTileOverlayPath, result: (NSData?, NSError?) -> Void) {
let url = URLForTilePath(path)
let filled_template = String(format: "%d_%d_%d.png", path.z, path.x, path.y)
let tile_path = base_path + "/" + filled_template
if (url != NSURL(fileURLWithPath: tile_path)) {
print("creating tile at " + String(path))
let img_data: NSData = UIImagePNGRepresentation(UIImage(named: "small")!)!
let filled_template = String(format: "%d_%d_%d.png", path.z, path.x, path.y)
let tile_path = base_path + "/" + filled_template
img_data.writeToFile(tile_path, atomically: true)
cache.setObject(img_data, forKey: url)
result(img_data, nil)
} else if let cachedData = cache.objectForKey(url) as? NSData {
print("using cache for " + String(path))
result(cachedData, nil)
} else {
print("loading " + String(path) + " from directory")
let img_data: NSData = UIImagePNGRepresentation(UIImage(contentsOfFile: tile_path)!)!
cache.setObject(img_data, forKey: url)
result(img_data, nil)
My custom MKTileOverlayRenderer class:
class ExploredTileRenderer: MKTileOverlayRenderer {
let tile_overlay: ExploredTileOverlay
var zoom_scale: MKZoomScale?
let cache: NSCache = NSCache()
override init(overlay: MKOverlay) {
self.tile_overlay = overlay as! ExploredTileOverlay
super.init(overlay: overlay)
NSNotificationCenter.defaultCenter().addObserver(self, selector: #selector(saveEditedTiles), name: "com.Coder.Wander.reachedMaxPoints", object: nil)
// There's some weird cache-ing thing that requires me to recall it
// whenever I re-draw over the tile, I don't really get it but it works
override func canDrawMapRect(mapRect: MKMapRect, zoomScale: MKZoomScale) -> Bool {
self.setNeedsDisplayInMapRect(mapRect, zoomScale: zoomScale)
return true
override func drawMapRect(mapRect: MKMapRect, zoomScale: MKZoomScale, inContext context: CGContext) {
zoom_scale = zoomScale
let tile_path = self.tilePathForMapRect(mapRect, andZoomScale: zoomScale)
let tile_path_string = stringForTilePath(tile_path)
//print("redrawing tile: " + tile_path_string)
self.tile_overlay.loadTileAtPath(tile_path, result: {
data, error in
if error == nil && data != nil {
if let image = UIImage(data: data!) {
let draw_rect = self.rectForMapRect(mapRect)
CGContextDrawImage(context, draw_rect, image.CGImage)
var path: [(CGMutablePath, CGFloat)]? = nil
self.tile_overlay.point_buffer.readPointsWithBlockAndWait({ points in
let total = self.getPathForPoints(points, zoomScale: zoomScale, offset: MKMapPointMake(0.0, 0.0))
path = total.0
//print("number of points: " + String(path!.count))
if ((path != nil) && (path!.count > 0)) {
//print("drawing path")
for segment in path! {
CGContextAddPath(context, segment.0)
CGContextSetBlendMode(context, .Clear)
CGContextSetLineJoin(context, CGLineJoin.Round)
CGContextSetLineCap(context, CGLineCap.Round)
CGContextSetLineWidth(context, segment.1)
And my helper functions that handle converting between zoomScale, zoomLevel, tile path, and tile coordinates:
func tilePathForMapRect(mapRect: MKMapRect, andZoomScale zoom: MKZoomScale) -> MKTileOverlayPath {
let zoom_level = self.zoomLevelForZoomScale(zoom)
let mercatorPoint = self.mercatorTileOriginForMapRect(mapRect)
//print("mercPt: " + String(mercatorPoint))
let tilex = Int(floor(Double(mercatorPoint.x) * self.worldTileWidthForZoomLevel(zoom_level)))
let tiley = Int(floor(Double(mercatorPoint.y) * self.worldTileWidthForZoomLevel(zoom_level)))
return MKTileOverlayPath(x: tilex, y: tiley, z: zoom_level, contentScaleFactor: UIScreen.mainScreen().scale)
func stringForTilePath(path: MKTileOverlayPath) -> String {
return String(format: "%d_%d_%d", path.z, path.x, path.y)
func zoomLevelForZoomScale(zoomScale: MKZoomScale) -> Int {
let real_scale = zoomScale / UIScreen.mainScreen().scale
var z = Int((log2(Double(real_scale))+20.0))
z += (Int(UIScreen.mainScreen().scale) - 1)
return z
func worldTileWidthForZoomLevel(zoomLevel: Int) -> Double {
return pow(2, Double(zoomLevel))
func mercatorTileOriginForMapRect(mapRect: MKMapRect) -> CGPoint {
let map_region: MKCoordinateRegion = MKCoordinateRegionForMapRect(mapRect)
var x : Double = map_region.center.longitude * (M_PI/180.0)
var y : Double = map_region.center.latitude * (M_PI/180.0)
y = log10(tan(y) + 1.0/cos(y))
x = (1.0 + (x/M_PI)) / 2.0
y = (1.0 - (y/M_PI)) / 2.0
return CGPointMake(CGFloat(x), CGFloat(y))
This is a pretty obscure error, I think, so haven't had a whole lot of luck finding other people facing similar issues. Anything would help!


ARKit: Tracking VisonCoreML detected object

I'm new to iOS and I am currently refactoring a code I got from a tutorial on VisionCoreML and ARKit that adds a node to the detected object.
currently, if the I move the object the node does not move and follow the object. I can see from Apple's sample code for Recognizing Objects in Live Capture they use layers and repositions this each time Vision detects the object at a new position which is what I was hoping to replicate with an ARObject.
Is there a way I can achieve this with ARKit?
Any help around this would be greatly appreciated.
EDIT: Working code with solution
#IBOutlet var sceneView: ARSCNView!
private var viewportSize: CGSize!
private var previousAnchor: ARAnchor?
private var trackingNode: SCNNode!
lazy var objectDetectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: yolov5s(configuration: MLModelConfiguration()).model)
let request = VNCoreMLRequest(model: model) { [weak self] request, error in
self?.processDetections(for: request, error: error)
return request
} catch {
fatalError("Failed to load Vision ML model.")
func renderer(_ renderer: SCNSceneRenderer, willRenderScene scene: SCNScene, atTime time: TimeInterval) {
guard let capturedImage = sceneView.session.currentFrame?.capturedImage
else { return }
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: capturedImage, orientation: .leftMirrored, options: [:])
do {
try imageRequestHandler.perform([objectDetectionRequest])
} catch {
print("Failed to perform image request.")
func processDetections(for request: VNRequest, error: Error?) {
guard error == nil else {
print("Object detection error: \(error!.localizedDescription)")
guard let results = request.results else { return }
for observation in results where observation is VNRecognizedObjectObservation {
let objectObservation = observation as! VNRecognizedObjectObservation
let topLabelObservation = objectObservation.labels.first
print(topLabelObservation!.identifier + " " + "\(Int(topLabelObservation!.confidence * 100))%")
guard recognisedObject(topLabelObservation!.identifier) && topLabelObservation!.confidence > 0.9
else { continue }
let rect = VNImageRectForNormalizedRect(
let midPoint = CGPoint(x: rect.midX, y: rect.midY)
let raycastQuery = self.sceneView.raycastQuery(from: midPoint,
allowing: .estimatedPlane,
alignment: .any)
let raycastArray = self.sceneView.session.raycast(raycastQuery!)
guard let raycastResult = raycastArray.first else { return }
let position = SCNVector3(raycastResult.worldTransform.columns.3.x,
if let _ = trackingNode {
trackingNode!.worldPosition = position
} else {
trackingNode = createNode()
trackingNode!.worldPosition = position
private func recognisedObject(_ identifier: String) -> Bool {
return identifier == "remote" || identifier == "mouse"
private func createNode() -> SCNNode {
let sphereNode = SCNNode(geometry: SCNSphere(radius: 0.01))
sphereNode.geometry?.firstMaterial?.diffuse.contents = UIColor.purple
return sphereNode
private func loadSession() {
let configuration = ARWorldTrackingConfiguration()
configuration.planeDetection = []
override func viewDidLoad() {
sceneView.delegate = self
viewportSize = sceneView.frame.size
override func viewWillAppear(_ animated: Bool) {
override func viewWillDisappear(_ animated: Bool) {
To be honest, the technologies you're using here cannot do that out of the box. YOLO (and any other object detection model you swapped out for it) have no built in concept of tracking the same object in a video. They look for objects in a 2D bitmap, and return 2D bounding boxes for them. As either the camera or object moves, and you pass in the next capturedImage buffer, it will give you a new bounding box in the correct position, but it has no way of knowing whether or not it's the same instance of the object detected in a previous frame.
To make this work, you'll need to do some post processing of those Vision results to determine whether or not it's the same object, and if so, manually move the anchor/mesh to match the new position. If you're confident there should only be one object in view at any given time, then it's pretty straightforward. If there will be multiple objects, you're venturing into complex (but still achievable) territory.
You could try to incorporate Vision Tracking, which might work though would depend on the nature and behavior of the tracked object.
Also, sceneView.hitTest() is deprecated. You should probably port that over to use ARSession.raycast()

Why is my programmatic screenshot capturing out of date view?

I have a map view that allows users to draw perimeter lines, and I need to capture a screenshot when they are done to save for recording purposes. For some reason, my code is capturing the view state before the new overlay is added, even though I add the overlay before attempting the screenshot.
I use a separate view to capture gesture, and then covert the points to an overlay and add it to the map here where points are the points gathered from the gesture tracking
func convertFragments() {
var coordinates: [CLLocationCoordinate2D] = []
for point in points {
let coordinate = mapView.convert(point, toCoordinateFrom: drawingView)
let polyline = MKPolyline(coordinates: coordinates, count: coordinates.count)
polyline.title = selectedTool.name
removeLines(for: selectedTool)
points = []
incidentManager.update(for: selectedTool, value: polyline)
Then incidentManager.update(value: polyline) makes a network call to save the information and calls a second method to capture the screenshot and update a log in log(eventType: eventType)
func update(for tool: MapTool, value: Any) {
func saveLine(_ line: MKPolyline, for key: String) {
key: IncidentManager.convertCLPoints(line.coordinates)
], merge: true)
switch tool {
case .hotZone, .innerPerimeter, .outerPerimeter:
let line = value as! MKPolyline
saveLine(line, for: tool.rawValue)
case .commandPost, .stagingArea:
let point = value as! CLLocationCoordinate2D
let geoPoint = GeoPoint(latitude: point.latitude, longitude: point.longitude)
tool.rawValue: geoPoint
], merge: true)
case .poi:
let annotation = value as! PerimeterMapAnnotation
let point = annotation.coordinate
let geoPoint = GeoPoint(latitude: point.latitude, longitude: point.longitude)
"pointsOfInterest": [annotation.title: geoPoint]
], merge: true)
updateAddress(defaultValue: value)
let eventType = tool.eventType(didSet: true)
log(eventType: eventType)
I have an extension on UIView that captures the view, but for some reason, its not the most updated version of the map view.
private func log(eventType: LogEventType) {
guard let mapImage = mapView.asImage() else { return }
let storageRef = Storage.storage().reference()
let imageRef = storageRef.child("\(incident.id)/\(UUID().uuidString).jpg")
let eventTime = Date()
let eventTimeString = eventTime.apiDateString
"logEventType": eventType.rawValue,
"imageReference": imageRef.fullPath,
"date": Timestamp(date: eventTime)
upload(image: mapImage, at: imageRef)
if shouldUpdateCover {
shouldUpdateCover = false
"coverPhotoRef": imageRef.fullPath
], merge: true)
func asImage() -> UIImage? {
UIGraphicsBeginImageContextWithOptions(self.bounds.size, self.isOpaque, 0.0)
defer { UIGraphicsEndImageContext() }
if let context = UIGraphicsGetCurrentContext() {
self.layer.render(in: context)
let image = UIGraphicsGetImageFromCurrentImageContext()
return image
return nil
I have tried many variations this "screenshot" method that I have found online, and no luck. When I debug, I can check the map and its overlays, and see that they are updating before hand.
Any ideas why the image captured here does not capture the changes made?

How to draw a CLLocationCoordinate2Ds on MKMapSnapshotter (drawing on mapView printed image)

I have mapView with array of CLLocationCoordinate2D. I use these locations to draw lines on my mapView by using MKPolyline. Now i want to store it as a UIimage. I found that theres class MKMapSnapshotter but unfortunately i can't draw overlays on it "Snapshotter objects do not capture the visual representations of any overlays or annotations that your app creates." So i get only blank map image. Is there any way to get image with my overlays?
private func generateImageFromMap() {
let mapSnapshotterOptions = MKMapSnapshotter.Options()
guard let region = mapRegion() else { return }
mapSnapshotterOptions.region = region
mapSnapshotterOptions.size = CGSize(width: 200, height: 200)
mapSnapshotterOptions.showsBuildings = false
mapSnapshotterOptions.showsPointsOfInterest = false
let snapShotter = MKMapSnapshotter(options: mapSnapshotterOptions)
snapShotter.start() { snapshot, error in
guard let snapshot = snapshot else {
//do something with image ....
let mapImage = snapshot...
How can i put overlays on this image? Or maybe theres other way for that problem.
Unfortunately, you have to draw them yourself. Fortunately, MKSnapshot has a convenient point(for:) method to convert a CLLocationCoordinate2D into a CGPoint within the snapshot.
For example, assume you had an array of CLLocationCoordinate2D:
private var coordinates: [CLLocationCoordinate2D]?
private func generateImageFromMap() {
guard let region = mapRegion() else { return }
let options = MKMapSnapshotter.Options()
options.region = region
options.size = CGSize(width: 200, height: 200)
options.showsBuildings = false
options.showsPointsOfInterest = false
MKMapSnapshotter(options: options).start() { snapshot, error in
guard let snapshot = snapshot else { return }
let mapImage = snapshot.image
let finalImage = UIGraphicsImageRenderer(size: mapImage.size).image { _ in
// draw the map image
mapImage.draw(at: .zero)
// only bother with the following if we have a path with two or more coordinates
guard let coordinates = self.coordinates, coordinates.count > 1 else { return }
// convert the `[CLLocationCoordinate2D]` into a `[CGPoint]`
let points = coordinates.map { coordinate in
snapshot.point(for: coordinate)
// build a bezier path using that `[CGPoint]`
let path = UIBezierPath()
path.move(to: points[0])
for point in points.dropFirst() {
path.addLine(to: point)
// stroke it
path.lineWidth = 1
// do something with finalImage
Then the following map view (with the coordinates, as MKPolyline, rendered by mapView(_:rendererFor:), like usual):
The above code will create the this finalImage:

Reduce memory consumption while loading gif images in UIImageView

I want to show gif image in a UIImageView and with the code below (source: https://iosdevcenters.blogspot.com/2016/08/load-gif-image-in-swift_22.html, *I did not understand all the codes), I am able to display gif images. However, the memory consumption seems high (tested on real device). Is there any way to modify the code below to reduce the memory consumption?
#IBOutlet weak var imageView: UIImageView!
override func viewDidLoad() {
let url = "https://cdn-images-1.medium.com/max/800/1*oDqXedYUMyhWzN48pUjHyw.gif"
let gifImage = UIImage.gifImageWithURL(url)
imageView.image = gifImage
override func didReceiveMemoryWarning() {
// Dispose of any resources that can be recreated.
fileprivate func < <T : Comparable>(lhs: T?, rhs: T?) -> Bool {
switch (lhs, rhs) {
case let (l?, r?):
return l < r
case (nil, _?):
return true
return false
extension UIImage {
public class func gifImageWithData(_ data: Data) -> UIImage? {
guard let source = CGImageSourceCreateWithData(data as CFData, nil) else {
print("image doesn't exist")
return nil
return UIImage.animatedImageWithSource(source)
public class func gifImageWithURL(_ gifUrl:String) -> UIImage? {
guard let bundleURL:URL? = URL(string: gifUrl) else {
return nil
guard let imageData = try? Data(contentsOf: bundleURL!) else {
return nil
return gifImageWithData(imageData)
public class func gifImageWithName(_ name: String) -> UIImage? {
guard let bundleURL = Bundle.main
.url(forResource: name, withExtension: "gif") else {
return nil
guard let imageData = try? Data(contentsOf: bundleURL) else {
return nil
return gifImageWithData(imageData)
class func delayForImageAtIndex(_ index: Int, source: CGImageSource!) -> Double {
var delay = 0.1
let cfProperties = CGImageSourceCopyPropertiesAtIndex(source, index, nil)
let gifProperties: CFDictionary = unsafeBitCast(
to: CFDictionary.self)
var delayObject: AnyObject = unsafeBitCast(
to: AnyObject.self)
if delayObject.doubleValue == 0 {
delayObject = unsafeBitCast(CFDictionaryGetValue(gifProperties,
Unmanaged.passUnretained(kCGImagePropertyGIFDelayTime).toOpaque()), to: AnyObject.self)
delay = delayObject as! Double
if delay < 0.1 {
delay = 0.1
return delay
class func gcdForPair(_ a: Int?, _ b: Int?) -> Int {
var a = a
var b = b
if b == nil || a == nil {
if b != nil {
return b!
} else if a != nil {
return a!
} else {
return 0
if a < b {
let c = a
a = b
b = c
var rest: Int
while true {
rest = a! % b!
if rest == 0 {
return b!
} else {
a = b
b = rest
class func gcdForArray(_ array: Array<Int>) -> Int {
if array.isEmpty {
return 1
var gcd = array[0]
for val in array {
gcd = UIImage.gcdForPair(val, gcd)
return gcd
class func animatedImageWithSource(_ source: CGImageSource) -> UIImage? {
let count = CGImageSourceGetCount(source)
var images = [CGImage]()
var delays = [Int]()
for i in 0..<count {
if let image = CGImageSourceCreateImageAtIndex(source, i, nil) {
let delaySeconds = UIImage.delayForImageAtIndex(Int(i),
source: source)
delays.append(Int(delaySeconds * 1000.0)) // Seconds to ms
let duration: Int = {
var sum = 0
for val: Int in delays {
sum += val
return sum
let gcd = gcdForArray(delays)
var frames = [UIImage]()
var frame: UIImage
var frameCount: Int
for i in 0..<count {
frame = UIImage(cgImage: images[Int(i)])
frameCount = Int(delays[Int(i)] / gcd)
for _ in 0..<frameCount {
let animation = UIImage.animatedImage(with: frames,
duration: Double(duration) / 1000.0)
return animation
When I render the image as normal png image, the consumption is around 10MB.
The GIF in question has a resolution of 480×288 and contains 10 frames.
Considering that UIImageView stores frames as 4-byte RGBA, this GIF occupies 4 × 10 × 480 × 288 = 5 529 600 bytes in RAM, which is more than 5 megabytes.
There are numerous ways to mitigate that, but only one of them puts no additional strain on the CPU; the others are mere CPU-to-RAM trade-offs.
The method I`m talking about is subclassing UIImageView and loading your GIFs by hand, preserving their internal representation (indexed image + palette). It would allow you to cut the memory usage fourfold.
N.B.: even though GIFs may be stored as full images for each frame (which is the case for the GIF in question), many are not. On the contrary, most of the frames can only contain the pixels that have changed since the previous one. Thus, in general the internal GIF representation only allows to display frames in direct order.
Other methods of saving RAM include e.g. re-reading every frame from disk prior to displaying it, which is certainly not good for battery life.
To display GIFs with less memory consumption, try BBWebImage.
BBWebImage will decide how many image frames are decoded and cached depending on current memory usage. If free memory is not enough, only part of image frames are decoded and cached.
For Swift 4:
// BBAnimatedImageView (subclass UIImageView) displays animated image
imageView = BBAnimatedImageView(frame: frame)
// Load and display gif
imageView.bb_setImage(with: url,
placeholder: UIImage(named: "placeholder"))
{ (image: UIImage?, data: Data?, error: Error?, cacheType: BBImageCacheType) in
// Do something when finish loading

Toggle flash in ios swift

I am building an image clasifier app. On camera screen I have a switch button which I want to use to toggle flash so that user can switch on flash in low light.
Here is my code:
import UIKit
import AVFoundation
import Vision
// controlling the pace of the machine vision analysis
var lastAnalysis: TimeInterval = 0
var pace: TimeInterval = 0.33 // in seconds, classification will not repeat faster than this value
// performance tracking
let trackPerformance = false // use "true" for performance logging
var frameCount = 0
let framesPerSample = 10
var startDate = NSDate.timeIntervalSinceReferenceDate
var flash=0
class ImageDetectionViewController: UIViewController {
var callBackImageDetection :(State)->Void = { state in
#IBOutlet weak var previewView: UIView!
#IBOutlet weak var stackView: UIStackView!
#IBOutlet weak var lowerView: UIView!
#IBAction func swithch(_ sender: UISwitch) {
if(sender.isOn == true)
let captureSession=AVCaptureSession()
let captureDevice: AVCaptureDevice?
setupCamera(flash: 1)
var previewLayer: AVCaptureVideoPreviewLayer!
let bubbleLayer = BubbleLayer(string: "")
let queue = DispatchQueue(label: "videoQueue")
var captureSession = AVCaptureSession()
var captureDevice: AVCaptureDevice?
let videoOutput = AVCaptureVideoDataOutput()
var unknownCounter = 0 // used to track how many unclassified images in a row
let confidence: Float = 0.8
// MARK: Load the Model
let targetImageSize = CGSize(width: 227, height: 227) // must match model data input
lazy var classificationRequest: [VNRequest] = {
do {
// Load the Custom Vision model.
// To add a new model, drag it to the Xcode project browser making sure that the "Target Membership" is checked.
// Then update the following line with the name of your new model.
// let model = try VNCoreMLModel(for: Fruit().model)
let model = try VNCoreMLModel(for: CodigocubeAI().model)
let classificationRequest = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
return [ classificationRequest ]
} catch {
fatalError("Can't load Vision ML model: \(error)")
// MARK: Handle image classification results
func handleClassification(request: VNRequest, error: Error?) {
guard let observations = request.results as? [VNClassificationObservation]
else { fatalError("unexpected result type from VNCoreMLRequest") }
guard let best = observations.first else {
fatalError("classification didn't return any results")
// Use results to update user interface (includes basic filtering)
print("\(best.identifier): \(best.confidence)")
if best.identifier.starts(with: "Unknown") || best.confidence < confidence {
if self.unknownCounter < 3 { // a bit of a low-pass filter to avoid flickering
self.unknownCounter += 1
} else {
self.unknownCounter = 0
DispatchQueue.main.async {
self.bubbleLayer.string = nil
} else {
self.unknownCounter = 0
DispatchQueue.main.async {[weak self] in
guard let strongSelf = self
// Trimming labels because they sometimes have unexpected line endings which show up in the GUI
let identifierString = best.identifier.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
strongSelf.bubbleLayer.string = identifierString
let state : State = strongSelf.getState(identifierStr: identifierString)
strongSelf.navigationController?.popViewController(animated: true)
func getState(identifierStr:String)->State
var state :State = .none
if identifierStr == "entertainment"
state = .entertainment
else if identifierStr == "geography"
state = .geography
else if identifierStr == "history"
state = .history
else if identifierStr == "knowledge"
state = .education
else if identifierStr == "science"
state = .science
else if identifierStr == "sports"
state = .sports
state = .none
return state
// MARK: Lifecycle
override func viewDidLoad() {
previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
override func viewDidAppear(_ animated: Bool) {
self.edgesForExtendedLayout = UIRectEdge.init(rawValue: 0)
bubbleLayer.opacity = 0.0
bubbleLayer.position.x = self.view.frame.width / 2.0
bubbleLayer.position.y = lowerView.frame.height / 2
override func viewDidLayoutSubviews() {
previewLayer.frame = previewView.bounds;
// MARK: Camera handling
func setupCamera(flash :Int) {
let deviceDiscovery = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: .video, position: .back)
if let device = deviceDiscovery.devices.last {
if(flash == 1)
if (device.hasTorch) {
do {
try device.lockForConfiguration()
if (device.isTorchAvailable) {
do {
try device.setTorchModeOn(level:0.2 )
captureDevice = device
func beginSession() {
do {
videoOutput.videoSettings = [((kCVPixelBufferPixelFormatTypeKey as NSString) as String) : (NSNumber(value: kCVPixelFormatType_32BGRA) as! UInt32)]
videoOutput.alwaysDiscardsLateVideoFrames = true
videoOutput.setSampleBufferDelegate(self, queue: queue)
captureSession.sessionPreset = .hd1920x1080
let input = try AVCaptureDeviceInput(device: captureDevice!)
} catch {
print("error connecting to capture device")
func stopActiveSession()
if captureSession.isRunning == true
override func viewWillDisappear(_ animated: Bool) {
deinit {
print("deinit called")
// MARK: Video Data Delegate
extension ImageDetectionViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
// called for each frame of video
func captureOutput(_ captureOutput: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let currentDate = NSDate.timeIntervalSinceReferenceDate
// control the pace of the machine vision to protect battery life
if currentDate - lastAnalysis >= pace {
lastAnalysis = currentDate
} else {
return // don't run the classifier more often than we need
// keep track of performance and log the frame rate
if trackPerformance {
frameCount = frameCount + 1
if frameCount % framesPerSample == 0 {
let diff = currentDate - startDate
if (diff > 0) {
if pace > 0.0 {
print("WARNING: Frame rate of image classification is being limited by \"pace\" setting. Set to 0.0 for fastest possible rate.")
print("\(String.localizedStringWithFormat("%0.2f", (diff/Double(framesPerSample))))s per frame (average)")
startDate = currentDate
// Crop and resize the image data.
// Note, this uses a Core Image pipeline that could be appended with other pre-processing.
// If we don't want to do anything custom, we can remove this step and let the Vision framework handle
// crop and resize as long as we are careful to pass the orientation properly.
guard let croppedBuffer = croppedSampleBuffer(sampleBuffer, targetSize: targetImageSize) else {
do {
let classifierRequestHandler = VNImageRequestHandler(cvPixelBuffer: croppedBuffer, options: [:])
try classifierRequestHandler.perform(classificationRequest)
} catch {
let context = CIContext()
var rotateTransform: CGAffineTransform?
var scaleTransform: CGAffineTransform?
var cropTransform: CGAffineTransform?
var resultBuffer: CVPixelBuffer?
func croppedSampleBuffer(_ sampleBuffer: CMSampleBuffer, targetSize: CGSize) -> CVPixelBuffer? {
guard let imageBuffer: CVImageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
fatalError("Can't convert to CVImageBuffer.")
// Only doing these calculations once for efficiency.
// If the incoming images could change orientation or size during a session, this would need to be reset when that happens.
if rotateTransform == nil {
let imageSize = CVImageBufferGetEncodedSize(imageBuffer)
let rotatedSize = CGSize(width: imageSize.height, height: imageSize.width)
guard targetSize.width < rotatedSize.width, targetSize.height < rotatedSize.height else {
fatalError("Captured image is smaller than image size for model.")
let shorterSize = (rotatedSize.width < rotatedSize.height) ? rotatedSize.width : rotatedSize.height
rotateTransform = CGAffineTransform(translationX: imageSize.width / 2.0, y: imageSize.height / 2.0).rotated(by: -CGFloat.pi / 2.0).translatedBy(x: -imageSize.height / 2.0, y: -imageSize.width / 2.0)
let scale = targetSize.width / shorterSize
scaleTransform = CGAffineTransform(scaleX: scale, y: scale)
// Crop input image to output size
let xDiff = rotatedSize.width * scale - targetSize.width
let yDiff = rotatedSize.height * scale - targetSize.height
cropTransform = CGAffineTransform(translationX: xDiff/2.0, y: yDiff/2.0)
// Convert to CIImage because it is easier to manipulate
let ciImage = CIImage(cvImageBuffer: imageBuffer)
let rotated = ciImage.transformed(by: rotateTransform!)
let scaled = rotated.transformed(by: scaleTransform!)
let cropped = scaled.transformed(by: cropTransform!)
// Note that the above pipeline could be easily appended with other image manipulations.
// For example, to change the image contrast. It would be most efficient to handle all of
// the image manipulation in a single Core Image pipeline because it can be hardware optimized.
// Only need to create this buffer one time and then we can reuse it for every frame
if resultBuffer == nil {
let result = CVPixelBufferCreate(kCFAllocatorDefault, Int(targetSize.width), Int(targetSize.height), kCVPixelFormatType_32BGRA, nil, &resultBuffer)
guard result == kCVReturnSuccess else {
fatalError("Can't allocate pixel buffer.")
// Render the Core Image pipeline to the buffer
context.render(cropped, to: resultBuffer!)
// For debugging
// let image = imageBufferToUIImage(resultBuffer!)
// print(image.size) // set breakpoint to see image being provided to CoreML
return resultBuffer
// Only used for debugging.
// Turns an image buffer into a UIImage that is easier to display in the UI or debugger.
func imageBufferToUIImage(_ imageBuffer: CVImageBuffer) -> UIImage {
CVPixelBufferLockBaseAddress(imageBuffer, CVPixelBufferLockFlags(rawValue: 0))
let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.noneSkipFirst.rawValue | CGBitmapInfo.byteOrder32Little.rawValue)
let context = CGContext(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo.rawValue)
let quartzImage = context!.makeImage()
CVPixelBufferUnlockBaseAddress(imageBuffer, CVPixelBufferLockFlags(rawValue: 0))
let image = UIImage(cgImage: quartzImage!, scale: 1.0, orientation: .right)
return image
I am getting error An AVCaptureOutput instance may not be added to more than one session'
Now I want to give user the facility to toggle flash. How to destroy active camera session and open new with flash on?
Can anyone help me also any other way to achieve this?
