GPUImage crop to CGRect and rotate - ios

Given a CGRect, I want to use GPUImage to crop a video. For example, if the rect is (0, 0, 50, 50), the video would be cropped at (0,0) with a length of 50 on each side.
What's throwing me is that GPUImageCropFilter doesn't take a rectangle, rather a normalized crop region with values ranging from 0 to 1. My intuition was to to this:
let assetSize = CGSizeApplyAffineTransform(videoTrack.naturalSize, videoTrack.preferredTransform)
let cropRect = CGRect(x: frame.minX/assetSize.width,
y: frame.minY/assetSize.height,
width: frame.width/assetSize.width,
height: frame.height/assetSize.height)
to calculate the crop region based on the size of the incoming asset. Then:
// Filter
let cropFilter = GPUImageCropFilter(cropRegion: cropRect)
let url = NSURL(fileURLWithPath: "\(NSTemporaryDirectory())\(String.random()).mp4")
let movieWriter = GPUImageMovieWriter(movieURL: url, size: assetSize)
movieWriter.encodingLiveVideo = false
movieWriter.shouldPassthroughAudio = false
// add targets
movieFile.addTarget(cropFilter)
cropFilter.addTarget(movieWriter)
cropFilter.forceProcessingAtSize(frame.size)
cropFilter.setInputRotation(kGPUImageRotateRight, atIndex: 0)
What should the movie writer size be? Shouldn't it be the size of the frame I want to crop with? And should I be using forceProcessingAtSize with the size value of my crop frame?
A complete code example would be great; I've been trying for hours and I can't seem to get the section of the video that I want.
FINAL:
if let videoTrack = self.asset.tracks.first {
let movieFile = GPUImageMovie(asset: self.asset)
let transformedRegion = CGRectApplyAffineTransform(region, videoTrack.preferredTransform)
// Filters
let cropFilter = GPUImageCropFilter(cropRegion: transformedRegion)
let url = NSURL(fileURLWithPath: "\(NSTemporaryDirectory())\(String.random()).mp4")
let renderSize = CGSizeApplyAffineTransform(videoTrack.naturalSize, CGAffineTransformMakeScale(transformedRegion.width, transformedRegion.height))
let movieWriter = GPUImageMovieWriter(movieURL: url, size: renderSize)
movieWriter.transform = videoTrack.preferredTransform
movieWriter.encodingLiveVideo = false
movieWriter.shouldPassthroughAudio = false
// add targets
// http://stackoverflow.com/questions/37041231/gpuimage-crop-to-cgrect-and-rotate
movieFile.addTarget(cropFilter)
cropFilter.addTarget(movieWriter)
movieWriter.completionBlock = {
observer.sendNext(url)
observer.sendCompleted()
}
movieWriter.failureBlock = { _ in
observer.sendFailed(.VideoCropFailed)
}
disposable.addDisposable {
cropFilter.removeTarget(movieWriter)
movieWriter.finishRecording()
}
movieWriter.startRecording()
movieFile.startProcessing()
}

As you note, the GPUImageCropFilter takes in a rectangle in normalized coordinates. You're on the right track, in that you just need to convert your CGRect in pixels to normalized coordinates by dividing the X components (origin.x and size.width) by the width of the image and the Y components by the height.
You don't need to use forceProcessingAtSize(), because the crop will automatically output an image of the appropriate cropped size. The movie writer's size should be matched to this cropped size, which you should know from your original CGRect.
The one complication you introduce is the rotation. If you need to apply a rotation in addition to your crop, you might want to check and make sure that you don't need to swap your X and Y for your crop region. This should be apparent in the output if the two need to be swapped.
There were some bugs with applying rotation at the same time as a crop a while ago, and I can't remember if I fixed all those. If I didn't, you could insert a dummy filter (gamma or brightness set to default values) before or after the crop and apply the rotation at that stage.

Related

How to convert VNRectangleObservation item to UIImage in SwiftUI

I was able to identify squares from a images using VNDetectRectanglesRequest. Now I want those rectangles to store as separate images (UIImage or cgImage). Below is what I tried.
let rectanglesDetection = VNDetectRectanglesRequest { request, error in
rectangles = request.results as! [VNRectangleObservation]
rectangles.sort{$0.boundingBox.origin.y > $1.boundingBox.origin.y}
for rectangle in rectangles {
let rect = rectangle.boundingBox
let imageRef = cgImage.cropping(to: rect)
let image = UIImage(cgImage: imageRef!, scale: image!.scale, orientation: image!.imageOrientation)
checkBoxImages.append(image)
}
Can anybody point out what's wrong or what should be the best approach?
Update 1
At this stage, I'm testing with an image that I added to the assets.
With this image I get 7 rectangles as observations as each for each cell and one for the table margin.
My task is to identify the text inside in each rectangle and my approach is to send VNRecognizeTextRequest for each rectangle that has been identified. My real scenario is little complicated than this but I want to at least achieve this before going forward.
Update 2
for rectangle in rectangles {
let trueX = rectangle.boundingBox.minX * image!.size.width
let trueY = rectangle.boundingBox.minY * image!.size.height
let width = rectangle.boundingBox.width * image!.size.width
let height = rectangle.boundingBox.height * image!.size.height
print("x = " , trueX , " y = " , trueY , " width = " , width , " height = " , height)
let cropZone = CGRect(x: trueX, y: trueY, width: width, height: height)
guard let cutImageRef: CGImage = image?.cgImage?.cropping(to:cropZone)
else {
return
}
let croppedImage: UIImage = UIImage(cgImage: cutImageRef)
croppedImages.append(croppedImage)
}
My image width and height is
width = 406.0 height = 368.0
I've taken my debug interface for you to get a proper understand.
As #Lasse mentioned, this is my actual issue with screenshots.
This is just a guess since you didn't state what the actual problem is, but probably you're getting a zero-sized image for each VNRectangleObservation.
The reason is: Vision uses a normalized coordinate space from 0.0 to 1.0 with lower left origin.
So in order to get the correct rectangle of your original image, you need to convert the rect from Normalized Space to Image Space. Luckily there is VNImageRectForNormalizedRect(::_:) to do just that.

Rotating CIImage by angle & Core Image coordinate system

I have some doubts about Core Image coordinate system, way transforms are applied and extent is determined. I couldn't find much in documentation or on internet so I tried the following code to rotate CIImage and display it in UIImageView.
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
imageView.contentMode = .scaleAspectFit
let uiImage = UIImage(contentsOfFile: imagePath)
ciImage = CIImage(cgImage: (uiImage?.cgImage)!)
imageView.image = uiImage
}
private var currentAngle = CGFloat(0)
private var ciImage:CIImage!
private var ciContext = CIContext()
#IBAction func rotateImage() {
let extent = ciImage.extent
let translate = CGAffineTransform(translationX: extent.midX, y: extent.midY)
let uiImage = UIImage(contentsOfFile: imagePath)
currentAngle = currentAngle + CGFloat.pi/10
let rotate = CGAffineTransform(rotationAngle: currentAngle)
let translateBack = CGAffineTransform(translationX: -extent.midX, y: -extent.midY)
let transform = translateBack.concatenating(rotate.concatenating(translate))
ciImage = CIImage(cgImage: (uiImage?.cgImage)!)
ciImage = ciImage.transformed(by: transform)
NSLog("Extent \(ciImage.extent), Angle \(currentAngle)")
let cgImage = ciContext.createCGImage(ciImage, from: ciImage.extent)
imageView.image = UIImage(cgImage: cgImage!)
}
So as I rotate the image every time by the push of a button, the image is rotated by angle pi/10 each time. But I see the image shrinking in UIImageView. The NSLogs show the extent is growing with some rotations with the origin x and y becoming negative.
2021-09-24 14:43:29.280393+0400 CoreImagePrototypes[65817:5175194] Metal API Validation Enabled
2021-09-24 14:43:31.094877+0400 CoreImagePrototypes[65817:5175194] Extent (-105.0, -105.0, 1010.0, 1010.0), Angle 0.3141592653589793
2021-09-24 14:43:41.426371+0400 CoreImagePrototypes[65817:5175194] Extent (-159.0, -159.0, 1118.0, 1118.0), Angle 0.6283185307179586
2021-09-24 14:43:42.244703+0400 CoreImagePrototypes[65817:5175194] Extent (-159.0, -159.0, 1118.0, 1118.0), Angle 0.9424777960769379
2021-09-24 14:43:42.644446+0400 CoreImagePrototypes[65817:5175194] Extent (-105.0, -105.0, 1010.0, 1010.0), Angle 1.2566370614359172
2021-09-24 14:43:43.037312+0400 CoreImagePrototypes[65817:5175194] Extent (0.0, 0.0, 800.0, 800.0), Angle 1.5707963267948966
2021-09-24 14:43:43.478774+0400 CoreImagePrototypes[65817:5175194] Extent (-105.0, -105.0, 1010.0, 1010.0), Angle 1.8849555921538759
2021-09-24 14:43:44.045811+0400 CoreImagePrototypes[65817:5175194] Extent (-159.0, -159.0, 1118.0, 1118.0), Angle 2.199114857512855
My questions:
How exactly do I determine scale factor to rescale the image so that the extent does not cross the original image rectangle?
What exactly does negative extent origin means? Relative to what it is negative? I understand coordinate system in Core Image is relative assuming bottom left corner of image to be (0,0), not with respect to some superview like in UIKit.
It's unclear what the question is, but what you seem to be focussed on is the meaning of the extent. This is like the frame, and, just like the frame, it loses its meaning if you have applied a transform to the CIImage. After a rotation, the extent is now based on the bounding box of the transformed image. So if you have a horizontally wider image and you rotate it a little bit counterclockwise, the extent becomes taller and its top becomes negative.

Incorrect frame of boundingBox with VNRecognizedObjectObservation

I'm having an issue with displaying bounding box around recognized object using Core ML & Vision.
The horizontal detection seems to be working correctly, however, vertically the box is too tall, goes over the top edge of the video, doesn't go all the way to the bottom of the video, and it doesn't follow motion of the camera correctly. Here you can see the issue: https://imgur.com/Sppww8T
This is how video data output is initialized:
let videoDataOutput = AVCaptureVideoDataOutput()
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
videoDataOutput.setSampleBufferDelegate(self, queue: dataOutputQueue!)
self.videoDataOutput = videoDataOutput
session.addOutput(videoDataOutput)
let c = videoDataOutput.connection(with: .video)
c?.videoOrientation = .portrait
I've also tried other video orientations, without much success.
Performing the vision request:
let handler = VNImageRequestHandler(cvPixelBuffer: image, options: [:])
try? handler.perform(vnRequests)
And finally once the request is processed. viewRect is set to the size of the video view: 812x375 (I know, video layer itself is a bit shorter, but that's not the issue here):
let observationRect = VNImageRectForNormalizedRect(observation.boundingBox, Int(viewRect.width), Int(viewRect.height))
I've also tried doing something like (with more issues):
var observationRect = observation.boundingBox
observationRect.origin.y = 1.0 - observationRect.origin.y
observationRect = videoPreviewLayer.layerRectConverted(fromMetadataOutputRect: observationRect)
I've tried to cut out as much of what I deemed to be irrelevant code as possible.
I've actually come across a similar issue using Apple's sample code, when the bounding box wouldn't vertically go around objects as expected: https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture Maybe that means that there is some issue with the API?
I use something like this:
let width = view.bounds.width
let height = width * 16 / 9
let offsetY = (view.bounds.height - height) / 2
let scale = CGAffineTransform.identity.scaledBy(x: width, y: height)
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -height - offsetY)
let rect = prediction.boundingBox.applying(scale).applying(transform)
This assumes portrait orientation and a 16:9 aspect ratio. It assumes the .imageCropAndScaleOption = .scaleFill.
Credits: The transform code was taken from this repo: https://github.com/Willjay90/AppleFaceDetection

Rotate my SceneKit material

I'm taking images with AVCapturePhotoOutput and then using their JPEG representation as the texture on a SceneKit SCNPlane that is the same aspect ratio as the image:
let image = UIImage(data: dataImage!)
let rectangle = SCNPlane(width:9, height:12)
let rectmaterial = SCNMaterial()
rectmaterial.diffuse.contents = image
rectmaterial.isDoubleSided = true
rectangle.materials = [rectmaterial]
let rectnode = SCNNode(geometry: rectangle)
let pos = sceneSpacePosition(inFrontOf: self.pictCamera, atDistance: 16.5) // 16.5 is arbitrary, but makes the rectangle the same size as the camera
rectnode.position = pos
rectnode.orientation = self.pictCamera.orientation
pictView.scene?.rootNode.addChildNode(rectnode)
sceneSpacePosition is a bit of code that can be found here on SO that maps CoreMotion into SceneKit orientation. It is used to place the rectangle, which does indeed appear at the right location with the right size. All very cool.
The problem is that the image is rotated 90 degrees to the rectangle. So I did the obvious:
rectmaterial.diffuse.contentsTransform = SCNMatrix4MakeRotation(Float.pi / 2, 0, 0, 1)
This does not work property; the resulting image is unrecognizable. It appears that one small part of the image has been stretched to a huge size. I thought it might be the axis, but I tried all three with the same result.
Any ideas?
You are rotating on the upper left corner as suggested by Alain T.
If you move your image down, you may get the rotation you were expecting.
Try this:
let translation = SCNMatrix4MakeTranslation(0, -1, 0)
let rotation = SCNMatrix4MakeRotation(Float.pi / 2, 0, 0, 1)
let transform = SCNMatrix4Mult(translation, rotation)
rectmaterial.diffuse.contentsTransform = transform

Crop UIImage to square portion

I have a UIScrollView which contains a UIImage. On top of that is a box that the user can move the image, so that that portion is cropped.
This screenshot explains it better:
So they can scroll the image around until the portion they want is inside that box.
I then want to be able to crop the scrollView/UIImage to exactly that size and store the cropped image.
It shouldn't be very hard but I've spent ages trying screenshots, UIGraphicsContext, etc. and cant seem to get anything to work.
Thanks for the help.
I finally figured out how to get it to work. Here is the code:
func croppedImage() -> UIImage {
let cropSize = CGSize(width: 280, height: 280)
let scale = (imageView.image?.size.height)! / imageView.frame.height
let cropSizeScaled = CGSize(width: cropSize.width * scale, height: cropSize.height * scale)
if #available(iOS 10.0, *) {
let r = UIGraphicsImageRenderer(size: cropSizeScaled)
let x = -scrollView.contentOffset.x * scale
let y = -scrollView.contentOffset.y * scale
return r.image { _ in
imageView.image!.draw(at: CGPoint(x: x, y: y))
}
} else {
return UIImage()
}
}
So it first calculates the scale of the imageView and the actual image.
Then it creates a CGSize of that crop box as shown in the photo. However, the width and height must be scaled by the scale factor. (e.g. 280 * 6.5)
You must check if the phone is running iOS 10.0 for UIGraphicsImageRender - if not, it won't work.
Initialise this with the crop box size.
The image must then be offset, and this is calculated by getting the scrollView's content offset, negating it, and multiplying by the scale factor.
Then return the image drawn at that point!

Resources