I'm working with a CIImage, and while I understand it's not a linear image, it does hold some data.
My question is whether or not a CIImage's extent property returns pixels or points? According to the documentation, which says very little, it's working space coordinates. Does this mean there's no way to get the pixels / points from a CIImage and I must convert to a UIImage to use the .size property to get the points?
I have a UIImage with a certain size, and when I create a CIImage using the UIImage, the extent is shown in points. But if I run a CIImage through a CIFilter that scales it, I sometimes get the extent returned in pixel values.
I'll answer the best I can.
If your source is a UIImage, its size will be the same as the extent. But please, this isn't a UIImageView (which the size is in points). And we're just talking about the source image.
Running something through a CIFilter means you are manipulating things. If all you are doing is manipulating color, its size/extent shouldn't change (the same as creating your own CIColorKernel - it works pixel-by-pixel).
But, depending on the CIFilter, you may well be changing the size/extent. Certain filters create a mask, or tile. These may actually have an extent that is infinite! Others (blurs are a great example) sample surrounding pixels so their extent actually increases because they sample "pixels" beyond the source image's size. (Custom-wise these are a CIWarpKernel.)
Yes, quite a bit. Taking this to a bottom line:
What is the filter doing? Does it need to simply check a pixel's RGB and do something? Then the UIImage size should be the output CIImage extent.
Does the filter produce something that depends on the pixel's surrounding pixels? Then the output CIImage extent is slightly larger. How much may depend on the filter.
There are filters that produce something with no regard to an input. Most of these may have no true extent, as they can be infinite.
Points are what UIKit and CoreGraphics always work with. Pixels? At some point CoreImage does, but it's low-level to a point (unless you want to write your own kernel) you shouldn't care. Extents can usually - but keep in mind the above - be equated to a UIImage size.
EDIT
Many images (particularly RAW ones) can have so large a size as to affect performance. I have an extension for UIImage that resizes an image to a specific rectangle to help maintain consistent CI performance.
extension UIImage {
public func resizeToBoundingSquare(_ boundingSquareSideLength : CGFloat) -> UIImage {
let imgScale = self.size.width > self.size.height ? boundingSquareSideLength / self.size.width : boundingSquareSideLength / self.size.height
let newWidth = self.size.width * imgScale
let newHeight = self.size.height * imgScale
let newSize = CGSize(width: newWidth, height: newHeight)
UIGraphicsBeginImageContext(newSize)
self.draw(in: CGRect(x: 0, y: 0, width: newWidth, height: newHeight))
let resizedImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext();
return resizedImage!
}
}
Usage:
image = image.resizeToBoundingSquare(640)
In this example, an image size of 3200x2000 would be reduced to 640x400. Or an image size or 320x200 would be enlarged to 640x400. I do this to an image before rendering it and before creating a CIImage to use in a CIFilter.
I suggest you think of them as points. There is no scale and no screen (a CIImage is not something that is drawn), so there are no pixels.
A UIImage backed by a CGImage is the basis for drawing, and in addition to the CGImage it has a scale; together with the screen resolution, that gives us our translation from points to pixels.
I have a SceneKit view that fills my screen. My goal is to let the user take snapshots of that scene, but the snapshots are not the whole screen, but an inset portion in a UIImageView which is slightly smaller than the screen. Ideally, the user should not notice, the image on top should be identical to the scene behind it.
I have coded this up using snapshot and cropped, but as you can see in the image, the scale ends up way off - see the width of the yellow line, and the size of the windows? It's also not positioned correctly, it's somewhat down and to the left from where it should be - the upper left should be below the line of windows, but you can see it is at the roofline above them. I can't see the original snapshot because the debugger QuickLook refuses to show it.
There's not much code to it, anyone see the problem:
let background = sceneView.snapshot().cgImage!
let cropped = background.cropping(to: overlayView.frame)
UIGraphicsBeginImageContextWithOptions(overlayView.frame.size, false, 1.0)
let context = UIGraphicsGetCurrentContext()
context!.setAlpha(0.50)
context!.draw(cropped!, in: overlayView.bounds)
let transparent = context!.makeImage();
UIGraphicsEndImageContext()
overlayView.image = UIImage.init(cgImage: transparent!, scale: 1.0, orientation: .downMirrored)
I have tried various scales and rects to no avail. I assume this is something very easy.
UPDATE: after several tries I was able to get quicklook to work. The snapshot is indeed the entire background as I would expect. But it is much larger than I would expect too - its 640, 998 while the cropped version is 228, 304. That explains the "zooming". This leads me to believe that the frame size of the inset view is NOT a direct relationship to the image size. Does that ring any bells? Is there some other rect I should be using rather than overlayView.frame?
So I assume the problem is that the frame coordinates are in one set of units and the image coordinates are in another. I was able to solve the problem this way:
let croprect = CGRect(x: overlayView.frame.origin.x * 2, y: overlayView.frame.origin.y * 2 - 45, width: overlayView.frame.width * 2, height: overlayView.frame.height * 2)
let drawrect = CGRect(x: 0, y: 0, width: overlayView.frame.width * 2, height: overlayView.frame.height * 2)
let background = sceneView.snapshot()
let cropped = background.cgImage!.cropping(to: croprect)
UIGraphicsBeginImageContextWithOptions(drawrect.size, false, 0.0)
let context = UIGraphicsGetCurrentContext()
context!.setAlpha(0.50)
context!.draw(cropped!, in: drawrect)
let transparent = context!.makeImage();
UIGraphicsEndImageContext()
I'm extremely curious why I had to adjust the Y starting point to get them to line up, anyone have an idea?
When cropping a CGImage in Swift 3 (using the .cropping method), the original CGImage is referenced by the cropped version - both according to the documentation, and according to what the Allocations instruments shows me.
I am placing the cropped CGImage objects on an undo stack, so having the original versions retained 'costs' me about 21mb of memory per undo element.
Since there is no obvious way to 'compact' a cropped CGImage and have it made independent from the original, I have currently done something similar to the following (without all the force unwrapping):
let croppedImage = original.cropping(to: rect)!
let data = UIImagePNGRepresentation(UIImage(cgImage: croppedImage))!
let compactedCroppedImage = UIImage(data: data)!.cgImage!
This works perfectly, and now each undo snapshot takes up only the amount of memory that it is supposed to.
My question is: Is there a better / faster way to achieve this?
Your code involves a PNG compression and decompression. This can be avoided. Just create an offscreen bitmap of the target size, draw the original image into it and use it as an image:
UIGraphicsBeginImageContext(rect.size)
let targetRect = CGRect(x: -rect.origin.x, y: -rect.origin.y, width: original.size.width, height: original.size.height)
original.draw(in: targetRect)
let croppedImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
Note: The result is slightly different if you don't have integral coordinates.
Per the Getting the Best Performance Page,
Use Core Graphics or Image I/O functions to crop or downsample, such as the functions CGImageCreateWithImageInRect or CGImageSourceCreateThumbnailAtIndex.
However, I'm wondering how true this is if you're working solely in Core Image for image processing. If I have an image that needs to be downsampled and then filtered, along with other things, wouldn't it be less efficient to convert to CGImage, downsample, then convert back to CIImage for other uses?
I'm wondering if it would simply be better to work in the Core Image framework if downsampling is apart of the image processing algorithm you're performing. Certainly if the above is faster I'd like to give it a try, but I'm not sure there's any other way to downsample something as fast as possible. No, unfortunately CILanczosScaleTransform is horribly slow, I wish Core Image had a faster way in built to scale images besides this.
I'm using the code below, which I found here:
http://flexmonkey.blogspot.com/2014/12/scaling-resizing-and-orienting-images.html
extension UIImage {
public func resizeToBoundingSquare(_ boundingSquareSideLength : CGFloat) -> UIImage {
let imgScale = self.size.width > self.size.height ? boundingSquareSideLength / self.size.width : boundingSquareSideLength / self.size.height
let newWidth = self.size.width * imgScale
let newHeight = self.size.height * imgScale
let newSize = CGSize(width: newWidth, height: newHeight)
UIGraphicsBeginImageContext(newSize)
self.draw(in: CGRect(x: 0, y: 0, width: newWidth, height: newHeight))
let resizedImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext();
return resizedImage!
}
}
After downsizing things and/or making various pixel sizes consistent, I then use CI filters, both custom and chaining. I'm not seeing any performance or memory issues.
I think that I have seen fastest performance by using a CGAffineTransform(scale X, y) on the CIImage itself. I tried the CGImagesource thumbnail method and the overhead way outstripped any benefits, perhaps due to having to do more source conversions. Note that I am starting with a CVSampleBuffer so my chain is like:
CVSampleBuffer -> CIImage -> downsample with CGAffineTransform -> CIFiltering using a filter chain... input directly to VNRequest. I am able to get 60Hz direct-from-camera realtime processing using this method, although I would like to optimize it further which is why I am searching for an even faster option!
There are many questions on OpenGL font rendering, many of them are satisfied by texture atlases (fast, but wrong), or string-textures (fixed-text only).
However, those approaches are poor and appear to be years out of date (what about using shaders to do this better/faster?). For OpenGL 4.1 there's this excellent question looking at "what should you use today?":
What is state-of-the-art for text rendering in OpenGL as of version 4.1?
So, what should we be using on iOS GL ES 2 today?
I'm disappointed that there appears to be no open-source (or even commercial solution). I know a lot of teams suck it down and spend weeks of dev time re-inventing this wheel, gradually learning how to kern and space etc (ugh) - but there must be a better way than re-writing the whole of "fonts" from scratch?
As far as I can see, there are two parts to this:
How do we render text using a font?
How do we display the output?
For 1 (how to render), Apple provides MANY ways to get the "correct" rendered output - but the "easy" ones don't support OpenGL (maybe some of the others do - e.g. is there a simple way to map CoreText output to OpenGL?).
For 2 (how to display), we have shaders, we have VBOs, we have glyph-textures, we have lookup-textures, and other tecniques (e.g. the OpenGL 4.1 stuff linked above?)
Here are the two common OpenGL approaches I know of:
Texture atlas (render all glyphs once, then render 1 x textured quad per character, from the shared texture)
This is wrong, unless you're using a 1980s era "bitmap font" (and even then: texture atlas requires more work than it may seem, if you need it correct for non-trivial fonts)
(fonts aren't "a collection of glyphs" there's a vast amount of positioning, layout, wrapping, spacing, kerning, styling, colouring, weighting, etc. Texture atlases fail)
Fixed string (use any Apple class to render correctly, then screenshot the backing image-data, and upload as a texture)
In human terms, this is fast. In frame-rendering, this is very, very slow. If you do this with a lot of changing text, your frame rate goes through the floor
Technically, it's mostly correct (not entirely: you lose some information this way) but hugely inefficient
I've also seen, but heard both good and bad things about:
Imagination/PowerVR "Print3D" (link broken) (from the guys that manufacture the GPU! But their site has moved/removed the text rendering page)
FreeType (requires pre-processing, interpretation, lots of code, extra libraries?)
...and/or FTGL http://sourceforge.net/projects/ftgl/ (rumors: slow? buggy? not updated in a long time?)
Font-Stash http://digestingduck.blogspot.co.uk/2009/08/font-stash.html (high quality, but very slow?)
1.
Within Apple's own OS / standard libraries, I know of several sources of text rendering. NB: I have used most of these in detail on 2D rendering projects, my statements about them outputting different rendering are based on direct experience
CoreGraphics with NSString
Simplest of all: render "into a CGRect"
Seem to be a slightly faster version of the "fixed string" approach people recommend (even though you'd expect it to be much the same)
UILabel and UITextArea with plain text
NB: they are NOT the same! Slight differences in how they render the smae text
NSAttributedString, rendered to one of the above
Again: renders differently (the differences I know of are fairly subtle and classified as "bugs", various SO questions about this)
CATextLayer
A hybrid between iOS fonts and old C rendering. Uses the "not fully" toll-free-bridged CFFont / UIFont, which reveals some more rendering differences / strangeness
CoreText
... the ultimate solution? But a beast of its own...
I did some more experimenting, and it seems that CoreText might make for a perfect solution when combined with a texture atlas and Valve's signed-difference textures (which can turn a bitmap glyph into a resolution-independent hi-res texture).
...but I don't have it working yet, still experimenting.
UPDATE: Apple's docs say they give you access to everything except the final detail: which glyph + glyph layout to render (you can get the line layout, and the number of glyphs, but not the glyph itself, according to docs). For no apparent reason, this core piece of info is apparently missing from CoreText (if so, that makes CT almost worthless. I'm still hunting to see if I can find a way to get the actual glpyhs + per-glyph data)
UPDATE2: I now have this working properly with Apple's CT (but no different-textures), but it ends up as 3 class files, 10 data structures, about 300 lines of code, plus the OpenGL code to render it. Too much for an SO answer :(.
The short answer is: yes, you can do it, and it works, if you:
Create CTFrameSetter
Create CTFrame for a theoretical 2D frame
Create a CGContext that you'll convert to a GL texture
Go through glyph-by-glyph, allowing Apple to render to the CGContext
Each time Apple renders a glyph, calculate the boundingbox (this is HARD), and save it somewhere
And save the unique glyph-ID (this will be different for e.g. "o", "f", and "of" (one glyph!))
Finally, send your CGContext up to GL as a texture
When you render, use the list of glyph-IDs that Apple created, and for each one use the saved info, and the texture, to render quads with texture-co-ords that pull individual glyphs out of the texture you uploaded.
This works, it's fast, it works with all fonts, it gets all font layout and kerning correct, etc.
1.
Create any string by NSMutableAttributedString.
let mabstring = NSMutableAttributedString(string: "This is a test of characterAttribute.")
mabstring.beginEditing()
var matrix = CGAffineTransform(rotationAngle: CGFloat(GLKMathDegreesToRadians(0)))
let font = CTFontCreateWithName("Georgia" as CFString?, 40, &matrix)
mabstring.addAttribute(kCTFontAttributeName as String, value: font, range: NSRange(location: 0, length: 4))
var number: Int8 = 2
let kdl = CFNumberCreate(kCFAllocatorDefault, .sInt8Type, &number)!
mabstring.addAttribute(kCTStrokeWidthAttributeName as String, value: kdl, range: NSRange(location: 0, length: mabstring.length))
mabstring.endEditing()
2.
Create CTFrame. The rect calculate from mabstring by CoreText.CTFramesetterSuggestFrameSizeWithConstraints
let framesetter = CTFramesetterCreateWithAttributedString(mabstring)
let path = CGMutablePath()
path.addRect(rect)
let frame = CTFramesetterCreateFrame(framesetter, CFRangeMake(0, 0), path, nil)
3.
Create bitmap context.
let imageWidth = Int(rect.width)
let imageHeight = Int(rect.height)
var rawData = [UInt8](repeating: 0, count: Int(imageWidth * imageHeight * 4))
let bitmapInfo = CGBitmapInfo(rawValue: CGBitmapInfo.byteOrder32Big.rawValue | CGImageAlphaInfo.premultipliedLast.rawValue)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let bitsPerComponent = 8
let bytesPerRow = Int(rect.width) * 4
let context = CGContext(data: &rawData, width: imageWidth, height: imageHeight, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: rgbColorSpace, bitmapInfo: bitmapInfo.rawValue)!
4.
Draw CTFrame in bitmap context.
CTFrameDraw(frame, context)
Now, we got the raw pixel data rawData. Create OpenGL Texture , MTLTexture , UIImage with rawData is ok.
Example,
To OpenGL Texture:Convert an UIImage in a texture
Set-up your texture:
GLuint textureID;
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glGenTextures(1, &textureID);
glBindTexture(GL_TEXTURE_2D, textureID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, textureData);
,
//to MTLTexture
let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: Int(imageWidth), height: Int(imageHeight), mipmapped: true)
let device = MTLCreateSystemDefaultDevice()!
let texture = device.makeTexture(descriptor: textureDescriptor)
let region = MTLRegionMake2D(0, 0, Int(imageWidth), Int(imageHeight))
texture.replace(region: region, mipmapLevel: 0, withBytes: &rawData, bytesPerRow: imageRef.bytesPerRow)
,
//to UIImage
let providerRef = CGDataProvider(data: NSData(bytes: &rawData, length: rawData.count * MemoryLayout.size(ofValue: UInt8(0))))
let renderingIntent = CGColorRenderingIntent.defaultIntent
let imageRef = CGImage(width: imageWidth, height: imageHeight, bitsPerComponent: 8, bitsPerPixel: 32, bytesPerRow: bytesPerRow, space: rgbColorSpace, bitmapInfo: bitmapInfo, provider: providerRef!, decode: nil, shouldInterpolate: false, intent: renderingIntent)!
let image = UIImage.init(cgImage: imageRef)
I know this post is old, but I came across it while trying to do this exactly in my application. In my search, I came across this sample project
http://metalbyexample.com/rendering-text-in-metal-with-signed-distance-fields/
It is a perfect implementation of CoreText with OpenGL using the techniques of texture atlasing and signed distance fields. It has greatly helped me achieve the results I wanted. Hope this helps someone else.