I'm new to SpriteKit, and my question is how to load sprite sheets from web API.
Currently, I have an API returns a big PNG image, which contains all sprite sheets, and a json about individual frame information. (file and json are generated by TexturePacker) The API looks like this:
The format just likes a .atlasc folder, which contains a big image and a plist (XML) file.
I was thinking about downloading image and plist file and save it in the disk to load. However, SKTextureAtlas.init(named: String) can only load from app bundle.
In one word, I want to load a sprite animation from the web at runtime.
I have control of the API, so I can update the API to accomplish my goal.
The way I've figured out is downloading image, create a sourceTexture, like: let sourceTexture = SKTexture(image: image)
Then use the frame information in json to create individual textures with method init(rect rect: CGRect, inTexture texture: SKTexture)
Sample code is:
var textures: [SKTexture] = []
let sourceTexture = SKTexture(image: image)
for frame in spriteSheet.frames {
let rect = CGRect(
x: frame.frame.origin.x / spriteSheet.size.width,
y: 1.0 - (frame.frame.size.height / spriteSheet.size.height) - (frame.frame.origin.y / spriteSheet.size.height),
width: frame.frame.size.width / spriteSheet.size.width,
height: frame.frame.size.height / spriteSheet.size.height
)
let texture = SKTexture(rect: rect, inTexture: sourceTexture)
textures.append(texture)
}
basically same way with #Honghao Zhang 's answer, but I was little bit confused about the whole structure at first glance.
so I share my code snippet for later readers.
Happy coding :)
func getSpriteTextures() -> [SKTexture]? {
guard let spriteSheet = loadSpriteJson(name: "sprite_json_file", codable: SpriteJson.self) else { return nil }
let sourceImage = UIImage(named: "sprite_img_file.png")!
let sourceTexture = SKTexture(image: sourceImage)
var textures: [SKTexture] = []
let sourceWidth = spriteSheet.meta.size.w
let sourceHeight = spriteSheet.meta.size.h
let orderedFrameImgNames = spriteSheet.frames.keys.sorted()
for frameImgName in orderedFrameImgNames {
let frameMeta = spriteSheet.frames[frameImgName]!
let rect = CGRect(x: frameMeta.frame.x / sourceWidth,
y: 1.0
- (frameMeta.sourceSize.h / sourceHeight)
- (frameMeta.frame.y / sourceHeight),
width: frameMeta.frame.w / sourceWidth,
height: frameMeta.frame.h / sourceHeight)
let texture = SKTexture(rect: rect, in: sourceTexture)
textures.append(texture)
}
return textures
}
Related
I was able to identify squares from a images using VNDetectRectanglesRequest. Now I want those rectangles to store as separate images (UIImage or cgImage). Below is what I tried.
let rectanglesDetection = VNDetectRectanglesRequest { request, error in
rectangles = request.results as! [VNRectangleObservation]
rectangles.sort{$0.boundingBox.origin.y > $1.boundingBox.origin.y}
for rectangle in rectangles {
let rect = rectangle.boundingBox
let imageRef = cgImage.cropping(to: rect)
let image = UIImage(cgImage: imageRef!, scale: image!.scale, orientation: image!.imageOrientation)
checkBoxImages.append(image)
}
Can anybody point out what's wrong or what should be the best approach?
Update 1
At this stage, I'm testing with an image that I added to the assets.
With this image I get 7 rectangles as observations as each for each cell and one for the table margin.
My task is to identify the text inside in each rectangle and my approach is to send VNRecognizeTextRequest for each rectangle that has been identified. My real scenario is little complicated than this but I want to at least achieve this before going forward.
Update 2
for rectangle in rectangles {
let trueX = rectangle.boundingBox.minX * image!.size.width
let trueY = rectangle.boundingBox.minY * image!.size.height
let width = rectangle.boundingBox.width * image!.size.width
let height = rectangle.boundingBox.height * image!.size.height
print("x = " , trueX , " y = " , trueY , " width = " , width , " height = " , height)
let cropZone = CGRect(x: trueX, y: trueY, width: width, height: height)
guard let cutImageRef: CGImage = image?.cgImage?.cropping(to:cropZone)
else {
return
}
let croppedImage: UIImage = UIImage(cgImage: cutImageRef)
croppedImages.append(croppedImage)
}
My image width and height is
width = 406.0 height = 368.0
I've taken my debug interface for you to get a proper understand.
As #Lasse mentioned, this is my actual issue with screenshots.
This is just a guess since you didn't state what the actual problem is, but probably you're getting a zero-sized image for each VNRectangleObservation.
The reason is: Vision uses a normalized coordinate space from 0.0 to 1.0 with lower left origin.
So in order to get the correct rectangle of your original image, you need to convert the rect from Normalized Space to Image Space. Luckily there is VNImageRectForNormalizedRect(::_:) to do just that.
I am working on an MTKView-backed paint program which can replay painting history via an array of MTLTextures that store keyframes. I am having an issue in which sometimes the content of these MTLTextures is scrambled.
As an example, say I want to store a section of the drawing below as a keyframe:
During playback, sometimes the drawing will display exactly as intended, but sometimes, it will display like this:
Note the distorted portion of the picture. (The undistorted portion constitutes a static background image that's not part of the keyframe in question)
I describe the way I Create individual MTLTextures from the MTKView's currentDrawable below. Because of color depth issues I won't go into, the process may seem a little round-about.
I first get a CGImage of the subsection of the screen that constitutes a keyframe.
I use that CGImage to create an MTLTexture tied to the MTKView's device.
I store that MTLTexture into a MTLTextureStructure that stores the MTLTexture and the keyframe's bounding-box (which I'll need later)
Lastly, I store in an array of MTLTextureStructures (keyframeMetalArray). During playback, when I hit a keyframe, I get it from this keyframeMetalArray.
The associated code is outlined below.
let keyframeCGImage = weakSelf!.canvasMetalViewPainting.mtlTextureToCGImage(bbox: keyframeBbox, copyMode: copyTextureMode.textureKeyframe) // convert from MetalTexture to CGImage
let keyframeMTLTexture = weakSelf!.canvasMetalViewPainting.CGImageToMTLTexture(cgImage: keyframeCGImage)
let keyframeMTLTextureStruc = mtlTextureStructure(texture: keyframeMTLTexture, bbox: keyframeBbox, strokeType: brushTypeMode.brush)
weakSelf!.keyframeMetalArray.append(keyframeMTLTextureStruc)
Without providing specifics about how each conversion is happening, I wonder if, from an architecture design point, I'm overlooking something that is corrupting my data stored in the keyframeMetalArray. It may be unwise to try to store these MTLTextures in volatile arrays, but I don't know that for a fact. I just figured using MTLTextures would be the quickest way to update content.
By the way, when I swap out arrays of keyframes to arrays of UIImage.pngData, I have no display issues, but it's a lot slower. On the plus side, it tells me that the initial capture from currentDrawable to keyframeCGImage is working just fine.
Any thoughts would be appreciated.
p.s. adding a bit of detail based on the feedback:
mtlTextureToCGImage:
func mtlTextureToCGImage(bbox: CGRect, copyMode: copyTextureMode) -> CGImage {
let kciOptions = [convertFromCIContextOption(CIContextOption.outputPremultiplied): true,
convertFromCIContextOption(CIContextOption.useSoftwareRenderer): false] as [String : Any]
let bboxStrokeScaledFlippedY = CGRect(x: (bbox.origin.x * self.viewContentScaleFactor), y: ((self.viewBounds.height - bbox.origin.y - bbox.height) * self.viewContentScaleFactor), width: (bbox.width * self.viewContentScaleFactor), height: (bbox.height * self.viewContentScaleFactor))
let strokeCIImage = CIImage(mtlTexture: metalDrawableTextureKeyframe,
options: convertToOptionalCIImageOptionDictionary(kciOptions))!.oriented(CGImagePropertyOrientation.downMirrored)
let imageCropCG = cicontext.createCGImage(strokeCIImage, from: bboxStrokeScaledFlippedY, format: CIFormat.RGBA8, colorSpace: colorSpaceGenericRGBLinear)
cicontext.clearCaches()
return imageCropCG!
} // end of func mtlTextureToCGImage(bbox: CGRect)
CGImageToMTLTexture:
func CGImageToMTLTexture (cgImage: CGImage) -> MTLTexture {
// Note that we forego the more direct method of creating stampTexture:
//let stampTexture = try! MTKTextureLoader(device: self.device!).newTexture(cgImage: strokeUIImage.cgImage!, options: nil)
// because MTKTextureLoader seems to be doing additional processing which messes with the resulting texture/colorspace
let width = Int(cgImage.width)
let height = Int(cgImage.height)
let bytesPerPixel = 4
let rowBytes = width * bytesPerPixel
//
let texDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm,
width: width,
height: height,
mipmapped: false)
texDescriptor.usage = MTLTextureUsage(rawValue: MTLTextureUsage.shaderRead.rawValue)
texDescriptor.storageMode = .shared
guard let stampTexture = device!.makeTexture(descriptor: texDescriptor) else {
return brushTextureSquare // return SOMETHING
}
let dstData: CFData = (cgImage.dataProvider!.data)!
let pixelData = CFDataGetBytePtr(dstData)
let region = MTLRegionMake2D(0, 0, width, height)
print ("[MetalViewPainting]: w= \(width) | h= \(height) region = \(region.size)")
stampTexture.replace(region: region, mipmapLevel: 0, withBytes: pixelData!, bytesPerRow: Int(rowBytes))
return stampTexture
} // end of func CGImageToMTLTexture (cgImage: CGImage)
The type of distortion looks like a bytes-per-row alignment issue between CGImage and MTLTexture. You're probably only seeing this issue when your image is a certain size that falls outside of the bytes-per-row alignment requirement of your MTLDevice. If you really need to store the texture as a CGImage, ensure that you are using the bytesPerRow value of the CGImage when copying back to the texture.
I'm trying to get the image(high quality) of each pdf page. I'm using below code running through a for loop until page count and it works.
guard let document = CGPDFDocument(pdfurl as CFURL) else { return }
guard let page = document.page(at: i) else { return }
let dpi: CGFloat = 300.0/72.0
let pagerect = page.getBoxRect(.mediaBox)
print(pagebounds)
print(pagerect)
let render = UIGraphicsImageRenderer(size: CGSize(width: pagerect.size.width * dpi, height: pagerect.size.height * dpi))
let imagedata = render.jpegData(withCompressionQuality: 0.5, actions: { cnv in
UIColor.white.set()
cnv.fill(pagerect)
cnv.cgContext.translateBy(x: 0.0, y: pagerect.size.height * dpi)
cnv.cgContext.scaleBy(x: dpi, y: -dpi)
cnv.cgContext.drawPDFPage(page)
})
let image = UIImage(data: imagedata)
I'm getting following issues with this ...
sometimes the image is nil.
When this runs, the usage of memory is very high.
With the page count(number of pages), usage of memory is very very high, and sometimes it goes to 1.4 GB and suddenly it crashes the app with the warning : Terminate due to memory waring . then I tried to run above code inside autoreleasepool. it did work but when the memory usage is more high (when it near to RAM size), again app crashes with above warning.
How can I avoid this memory warning and get the quality image form pdf page. hope any help. have a nice day.
If you are facing this issue then try this:
autoreleasepool {
guard let page = document.page(at: i) else { return }
// Fetch the page rect for the page we want to render.
let pageRect = page.getBoxRect(.mediaBox)
var dpi: CGFloat = 1.0
if pageRect.size.width > pageRect.size.height {
dpi = 3508.0 / pageRect.size.width
} else {
dpi = 3508.0 / pageRect.size.height
}
//dpi = 300
let format = UIGraphicsImageRendererFormat()
format.scale = 1
let renderer = UIGraphicsImageRenderer(size: CGSize(width: pageRect.size.width * dpi, height: pageRect.size.height * dpi), format: format)
let imagedata = renderer.jpegData(withCompressionQuality: 1.0, actions: { cnv in
UIColor.white.set()
cnv.fill(pageRect)
cnv.cgContext.translateBy(x: 0.0, y: pageRect.size.height * dpi)
cnv.cgContext.scaleBy(x: dpi, y: -dpi)
cnv.cgContext.drawPDFPage(page)
})
let image = UIImage(data: imagedata)
}
autoreleasepool - for permanent memory clearing
scale - so that images for different devices are not created, which increases their resolution by 2 or 3 times
changed the way to increase dpi as it can be initially more or less than 72
1) sometimes the image is nil.
Is there a reason that you are generating a jpeg data then converting to UIImage VS directly creating an uiimage (using func image(actions: (UIGraphicsImageRendererContext) -> Void) -> UIImage)?
If you really need to use the jpeg method, then don't directly instantiate from UIImage(data:), use CGImage's init?(jpegDataProviderSource source: CGDataProvider,
decode: UnsafePointer<CGFloat>?,
shouldInterpolate: Bool,
intent: CGColorRenderingIntent) then use UIImage(cgImage:) to get your UIImage instance
2) When this runs, the usage of memory is very high
Are you storing all images created per each page? If yes, then if you have a pdf of high number of pages then you will consume max memory at some point because of accumulated images. Why don't you store to disk each image created then releasing it afterwards so you don't accumulate the memory of storing all pages in memory.
Sharing the loop(assuming there is a loop) outside of this snippet could help solve your issue more
I have a class that takes an UIImage, initializes a CIImage with it like so:
workingImage = CIImage.init(image: baseImage!)
Then the image is used to cut out 9 neighbouring squares in a 3x3 pattern out of it - in a loop:
for x in 0..<3
{
for y in 0..<3
{
croppingRect = CGRect(x: CGFloat(Double(x) * sideLength + startPointX),
y: CGFloat(Double(y) * sideLength + startPointY),
width: CGFloat(sideLength),
height: CGFloat(sideLength))
let tmpImg = (workingImage?.cropping(to: croppingRect))!
}
}
Those tmpImgs are inserted into a table and later used, but thats besides the point.
This code works on IOS 9, and on IOS 10 simulators, but not on an actual IOS 10 device. The images produced are either all empty, or one of them is like a half of what its supposed to be, with the rest being, again, empty.
Is this not how its supposed to be done in IOS 10?
The heart of the matter is that passing through CIImage is not the way to crop a UIImage. For one thing, coming back from CIImage to UIImage is a complicated business. For another, the whole round-trip is unnecessary.
How To Crop
To crop an image, make an image graphics context of the desired cropped size and call draw(at:) on the UIImage to draw it at the desired point relative to the graphics context, so that the desired portion of the image falls into the context. Now extract the resulting new image and close the context.
To demonstrate, I'll crop to one of the thirds you are trying to crop to, namely the lower right third:
let sz = baseImage.size
UIGraphicsBeginImageContextWithOptions(
CGSize(width:sz.width/3.0, height:sz.height/3.0),
false, 0)
baseImage.draw(at:CGPoint(x: -sz.width/3.0*2.0, y: -sz.height/3.0*2.0))
let tmpImg = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
Original image (baseImage):
Cropped image (tmpImg):
The other sections are completely parallel.
Core Image's coordinate system mismatches with UIKit, so the rect needs to be mirrored.
So in your specific case, you want:
var ciRect = croppingRect
ciRect.origin.y = workingImage!.extent.height - ciRect.origin.y - ciRect.height
let tmpImg = workingImage!.cropped(to: ciRect)
This definitely works for iOS 10+.
In a more general case, we would make a UIImage extension that covers both possible coordinate systems, and that's way faster than draw(at:):
extension UIImage {
/// Return a new image cropped to a rectangle.
/// - parameter rect:
/// The rectangle to crop.
open func cropped(to rect: CGRect) -> UIImage {
// a UIImage is either initialized using a CGImage, a CIImage, or nothing
if let cgImage = self.cgImage {
// CGImage.cropping(to:) is magnitudes faster than UIImage.draw(at:)
if let cgCroppedImage = cgImage.cropping(to: rect) {
return UIImage(cgImage: cgCroppedImage)
} else {
return UIImage()
}
}
if let ciImage = self.ciImage {
// Core Image's coordinate system mismatch with UIKit, so rect needs to be mirrored.
var ciRect = rect
ciRect.origin.y = ciImage.extent.height - ciRect.origin.y - ciRect.height
let ciCroppedImage = ciImage.cropped(to: ciRect)
return UIImage(ciImage: ciCroppedImage)
}
return self
}
}
I've made a pod for it, so the source code is at https://github.com/Coeur/ImageEffects/blob/master/SwiftImageEffects/ImageEffects%2Bextensions.swift
I'm trying to use Tesseract OCR library in my iOS application. I downloaded tesseract-ios library from github and when I tried to recognize a simple text image I got garbage instead. Here is an image of what I tried to recognize:
I got unreadable text:
T0I1101T0W KIR1 H1I1101T0W KIR1 H1I1101T0W CIBEPS H1 ES PBHY P306
EHH11 133I R1 11335 11I1H1 19 13S SYIL 3B19 M H300H1911 H1113 AIR1
J1 OIII 3I9SH5H133IS 13V9 I1 Q1H211 E015 19 W331 H1 111SW
Why Tesseract can't recognise even simple image? Here is code which I used to instantiate Tesseract:
Tesseract* tesseractObject = [[Tesseract alloc] initWithDataPath:#"tessdata" language:#"eng"];
[tesseractObject setVariableValue:#"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" forKey:#"tessedit_char_whitelist"];
[tesseractObject setImage:image];
[tesseractObject recognize];
NSLog(#"RECOGNISED= %#" , [tesseractObject recognizedText]);
Here is my project structure:
I added English testdata folder by reference. So what am I doing wrong? How can I fix this?
You are using the option tessedit_char_whitelist with the value "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" which limits the character recognition to this list only. However the image that you want to process contains lower case characters, if you want to use this option you will have to include lower cases char too.
[tesseractObject setVariableValue:#"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" forKey:#"tessedit_char_whitelist"];
Make sure you have the latest tessdata file from Google code
http://code.google.com/p/tesseract-ocr/downloads/list
This will provide you with a list of tessdata files that you need to download and include in your app if you haven't already. In your case you will need tesseract-ocr-3.02.eng.tar.gz as you are looking for the English language files
The following article will show you where you need to install it. I read through this tutorial when I built my first Tesseract project and found it really useful
http://lois.di-qual.net/blog/install-and-use-tesseract-on-ios-with-tesseract-ios/
Like Adam said, if you want good results, you'll have to do some image processing and configure some settings (white-listing certain characters, etc).
For anyone else stumbling upon this question, I've put together a sample project here that does some white-listing and image processing:https://github.com/mstrchrstphr/OCR-iOS-Example
and my output is
Solution :
tesseract.language = #"eng+fra";
tesseract.pageSegmentationMode = G8PageSegmentationModeAuto;
tesseract.engineMode = G8OCREngineModeTesseractCubeCombined;
tesseract.image = [image.image g8_blackAndWhite];
tesseract.maximumRecognitionTime = 60.0;
[tesseract recognize];
NSLog(#"%#", tesseract.recognizedText);
reco_area.text = [tesseract recognizedText];
for tessdata
click here
whatever # Adam Richardson explained is correct along with that add this 1) scaleimage method for increase size of the image(dimensions increase)
func scaleImage(image: UIImage, maxDimension: CGFloat) -> UIImage {
var scaledSize = CGSize(width: maxDimension, height: maxDimension)
var scaleFactor: CGFloat
if image.size.width > image.size.height {
scaleFactor = image.size.height / image.size.width
scaledSize.width = maxDimension
scaledSize.height = scaledSize.width * scaleFactor
} else {
scaleFactor = image.size.width / image.size.height
scaledSize.height = maxDimension
scaledSize.width = scaledSize.height * scaleFactor
}
UIGraphicsBeginImageContext(scaledSize)
image.draw(in: CGRect(x: 0, y: 0, width: scaledSize.width, height: scaledSize.height))
let scaledImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return scaledImage!
}
2) store this eng.traineddata language file in filemanager
func storeLanguageFile() throws{
var fileManager: FileManager = FileManager.default
let nsDocumentDirectory = FileManager.SearchPathDirectory.documentDirectory
let nsUserDomainMask = FileManager.SearchPathDomainMask.userDomainMask
let docDirectory = NSSearchPathForDirectoriesInDomains(nsDocumentDirectory, nsUserDomainMask, true)[0] as NSString
let path: String = docDirectory.appendingPathComponent("/tessdata/eng.traineddata")
if fileManager.fileExists(atPath: path){
var data: NSData = NSData.dataWithContentsOfMappedFile((Bundle.main.resourcePath?.appending("/tessdata/eng.traineddata"))!)! as! NSData
var error: NSError
try FileManager.default.createDirectory(atPath: docDirectory.appendingPathComponent("/tessdata"), withIntermediateDirectories: true, attributes: nil)
data.write(toFile: path, atomically: true)
}
}
3) after that you can use https://github.com/BradLarson/GPUImage for increase clarity of the image
you can use this
func preprocessedImage(for tesseract: G8Tesseract!, sourceImage: UIImage!) -> UIImage! {
var stillImageFilter: GPUImageAdaptiveThresholdFilter = GPUImageAdaptiveThresholdFilter()
stillImageFilter.blurRadiusInPixels = 4.0
var filterImage: UIImage = stillImageFilter.image(byFilteringImage: sourceImage)
return filterImage
}
these 3 steps will help you to increase the accuracy of the tesseract upto 60 ~ 70 %