I implemented live text on images with the following code
let config = ImageAnalyzer.Configuration([.text, .machineReadableCode])
Task {
do {
let analysis = try await analyzer.analyze(image, configuration: config)
interaction.analysis = analysis
interaction.preferredInteractionTypes = .automatic
} catch {
}
}
I am able to select text, but I'm not able to do much after that. I would like to be able to get the selected text along with the position of the selected text relative to the image. How would I do that?
Related
I work on an iOS app that displays images that often contain text, and I'm adding support for ImageAnalysisInteraction as described in this WWDC 2022 session. I have gotten as far as making the interaction show up and being able to select text and get the system selection menu, and even add my own action to the menu via the buildMenuWithBuilder API. But what I really want to do with my custom action is get the selected text and do a custom lookup-like thing to check the text against other content in my app.
So how do I get the selected text from an ImageAnalysisInteraction on a UIImageView? The docs show methods to check if there is selected text, but I want to know what the text is.
I was trying to solve the same problem. However, there doesn't currently seem to be any straightforward way to get selected text from ImageAnalysisInteraction. The closest thing seems to be the ImageAnalysis.transcript property, but it contains all the OCR text, not just what the user selected.
My solution was to capture the text whenever the user taps on the copy button on the selection menu. You can do this by observing clipboard changes, which allows you to copy the selected text from the clipboard whenever a change is detected.
See:
Get notified on clipboard change in swift
How to copy text to clipboard/pasteboard with Swift
Hope this help you
// Step -1
import Vision
// Step -2
// converting image into CGImage
guard let cgImage = imageWithText.image?.cgImage else {return}
// Step -3
// creating request with cgImage
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])
// Step -4
let request = VNRecognizeTextRequest { request, error in
guard let observations = request.results as [VNRecognizedTextObservation],
error == nil else {return}
let text = observations.compactMap({
$0.topCandidates(1).first?.string
}).joined(separator: ", ")
print(text) // text we get from image
}
// step -5
request.recognitionLevel = VNRequestTextRecognitionLevel
try handler.perform([request])
For Reference and more details
I'm using PDFKit framework, and want to highlight all the hyperlinks with blue color in all the pages in the PDF. How do I go ahead? I have searched but could not get enough relevant post.
If you want to extract all links from pdf, then apply a regular expression and extract all links in an array like:
let text = pdfView.document?.string ?? ""
let types: NSTextCheckingResult.CheckingType = .link
do {
let detector = try NSDataDetector(types: types.rawValue)
let matchResult = detector.matches(in: text, options: .reportCompletion, range: NSRange(location: 0, length: text.count))
let linksArray: [URL] = matchResult.compactMap({ $0.url })
print("List of available links: \(linksArray)")
} catch (let error) {
print (error.localizedDescription)
}
But, if you just want to highlight the links and click action in them then PDFKit does have a property enableDataDetectors to detect links in the PDFView. You have to just enable it.
As per apple documentation:
Turns on or off data detection. If enabled, page text will be scanned for URL's as the page becomes visible. Where URL's are found, Link annotations are created in place. These are temporary annotations and are not saved.
You can use it as:
let pdfView = PDFView.init(frame: self.view.bounds)
pdfView.enableDataDetectors = true
If you need to handle click of this link, then conform to PDFViewDelegate, and it will call delegate method:
func pdfViewWillClick(onLink sender: PDFView, with url: URL) {
}
I also had the same question and this is implemented functionality to accomplish the task.
When initializing the pdf document, we need to enable enableDataDetectors which will turns on the data detection, which adds annotations for detected URLs in a page.
PDFView *pdfView = [[PDFView alloc] initWithFrame: containerView.bounds];
pdfView.document = pdfDocument;
pdfView.enableDataDetectors = true;
Then using following function, we can extract the page hyperlinks. I converted those hyperlinks to PDFDestination for easy navigation. Hope this will helps to someone!
-(NSArray*)getHyperlinksDestinationsListFrom:(PDFDocument*)pdfDocument {
NSMutableArray *list = [NSMutableArray new];
if (pdfDocument) {
for (int i = 0; i < pdfDocument.pageCount; i++) {
PDFPage *page = [pdfDocument pageAtIndex:i];
// After enabling 'enableDataDetectors', all the detectable hyperlinks will return as 'Link' annotations
for (PDFAnnotation *anno in page.annotations) {
if ([anno.type isEqual: #"Link"]) {
PDFDestination *dest = [[PDFDestination alloc] initWithPage:page atPoint:anno.bounds.origin];
[list addObject:dest];
}
}
}
}
return list;
}
Highlight part is straight-forward once you detected the links.
How can I get the console logs with all the print/Nslog contents and display it on a textview? Thank you very much for your answer.
To accomplish this I modified the OutputListener Class described in this article titled "Intercepting stdout in Swift" by phatblat:
func captureStandardOutputAndRouteToTextView() {
outputPipe = Pipe()
// Intercept STDOUT with outputPipe
dup2(self.outputPipe.fileHandleForWriting.fileDescriptor, FileHandle.standardOutput.fileDescriptor)
outputPipe.fileHandleForReading.waitForDataInBackgroundAndNotify()
NotificationCenter.default.addObserver(forName: NSNotification.Name.NSFileHandleDataAvailable, object: outputPipe.fileHandleForReading , queue: nil) {
notification in
let output = self.outputPipe.fileHandleForReading.availableData
let outputString = String(data: output, encoding: String.Encoding.utf8) ?? ""
DispatchQueue.main.async(execute: {
let previousOutput = self.outputText.string
let nextOutput = previousOutput + outputString
self.outputText.string = nextOutput
let range = NSRange(location:nextOutput.count,length:0)
self.outputText.scrollRangeToVisible(range)
})
self.outputPipe.fileHandleForReading.waitForDataInBackgroundAndNotify()
}
}
}
If you do not want to change existing code, you can;
1 - redirect the output of print to a known file.
see instructions here; How to redirect the nslog output to file instead of console ( answer 4, redirecting)
2 - monitor the file for changes and read them in to display in your textView.
You cannot do that.
You can use some logger, witch allow you to add custom log destination.
You will have to change all print/NSLog calls to e.g. Log.verbose(message).
I'm using SwiftyBeaver. It allows you to define your custom destination. You can later read it and present in some text field.
You can totally do that! Check this out: https://stackoverflow.com/a/13303081/1491675
Basically you create an output file and pipe the stderr output to that file. Then to display in your textView, just read the file and populate your textView.
So I am using the CZWeatherKit library to grab weather data from forecast.io.
When I get results, it sends a climacon UInt8 char, which should match to an icon if the climacon font is installed. I did that but it only shows the char, not the actual icon. Here is the code, it prints a quote i.e. " which is the correct mapping to ClimaconCloudSun, but the icon doesn't show. I followed these instructions to install the climacons.ttf font
request.sendWithCompletion { (data, error) -> Void in
if let error = error {
print(error)
} else if let weather = data {
let forecast = weather.dailyForecasts.first as! CZWeatherForecastCondition
dispatch_async(dispatch_get_main_queue(), { () -> Void in
// I get back good results, this part works
let avgTempFloat = (forecast.highTemperature.f + forecast.lowTemperature.f) / 2
let avgTemp = NSDecimalNumber(float: avgTempFloat).decimalNumberByRoundingAccordingToBehavior(rounder)
self.temperatureLabel.text = String(avgTemp)
self.weatherLabel.text = forecast.summary
// this part does not work, it has the right char, but does not display icon
// I tried setting self.climaconLabel.font = UIFont(name: "Climacons-Font", size: 30) both in IB and programmatically
let climaChar = forecast.climacon.rawValue
let climaString = NSString(format: "%c", climaChar)
self.climaconLabel.text = String(climaString)
})
}
}
I solved the exact same issue, the problem was the font file. Replace your current font with the one provided here: https://github.com/comyar/Sol/blob/master/Sol/Sol/Resources/Fonts/Climacons.ttf
You've probably moved on from this problem by now, but I'll leave this here for future use.
You need to call setNeedsLayout on the label after you change the title text to the desired value, and the label will change to the corresponding icon.
The documentation is not really clear to me. So far I reckon I need to set up a CGPDFOperatorTable and then create a CGPDFContentStreamCreateWithPage and CGPDFScannerCreate per PDF page.
The documentation refers to setting up Callbacks, but it's unclear to me how. How to actually obtain the content from a page?
This is my code so far.
let pdfURL = NSBundle.mainBundle().URLForResource("titleofdocument", withExtension: "pdf")
// Create pdf document
let pdfDoc = CGPDFDocumentCreateWithURL(pdfURL)
// Nr of pages in this PF
let numberOfPages = CGPDFDocumentGetNumberOfPages(pdfDoc) as Int
if numberOfPages <= 0 {
// The number of pages is zero
return
}
let myTable = CGPDFOperatorTableCreate()
// lets go through every page
for pageNr in 1...numberOfPages {
let thisPage = CGPDFDocumentGetPage(pdfDoc, pageNr)
let myContentStream = CGPDFContentStreamCreateWithPage(thisPage)
let myScanner = CGPDFScannerCreate(myContentStream, myTable, nil)
CGPDFScannerScan(myScanner)
// Search for Content here?
// ??
CGPDFScannerRelease(myScanner)
CGPDFContentStreamRelease(myContentStream)
}
// Release Table
CGPDFOperatorTableRelease(myTable)
It's a similar question to: PDF Parsing with SWIFT but has no answers yet.
Here is an example of the callbacks implemented in Swift:
let operatorTableRef = CGPDFOperatorTableCreate()
CGPDFOperatorTableSetCallback(operatorTableRef, "BT") { (scanner, info) in
print("Begin text object")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "ET") { (scanner, info) in
print("End text object")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "Tf") { (scanner, info) in
print("Select font")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "Tj") { (scanner, info) in
print("Show text")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "TJ") { (scanner, info) in
print("Show text, allowing individual glyph positioning")
}
let numPages = CGPDFDocumentGetNumberOfPages(pdfDocument)
for pageNum in 1...numPages {
let page = CGPDFDocumentGetPage(pdfDocument, pageNum)
let stream = CGPDFContentStreamCreateWithPage(page)
let scanner = CGPDFScannerCreate(stream, operatorTableRef, nil)
CGPDFScannerScan(scanner)
CGPDFScannerRelease(scanner)
CGPDFContentStreamRelease(stream)
}
You've actually specified exactly how to do it, all you need to do is put it together and try until it works.
First of all, you need to setup a a table with callbacks as you state yourself in the beginning of your question (all code in Objective C, NOT Swift):
CGPDFOperatorTableRef operatorTable = CGPDFOperatorTableCreate();
CGPDFOperatorTableSetCallback(operatorTable, "q", &op_q);
CGPDFOperatorTableSetCallback(operatorTable, "Q", &op_Q);
This table contains a list of the PDF operators you want to get called for and associates a callback with them. Those callbacks are simply functions you define elsewhere:
static void op_q(CGPDFScannerRef s, void *info) {
// Do whatever you have to do in here
// info is whatever you passed to CGPDFScannerCreate
}
static void op_Q(CGPDFScannerRef s, void *info) {
// Do whatever you have to do in here
// info is whatever you passed to CGPDFScannerCreate
}
And then you create the scanner and get it going, while passing it the information you just defined.
// Passing "self" is just an example, you can pass whatever you want and it will be provided to your callback whenever it is called by the scanner.
CGPDFScannerRef contentStreamScanner = CGPDFScannerCreate(contentStream, operatorTable, self);
CGPDFScannerScan(contentStreamScanner);
If you want to see a complete example with sourcecode on how to find and process images, check this website.
To understand why a parser works this way, you need to read the PDF specification a bit better. A PDF file contains something close to printing instructions. Such as "move to this coordinate, print this character, move there, change the color, print the character number 23 from the font #23", etc.
The parser gives you callbacks for each instructions, with the possibility to retrieve the instruction parameters. That's all.
So, in order to get the content from a file, you need to rebuild its state manually. Which means, recompute the frames for all characters, and try to reverse-engineer the page layout. This is clearly not an easy task, and that's why people have created libraries to do so.
You may want to have a look at PDFKitten , or PDFParser which is a Swift port with some improvement that i did.