My basic task is to capture a part of an image and then use it. To give you an overview, I am creating an app based on OCR, I allow to user to take a picture using camera. However, rather than processing the entire image, I only want some part of it to be selected and send for processing (Preferably a rectangle). So, to sum it up, I want an overlay to be provided, and I want the image inside that overlay to be further used rather than the entire clicked image.
Now, by my understanding, I realize that AVFoundation is the tool to capture the image, however in my app I have used UIImagePicker. I am totally confused since I am a newbie and not sure hot to proceed ahead. Appreciate all for the help. Thanks again
There is fine open source library for OCR in iOS :
https://github.com/nolanbrown/Tesseract-iPhone-Demo
This will work best if the image resolution is 150 * 150 . For more in formation of other libraries you can also refer the SO question:OCR Lib
Related
I'm trying to write an app that recognize a logo saved in app bundle and readed as UIImage. I have did a search before make this question, the only free solution seems to be OpenCv. I have tried it in a demo i had download from toptal_logo_detector . The demo works and i can find my logo everywhere i place it. Anyway the camera is very slow, too slow to use it in a real app. Maybe there's a way to optimize it, but my question is another.
I have to recognize a vector logo (always the same logo) centered in a white background ,something like this wifi logo:
My only solution is the complex OpenCV? There's a free and simpler way to achive the result: YES here there's your logo/No there isn't ?
I found this tutorial (with project download) that does what you want using OpenCV
I am currently working on an iOS app that can take a picture programmatically using AVFoundation libraries like AVCaptureDevice through a custom button.
The new requirement is that the camera should automatically take a picture when the camera session detects something specific. For example, if the camera is open, and I line up an apple to fill a certain circle part of the capture screen, it should take the picture automatically. We can see this auto capture feature in some banking apps when you submit a mobile check deposit.
Does anyone know of existing libraries(open-source or proprietary) that can analyze images in real time while a user is taking a picture?
The first thing you are going to need to do is decide how you want to detect the apple. You can do this using shape detection, image recognition, or various other methods. This is important because you need to know the approach you want to take before you can identify the best way to implement it.
Once you know how you are going to identify the apple, the easiest way to do real-time image processing like this would be to use an existing augmented reality SDK. For example:
http://www.wikitude.com/products/wikitude-sdk/
http://artoolkit.org/
https://developer.vuforia.com/
If you are feeling really adventurous you could roll your own using AForge or a similar library. I have taken this approach in the past for basic shape detection projects.
Edit
The reason I suggest using an existing AR SDK is because generally they provide a lot of the glue between the camera feed and their API for you and it takes a lot of leg work out of the equation. Even though you won't be using any of the actual "augmentation" part of their SDKs, you can still take advantage of the detection part.
No matter what approach you take, you can think about it in the simplest terms of looking a picture, and figuring out if the item you want is in that picture. How do you decide? In most cases you look for a specific shape or pattern.
I am trying to make an app for image recognition with Open CV, i want to implement something like this but i don't know how should i do it can any one give me any help where should i begin from i have downloaded Opencv for iOS from here,
I have a hardcopy of image as an example which i want to scan through the camera and the images(markers) i have imported in project now when i scan the image through camera then it should overlay the markers on the image and when i tap/select the marker it should show the info of that marker.
Here is my image :
It's just an example i have taken (Square,Circle and Triangle as Markers)
So now when the image is scanned then the markers will come up as an overlay and on clicking the markers i should get the names (If the Overlay image over the Circle Named "Air" is tapped it should show me "Air" on an alert or if Square Named "Tiger" is tapped it should say "Tiger")
My problem is that the images are kind of same pattern but the result is different on every part so i don't know how should i approach in this ..
Please can any one help me out by suggesting any idea or if any one has done thing like this please tell me how should i implement it.
I have to start from scratch any help please .
Can this be achieved using Open CV or i have to use any other SDK such as vuforia or layar.
Maybe you should search a little bit before asking help...
Anyway, the shapes you want to find do not seems to change (scale, rotation) so, you can look at the template matching methods implemented in OpenCV (see Tutorial OpenCV)
If the shapes are changing, you should look at more powerful methods such as SIFT or SURF. Both are already implemented in OpenCV (the link from aishack is a tutorial to re-implement SIFT, you can find in the same website a tutorial to use the OpenCV method).
excuse me,my english is poor however i would try to describe my questions clearly.
first i want to operate (read,zoom,move,zoom with rectangle) some image whose format like jpg,tiff and img .
i have try to do this by gdal,using rasterio to zoom and move ,but the result is quite strange.it's slow than i do it with gdi+.i have asked other people,however ,the answer may be rasterio read image direct from hard disk, but gdi+ do things in ram. and maybe the images i operated are small images ,small than 4000*3000.
so now i operate images in gdi+.but i think if i can do same things in directx?
i mean i use directx instead gdi+.because i think it will be more fast.
and because i can only use c#,so i think there are some people could give me some suggestion with managed dx or xna
thx~~~
There is already a fast image viewer called TuiView that is simple to install and use.
Documentation is here: http://tuiview.org
If I understood your question, you are trying to build a simple image viewer.
If so you can easily do it with XNA and it will work very fast.
All you need to do is to load the image and display it to the screen, and the pan and zoom are also very simple.
Read this tutorial :
http://rbwhitaker.wikidot.com/spritebatch-basics
Is there any software or SDK out there that can take a picture of a driver's license and recognize it? I'm thinking something like how bar code scanners work where you place the bar code infront of the camera within a specified box and it takes a picture of it.
I don't know any frameworks which may integrate this process, but have you considered doing it manually in few steps? Whole process may look like this:
Add box overlay to UIImagePicker camera view
Capture the picture
Use some OCR tool on it (for example see this: Getting text from image on ios (image processing))
Take license number, decode it and do whatever you want