Unity3D - OCR Number Recognition - opencv

Our initial use case called for writing an application in Unity3D (write solely in C# and deploy to both iOS and Android simultaneously) that allowed a mobile phone user to hold their camera up to the title of a magazine article, use OCR to read the title, and then we would process that title on the backend to get related stories. Vuforia was far and away the best for this use case because of its fast native character recognition.
After the initial application was demoed a bit, more potential uses came up. Any use case that needed solely A-z characters recognized was easy in Vuforia, but the second it called for number recognition we had to look elsewhere because Vuforia does not support number recognition (now or anywhere in the near future).
Attempted Workarounds:
Google Cloud Vision - works great, but not native and camera images are sometime quite large, so not nearly as fast as we require. Even thought about using the OpenCV Unity asset to identify the numbers and then send multiple much smaller API calls, but still not native and one extra step.
Following instructions from SO to use a .Net wrapper for Tesseract - would probably work great, but after building and trying to bring the external dlls into Unity I receive this error .Net Assembly Not Found (most likely an issue with the version of .Net the dlls were compiled in).
Install Tesseract from source on a server and then create our own API - honestly unclear why we tried this when Google's works so well and is actively maintained.
Has anyone run into this same problem in Unity and ultimately found a good solution?

Vuforia on itself doesn't provide any system to detect numbers, just letters. To solve this problem I followed the next strategy (just for numbers near of a common image):
Recognize the image.
Capture a Screenshot just after the target image is recognized (this screenshot must contain the numbers).
Send the Screenshot to an OCR web-service and get the response.
Extract the numbers from the response.
Use these numbers to do whatever you need and show AR info.
This approach solves this problem, but it doesn't work like a charm. Their success depends on the quality of the screenshot and the OCR service.

Related

How to write a library for multiple devices with similar versions of an API

I am trying to develop a library of shared code for my company.
We are developing on a technology by SICK called AppSpace, which is designed for machine vision. AppSpace is a stand alone eco-system, beneath which there comes a variety of SICK programmable devices (e.g. programmable cameras, LiDAR sensors), and an IDE with which these can be programmed. Programs are written in Lua, using HTML/CSS for the front end.
AppSpace provides a Lua API for these devices.
In my company, a few of us write applications and it is therefore important that we create a library of shared code to avoid redundancy / rewritten code.
However, each firmware version of each device has a corresponding API version. That is to say, that on a given device the API can change between firmware versions, and also that API versions differ across devices. Two devices will have two different sets of API functions available to them. Functions they share in common may also have slightly different implementations.
I am at a loss as to how such a situation can be properly managed.
I suppose the most "manual" route would be for each device to have its own partial copy of the library, and to manually update each device's library to have the same behavior each time a change is made, ensuring that each device conforms to its API. This seems like bad practice as it is very error prone - the libraries would inevitably become out of sync.
Another option might be to have a master library, and to scrape the API documentation for each device. Then build a library manager which parses the Lua code from the library and identifies missing functions for each device. This seems completely impractical and also error prone, probably.
What would be the best way to develop and maintain a library of shared code which can be run on multiple devices, if it is even possible?
I would like to answer this and review some of the topics discussed.
First and foremost; functions that are shared in common between devices will be implemented differently by means of the compiled code on the respective device (i.e. PC, 2d camera, 3d camera, LIDAR, etc) while the functionality maintains the same between them all. This way the code can be readily ported from one device to another. That is the principle of the SICK AppEngine that is running on all SICK AppSpace devices as well as 3rd party hardware with AppEngine installed.
The APIs embedded into the devices are called a CROWN (Common Reusable Objects Wired by Name) component and can be tested against nil to determine if they are exposed APIs. Here's an example of an CROWN called 'IMAGE'. If it exists than you could run this code when it does.
if IMAGE then
--do code
end
SICK also has a AppPool that you can upload your source code to and it will test all the required CROWNs and return a list of all SICK devices that can run properly.

Bellus3D is being end-of-lifed, is there any replacement iOS Solution for 3D face scanning?

I work on an application for custom fit eyewear, and we've been using Bellus3D's iOS SDK for getting facial geometry, including landmarks like pupils.
Bellus3D has decided to wind their business down by the end of 2022, and I'm looking for a suitable replacement framework for our application. Bellus was great because it produced reliable results in exchange for a pretty simple user experience.
I've found a few apps that also use or used Bellus, but not getting any word about what alternatives they've found that would suitably replace it.
Scandy doesn't seem to be accepting new SDK registrations
Standard Cyborg took some tweaks, but works great, and their API tokens work, but I can't find any information about their pricing and they're not responding
Topology Eyewear seems to have a solution, but not a lot of details and aren't responding either.
I've reached out to a few app developers that incorporated Bellus 3D, but so far all I've heard is that they're in the same situation.
Does anyone know of a working, maintained solution for 3D face scanning with cell phones (or 3D scanning in general), or of an approach to get something with decent fidelity out of ARKit

Image recognition for text in React Native

This may be a crazy question but I've seen done with apps. Is there any kind of API that can be used to recognition the text within an image (the way chase recognizes numbers on a check) OR, is there an API that can be used to search (lets say google) for information based off an image? Example would be if I took a picture of a business logo, google will search for a business listing that fits that logo?
I know crazy question but I want to know if it can even be done. If it can, can it be used with React Native? Thanks!
The React Native Tesseract package only supports Android. iOS support is pending but no timeline when it will be done.
The pure Javascript implementation of Tesseract would offer cross-platform support in React Native.
http://tesseract.projectnaptha.com/

how to read video file and split it into frames for android

My goal is as follows: I have to read in a video that is stored on the sd card, process it frame for frame and then store it in a new file on the SD card again,In each image to do image processing.
At first I wanted to use opencv for android but I did not seem to be able to read the video
here.
I am guessing you already know that doing this on a mobile device or any compute limited devices is not ideal, simply because video manipulation is very computer intensive which translates to slow execution and heavy battery usage on many devices. If you do have the option to do the processing on the server side it is definitely worth considering.
Assuming that for your use case you need to do it on the mobile device, then OpenCV on Android will now allow you to read in a video and access each frame - #StephenG mentions this in his answer to the question you refer to above.
In the past, functionality like this did not get ported to the Android OpenCv as the guidance was to use ffmpeg for frame grabbing on Android devices.
According to more recent documentation, however, this should be available for Android now using the VideoCapture class (note I have not used this myself...):
http://docs.opencv.org/java/2.4.11/org/opencv/highgui/VideoCapture.html
It is worth noting that OpenCV Android examples are all currently based around Eclipse and if you want to use Studio, getting things up an running initially can be quite tricky. The following worked for me recently, but as both studio and OpenCV can change over time you may find you have to do some forum hunting if it does not work for you:
https://stackoverflow.com/a/35135495/334402
Taking a different approach, you can use ffmpeg itself, in a wrapper in Android, for tasks like this.
The advantage of the wrapper approach is that you can use all the usual command line syntax and there is a lot of info on the web to help you get the right parameters.
The disadvantage is that ffmpeg was not really designed to be wrapped in this way so you do sometimes see issues. Having said that it is a common approach now and so long as you choose a well used wrapper library you should at least have a good community to discuss any issues you come across with. I have used this approach in a hand crafted way in the past but if I was doing it again I would use one of the popular examples such as:
https://github.com/WritingMinds/ffmpeg-android-java

Fingerprint matching in mobile devices (coding in OpenCV, deployment in Android)

I am intending to do my mainstream project in image processing domain. My project's aim is to enable user with finger print security in mobile devices. It involves:
Reading the fingerprint from user through mobile phone.
Matching the fingerprint image obtained with those available in the
database.
I am interested in coding my project in OpenCV and deploying in Android. I need clarifications on few of the following things:
Is this project feasible?
Is OpenCV apt for this project? (considered Matlab, but it doesn't have portability to Android)
Eclipse or Visual Studio?, which will be more suitable (considering the deployment in Android)
I am a beginner and need to learn OpenCV, so please guide me how to start my project (what are the tools, books for reference and the IDE's, SDK's to be used?)
Yes, for sure, but not easy.
OpenCV. Lots of stuff: http://opencv.org/
Eclipse, taking into account that you will be deploying it on Android.
Good luck!
You can easily work with OpenCV. Its not hard.
Learning Android portion - to me is the challenging part. If you want to use this finger recognition - you could do this by capturing an image from your front cam OR by using your back also (assuning you put a finger on the camera lens, while holding the phone).
Now to use this feature to unlock your phone - to me is a tough job. It involves more of android.
To start with - I would suggest you to build an app with a finger recognition algorithm. It should recognize your finger & may be take an action like display your name or something like this.
If you can do this.. the rest is all android, and how you play with android to get this to work.
I hope this helps and gives you a very high level answer.

Resources