I need to achieve this goal : recognize a specific logo painted on a wall with the camera of the iPhone. I'd like to have a sort of database with N logos that the app should be able to recognize. Could you suggest me some useful libraries (premium or free) designed to do this?
I would suggest the Watson Visual Recognition API for this task. You could train a custom classifier for each of the logos you want to recognize. There is a demo here: https://visual-recognition-demo.mybluemix.net/train and docs here: https://www.ibm.com/watson/developercloud/doc/visual-recognition/
Pricing info is here - https://www.ibm.com/watson/developercloud/visual-recognition.html#pricing-block (there is a free plan)
Related
First, you need to know that I'm a beginner in this subject. Initially, I'm an Embedded System Developpers but I never worked with image recognition.
Let me expose my main goal:
I would like to create my own database of Logos and be able to
recognize them in a larger image. Typical application would be, for
example, to make a database of pepsi logos and coca-cola logos and
when I take a photo of a bottle of Soda, it tells me if it one of
them or an another.
So, here is my problem:
I first wanted to use the Auto ML Kit of Google. I gave him my
databases so it could train itself on it. My first attempt was to
take photos of bottle entirely and then compare. It was ok but not
too efficient. I then tried to give him only logos but after
training, it couldnt recognize anything in the whole image of a
bottle.
I think I didn't give enough images in the first case. But I'd prefer to use the second case (by giving only logo) so that the machine would search something similar in the image.
Finally, my questions:
If you've worked with ML Kit from Google, were you able to train a
model by giving images that should be recognized in a larger image?
If yes, do you have any hints to give me?
Do you know reliable software that could help me to perform tests of this kind? I thought about Azure Machine Learning Studio from
Microsoft (since I develop on Visual Studio).
In a first time, I'd like to code as few as I can just for testing. Maybe later I could try to code my own Machine Learning System but I think it's a big challenge.
I also thought that I would need to split my image in smaller image and then send each of this images into the Machine but it would be time consuming and I need a fast reaction (like < 2 seconds).
Thanks in advance for your answer. I don't need complete answer with full tutorial (Stack Overflow is not intended for that anyway ^^) but just some advices would already be good.
Have a good day!
Azure’s Custom Vision is great for this: https://www.customvision.ai
Let’s say you want to detect a pepsi logo. Upload 70 images of products with the logo on them. Use Custom Vision to draw a box around the logo for each photo. Click “train”, and you get a tensorflow model with code.
Look up any tutorial for it, it’s pretty incredible and really easy to use.
I have planned to detect an image in a news paper play the video relevant to it. I have seen several news paper reading AR apps include this feature. But i couldn't find how to do so. How can I do it??
I dont expect any code. But like to know what are the steps I should follow to do this. Thank you.
You need to browse through the available marker-based AR SDKs - such SDKs let you defined in advance the database of images you would like to detect and respond to, and once any of these images is detected during runtime, you get some kind of an event with data on the detected image.
Vuforia is considered a good one and it has good samples, so it is supposed to be easier to start with. You should also check out Kudan, and there are more.
I'm developing this project where I'm trying to create a distributed version of Tensorflow (the actual open source version is single node) and where the cluster is entirely composed by mobile devices (e.g. smartphones).
In your opinion, what is a possible application or use case where this could be useful? Can you give me some example please?
I know that this is not a "standard" Stack Overflow question, but I didn't know where to post it (if you know a better place where to post it, please let me know it). Thanks so much for your help!
http://www.google.com.hk/search?q=teonsoflow+android
TensorFlow can be used for image identification and there is an example using the camera for Android.
There could be many distributed uses for this. Face recognition, 3D space construction from 2D images.
TensorFlow can be used for a chat bot. I am working towards using it for a personal assistant. The Ai on one phone could communicate with the Ai on other phones.
It could use vision and GPS to 'reserve' a lane for you on the road. Intelligent crowd planned roads and intersections would be safer.
I am also interested in using it for distributed mobile. Please contact me with my user name at gmail or Skype.
https://boinc.berkeley.edu
I think all my answers could run on individual phones with communication between them. If you want them to act like a cluster as #Yaroslav pointed out there is Seti#home and other projects running in the BOINC client.
TensorFlow could be combined with a game engine. You could have a proceduraly generated Ai learning augumented reality game generating the story as multiple players interact with it. I have seen research papers for each of these components.
Nowadays, I wanna do some research of augmented reality technology.Especially, I would like to match a 2d image and a 3d model.And then, I will see the 3d model if scanning the 2d image. What's more, I know that there are a lot of SDKs(like metaio,and wikitude) and software can realize this in mobile app. However, what I want to do is realizing this in a website. I hope the people who use this don't need to download a particular mobile app, but just open a website and then scan a picture.
So, until now, I's like to know that,as the tile asked, can AR be realized in a website? If yes, how can I do it or is there any software like Metaio Creator to do this? If no, why?
Thank you for anyone who would like to answer my naive question.
May I recommend you our completely webbased AR & VR tool holobuilder.com by bitstars.com?
It supports 360 degree photospheres that can be enhanced with custom 3D models and then directly be embedded into your website as iframe, it has native support for stereoscopic view mode and much more.
For your use case you could have a look at the lower part of this blog post where you find information and an embedded example presentation with photosphere imagery containing 3D elements:
http://heyholo.com/google-pushes-vr-great-for-tools-like-holobuilder/
If you want to start creating I recommend the beginners guide:
https://medium.com/#maxspeicher/the-definite-guide-to-holobuilder-3b62a54d303e
The cv feature tracking you requested can not yet be realized without any apps/browser. But what you can do is realizing perspectively correct displaying 3D elements into the camera image and move with sensors. Should be as performant as within the player app.
We hope that it can somehow help you in pushing your research and we would love to read your feedback. In case of any questions please do not hesitate to ask, here or on any other contact channel!
is it possible to instantiate the use of an image search engine within an app? I have an idea to incorporate image search engines with the pictures that can be taken with the camera and then have the app return info about the picture that is recognized.
Google Goggles, Like.com (formerly Riya) now acquired by Google, Tineye.com are some sites that offer visual search. Not sure they offer an API.
If you want to whip one up, it is as you would expect, no trivial task. AFAIK, there are no OOTB solutions available: especially, considering your use-case of taking an image and getting related information (known in the trade parlance as RST invariant template matching) - and you would need to look into significant investment of time and $.
We offer an image search engine for mobile app cameras - www.iqengines.com.