Recognize Logo in a full image - machine-learning

First, you need to know that I'm a beginner in this subject. Initially, I'm an Embedded System Developpers but I never worked with image recognition.
Let me expose my main goal:
I would like to create my own database of Logos and be able to
recognize them in a larger image. Typical application would be, for
example, to make a database of pepsi logos and coca-cola logos and
when I take a photo of a bottle of Soda, it tells me if it one of
them or an another.
So, here is my problem:
I first wanted to use the Auto ML Kit of Google. I gave him my
databases so it could train itself on it. My first attempt was to
take photos of bottle entirely and then compare. It was ok but not
too efficient. I then tried to give him only logos but after
training, it couldnt recognize anything in the whole image of a
bottle.
I think I didn't give enough images in the first case. But I'd prefer to use the second case (by giving only logo) so that the machine would search something similar in the image.
Finally, my questions:
If you've worked with ML Kit from Google, were you able to train a
model by giving images that should be recognized in a larger image?
If yes, do you have any hints to give me?
Do you know reliable software that could help me to perform tests of this kind? I thought about Azure Machine Learning Studio from
Microsoft (since I develop on Visual Studio).
In a first time, I'd like to code as few as I can just for testing. Maybe later I could try to code my own Machine Learning System but I think it's a big challenge.
I also thought that I would need to split my image in smaller image and then send each of this images into the Machine but it would be time consuming and I need a fast reaction (like < 2 seconds).
Thanks in advance for your answer. I don't need complete answer with full tutorial (Stack Overflow is not intended for that anyway ^^) but just some advices would already be good.
Have a good day!

Azure’s Custom Vision is great for this: https://www.customvision.ai
Let’s say you want to detect a pepsi logo. Upload 70 images of products with the logo on them. Use Custom Vision to draw a box around the logo for each photo. Click “train”, and you get a tensorflow model with code.
Look up any tutorial for it, it’s pretty incredible and really easy to use.

Related

Detecting which way a car, in an image, is pointing (what is this type of problem called)

I have about 2,000 images of cars, most pointing right, but some pointing left.
I'd like to find a way of automatically tagging a car with it's direction (new images will be coming in continually).
I'm struggling to get started and wondered if this kind of image detection problem has a name that may help my searches. Is object orientation detection a thing?
I'm a software developer (not doing much ML or Image stuff) and have a ton of azure and gcc resources available, but I can't find anything to solve this. Azure Cognitive Service can tell us it's a car in the picture, but doesn't tell us the direction.
Could just do with a good starting point to get going.
Should add, the images are quite clean on white backgrounds, examples:
Thanks to Venkata for commenting, it was a bad dataset causing our issues (too many rights vs left).
Here's what we did to get it all working:
We set up a training and prediction instance in azure (using custom vision cognitive services in our portal).
We then used https://www.customvision.ai/ to set everything up and train the model (it's super simple).
We didn't actually need any left facing images in the end, we just took all the right facing images we had (about 500 in the final instance), we uploaded them all with the tag "Right". We then mirrored all the images with a photoshop script and then uploaded them all again with a "Left" tag. It trained for about 15 minutes and we ended up with a 100% prediction score. We tested it with a load of images that weren't contained in the training set to confirm it was all working.
We then did the same for a ton of van/truck images, these were taken from a different angle (cars were all side profile shots, the vans were all front 3 quarter so we weren't sure if we'd have the same success).
Again, we flipped the images ourselves to create the left images so we only needed to source right facing vans to create the whole model.
We ended up with a 99.8% score, which is totally acceptable for our use case and we can now detect all cars and van directions and it even detects car directions that are front 3 quarters and vans that are in profile (even though we only trained cars in profile and vans in 3 quarter).
The custom vision portal gives you an API endpoint and a key, now when we detect a new image in our system it goes via the API (using the custom image sdk/nuget in our .net site) and we check the tags to see if it needs flipping. If it does, we flip it and save it back to the disk and it's then cached so it doesn't keep hitting the API.
It's pretty amazing, it took us just two days to research the options, pick a provider and then implement the solution in to a production platform. It's probably a simple use case for ML, but 10 years ago (or even 5) we couldn't have dreamed that things would have come along so far.
tldr; If you need to detect if an object in an image is pointing left or right, just grab a lot of right facing examples and then flip them yourself to create a well balanced model. Obviously, this relies on the object looking the same from one side to the other.

Photo editing app iOS

I am trying to make a photo editing application for iOS, but am not sure where to start looking. I have attached an image made in Word... that hopefully simply depicts what I am trying to achieve. It will involved manipulating individual pixels of a shape/image and masking/clipping. WHow should I start and what resources are available to me other than the developer docs?
Cheers
If you are not new to programming I would suggest a trial and run kind of approach. If it was me, I would follow a approach like this
Figuring out what to do/ what not to do
Do I need to develop the tech I want from scratch or can I use some pods ?
What are the good reads and example apps - (Try this)
Development approach
Build a photo gallery to pick images from
Build a EDIT mode screen
Get set of template overlay images
Figure out how to overlay them on top of each other
Export the final picture as one picture
The developer documentation is essential when it comes to learning new APIs, but sometimes it can be a little overwhelming. You can try reading raywenderlich.com tutorials on Core Image first to get an idea (link here) or find a book on computer graphics. It is essential to understand at least the underlying techniques to efficiently program image processing code. In many cases you'll find there is a more elegant technique than just looping on pixels and modifying one-by-one.
Then you can continue with reading on image compositing using core image for example.

How can I go forward with this image processing task?

I am totally new to Image processing. I just started my Master thesis in the area of Computer vision and Machine learning. My background is Informatics. Now, my first task is to register the images of some fishes(Image registration) as they come out of water. I have got stream of images and I want to come up with a model of the fish by aligning the temporally different images of a fish. As, I can understand, I will have to fist of all remove the background and water from the images to work on just fishes, am I right?
Can anyone give me a brief idea of how I should go forward or the things I should read first before I can understand the things. For example, should I read the basics of Image Processing, Feature detection, Image segmentation....? and regarding the programming language I can use - one where I can find good libraries, forums, other help.
I would be really grateful, if anyone can help. Thanks.

Applying various effects to images

I know how to apply two effects to images -- blurring and making them grayscale. However, I would like to expand my knowledge further and learn more things of this nature.
I decided to Google them but found out that I do not even know what they are called.
I would like to ask: How do I progress further into image processing?
Image processing is a very big area with many applications.
These applications go from medical imaging, data compression and many
commercial applications like the ones you find in photoshop.
Without knowing where you are going to apply image processing, I assume
that you want to learn for the sake of curiosity :).
Today we have lots of online courses that make learning more easy.
I did an image processing course by Guilhermo Sapiro on the coursera
website that helped a lot https://www.coursera.org/course/images .
The course has already ended but the video classes are also available
on youtube. http://www.youtube.com/watch?v=GWCB3pKi2ko ( One about histogram equalization
you can see others on the related videos)
Another source is the amazing book by Rafael gonzales calle Digital Image Processing.
If you're looking for a website solution this is a good guide to how to use the css filter effect: http://www.html5rocks.com/en/tutorials/filters/understanding-css/
If you're looking for something else, I think more detail on your application is needed.

multiple choice test mark reader - where to start?

I was assigned a project (in school) for automated multiple choice test scoring and I do not know where to start.
I think his is a kind of popular program and you already know about it. Enter an image file scanned of the answer sheet and return results.
Everything I know about computer vision is a few examples of photo editing with OpenCV. I hope you can give me a few keywords related to the problem or maybe a couple of blog articles, documents and related libraries.
Is there any free open source programs that I can refer to?
Thanks!
Edit: Add 2 example of the answer sheet (sory that I cannot find a sheet in English):
I think there are basically two steps to the problem
bring the form into a normalized position
now you know where the boxes are and can look at them by thresholding the gray values in that region.
What methods to use for step 1 depends on your actual images and how much the vary. Do you have some example images you can upload?
Also I think it is a good idea, especially if you are a beginner, to start with some simple examples and work your way up from there by adding more and more variation.

Resources