I am researching about mask r-cnn. I want to know how to pretrain my image(knife,sofa,baby,.....) using resnet50 in mask-rcnn. I struggle to find that in github, but I can't. Please help me anybody who know how to handle it.
Try this implementation of Mask RCNN on github here.
You can follow Mask_RCNN github repo. It has both resnet50 and resnet100 (might be wrong here). It is a beautiful implementation I would say. The base model is from FAIR (Facebook AI Research). There is a demo file which you can check before starting your work.
If it works well, you can see my answer, it will help you to train the model with your custom data. The answer is a bit long, but it lists all the steps.
Something which I personally like about this implementation is:
It is easy to setup. Won't bother you much about the dependencies. Having a python virtual environment does the wonders.
It falls back automatically from a CPU version to GPU and vice versa.
It is having good support from its developers. It is getting commits frequently.
The code is very customisable. So If you want to do some changes, it's pretty easy. Some booleans and numbers changes up and down and you are done...!!!
Related
First, you need to know that I'm a beginner in this subject. Initially, I'm an Embedded System Developpers but I never worked with image recognition.
Let me expose my main goal:
I would like to create my own database of Logos and be able to
recognize them in a larger image. Typical application would be, for
example, to make a database of pepsi logos and coca-cola logos and
when I take a photo of a bottle of Soda, it tells me if it one of
them or an another.
So, here is my problem:
I first wanted to use the Auto ML Kit of Google. I gave him my
databases so it could train itself on it. My first attempt was to
take photos of bottle entirely and then compare. It was ok but not
too efficient. I then tried to give him only logos but after
training, it couldnt recognize anything in the whole image of a
bottle.
I think I didn't give enough images in the first case. But I'd prefer to use the second case (by giving only logo) so that the machine would search something similar in the image.
Finally, my questions:
If you've worked with ML Kit from Google, were you able to train a
model by giving images that should be recognized in a larger image?
If yes, do you have any hints to give me?
Do you know reliable software that could help me to perform tests of this kind? I thought about Azure Machine Learning Studio from
Microsoft (since I develop on Visual Studio).
In a first time, I'd like to code as few as I can just for testing. Maybe later I could try to code my own Machine Learning System but I think it's a big challenge.
I also thought that I would need to split my image in smaller image and then send each of this images into the Machine but it would be time consuming and I need a fast reaction (like < 2 seconds).
Thanks in advance for your answer. I don't need complete answer with full tutorial (Stack Overflow is not intended for that anyway ^^) but just some advices would already be good.
Have a good day!
Azure’s Custom Vision is great for this: https://www.customvision.ai
Let’s say you want to detect a pepsi logo. Upload 70 images of products with the logo on them. Use Custom Vision to draw a box around the logo for each photo. Click “train”, and you get a tensorflow model with code.
Look up any tutorial for it, it’s pretty incredible and really easy to use.
I have been investigation a little about image recognition, But Haven't found something useful for me yet.
For my Wife who Is a Dentist that has to make his Tesis, I need to make an App that recognize all teeth Shape from a picture taken at standard conditions.
I need to find the best match based on teeth pattern predefined to categorize and see which match best. I know this is a big issue and not a simple solution.
Does someone know an image recognition software that makes me able to give it a a number of patterns, and then have an image and see wich pattern fits the best? Or maybe just some orientation to start searching and working on solving this problem.
Thanks!
OpenCV would be the way to go here but let me give you the facts before you start ripping your hair out.
I don't know your development experience but although OpenCV has an iOS wrapper you will be working with low-level, C libraries. If that makes you uncomfortable then turn back now. Furthermore, you will be writing the majority of the recognition/detection algorithms yourself and it takes a lot of time and patience to get these to the point where they work to an extent. Additionally, don't expect the end product to be all that reliable, professional image recognition/manipulation tools take years of development by teams of experts in computer vision. No disrespect but something that has been hacked together over a few weeks by one person will be sub-par and lacking.
Nonetheless if you want to go ahead, you can download OpenCV for iOS here:
http://docs.opencv.org/2.4/doc/tutorials/introduction/ios_install/ios_install.html
I have tried to implement the ellipse fitting algorithm descibed in the following paper: “ElliFit: An unconstrained, non-iterative, least squares
based geometric ellipse fitting method”, by Prasad, Leung, Quek. A free version can be downloaded online from http://azadproject.ir/wp-content/uploads/2014/07/2013-ElliFit-A-non-constrainednon-iterative-least-squares-based-geometric-Ellipse-Fitting-method.pdf
The authors did not provide any publicly available implementation.
I have implemented the algorithm in Mathematica, I believe I have implemented it correctly, yet it fails to correctly find the fit parameters. The PDF of the experiment can be downloaded here: http://zvrba.net/downloads/ElliFit-fail-example.pdf
Did somebody else try to implement this particular algorithm and, if yes, what is the key to get it working? Is there a "bug" in the paper? Can somebody take another look at my implementation and see whether there's a bug there?
I know it's been almost a year since this question, but it seems that the authors have now provided public source code for ElliFit, both a MATLAB version and an OpenCV version.
Both are available on the the author's homepage. In case the homepage goes offline for some reason, both source codes are shared on Google and are available here (MATLAB) and here (OpenCV).
At the time of writing, I have not personally tested their code, but am planning to use them for a project. I will post any updates here in the next few days.
EDIT:
I got around to test the code sooner than I expected. I gave the OpenCV code a try. It works pretty well, as demonstrated by the image below (ignore the "almost-closed-ellipses". It's an artifact caused by something else in my code).
As you can see, it works pretty well, most of the times. There are some failure cases too (the small ellipse on the spray bottle next to the cup).
I want to do a project involving Computer Vision. Mostly object detection/identification. After some research, I keep coming back to OpenCV. But all of the tutorials are from 2008 (I guess it was big for a bit then). It doesn't compile in Python on the mac apparently. I'm using the C++ framework right out of Xcode, but none of the tutorials work as they're outdated and the documentation sucks from what I can parse.
Is there a better solution for what I'm doing, and does anyone have any suggestions as to learning how to to use OpenCV?
Thanks
I have had similar problems getting started with OpenCV and from my experience this is actually the biggest hurdle to learning it. Here is what worked for me:
This book: "OpenCV 2 Computer Vision Application Programming Cookbook." It's the most up-to-date book and has examples on how to solve different Computer Vision problems (You can see the table of contents on Amazon with "Look Inside!"). It really helped ease me into OpenCV and get comfortable with how the library works.
Like have others have said, the samples are very helpful. For things that the book skips or covers only briefly you can usually find more detailed examples when looking through the samples. You can also find different ways of solving the same problem between the book and the samples. For example, for finding keypoints/features, the book shows an example using FAST features:
vector<KeyPoint> keypoints;
FastFeatureDetector fast(40);
fast.detect(image, keypoints);
But in the samples you will find a much more flexible way (if you want to have the option of choosing which keypoint detection algorithm to use):
vector<KeyPoint> keypoints;
Ptr<FeatureDetector> featureDetector = FeatureDetector::create("FAST");
featureDetector->detect(image, keypoints);
From my experience things eventually start to click and for more specific questions you start finding up-to-date information on blogs or right here on StackOverflow.
Let me add a couple of things. First, I can assure you that the Python bindings to OpenCV work on a Mac. I use them every day.
Many people like OpenCV for many reasons:
The license is good, friendly to integration into commercial products, etc.
It is quite good from a technical stand point. It gives you a reference implementation of state of the art algorithms.
It tends to be quite fast compared to the alternatives (Matlab I'm looking at you).
Like everything in life, it is not perfect:
It is a good example of a software library that is a moving target.
I have a 300 line python program that uses OpenCV and every few
months when a new version of OpenCV is released I have to change it
to adapt to the new function names/calling conventions, etc. The
library does advance, a lot, however it is a pain to have to change
the same program 3 times per year.
It has a learning curve, like computer vision itself, it is quite
technical and not easy to learn.
There are alternatives (with other pros and cons) MATLAB with the Image Processing Toolbox is one such example.
The simplest answer that comes to mind, is to read the example code with a bit of understanding, and to try out if Your ideas work. The api does change, and most of the tutorials are writen for the first versions of OpenCV, and it looks that nobody bothered to rewrite them. Nevertheless the core ideas behind it are not changing. So if You find a tutorial answering Your questions, but written in old API just look in the documentation for modern replacements of used functions. It’s not easy and quick, but looks like it works. If You use the newest (actually 2.3) version, I suggest using both the 2.1 documntation and 2.3 docs + tutorials . You should also look into the samples, which should have been installed alongside the library. There are lots of hints about how to use certain structures and tricks that weren't mentioned in documentation. Finally, don't be afraid to look inside the code of the library itself (if You compiled it on Your own). Unfortunately, thats the only source I know to check for example what code corresponds to which type of Mat object.
Does it happen that no one ever needs histogram in Delphi ?
Google gave me a bunch of half-baked code snippets. But it means that each time you need one - you have to invent one more ad hoc bycicle.
Torry mostly told me about some very expensive closed source Math Statistics or Financial packages, that as a subproduct have histograms. But they are very expensive and since you have no source code, each time you install update onto IDE/RTL/VCL you're probably screwed, until the vendor would make (soon ? ever?) updated packages. Given thatvendor is still does exists.
S.O. told me nothing, nil.
For what i found...
Mitov.com provides some histograms in PlotLab. which told to be free for non-commercial. Alas, it is again closed-source, and if the Histogram - quite fancy let's admit -is the onlything i need from it - why pay the whole price ?
One more example http://DSpatial.sf.net
Just few years ago i used it in Delphi 5, but even then i felt the author is loosing interest in the project. I made few enhancement, fixed some bugs, he merged them and that's all. The component was not very useful and lacked upon features, yet better than nothing. Now the project seems to be completely dead. Good old days, etc. But i do not want them back :-)
And Stack Overflow seemingly carries no single question about it. But maybe just no one bothered to create topic, after search found nothing ? I mean, Delphi was created for database access, histograms are one of basic ways to visualize data, and no one crosses them ? Something with nice style, with rich mouse tooltip like in HTML/CSS/JS on http://www.moskva.fm/stations/FM_95.2 ?
Or is this too domain-related and not ever possible to have good abstraction ?
TChart is a control that ships with most versions of Delphi. TChart can be used to make histograms (bar charts) in style. The following give you some ideas about how to use it: http://www.digitalcoding.com/tutorials/delphi/Simple-steps-to-create-Delphi-chart.html and http://delphi.about.com/od/adptips2006/qt/chart_selectbar.htm .
If you need something with code, google the pages at delphiforfun.org/programs/oscilloscope.htm . These are not controls. The oscilloscope article has a histogram with source. Some of the other projects at the site have other histogram graphs with source..not elegant but useful and free. Use them as a template to make your own control.
The link at http://delphiforfun.org/programs/Math_Topics/probability_distributions.htm shows how to make your own statistics displays with "histograms." This example makes use of TChart.
Here is some more stuff to try I found looking at my resource file:
http://wiki.lazarus.freepascal.org/TAChart, http://members.home.nl/mvanwesten/en_lazarus.html , http://www.martinole.org/TAChart.html ...some of these are GPU components that supposedly work with some versions of Delphi. Perhaps this is your lucky day as there is some source code. The first and third listed probably will work reasonably for histograms. You may have to write your own statistics algorithms.
Found this thread while doing some searching. The ImageEn component suite has a THistogramBox component. It's the NOT prettiest thing in the world, but it's the only one I've found so far.
http://www.imageen.com
I came across a histogram example in a gdiplus package available for download from code central. I don't know if it will do what you need but when I saw it I remembered your SO question.
HTH.
If you were using firemonkey, you could just created a series of TRectangles in series. They can be made unclickable by turning hittest off. Or is that too easy and straightforward?