OpenCV and Computer Vision, where do we stand now? - opencv

I want to do a project involving Computer Vision. Mostly object detection/identification. After some research, I keep coming back to OpenCV. But all of the tutorials are from 2008 (I guess it was big for a bit then). It doesn't compile in Python on the mac apparently. I'm using the C++ framework right out of Xcode, but none of the tutorials work as they're outdated and the documentation sucks from what I can parse.
Is there a better solution for what I'm doing, and does anyone have any suggestions as to learning how to to use OpenCV?
Thanks

I have had similar problems getting started with OpenCV and from my experience this is actually the biggest hurdle to learning it. Here is what worked for me:
This book: "OpenCV 2 Computer Vision Application Programming Cookbook." It's the most up-to-date book and has examples on how to solve different Computer Vision problems (You can see the table of contents on Amazon with "Look Inside!"). It really helped ease me into OpenCV and get comfortable with how the library works.
Like have others have said, the samples are very helpful. For things that the book skips or covers only briefly you can usually find more detailed examples when looking through the samples. You can also find different ways of solving the same problem between the book and the samples. For example, for finding keypoints/features, the book shows an example using FAST features:
vector<KeyPoint> keypoints;
FastFeatureDetector fast(40);
fast.detect(image, keypoints);
But in the samples you will find a much more flexible way (if you want to have the option of choosing which keypoint detection algorithm to use):
vector<KeyPoint> keypoints;
Ptr<FeatureDetector> featureDetector = FeatureDetector::create("FAST");
featureDetector->detect(image, keypoints);
From my experience things eventually start to click and for more specific questions you start finding up-to-date information on blogs or right here on StackOverflow.

Let me add a couple of things. First, I can assure you that the Python bindings to OpenCV work on a Mac. I use them every day.
Many people like OpenCV for many reasons:
The license is good, friendly to integration into commercial products, etc.
It is quite good from a technical stand point. It gives you a reference implementation of state of the art algorithms.
It tends to be quite fast compared to the alternatives (Matlab I'm looking at you).
Like everything in life, it is not perfect:
It is a good example of a software library that is a moving target.
I have a 300 line python program that uses OpenCV and every few
months when a new version of OpenCV is released I have to change it
to adapt to the new function names/calling conventions, etc. The
library does advance, a lot, however it is a pain to have to change
the same program 3 times per year.
It has a learning curve, like computer vision itself, it is quite
technical and not easy to learn.
There are alternatives (with other pros and cons) MATLAB with the Image Processing Toolbox is one such example.

The simplest answer that comes to mind, is to read the example code with a bit of understanding, and to try out if Your ideas work. The api does change, and most of the tutorials are writen for the first versions of OpenCV, and it looks that nobody bothered to rewrite them. Nevertheless the core ideas behind it are not changing. So if You find a tutorial answering Your questions, but written in old API just look in the documentation for modern replacements of used functions. It’s not easy and quick, but looks like it works. If You use the newest (actually 2.3) version, I suggest using both the 2.1 documntation and 2.3 docs + tutorials . You should also look into the samples, which should have been installed alongside the library. There are lots of hints about how to use certain structures and tricks that weren't mentioned in documentation. Finally, don't be afraid to look inside the code of the library itself (if You compiled it on Your own). Unfortunately, thats the only source I know to check for example what code corresponds to which type of Mat object.

Related

how to pretrain my image using resnet50 in mask-rcnn

I am researching about mask r-cnn. I want to know how to pretrain my image(knife,sofa,baby,.....) using resnet50 in mask-rcnn. I struggle to find that in github, but I can't. Please help me anybody who know how to handle it.
Try this implementation of Mask RCNN on github here.
You can follow Mask_RCNN github repo. It has both resnet50 and resnet100 (might be wrong here). It is a beautiful implementation I would say. The base model is from FAIR (Facebook AI Research). There is a demo file which you can check before starting your work.
If it works well, you can see my answer, it will help you to train the model with your custom data. The answer is a bit long, but it lists all the steps.
Something which I personally like about this implementation is:
It is easy to setup. Won't bother you much about the dependencies. Having a python virtual environment does the wonders.
It falls back automatically from a CPU version to GPU and vice versa.
It is having good support from its developers. It is getting commits frequently.
The code is very customisable. So If you want to do some changes, it's pretty easy. Some booleans and numbers changes up and down and you are done...!!!

iOs Image Recognition, Categorizing and matching pattern

I have been investigation a little about image recognition, But Haven't found something useful for me yet.
For my Wife who Is a Dentist that has to make his Tesis, I need to make an App that recognize all teeth Shape from a picture taken at standard conditions.
I need to find the best match based on teeth pattern predefined to categorize and see which match best. I know this is a big issue and not a simple solution.
Does someone know an image recognition software that makes me able to give it a a number of patterns, and then have an image and see wich pattern fits the best? Or maybe just some orientation to start searching and working on solving this problem.
Thanks!
OpenCV would be the way to go here but let me give you the facts before you start ripping your hair out.
I don't know your development experience but although OpenCV has an iOS wrapper you will be working with low-level, C libraries. If that makes you uncomfortable then turn back now. Furthermore, you will be writing the majority of the recognition/detection algorithms yourself and it takes a lot of time and patience to get these to the point where they work to an extent. Additionally, don't expect the end product to be all that reliable, professional image recognition/manipulation tools take years of development by teams of experts in computer vision. No disrespect but something that has been hacked together over a few weeks by one person will be sub-par and lacking.
Nonetheless if you want to go ahead, you can download OpenCV for iOS here:
http://docs.opencv.org/2.4/doc/tutorials/introduction/ios_install/ios_install.html

Robotics library in Forth?

I have read the documentation for the Roboforth environment from STrobotics and recognized that this a nice way for programming a robot. What I missed is a sophisticated software library with predefined motion primitives. For example, for picking up a object, for regrasping or for changing a tool.
In other programming languages like Python or C++, a library is a convenient way for programming repetitive tasks and for storing expert knowledge into machine-readable files. Also a library is good way for not-so-talented programmers to get access on higher-level-functions. In my opinion Forth is the perfect language for implementing such an API, but I didn't find information about it. Where should I search? Are there any examples out there?
I am author of RoboForth, and you make a good point. I have approached the problem of starting off new users with videos on YouTube; see How to... (playlist with 6 items, e.g "ST Robotics How-to number 1 - getting started") which is a playlist covering basics and indeed tool changing.
I never wrote any starter programs, because the physical positions (coordinates) would be different from one user to the next, however I think it can be done, and I will do it. Thanks for the heads up.

Getting ElliFit ellipse fitting algorithm to work

I have tried to implement the ellipse fitting algorithm descibed in the following paper: “ElliFit: An unconstrained, non-iterative, least squares
based geometric ellipse fitting method”, by Prasad, Leung, Quek. A free version can be downloaded online from http://azadproject.ir/wp-content/uploads/2014/07/2013-ElliFit-A-non-constrainednon-iterative-least-squares-based-geometric-Ellipse-Fitting-method.pdf
The authors did not provide any publicly available implementation.
I have implemented the algorithm in Mathematica, I believe I have implemented it correctly, yet it fails to correctly find the fit parameters. The PDF of the experiment can be downloaded here: http://zvrba.net/downloads/ElliFit-fail-example.pdf
Did somebody else try to implement this particular algorithm and, if yes, what is the key to get it working? Is there a "bug" in the paper? Can somebody take another look at my implementation and see whether there's a bug there?
I know it's been almost a year since this question, but it seems that the authors have now provided public source code for ElliFit, both a MATLAB version and an OpenCV version.
Both are available on the the author's homepage. In case the homepage goes offline for some reason, both source codes are shared on Google and are available here (MATLAB) and here (OpenCV).
At the time of writing, I have not personally tested their code, but am planning to use them for a project. I will post any updates here in the next few days.
EDIT:
I got around to test the code sooner than I expected. I gave the OpenCV code a try. It works pretty well, as demonstrated by the image below (ignore the "almost-closed-ellipses". It's an artifact caused by something else in my code).
As you can see, it works pretty well, most of the times. There are some failure cases too (the small ellipse on the spray bottle next to the cup).

How to compare a webcam image/video to a specific image/video?

I am basically just starting out in computer programming; mostly fluent in basic Java. I have an idea of creating an ASL (American Sign Language) to English, and my initial problem is how to identify hand movement from a webcam then comparing them to Signs that is already stored as an image or another video. If the problem is a bit too advanced for me then please list any major concepts that I can learn. Please and thank you.
You clearly have a challenging problem ^^. Try to explain all you need to solve your problem would be very hard, mainly because there many ways to do this. I advice you to read a nice book about image processing (Gonzalez' book is a nice choice) and the OpenCV documentation (but it is implemented in C, C++ and has Python bindings; although it's a library that implements a lot of image processing techniques). Maybe you should focus your study on feature detection, motion analysis and object tracking. As sign language uses not just hand sign (static state) but also hand moviments (dynamic state) to express something, object tracking may be a good way to describe the signs.I hope these informations help you, at least a little -^.^- Bye bye.
Look at OpenCV. They have a lot of libraries that you might find handy.
http://opencv.willowgarage.com/wiki/

Resources