I am working on SIFT features and 'm using a visual bag-of-words approach to make a vocabulary first and then do the matching. I've found similar questions but didn't find the appropriate answer.
Same question is asked in below link but there is no satisfactory answer, can anyone help me. Thank u in advance.
https://stackoverflow.com/questions/29366944/finding-top-similar-images-from-a-database-using-sift-surf
Sift and Surf Method are all implemented in lire project and ready to use. Code is very simple if you know the bag of visual word you can modify also.
https://github.com/dermotte/LIRE complete bag of visual word is fully implemented. here is the lire Demo site.
You may look details and implementation of the methods on opencv library for feature extraction. After getting visual word you should use information retrieval approaches used in search engines. By the way Lire also include an information retrieval library called lucene. You may stick to lire way until you get the whole idea.
Related
I am researching about mask r-cnn. I want to know how to pretrain my image(knife,sofa,baby,.....) using resnet50 in mask-rcnn. I struggle to find that in github, but I can't. Please help me anybody who know how to handle it.
Try this implementation of Mask RCNN on github here.
You can follow Mask_RCNN github repo. It has both resnet50 and resnet100 (might be wrong here). It is a beautiful implementation I would say. The base model is from FAIR (Facebook AI Research). There is a demo file which you can check before starting your work.
If it works well, you can see my answer, it will help you to train the model with your custom data. The answer is a bit long, but it lists all the steps.
Something which I personally like about this implementation is:
It is easy to setup. Won't bother you much about the dependencies. Having a python virtual environment does the wonders.
It falls back automatically from a CPU version to GPU and vice versa.
It is having good support from its developers. It is getting commits frequently.
The code is very customisable. So If you want to do some changes, it's pretty easy. Some booleans and numbers changes up and down and you are done...!!!
I was assigned a project (in school) for automated multiple choice test scoring and I do not know where to start.
I think his is a kind of popular program and you already know about it. Enter an image file scanned of the answer sheet and return results.
Everything I know about computer vision is a few examples of photo editing with OpenCV. I hope you can give me a few keywords related to the problem or maybe a couple of blog articles, documents and related libraries.
Is there any free open source programs that I can refer to?
Thanks!
Edit: Add 2 example of the answer sheet (sory that I cannot find a sheet in English):
I think there are basically two steps to the problem
bring the form into a normalized position
now you know where the boxes are and can look at them by thresholding the gray values in that region.
What methods to use for step 1 depends on your actual images and how much the vary. Do you have some example images you can upload?
Also I think it is a good idea, especially if you are a beginner, to start with some simple examples and work your way up from there by adding more and more variation.
I have been running in and out of OpenCV 2.4.3 trying to figure out the extra functions and parameters that can be used to CvBGCodeBookModel based background subtraction. The documentation is not very helpful, does anyone know a resource/tutorial out there that explains CvBGCodeBookModel implemented in OpenCV along with some of its functions?
Guidance much appreciated
There is a sample in the opencv code (samples/c/bgfg_codebook.cpp) that uses CvBGCodeBookModel, it might be a good place to look.
It says the code is adapted from the book "Learning OpenCV" by O'Reilly press, so that would be another resource.
There is also this paper that describes the theory, not sure if that would be helpful to you or not.
I am basically just starting out in computer programming; mostly fluent in basic Java. I have an idea of creating an ASL (American Sign Language) to English, and my initial problem is how to identify hand movement from a webcam then comparing them to Signs that is already stored as an image or another video. If the problem is a bit too advanced for me then please list any major concepts that I can learn. Please and thank you.
You clearly have a challenging problem ^^. Try to explain all you need to solve your problem would be very hard, mainly because there many ways to do this. I advice you to read a nice book about image processing (Gonzalez' book is a nice choice) and the OpenCV documentation (but it is implemented in C, C++ and has Python bindings; although it's a library that implements a lot of image processing techniques). Maybe you should focus your study on feature detection, motion analysis and object tracking. As sign language uses not just hand sign (static state) but also hand moviments (dynamic state) to express something, object tracking may be a good way to describe the signs.I hope these informations help you, at least a little -^.^- Bye bye.
Look at OpenCV. They have a lot of libraries that you might find handy.
http://opencv.willowgarage.com/wiki/
I want to do a project involving Computer Vision. Mostly object detection/identification. After some research, I keep coming back to OpenCV. But all of the tutorials are from 2008 (I guess it was big for a bit then). It doesn't compile in Python on the mac apparently. I'm using the C++ framework right out of Xcode, but none of the tutorials work as they're outdated and the documentation sucks from what I can parse.
Is there a better solution for what I'm doing, and does anyone have any suggestions as to learning how to to use OpenCV?
Thanks
I have had similar problems getting started with OpenCV and from my experience this is actually the biggest hurdle to learning it. Here is what worked for me:
This book: "OpenCV 2 Computer Vision Application Programming Cookbook." It's the most up-to-date book and has examples on how to solve different Computer Vision problems (You can see the table of contents on Amazon with "Look Inside!"). It really helped ease me into OpenCV and get comfortable with how the library works.
Like have others have said, the samples are very helpful. For things that the book skips or covers only briefly you can usually find more detailed examples when looking through the samples. You can also find different ways of solving the same problem between the book and the samples. For example, for finding keypoints/features, the book shows an example using FAST features:
vector<KeyPoint> keypoints;
FastFeatureDetector fast(40);
fast.detect(image, keypoints);
But in the samples you will find a much more flexible way (if you want to have the option of choosing which keypoint detection algorithm to use):
vector<KeyPoint> keypoints;
Ptr<FeatureDetector> featureDetector = FeatureDetector::create("FAST");
featureDetector->detect(image, keypoints);
From my experience things eventually start to click and for more specific questions you start finding up-to-date information on blogs or right here on StackOverflow.
Let me add a couple of things. First, I can assure you that the Python bindings to OpenCV work on a Mac. I use them every day.
Many people like OpenCV for many reasons:
The license is good, friendly to integration into commercial products, etc.
It is quite good from a technical stand point. It gives you a reference implementation of state of the art algorithms.
It tends to be quite fast compared to the alternatives (Matlab I'm looking at you).
Like everything in life, it is not perfect:
It is a good example of a software library that is a moving target.
I have a 300 line python program that uses OpenCV and every few
months when a new version of OpenCV is released I have to change it
to adapt to the new function names/calling conventions, etc. The
library does advance, a lot, however it is a pain to have to change
the same program 3 times per year.
It has a learning curve, like computer vision itself, it is quite
technical and not easy to learn.
There are alternatives (with other pros and cons) MATLAB with the Image Processing Toolbox is one such example.
The simplest answer that comes to mind, is to read the example code with a bit of understanding, and to try out if Your ideas work. The api does change, and most of the tutorials are writen for the first versions of OpenCV, and it looks that nobody bothered to rewrite them. Nevertheless the core ideas behind it are not changing. So if You find a tutorial answering Your questions, but written in old API just look in the documentation for modern replacements of used functions. It’s not easy and quick, but looks like it works. If You use the newest (actually 2.3) version, I suggest using both the 2.1 documntation and 2.3 docs + tutorials . You should also look into the samples, which should have been installed alongside the library. There are lots of hints about how to use certain structures and tricks that weren't mentioned in documentation. Finally, don't be afraid to look inside the code of the library itself (if You compiled it on Your own). Unfortunately, thats the only source I know to check for example what code corresponds to which type of Mat object.