If I have a known camera pose(Rotation + Position), and Intrinsics(distortion coefficients and camera matrix), and 2 cameras pointing at the same scene from slightly different angles.
Is there a way to use bundle adjustment to refine the camera pose? Preferably in some already existing API or function that doesent require too much mathematical knowledge to use.
You should use PBA (Multicore Bundle Adjustment) from Changchang Wu. It is really a nice library and it is written in C++. Furthermore, it features multi core computations and even GPU computation with a speedup of about 20 times.
It is clearly structured and easy to use.
So, instead of using SBA from Lourakis or using SSBA from Christopher Zach you should use PBA.
You may want to check out SSBA at http://www.inf.ethz.ch/personal/chzach/opensource.html but it will still require some mathematical insight to be able to use it properly.
You could try the implementation right inside OpenCV. It's in the contrib module. But I couldn't yet get it to work properly.. :/
article about it
Try the Ceres solver. An example implementation is available here. Again, you will need an understanding of the mathematical principles of bundle adjustment. But that is unavoidable.
Related
I am researching about mask r-cnn. I want to know how to pretrain my image(knife,sofa,baby,.....) using resnet50 in mask-rcnn. I struggle to find that in github, but I can't. Please help me anybody who know how to handle it.
Try this implementation of Mask RCNN on github here.
You can follow Mask_RCNN github repo. It has both resnet50 and resnet100 (might be wrong here). It is a beautiful implementation I would say. The base model is from FAIR (Facebook AI Research). There is a demo file which you can check before starting your work.
If it works well, you can see my answer, it will help you to train the model with your custom data. The answer is a bit long, but it lists all the steps.
Something which I personally like about this implementation is:
It is easy to setup. Won't bother you much about the dependencies. Having a python virtual environment does the wonders.
It falls back automatically from a CPU version to GPU and vice versa.
It is having good support from its developers. It is getting commits frequently.
The code is very customisable. So If you want to do some changes, it's pretty easy. Some booleans and numbers changes up and down and you are done...!!!
I would like to use Smalltalk (Pharo) to better refactor my image processing and computer vision code/algorithms, written in other languages. I have not found a lot of examples online where Smalltalk is used for processing images (or video frames). I would like to know whether
i) there is an opencv/image/computer vision library available for Smalltalk that is easily installed or
ii) someone could give an example of how to access the pixel data in an image and threshold it using Smalltalk.
For the first question, you can maybe write your own interface using FFI to the OpenCV C-API.
For the second question, I think it's easy to use ImageReadWriter formFromFileNamed: and then can use pixelValueAt: to read the value, threshold, and then write back by pixelValueAt:put:.
There is a recent binding to OpenCV (for Pharo 7 a.t.m.) at https://github.com/feenkcom/gt4opencv
I want to do a project involving Computer Vision. Mostly object detection/identification. After some research, I keep coming back to OpenCV. But all of the tutorials are from 2008 (I guess it was big for a bit then). It doesn't compile in Python on the mac apparently. I'm using the C++ framework right out of Xcode, but none of the tutorials work as they're outdated and the documentation sucks from what I can parse.
Is there a better solution for what I'm doing, and does anyone have any suggestions as to learning how to to use OpenCV?
Thanks
I have had similar problems getting started with OpenCV and from my experience this is actually the biggest hurdle to learning it. Here is what worked for me:
This book: "OpenCV 2 Computer Vision Application Programming Cookbook." It's the most up-to-date book and has examples on how to solve different Computer Vision problems (You can see the table of contents on Amazon with "Look Inside!"). It really helped ease me into OpenCV and get comfortable with how the library works.
Like have others have said, the samples are very helpful. For things that the book skips or covers only briefly you can usually find more detailed examples when looking through the samples. You can also find different ways of solving the same problem between the book and the samples. For example, for finding keypoints/features, the book shows an example using FAST features:
vector<KeyPoint> keypoints;
FastFeatureDetector fast(40);
fast.detect(image, keypoints);
But in the samples you will find a much more flexible way (if you want to have the option of choosing which keypoint detection algorithm to use):
vector<KeyPoint> keypoints;
Ptr<FeatureDetector> featureDetector = FeatureDetector::create("FAST");
featureDetector->detect(image, keypoints);
From my experience things eventually start to click and for more specific questions you start finding up-to-date information on blogs or right here on StackOverflow.
Let me add a couple of things. First, I can assure you that the Python bindings to OpenCV work on a Mac. I use them every day.
Many people like OpenCV for many reasons:
The license is good, friendly to integration into commercial products, etc.
It is quite good from a technical stand point. It gives you a reference implementation of state of the art algorithms.
It tends to be quite fast compared to the alternatives (Matlab I'm looking at you).
Like everything in life, it is not perfect:
It is a good example of a software library that is a moving target.
I have a 300 line python program that uses OpenCV and every few
months when a new version of OpenCV is released I have to change it
to adapt to the new function names/calling conventions, etc. The
library does advance, a lot, however it is a pain to have to change
the same program 3 times per year.
It has a learning curve, like computer vision itself, it is quite
technical and not easy to learn.
There are alternatives (with other pros and cons) MATLAB with the Image Processing Toolbox is one such example.
The simplest answer that comes to mind, is to read the example code with a bit of understanding, and to try out if Your ideas work. The api does change, and most of the tutorials are writen for the first versions of OpenCV, and it looks that nobody bothered to rewrite them. Nevertheless the core ideas behind it are not changing. So if You find a tutorial answering Your questions, but written in old API just look in the documentation for modern replacements of used functions. It’s not easy and quick, but looks like it works. If You use the newest (actually 2.3) version, I suggest using both the 2.1 documntation and 2.3 docs + tutorials . You should also look into the samples, which should have been installed alongside the library. There are lots of hints about how to use certain structures and tricks that weren't mentioned in documentation. Finally, don't be afraid to look inside the code of the library itself (if You compiled it on Your own). Unfortunately, thats the only source I know to check for example what code corresponds to which type of Mat object.
Best as in reliable, maintainable and fast.
Considering Processing, VVVV or OpenFrameworks?
I know Processing doesn't handle big video frames very well.
VVVV (Nodes use OpenCV) is just for Windows.
OpenFrameworks (OpenCv) is more complicated than the
above.
You can try to implement your app in Processing and see if it fits your needs and is fast enough. It should a little more easy and faster to write Java instead of C++.
Here can you find how to setup with processing with examples: http://ubaa.net/shared/processing/opencv/
If you don't want to code anything you can try VVVV, should be little faster but only on Windows as you mentioned.
If your Processing app is running too slow, you can try openFrameworks.
download it the new OF 007 from http://www.openframeworks.cc/ and check out the setup guide.
If you have done the install you can play around with the openCV examples from
<your-OF-folder>/apps/addonsExamples/opencvExample
<your-OF-folder>/apps/addonsExamples/opencvHaarFinderExample/
Personally I prefer OF because you can do any custom thing with the most performance, but its good to make your prototype with Processing to see if it works and implement it after that again in OF.
As far as I can see from your question, VVVV and OF are the options your looking at, but you prefer VVVV's node based programming over OF, but aren't happy that VVVV is Windows only.
Have you considered other alternatives like MaxMSPJitter or PureData ?
Both are similar to VVVV or the other way around :)
MaxMSP has a package for 'optimized matrix operations'(3D/video) called Jitter.
For Jitter there is a cv.jit free collection of external objects and the samples/tutorials are great.
Similarly PureData has an add-on called Gem, which is similar to Max's Jitter package.
I haven't tried with PureData, but there are OpenCV bindings for it, through Gem.
cv.jit
pdp OpenCV PureData Bindings - via Piksel.no
MaxMSP uses quicktime on osx and can use directX on windows, but it's commercial.
PureData runs on windows/osx/linux, it's free and opensource.
HTH
We're looking for a package to help identify and automatically rotate faxed TIFF images based on a watermark or logo.
We use libtiff for rotation currently, but don't know of any other libraries or packages I can use for detecting this logo and determining how to rotate the images.
I have done some basic work with OpenCV but I'm not sure that it is the right tool for this job. I would prefer to use C/C++ but Java, Perl or PHP would be acceptable too.
You are in the right place using OpenCV, it is an excellent utility. For example, this guy used it for template matching, which is fairly similar to what you need to do. Also, the link Roddy specified looks similar to what you want to do.
I feel that OpenCV is the best library out there for this kind of development.
#Brian, OpenCV and the IntelIPP are closely linked and very similar (both Intel libs). As far as I know, if OpenCV finds the intel IPP on your computer it will automatically use it under the hood for improved speed.
The Intel Performance Primitives (IPP) library has a lot of very efficient algorithms that help with this kind of a task. The library is callable from C/C++ and we have found it to be very fast. I should also note that it's not limited to just Intel hardware.
That's quite a complex and specialized algorithm that you need.
Have a look at http://en.wikipedia.org/wiki/Template_matching. There's also a demo program (but no source) at http://www.lps.usp.br/~hae/software/cirateg/index.html
Obviously these require you to know the logo you are looking for in advance...