I have been searching online and trying to understand the literature I have, but something is eluding me.
Given a SOM, when visualized with a U-Matrix, does the U-Matrix represent the distance between a given node and every other node, or the distance between a node and its direct neighbours?
Regards,
Jack Hunt
EDIT:- Suggestions for alternative visualization techniques are welcome.
Generally the color of a node in the U-matrix is based on the distance between neighboring nodes in the lattice (link). There are other ways to color a U-matrix, but that is the most common.
Other visualization techniques depend entirely upon your data and what you are looking for.
Related
I came across top down and bottom up approaches while reading this paper
"https://arxiv.org/abs/1611.08050" on image processing.
I got a vague about top down approach from this paragraph:
"top-down approach: apply a separately trained human detector (based on object detection techniques such as the ones we discussed before), find each person, and then run pose estimation on every detection."
But couldn't understand bottom up approach from this:
"bottom-up approaches recognize human poses from pixel-level image evidence directly. They can solve both problems above: when you have information from the entire picture you can distinguish between the people, and you can also decouple the runtime from the number of people on the frameā¦ at least theoretically."
Please help me understand these concepts. Thank you.
Both paragraph's are from this blog : "https://medium.com/neuromation-blog/neuronuggets-understanding-human-poses-in-real-time-b73cb74b3818"
There are two person in the picture. all human has 15 joint(key point)
Top-down approach
find two bounding boxes including each person
estimate human joint(15 key-point) per each bounding box
In this example, Top-down approach need pose estimation twice.
Bottom-up approach
estimate all human joint(30 key-point) in the picture
classify which joint(15 key-point) are included in the same person
In this example, pose estimator doesn't care how many people are in the picture. they only consider how they can classify each joint to the each person.
In general situation, Top-down approach consume time much more than Bottom-up, because Top-down approach need N-times pose estimation by person detector results.
I am trying to create spatial representation of features. Basically, an image is subdivided into grids, e.g. 4 grids, and features are detected for each grid. Features are clustered into visual words. Histograms are created for each grid and then I can match the corresponding grids with histogram intersection. Here is the paper http://www.vision.caltech.edu/Image_Datasets/Caltech101/cvpr06b_lana.pdf that I am working on it. First of all, how can I subdivide an image and detect features? I found out GridAdaptedFeatureDetector in Opencv but I do not know how to get features for particular grid. I can define a region of interest and detect features separately and add them into histogram but this sounds complicated and time wasting. Maybe there is an easy way to do. Any ideas are appreciated. Thanks in advance.
Your question is basically how one could implement her paper. The good news are that prof. Lazebnik has shared the source code or her Spatial Pyramid here:
http://web.engr.illinois.edu/~slazebni/research/SpatialPyramid.zip
Nevertheless, it is a matlab implementation that you would have to convert to OpenCV if you want.
You can also take a look at here slides and the dataset used is here.
I'm searching for algorithms/methods that are used to classify or differentiate between two outdoor environments. Given an image with vehicles, I need to be able to detect whether the vehicles are in a natural desert landscape, or whether they're in the city.
I've searched but can't seem to find relevant work on this. Perhaps because I'm new at computer vision, I'm using the wrong search terms.
Any ideas? Is there any work (or related) available in this direction?
I'd suggest reading Prince's Computer Vision: Models, Learning, and Inference (free PDF available). It covers image classification, as well as many other areas of CV. I was fortunate enough to take the Machine Vision course at UCL which the book was designed for and it's an excellent reference.
Addressing your problem specifically, a simple MAP or MLE model on pixel colours will probably provide a reasonable benchmark. From there you could look at more involved models and feature engineering.
Seemingly complex classifications similar to "civilization" vs "nature" might be able to be solved simply with the help of certain heuristics along with classification based on color. Like Gilevi said, city scenes are sure to contain many flat lines and right angles, while desert scenes are dominated by rolling dunes and so on.
To address this directly, you could use OpenCV's hough - lines algorithm on the images (tuned for this problem of course) and look at:
a) how many lines are fit to the image at a given threshold
b) of the lines that are fit what is the expected angle between two of them; if the angles are uniformly distributed then chances are its nature, but if the angles are clumped up around multiples of pi/2 (more right angles and straight lines) then it is more likely to be a cityscape.
Color components, textures, and degree of smoothness(variation or gradient of image) may differentiate the desert and city background. You may also try Hough transform, which is used for line detection that can be viewed as city feature (building, road, bridge, cars,,,etc).
I would recommend you this research very similar with your project. This article presents a comparison of different classification techniques to obtain the scene classifier (urban, highway, and rural) based on images.
See my answer here: How to match texture similarity in images?
You can use the same method. I already solved in the past problems like the one you described with this method.
The problem you are describing is that of scene categorization. Search for works that use the SUN database.
However, you only working with two relatively different categories, so I don't think you need to kill yourself implementing state-of-the-art algorithms. I think taking GIST features + color features and training a non-linear SVM would do the trick.
Urban environments is usually characterized with a lot of horizontal and vertical lines, GIST captures that information.
Is it possible to compare two intensity histograms (derived from gray-scale images) and obtain a likeness factor? In other words, I'm trying to detect the presence or absence of an soccer ball in an image. I've tried feature detection algorithms (such as SIFT/SURF) but they are not reliable enough for my application. I need something very reliable and robust.
Many thanks for your thoughts everyone.
This answer (Comparing two histograms) might help you. Generally, intensity comparisons are quite sensitive as e.g. white during day time is different from white during night time.
I think you should be able to derive something from compareHist() in openCV (http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_comparison/histogram_comparison.html) to suit your needs if compareHist() does fit your purpose.
If not, this paper http://www.researchgate.net/publication/222417618_Tracking_the_soccer_ball_using_multiple_fixed_cameras/file/32bfe512f7e5c13133.pdf
tracks the ball from multiple cameras and you might get some more ideas from that even though you might not be using multiple cameras.
As kkuilla have mentioned, there is an available method to compare histogram, such as compareHist() in opencv
But I am not certain if it's really applicable for your program. I think you will like to use HoughTransfrom to detect circles.
More details can be seen in this paper:
https://files.nyu.edu/jb4457/public/files/research/bristol/hough-report.pdf
Look for the part with coins for the circle detection in the paper. I did recall reading up somewhere before of how to do ball detection using Hough Transform too. Can't find it now. But it should be similar to your soccer ball.
This method should work. Hope this helps. Good luck(:
I am new to image processing and I am using k-means for clustering for my assignment. I am having an issue where my friend told me that to use k-mean in opencv, we need to only pass the color of the object of interest and not the whole image.
This has confused me as I am not sure how to obtain the color composition before apply kmeans. Sorry about my English, I would give an example. I have a picture with several colors and lets say I want to obtain the blue cluster which is a car. So does it means that I need to pass only the color blue to the kmeans.
Maybe I am totally wrong in this since I am unsure and i have been struggling for several days now. I think I need thorough explanation from some expert whom i think i will get it here.
Thank you for your time.
In order to put you in the right direction, below are some hints:
you will pass all the pixels to k-means alongwith the desired number of clusters (or groups) to find
k-means will process you data and will cluster you data in the specified number of clusters
you will take the pixels in blue cluster (e.g.) to do what you want to do.
I hope this will help ;)