I am new to image processing and I am using k-means for clustering for my assignment. I am having an issue where my friend told me that to use k-mean in opencv, we need to only pass the color of the object of interest and not the whole image.
This has confused me as I am not sure how to obtain the color composition before apply kmeans. Sorry about my English, I would give an example. I have a picture with several colors and lets say I want to obtain the blue cluster which is a car. So does it means that I need to pass only the color blue to the kmeans.
Maybe I am totally wrong in this since I am unsure and i have been struggling for several days now. I think I need thorough explanation from some expert whom i think i will get it here.
Thank you for your time.
In order to put you in the right direction, below are some hints:
you will pass all the pixels to k-means alongwith the desired number of clusters (or groups) to find
k-means will process you data and will cluster you data in the specified number of clusters
you will take the pixels in blue cluster (e.g.) to do what you want to do.
I hope this will help ;)
Related
This might be a very broad question so I'm sorry in advance. I'd like to also point out I'm new in the CV field, so my insight in this field is minimum.
I am trying to find correspondences between points from a FLIR image and a VIS image. I'm currently building 40x40 pixels regions around keypoints, over which I'm applying the LoG. I'm trying to compare them to find the most similar regions.
For example, I have these data sets:
Where the columns represent, in this order:
the image for which I'm trying to find a correspondent
the candidate images
the LoG of the first column
the LoG of the second column
It is very clear, for the human eye, that the third image is the best match for the first set, while the first image is the best image for the second set.
I have tried various ways of expressing a similarity/disimilarity between these images, such as SSD, Cross Correlation, or Mutual Information, but they all fail to be consistent (they only work in some cases).
Now, my actual question is:
What should I use to express the similarity between images in a more semantic way, such that shapes would be more important in deciding the best match, rather than actual intensities of the pixels? Do you know of any technique that would aid me in my quest of finding these matches?
Thank you!
Note: I'm using OpenCV with Python right now, but the programming language and library is not important.
Im trying implement a real time object classification program using SVM classification and BoW clustering algorithms. My questions is what are the good practices for selecting positive and negative training images?
Positive image sets
Should the background be empty? Meaning, should the image only contain the object of interest? When implementing this algorithm in real time, the test image will not contain only the object of interest, it will definitely have some information from the background as well. So instead of using isolated image collection, should I choose images which look more similar to the test images?
Negative image sets
Can these be any image set without the object of interest? Or should they be from the environment where this algorithm is going to be tested without object of interest?. For example, if I'm going to classify phones in my living room environment, should negatives be the background image set of my living room environment without the phone in the foreground? or can it be any image set? (like kitchen, living room, bedroom or outdoor images) Im asking this because, I don't want the system to be environment-specific. Must be robust at any environment (indoors and outdoors)
Thank you. Any help or advice is much appreciated.
Positive image sets
Yes you should definitely choose images which look more similar to the test images.
Negative image sets
It can be any image set however, it is better to include images from the environment where this algorithm is going to be tested without object of interest.
Generally
Please read my answer to some other SO question, it would be useful. Discussion continued in comments, so that might be useful as well.
I have some color images needing segmentation. They are images of slides that are stained with hematoxylin and eosin ("H&E").
I found this method for color deconvolution by Ruifrok
http://europepmc.org/abstract/med/11531144
that separates out the images by color.
However it seems that you can do something similar just by using K-means clustering:
http://www.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-clustering.html
I am curious what the difference is. Any insight would be welcome. Thanks.
I can't seem to find a copy of the article (well without paying) but they are not exactly the same.
K means seeks to cluster data. So if you just want to find dominant colors in an image, or do some sorting based on colors, this is the way to go. As a side note: Kmeans can be used on any vector. Its not confined to color, so you can use it for many other applications.
Color Deconvolution is trying to remove the effects of chemical dyes commonly used for microscopy. (If I understood the abstract properly). Based on the specific dye used, the algorithm tries to reverse its effects and give you the original color image back (before the dye was added). I found this website that shows some output. This is deconvolving the dye contribution to the RGB spectrum. It doesn't do any clustering/grouping (other than finding the dye)
Hope that helps
EDIT
If you didn't know, convolution is most often associated with signals/image processing. Basically you take a filter and run it over a signal. The output is a modified version of the original input. In this case, the original image is filtered by a dye with known RGB values. IF we know the full characteristics of the dye/filter we can invert it. Then by running the convolution again using the inverse filter we can hopefully de -convolve the effect. In principle it sounds simple enough, but in many cases this isn't possible.
I am looking for an algorithm or, even better, some library that covers background substraction from a single static image (no background model available). What whould be possible though is some kind of user input like for example https://clippingmagic.com does it.
Sadly my google fu is bad here as i cant find any papers on that topic with my limited amount of keywords.
That webpage is really impressive. If I were to try and implement something similar I would probably use k-means clustering using the CIELAB colorspace. The reason for changing the colorspace is so that colors can be represented by two points rather than 3 as a regular RGB image. This should speed up clustering. Additionally, the CIELAB color space was build for this purpose, finding "distances" (similarities) between colors and accounts for the way humans perceive color. Not just looking at the raw binary data the computer has.
But a quick overview of kmeans. For this example we will say k=2 (meaning only two clusters)
Initialize each cluster with a mean.
Go through every pixel in your image and decide which mean it is closer to, cluster 1 or 2?
Compute the new mean for your clusters after you've processed all the pixels
using the newly computed means repeat steps 2-4 until convergence (meaning the means don't change very much)
Now that would work well when the foreground image is notably different than the background. Say a red ball in a blue background, but if the colors are similar it would be more problematic. I would still stick to kmeans but have a larger number of clusters. So on that web page you can make multiple red or green selections. I would make each of these strokes a cluster, and intialize my cluster to the mean. So say I drew 3 red strokes, and 2 green ones. That means I'd have 5 groups. But somehow internally I add an extra attribute as foreground/background. So that each cluster will have a small variance, but in the end, I would only display that attribute, foreground or background. I hope that made sense.
Maybe now you have some search terms to start off with. There may be many other methods but this is the first I thought of, good luck.
EDIT
After playing with the website a bit more I see it uses spatial proximity to cluster. So say I had 2 identical red blobs on opposite sides of the image. If I only annotate the left side of the image the blob on the right side might not get detected. Kmeans wouldn't replicate this behavior since the method I described only uses the color to cluster pixels completely oblivious to their location in the image.
I don't know what tools you have at your disposal but here is a nice matlab example/tutorial on color based kmeans
I need to automatically align an image B on top of another image A in such a way, that the contents of the image match as good as possible.
The images can be shifted in x/y directions and rotated up to 5 degrees on z, but they won't be distorted (i.e. scaled or keystoned).
Maybe someone can recommend some good links or books on this topic, or share some thoughts how such an alignment of images could be done.
If there wasn't the rotation problem, then I could simply try to compare rows of pixels with a brute-force method until I find a match, and then I know the offset and can align the image.
Do I need AI for this?
I'm having a hard time finding resources on image processing which go into detail how these alignment-algorithms work.
So what people often do in this case is first find points in the images that match then compute the best transformation matrix with least squares. The point matching is not particularly simple and often times you just use human input for this task, you have to do it all the time for calibrating cameras. Anyway, if you want to fully automate this process you can use feature extraction techniques to find matching points, there are volumes of research papers written on this topic and any standard computer vision text will have a chapter on this. Once you have N matching points, solving for the least squares transformation matrix is pretty straightforward and, again, can be found in any computer vision text, so I'll assume you got that covered.
If you don't want to find point correspondences you could directly optimize the rotation and translation using steepest descent, trouble is this is non-convex so there are no guarantees you will find the correct transformation. You could do random restarts or simulated annealing or any other global optimization tricks on top of this, that would most likely work. I can't find any references to this problem, but it's basically a digital image stabilization algorithm I had to implement it when I took computer vision but that was many years ago, here are the relevant slides though, look at "stabilization revisited". Yes, I know those slides are terrible, I didn't make them :) However, the method for determining the gradient is quite an elegant one, since finite difference is clearly intractable.
Edit: I finally found the paper that went over how to do this here, it's a really great paper and it explains the Lucas-Kanade algorithm very nicely. Also, this site has a whole lot of material and source code on image alignment that will probably be useful.
for aligning the 2 images together you have to carry out image registration technique.
In matlab, write functions for image registration and select your desirable features for reference called 'feature points' using 'control point selection tool' to register images.
Read more about image registration in the matlab help window to understand properly.