Counting number of bright spots in image (python) - opencv

I'm trying to develop a way to count the number of bright spots in an image. The spots should be gaussian point sources, but there is a lot of noise. There are probably on the order of 10-20 actual point sources in this image. My first though was to use a gaussian convolution with sigma = 15, which seems to do a good job.
First, is there a better way to isolate these bright spots?
Second, how can I 'detect' the bright spots, i.e. count them? I haven't had any luck with circular hough transforms from opencv.
Edit: Here is the original without gridlines, here is the convolved image without gridlines.

I am working with thermal infrared images which subject to quantity of noises.
I found that low rank based approaches such as approaches based on Singular Value Decomposition (SVD) or Weighted Nuclear Norm Metric (WNNM) give very efficient result in terms of reducing the noise while preserving the structure of the information.
Their main drawback is the fact they are quite slow to compute (several minutes per image)
Here is some litterature:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7067415
https://arxiv.org/abs/1705.09912
The second paper has some MatLab code available, there is quite a lot of files but the translation to python is should not that complex.
OpenCV implement as well (and it is available in python) a very efficient algorithm on the Non-Local Means algorithm:
https://docs.opencv.org/master/d5/d69/tutorial_py_non_local_means.html

Related

Calculate similarity of picture and its sketch

I'm trying to develop algorithm, which returns similarity score for two given black and white images: original one and its sketch, drawn by human:
All original images has the same style, but there is no any given limited set of them. Their content could be totally different.
I've tried few approaches, but none of them was successful yet:
OpenCV template matching
OpenCV matchTemplate is not able to calculate similarity score of images. It could only tells me count of matched pixels, and this value is usually quite low, because of not ideal proportions of human's sketch.
OpenCV feature matching
I've failed with this method, because I couldn't find good algorithms for extracting significant features from human's sketch. Algorithms from OpenCV's tutorials are good in extracting corners and blobs as features. But here, in sketches, we have a lot of strokes - each of them produces a lot of insignificant, junk features and leads to fuzzy results.
Neural Network Classification
Also I took a look at neural networks - they are good in image classification, but also they need train sets for each of classes, and this part is impossible, because we have an unlimited set of possible images.
Which methods and algorithms would you use for this kind of task?
METHOD 1
Cosine similarity gives a similarity score ranging between (0 - 1).
I first converted the images to gray scale and binarized them. I cropped the original image to half the size and excluded the text as shown below:
I then converted the image arrays to 1D arrays using flatten(). I used the following to compute cosine similarity:
from scipy import spatial
result = spatial.distance.cosine(im2, im1)
print result
The result I obtained was 0.999999988431, meaning the images are similar to each other by this score.
EDIT
METHOD 2
I had the time to check out another solution. I figured out that OpenCV's cv2.matchTemplate() function performs the same job.
I f you check out THIS DOCUMENTATION PAGE you will come across the different parameters used.
I used the cv2.TM_SQDIFF_NORMED parameter (which gives the normalized square difference between the two images).
res = cv2.matchTemplate(th1, th2, cv2.TM_SQDIFF_NORMED)
print 1 - res
For the given images I obtained a similarity score of: 0.89689457

Image segmentation with maxflow

I have to do a foreground/background segmentation using maxflow algorithm in C++. (http://wiki.icub.org/iCub/contrib/dox/html/poeticon_2src_2objSeg_2src_2maxflow-v3_802_2maxflow_8cpp_source.html). I get an array of pixels from a png file according to their RBG but what are the next steps. How could I use this algorithm for my problem?
I recognize that source very well. That's the Boykov-Kolmogorov Graph Cuts library. What I would recommend you do first is read their paper.
Graph Cuts is an interactive image segmentation algorithm. You mark pixels in your image on what you believe belong to the object (a.k.a. foreground) and what don't belong to the object (a.k.a the background). That's what you need first. Once you do this, the Graph Cuts algorithm best guesses what the labels of the other pixels are in the image. It basically goes through each of the other pixels that are not labeled and figures out whether or not they belong to foreground and background.
The whole premise behind Graph Cuts is that image segmentation is akin to energy minimization. Image segmentation can be formulated as a cost function with a summation of two terms:
Self-Penalty: This is the cost of assigning each pixel as either foreground or background. This is also known as a data cost.
Neighbouring Penalties: This enforces that neighbouring pixels more or less should share the same classification label. This is also known as a smoothness cost.
This kind of formulation is well known as the Maximum A Posteriori Markov Random Field classification problem (MAP-MRF). The goal is to minimize that cost function so that you achieve the best image segmentation possible. This is actually an NP-Hard problem, and is actually one of the problems that is up for money from the Clay Math Institute.
Boykov and Kolmogorov theoretically proved that the MAP-MRF problem can be translated into graph theory, and solving the MAP-MRF problem is akin to taking your image and forming it into a graph with source and sink links, as well as links that connect neighbouring pixels together. To solve the MAP-MRF, you perform the maximum-flow/minimum-cut algorithm. There are many ways to do this, but Boykov / Kolmogorov find a more efficient way that is much faster than more established algorithms, such as Push-Relabel, Ford-Fulkenson, etc.
The self penalties are what are known as t links, while the neighbouring penalties are what are known as n links. You should read up the paper to figure out how these are computed, but the t links describe the classification penalty. Basically, how much it would cost to classify each pixel as belonging to the foreground or the background. These are usually based on the negative log probability distributions of the image. What you do is you create a histogram of the distribution of what was classified as foreground and a histogram of what was classified as background.
Usually, a uniform quanitization of each colour channel for both foreground and background suffices. You then turn these into PDFs but dividing by the total number of elements in each histogram, then when you calculate the t-links for each pixel, you access the colour, then see where it lies in the histogram, then take the negative log. This will tell you how much it will cost to classify that pixel to be either foreground or background.
The neighbouring pixel costs are more intuitive. People usually just take the Euclidean distance between one pixel and a neighbouring pixel and apply this distance to a Gaussian. To make things simple, a 4 pixel neighbourhood is what is usually used (North, South, East and West).
Once you figure out how to compute the cost, you follow this procedure:
Mark pixels as foreground or background.
Create a graph structure using their library
Compute the histograms of the foreground and background pixels
Calculate t-links and add to the graph
Calculate n-links and add to the graph
Invoke the maxflow routine on the graph to segment the image
Go through each pixel and figure out whether or not the pixel belongs to foreground or background.
Create a binary map that reflects this, then copy over image pixels where the binary map is true, and don't do this when it's false.
The original source of maxflow can be found here: http://pub.ist.ac.at/~vnk/software/maxflow-v3.03.src.zip
It also has a README so you can see how the library is supposed to work given some example images.
You have a lot to digest, but Graph Cuts is one of the most powerful interactive segmentation tools out there.
Good luck!

Water Edge Detection

Is there a robust way to detect the water line, like the edge of a river in this image, in OpenCV?
(source: pequannockriver.org)
This task is challenging because a combination of techniques must be used. Furthermore, for each technique, the numerical parameters may only work correctly for a very narrow range. This means either a human expert must tune them by trial-and-error for each image, or that the technique must be executed many times with many different parameters, in order for the correct result to be selected.
The following outline is highly-specific to this sample image. It might not work with any other images.
One bit of advice: As usual, any multi-step image analysis should always begin with the most reliable step, and then proceed down to the less reliable steps. Whenever possible, the less reliable step should make use of the result of more-reliable steps to augment its own accuracy.
Detection of sky
Convert image to HSV colorspace, and find the cyan located at the upper-half of the image.
Keep this HSV image, becuase it could be handy for the next few steps as well.
Detection of shrubs
Run Canny edge detection on the grayscale version of image, with suitably chosen sigma and thresholds. This will pick up the branches on the shrubs, which would look like a bunch of noise. Meanwhile, the water surface would be relatively smooth.
Grayscale is used in this technique in order to reduce the influence of reflections on the water surface (the green and yellow reflections from the shrubs). There might be other colorspaces (or preprocessing techniques) more capable of removing that reflection.
Detection of water ripples from a lower elevation angle viewpoint
Firstly, mark off any image parts that are already classified as shrubs or sky. Since shrub detection would be more reliable than water detection, shrub detection's result should be used to inform the less-reliable water detection.
Observation
Because of the low elevation angle viewpoint, the water ripples appear horizontally elongated. In fact, every image feature appears stretched horizontally. This is called Anisotropy. We could make use of this tendency to detect them.
Note: I am not experienced in anisotropy detection. Perhaps you can get better ideas from other people.
Idea 1:
Use maximally-stable extremal regions (MSER) as a blob detector.
The Wikipedia introduction appears intimidating, but it is really related to connected-component algorithms. A naive implementation can be done similar to Dijkstra's algorithm.
Idea 2:
Notice that the image features are horizontally stretched, a simpler approach is to just sum up the absolute values of horizontal gradients and compare that to the sum of absolute values of vertical gradients.

Fast and quick pixel matching algorithm

I am stuck in a pixel matching algorithm for finding symbols in an image. I have two images of symbols that I intend to find in an image that has big resolution.
Instead of a pixel by pixel matching algorithm, is there a fast algorithm that gives the same result as that of pixel matching algorithm. The result should be similar to: (percentage of pixel matched) divide by (total pixels).
My problem is that I wish to find certain symbols in a 1 bit image. The symbol appear with exact similarity in the target image and 95% of total pixel match with the target block in the image. but it takes hours to do iterations. The image is 10k X 10k and the symbol size is 20 X 20, so it will 10 power of 10 calculations which is too much to handle. Is there any filter/NN combination or any other algorithm that can give same results as that of pixel matching in a few minutes?
The point here is that pixels are almost same in the but problem is that size is very large. I do not want complex features for noise handling or edges, fuzzy etc. just a simple algorithm to do pixel matching quickly and the result should be similar to: (percentage of pixel matched) divide by (total pixels)
object recognition is tricky in that any simple algorithm is generally going to be way too slow, as you've apparently realized.
Luckily, if you have a rather large collection of these images on hand that are already correctly labeled, then I have a very simply solution for you.
Simply make 3 layer feedforward network with one input unit per pixel, all of which connect to a much smaller hidden layer, and then those in turn connect to 1 output unit (representing which symbol is present in the image). Then just run the backpropagation algorithm on your dataset until the network learns to identify the symbols.
Unfortunately, this doesn't scale very well, so you might have to look into convolutional NNs for better performance.
Additionally, if you don't have any training data (i.e. labeled examples), then your best bet is probably to decompose your symbols into features and then sweep the image for those. If you can decompose them into lines, then a hough transform can do this quite rapidly.
Maybe an (Adaptive Resonance Theory) ART-1 network could help.
The algorithm can also be written that all Prototypes are checked in parallel in the same time and it can be blazing fast because it esentially uses binary math a lot.

Face Recognition Logic

I want to develop an application in which user input an image (of a person), a system should be able to identify face from an image of a person. System also works if there are more than one persons in an image.
I need a logic, I dont have any idea how can work on image pixel data in such a manner that it identifies person faces.
Eigenface might be a good algorithm to start with if you're looking to build a system for educational purposes, since it's relatively simple and serves as the starting point for a lot of other algorithms in the field. Basically what you do is take a bunch of face images (training data), switch them to grayscale if they're RGB, resize them so that every image has the same dimensions, make the images into vectors by stacking the columns of the images (which are now 2D matrices) on top of each other, compute the mean of every pixel value in all the images, and subtract that value from every entry in the matrix so that the component vectors won't be affine. Once that's done, you compute the covariance matrix of the result, solve for its eigenvalues and eigenvectors, and find the principal components. These components will serve as the basis for a vector space, and together describe the most significant ways in which face images differ from one another.
Once you've done that, you can compute a similarity score for a new face image by converting it into a face vector, projecting into the new vector space, and computing the linear distance between it and other projected face vectors.
If you decide to go this route, be careful to choose face images that were taken under an appropriate range of lighting conditions and pose angles. Those two factors play a huge role in how well your system will perform when presented with new faces. If the training gallery doesn't account for the properties of a probe image, you're going to get nonsense results. (I once trained an eigenface system on random pictures pulled down from the internet, and it gave me Bill Clinton as the strongest match for a picture of Elizabeth II, even though there was another picture of the Queen in the gallery. They both had white hair, were facing in the same direction, and were photographed under similar lighting conditions, and that was good enough for the computer.)
If you want to pull faces from multiple people in the same image, you're going to need a full system to detect faces, pull them into separate files, and preprocess them so that they're comparable with other faces drawn from other pictures. Those are all huge subjects in their own right. I've seen some good work done by people using skin color and texture-based methods to cut out image components that aren't faces, but these are also highly subject to variations in training data. Color casting is particularly hard to control, which is why grayscale conversion and/or wavelet representations of images are popular.
Machine learning is the keystone of many important processes in an FR system, so I can't stress the importance of good training data enough. There are a bunch of learning algorithms out there, but the most important one in my view is the naive Bayes classifier; the other methods converge on Bayes as the size of the training dataset increases, so you only need to get fancy if you plan to work with smaller datasets. Just remember that the quality of your training data will make or break the system as a whole, and as long as it's solid, you can pick whatever trees you like from the forest of algorithms that have been written to support the enterprise.
EDIT: A good sanity check for your training data is to compute average faces for your probe and gallery images. (This is exactly what it sounds like; after controlling for image size, take the sum of the RGB channels for every image and divide each pixel by the number of images.) The better your preprocessing, the more human the average faces will look. If the two average faces look like different people -- different gender, ethnicity, hair color, whatever -- that's a warning sign that your training data may not be appropriate for what you have in mind.
Have a look at the Face Recognition Hompage - there are algorithms, papers, and even some source code.
There are many many different alghorithms out there. Basically what you are looking for is "computer vision". We had made a project in university based around facial recognition and detection. What you need to do is google extensively and try to understand all this stuff. There is a bit of mathematics involved so be prepared. First go to wikipedia. Then you will want to search for pdf publications of specific algorithms.
You can go a hard way - write an implementaion of all alghorithms by yourself. Or easy way - use some computer vision library like OpenCV or OpenVIDIA.
And actually it is not that hard to make something that will work. So be brave. A lot harder is to make a software that will work under different and constantly varying conditions. And that is where google won't help you. But I suppose you don't want to go that deep.

Resources