Let's say we have a number of color images that are examples of some textured pattern. There is a rare occurrence where this texture is "disrupted" by some foreign object. What would be the best way to detect these rare anomalies?
I thought about training a CNN, but the number of good examples vastly outnumbers the bad examples, so I have my doubts. I started looking into grey level co-occurrence matrices (GLCM) and local binary patterns (LBP), but I think color information could play an important part in determining the occurrence of a disruption. Could I find the distribution from these extracted features (of either GLCM or LBP) and calculate the probability that a new image belongs to this distribution?
Thanks for your help!
It is difficult to figure out your problem without seeing some sample images. In principle there's a wide variety of approaches you could use to detect texture disruption, namely GLCM features, LBPs, Law's masks, vector quantization, etc. Measuring the local entropy is a possible way to go. Consider the image below, in which we can clearly distinguish two types of texture:
The following snippet reads the image, computes the local entropy for each pixel on a circular neighbourhood or a given radius 25 and displays the results:
from skimage import io
from skimage.filters.rank import entropy
from skimage.morphology import disk
img = io.imread('https://i.stack.imgur.com/Wv74a.png')
R = 25
filtered = entropy(img, disk(R))
io.imshow(filtered)
It clearly emerges from the resulting entropy map that the local entropy values could be utilized to detect texture disruption.
Related
I'm trying to develop a way to count the number of bright spots in an image. The spots should be gaussian point sources, but there is a lot of noise. There are probably on the order of 10-20 actual point sources in this image. My first though was to use a gaussian convolution with sigma = 15, which seems to do a good job.
First, is there a better way to isolate these bright spots?
Second, how can I 'detect' the bright spots, i.e. count them? I haven't had any luck with circular hough transforms from opencv.
Edit: Here is the original without gridlines, here is the convolved image without gridlines.
I am working with thermal infrared images which subject to quantity of noises.
I found that low rank based approaches such as approaches based on Singular Value Decomposition (SVD) or Weighted Nuclear Norm Metric (WNNM) give very efficient result in terms of reducing the noise while preserving the structure of the information.
Their main drawback is the fact they are quite slow to compute (several minutes per image)
Here is some litterature:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7067415
https://arxiv.org/abs/1705.09912
The second paper has some MatLab code available, there is quite a lot of files but the translation to python is should not that complex.
OpenCV implement as well (and it is available in python) a very efficient algorithm on the Non-Local Means algorithm:
https://docs.opencv.org/master/d5/d69/tutorial_py_non_local_means.html
I am using orb to identify the head of sprites in a still from a video game. When I first read about the algorithm is seemed like a great fit for my purpose. However, it doesn't seem to be preforming as I expected and I don't have the intuition or experience to know whether or not this is a poorly chosen algorithm or if it's not working due to my implementation.
Here's my reference image:
And here is the image I'm searching within:
Here's the code I'm using (top image is img1, second image is img2):
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img1 = cv.imread('6.jpg',0)
img2 = cv.imread('0.jpg',0)
# Initiate ORB detector
orb = cv.ORB_create(nfeatures=1000,WTA_K =3,scoreType=cv.ORB_FAST_SCORE,patchSize=10,edgeThreshold=50)
# find the keypoints with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
# create BFMatcher object
bf = cv.BFMatcher(cv.NORM_HAMMING2, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
# Draw first x matches.
print(len(matches))
img3 = cv.drawMatches(img1,kp1,img2,kp2,matches[:200],None, flags=2)
plt.imshow(img3),plt.show()
Now i understand that Orb wont work well for the sprites that are facing away from the camera, since the eyes/mouth wont be present (once I get this working I plan on running multiple reference images from different positions around the head), but I can't work out why it isn't matching some of the heads. Here's the print of all 74 matches it finds:
Although a few of the lines connect to the head of one of the sprites, they don't actually track from on feature of the original image to the same feature of the sprite (eg. tracks from eye of img1 to top of head in img2). What can I do to improve matches here?
I'm going to point out the problems I can notice first, the provide possible solutions and alternatives.
Problems
Scale Invariance: Although ORB uses image pyramids for scale invariance, it is not so robust. As you can see your 'train' image is bigger than the blobs of head you're trying to notice.
Background Noise: As you can see from your output, a lot of features(key-points) that are detected from the background are matching with your image. This is causing your code to return many inaccurate matches.
Alternate Cases: In the test image shown, the faces of the characters are, to a certain degree, rotated versions of your 'train' image. There can be many situations in which that is not so. ORB is rotation independent but that is for shapes that do not change so much with rotation. Incase a character was facing the camera directly, ORB and most key-point detection techniques would fail.
Possible Solutions
Use a scale invariant algorithm: SIFT is one option provided you're using this algorithm for non commercial purposes as it is patent protected in the USA. It is scale-invariant and in most cases, more accurate than ORB. The only tradeoff is the speed. Assuming you're using KNN to match points, reducing the distance ratio would also remove many outliers. Getting rid of the text is also going to help.
Determining regions of interest: To remove background noise, you can use contouring techniques to extract 'a box' around the character's body and then apply your algorithm so as to avoid matching with any key-points of the background. Another technique is an outlier removal algorithm called RANSAC(Random Sample Consensus)
Get enough images of characters facing in different directions and use each of them as a reference. This would be computationally expensive because you'll have to run your code multiple times on the same test image using different training 'heads'
Alternatives
Since you're looking for a very specific object, try training a CNN model. There are many readily available codes online that allow you to use your own training data. You could also try template matching. MSTM would be an example.
I'm trying to develop algorithm, which returns similarity score for two given black and white images: original one and its sketch, drawn by human:
All original images has the same style, but there is no any given limited set of them. Their content could be totally different.
I've tried few approaches, but none of them was successful yet:
OpenCV template matching
OpenCV matchTemplate is not able to calculate similarity score of images. It could only tells me count of matched pixels, and this value is usually quite low, because of not ideal proportions of human's sketch.
OpenCV feature matching
I've failed with this method, because I couldn't find good algorithms for extracting significant features from human's sketch. Algorithms from OpenCV's tutorials are good in extracting corners and blobs as features. But here, in sketches, we have a lot of strokes - each of them produces a lot of insignificant, junk features and leads to fuzzy results.
Neural Network Classification
Also I took a look at neural networks - they are good in image classification, but also they need train sets for each of classes, and this part is impossible, because we have an unlimited set of possible images.
Which methods and algorithms would you use for this kind of task?
METHOD 1
Cosine similarity gives a similarity score ranging between (0 - 1).
I first converted the images to gray scale and binarized them. I cropped the original image to half the size and excluded the text as shown below:
I then converted the image arrays to 1D arrays using flatten(). I used the following to compute cosine similarity:
from scipy import spatial
result = spatial.distance.cosine(im2, im1)
print result
The result I obtained was 0.999999988431, meaning the images are similar to each other by this score.
EDIT
METHOD 2
I had the time to check out another solution. I figured out that OpenCV's cv2.matchTemplate() function performs the same job.
I f you check out THIS DOCUMENTATION PAGE you will come across the different parameters used.
I used the cv2.TM_SQDIFF_NORMED parameter (which gives the normalized square difference between the two images).
res = cv2.matchTemplate(th1, th2, cv2.TM_SQDIFF_NORMED)
print 1 - res
For the given images I obtained a similarity score of: 0.89689457
My problem is as follows:
I have 6 types of images, or 6 classes. For example, cat, dog, bird, etc.
For every type of image, I have many variations of that image. For example, brown cat, black dog, etc.
I'm currently using a Support Vector Machine (SVM) to classify the images using one-versus-rest classification. I'm unfolding each image into a single pixel vector and using that as the feature vector for a given image I'm experiencing decent classification accuracy, but I want to try something different.
I want to use image descriptors, particularly SURF features, as the feature vector for each image. This issue is, I can only have a single feature vector per given image and I'm given a variable number of SURF features from the feature extraction process. For example, 1 picture of a cat may give me 40 SURF features, while 1 picture of a dog will give me 68 SURF features. I could pick the n strongest features, but I have no way of guaranteeing that the chosen SURF features are ones that describe my image (for example, it could focus on the background). There's also no guarantee that ANY SURF features are found.
So, my problem is, how can I get many observations (each being a SURF feature vector), and "fold" these observations into a single feature vector which describes the raw image and can fed to an SVM for training?
Thanks for your help!
Typically the SURF descriptors are quantized using a K-means dictionary and aggregated into one l1-normalized histogram. So your inputs to the SVM algorithm are now fixed in size.
What are the ways in which to quantify the texture of a portion of an image? I'm trying to detect areas that are similar in texture in an image, sort of a measure of "how closely similar are they?"
So the question is what information about the image (edge, pixel value, gradient etc.) can be taken as containing its texture information.
Please note that this is not based on template matching.
Wikipedia didn't give much details on actually implementing any of the texture analyses.
Do you want to find two distinct areas in the image that looks the same (same texture) or match a texture in one image to another?
The second is harder due to different radiometry.
Here is a basic scheme of how to measure similarity of areas.
You write a function which as input gets an area in the image and calculates scalar value. Like average brightness. This scalar is called a feature
You write more such functions to obtain about 8 - 30 features. which form together a vector which encodes information about the area in the image
Calculate such vector to both areas that you want to compare
Define similarity function which takes two vectors and output how much they are alike.
You need to focus on steps 2 and 4.
Step 2.: Use the following features: std() of brightness, some kind of corner detector, entropy filter, histogram of edges orientation, histogram of FFT frequencies (x and y directions). Use color information if available.
Step 4. You can use cosine simmilarity, min-max or weighted cosine.
After you implement about 4-6 such features and a similarity function start to run tests. Look at the results and try to understand why or where it doesnt work. Then add a specific feature to cover that topic.
For example if you see that texture with big blobs is regarded as simmilar to texture with tiny blobs then add morphological filter calculated densitiy of objects with size > 20sq pixels.
Iterate the process of identifying problem-design specific feature about 5 times and you will start to get very good results.
I'd suggest to use wavelet analysis. Wavelets are localized in both time and frequency and give a better signal representation using multiresolution analysis than FT does.
Thre is a paper explaining a wavelete approach for texture description. There is also a comparison method.
You might need to slightly modify an algorithm to process images of arbitrary shape.
An interesting approach for this, is to use the Local Binary Patterns.
Here is an basic example and some explanations : http://hanzratech.in/2015/05/30/local-binary-patterns.html
See that method as one of the many different ways to get features from your pictures. It corresponds to the 2nd step of DanielHsH's method.