Detecting a shape in noisy set of images, obtained by scanning - opencv

I have images of rocks of varied shape and texture, obtained by laser scanning (there are 300 images obtained by rotation). This means that the rock is effectively scanned in 360 degrees with slight rotation recorded by each in each image. There are 6 marks on the rock, circles, crosses, T, X's, and their combinations (an example is in the picture below). I want to automatically detect these shapes given a set of images (on at least 1/300 of them).
I've tried template matching with OpenCV, but I haven't had success. Do you know any methods or libraries that could solve this problem? I'm thinking about training a neural network for this.Example

Related

Point Cloud Stitching and Processing

I have pointclouds from two realsense D415 cameras, which is mounted in a way to have overlap. I am able to stitch the output pointclouds in realtime to get a single larger FOV pointcloud by using PCL ICP to find the transforms and transforming one pointcloud to match with the other. Now I would like to detect planes on this output pointcloud and further detect people with the help of the detected planes and ground plane. I have implemented the same with PCL libraries once again.
Now issues arise in two cases:
(1) The final stitched output is unordered leading to a lot of PCL functions not being able to use the pointcloud. To overcome this , say I resize my pointcloud to match the final dimensions with height not being 1, I run into (2).
(2) Upon passing the resized pointcloud I get an error from the normal algorithm I am using as Input not from a projective device, using only XXXX points (XXXX is a very small fraction of the total number of points in the pointcloud). Any other available normal algorithm performs really slow and is not capable of being used in real time applications.
Any ideas how to proceed with this? Happy to provide more information.

Homography when camera translation (for stitching)

I have a camera which I take with it 2 captures. I want do make a reconstitution with the 2 images in one image.
I only do a translation with the camera an take images of a plane TV screen. I heard homography only works when the camera does a rotation.
What should i do when I only have a translation?
Because you are imaging a planar surface (in your case a TV screen), all images of it with a perspective camera will be related by homographies. This is the same if your camera is translating and/or rotating. Therefore to stitch different images of the surface, you don't need to do any 3D geometry processing (essential matrix computation/triangulation etc.).
To solve your problem you need to do the following:
You determine the homographies between your images. Because you only have two images you can select the first one as the 'source' and the second one as the 'target', and compute the homography from target to source. This is classically done with feature detection and robust homography fitting. Let's denote this homography by the 3x3 matrix H.
You warp your target image to your source using H. You can do this in openCV with the warpPerspective method.
Merge your source and warped target using a blending function.
An open source project for doing exactly these steps is here.
If your TV lacks distinct features or there is lots of background clutter, the method for estimating H might not be highly robust. If this is the case you could manually click four or more correspondences on the TV in the target and source images, and compute H using OpenCV's findHomography method. Note that your correspondences cannot be completely arbitrary. Specifically, there should not be three correspondences that are colinear (in which case H cannot be computed). They should also be clicked as accurately as possible because errors will affect the final stitch and cause ghosting artefacts.
An important caveat is if your camera has significant lens distortion. In this case your images will not be related by homographies. You can deal with this by performing a calibration of your camera using OpenCV, and then you need to pre-process your images to undo the lens distortion (using OpenCV's undistort method).

Undistorting/rectify images with OpenCV

I took the example of code for calibrating a camera and undistorting images from this book: shop.oreilly.com/product/9780596516130.do
As far as I understood the usual camera calibration methods of OpenCV work perfectly for "normal" cameras.
When it comes to Fisheye-Lenses though we have to use a vector of 8 calibration parameters instead of 5 and also the flag CV_CALIB_RATIONAL_MODEL in the method cvCalibrateCamera2.
At least, that's what it says in the OpenCV documentary
So, when I use this on an array of images like this (Sample images from OCamCalib) I get the following results using cvInitUndistortMap: abload.de/img/rastere4u2w.jpg
Since the resulting images are cut out of the whole undistorted image, I went ahead and used cvInitUndistortRectifyMap (like it's described here stackoverflow.com/questions/8837478/opencv-cvremap-cropping-image). So I got the following results: abload.de/img/rasterxisps.jpg
And now my question is: Why is not the whole image undistorted? In some pics of my later results you can recognize that the laptop for example is still totally distorted. How can I acomplish even better results using the standard OpenCV methods?
I'm new to stackoverflow and I'm new to OpenCV as well, so please excuse any of my shortcomings when it comes to expressing my problems.
All chessboard corners should be visible to be found. The algorithm expect a certain size of chessboard such as 4x3 or 7x6 (for example). The white border around a chess board should be visible too or dark squares may not be defined precisely.
You still have high distortions at the image periphery after undistort() since distortions are radial (that is they increase with the radius) and your found coefficients are wrong. The latter are wrong since a calibration process minimizes the sum of squared errors in pixel coordinates and you did not represent the periphery with enough samples.
TODO: You have to have 20-40 chess board pattern images if you use 8 distCoeff. Slant your boards at different angles, put them at different distances and spread them around, especially at the periphery. Remember, the success of calibration depends on sampling and also on seeing vanishing points clearly from your chess board (hence slanting and tilting).

Calculating distance to plane and square's corners from image

I'm taking camera images of white paper with black square that has specified size (e.g. 10 cm). Image is taken with different distance to paper plane and with different camera angle.
Now I need to deduce from those images camera rotation, camera translation and distance to paper plane as well distance to squares corners.
I'm quite new to image processing so maybe somebody can direct me to some keywords, algorithms or basic math to look for or even OpenCV functions to investigate. On the paper there will be always some primitive objects like squares so I don't need some algorithms that will work any arbitary image but I will definitely need a fast algorithm.
To calculate camera rotation and translation you need to follow sevral steps that are always the same in this kind of problems:
Run a detector on a sample of the image (FAST)
Run a detector on all images you want to process, could be a frame captured from video.
Generate descriptors of points detected (SIFT).
Match descriptors with a matcher (flannMatcher)
Find homography form matched pairs (findHomography())
Find camera pose from homography.
You have some links to the methods in this tutorial.

Detection of Blur in Images/Video sequences

I had asked this on photo stackexchange but thought it might be relevant here as well, since I want to implement this programatically in my implementation.
I am trying to implement a blur detection algorithm for my imaging pipeline. The blur that I want to detect is both -
1) Camera Shake: Pictures captured using hand which moves/shakes when shutter speed is less.
2) Lens focussing errors - (Depth of Field) issues, like focussing on a incorrect object causing some blur.
3) Motion blur: Fast moving objects in the scene, captured using a not high enough shutter speed. E.g. A moving car a night might show a trail of its headlight/tail light in the image as a blur.
How can one detect this blur and quantify it in some way to make some decision based on that computed 'blur metric'?
What is the theory behind blur detection?
I am looking of good reading material using which I can implement some algorithm for this in C/Matlab.
thank you.
-AD.
Motion blur and camera shake are kind of the same thing when you think about the cause: relative motion of the camera and the object. You mention slow shutter speed -- it is a culprit in both cases.
Focus misses are subjective as they depend on the intent on the photographer. Without knowing what the photographer wanted to focus on, it's impossible to achieve this. And even if you do know what you wanted to focus on, it still wouldn't be trivial.
With that dose of realism aside, let me reassure you that blur detection is actually a very active research field, and there are already a few metrics that you can try out on your images. Here are some that I've used recently:
Edge width. Basically, perform edge detection on your image (using Canny or otherwise) and then measure the width of the edges. Blurry images will have wider edges that are more spread out. Sharper images will have thinner edges. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous paper that describes this approach well enough for a full implementation. If you're dealing with motion blur, then the edges will be blurred (wide) in the direction of the motion.
Presence of fine detail. Have a look at my answer to this question (the edited part).
Frequency domain approaches. Taking the histogram of the DCT coefficients of the image (assuming you're working with JPEG) would give you an idea of how much fine detail the image has. This is how you grab the DCT coefficients from a JPEG file directly. If the count for the non-DC terms is low, it is likely that the image is blurry. This is the simplest way -- there are more sophisticated approaches in the frequency domain.
There are more, but I feel that that should be enough to get you started. If you require further info on either of those points, fire up Google Scholar and look around. In particular, check out the references of Marziliano's paper to get an idea about what has been tried in the past.
There is a great paper called : "analysis of focus measure operators for shape-from-focus" (https://www.researchgate.net/publication/234073157_Analysis_of_focus_measure_operators_in_shape-from-focus) , which does a comparison about 30 different techniques.
Out of all the different techniques, the "Laplacian" based methods seem to have the best performance. Most image processing programs like : MATLAB or OPENCV have already implemented this method . Below is an example using OpenCV : http://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/
One important point to note here is that an image can have some blurry areas and some sharp areas. For example, if an image contains portrait photography, the image in the foreground is sharp whereas the background is blurry. In sports photography, the object in focus is sharp and the background usually has motion blur. One way to detect such a spatially varying blur in an image is to run a frequency domain analysis at every location in the image. One of the papers which addresses this topic is "Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes" (cvpr2017).
the authors look at multi resolution DCT coefficients at every pixel. These DCT coefficients are divided into low, medium, and high frequency bands, out of which only the high frequency coefficients are selected.
The DCT coefficients are then fused together and sorted to form the multiscale-fused and sorted high-frequency transform coefficients
A subset of these coefficients are selected. the number of selected coefficients is a tunable parameter which is application specific.
The selected subset of coefficients are then sent through a max pooling block to retain the highest activation within all the scales. This gives the blur map as the output, which is then sent through a post processing step to refine the map.
This blur map can be used to quantify the sharpness in various regions of the image. In order to get a single global metric to quantify the bluriness of the entire image, the mean of this blur map or the histogram of this blur map can be used
Here are some examples results on how the algorithm performs:
The sharp regions in the image have a high intensity in the blur_map, whereas blurry regions have a low intensity.
The github link to the project is: https://github.com/Utkarsh-Deshmukh/Spatially-Varying-Blur-Detection-python
The python implementation of this algorithm can be found on pypi which can easily be installed as shown below:
pip install blur_detector
A sample code snippet to generate the blur map is as follows:
import blur_detector
import cv2
if __name__ == '__main__':
img = cv2.imread('image_name', 0)
blur_map = blur_detector.detectBlur(img, downsampling_factor=4, num_scales=4, scale_start=2, num_iterations_RF_filter=3)
cv2.imshow('ori_img', img)
cv2.imshow('blur_map', blur_map)
cv2.waitKey(0)
For detecting blurry images, you can tweak the approach and add "Region of Interest estimation".
In this github link: https://github.com/Utkarsh-Deshmukh/Blurry-Image-Detector , I have used local entropy filters to estimate a region of interest. In this ROI, I then use DCT coefficients as feature extractors and train a simple multi-layer perceptron. On testing this approach on 20000 images in the "BSD-B" dataset (http://cg.postech.ac.kr/research/realblur/) I got an average accuracy of 94%
Just to add on the focussing errors, these may be detected by comparing the psf of the captured blurry images (wider) with reference ones (sharper). Deconvolution techniques may help correcting them but leaving artificial errors (shadows, rippling, ...). A light field camera can help refocusing to any depth planes since it captures the angular information besides the traditional spatial ones of the scene.

Resources