I am working on a task where the image I have presents missing data, I wish to obtain the gradient without internal boundary issues.
The idea is to build a height map out of point cloud data (done) and then evaluate the slopes using a gradient function, however the points are sparse and thus the image presents missing data.
The first approach I tried was to use dilation in order to grow the area by some pixels, then apply the gradient filter and finally mask the boundaries to remove fabricated data, but it seems to erodes slopes as well:
On this picture a height map is gererated from a point cloud which in turn comes from a stereo camera system, the camera is facing a steep wall. On the left is the height map and on the right is the dilated map. On the right side it seems the wall has been "pushed back".
What would be the best approach to eliminate the internal border conditions? I thought about dilating the values with a special function that ignores the "non-available pixels" (perhaps represented by 0 or -1) and takes the average of the surrounding available pixel values (if available). Is there such a function in OpenCV?
Related
I'm working on a project which locates the Machine Readable Zone on ID cards.
For this I need to do some pre processing to extract the ID card from a scanned image which typically are randomly disposed on a white page. I'm able to locate the majority of the cards by using a Histogram equalization with CLAHE before a contour detection. But in some cases the border around the MRZ is totally invisible (white on white) as shown on the attached image.
I'd like to detect rectangle of a predefined shape as I know the shape of the ID card will be always the same but so far I wasn't able to find a way do do something like this with OpenCV.
Basically what I need is to find two rectangle of a fixed ratio that best match the 2 cards on the scan.
I'm wondering if I need to try OpenCV matchers or if there is a simpler way to accomplish this kind of detection.
The solution to you problem is likely going to be matrix transformations. The concept is to pinpoint 4 coordinates on the card that can be easily detected using opencv, such as the the rectangle colored in blue & cyan.
Have coordinates of the card with the predefined shape stored in an array, where a corner of the card is at the 0, 0. Also store the coordinates of the blue * cyan rectangle in an array. With the two arrays you can find the perspective transform of the two arrays using the cv2.getPerspectiveTransform method.
Using the perspective transform found, you can detect the coordinates of the whole card every time you detect the coordinates of the blue & cyan rectangle.
I am trying to find a reliable method to calculate the corner points of a container. From these corner point’s idea is to calculate the center point of the container for the localization of robot, it means that the calculated center point will be the destination of robot in order to pick the container. For this I am looking for any suggestions to calculate the corner points or may be if any possibility to calculate the center point directly. Up to this point PCL library C/C++ is used for the processing of the 3D data.
The image below is the screenshot of the container.
thanks in advance.
afterApplyingPassthrough
I did the following things:
I binarized the image (black pixels = 0, green pixels = 1),
inverted the image (black pixels = 1, green pixels = 0),
eroded the image with 3x3 kernel N-times and dilated it with same kernel M-times.
Left: N=2, M=1;Right: N=6, M=6
After that:
I computed contours of all non-zero areas and
removed the contour that surrounded entire image.
This are the contours that remained:
I do not know how "typical" input image looks like in your case. Since I only have access to one sample image, I would rather not speculate about "general solution" that will be suitable for you. But to solve this particular case, you could analyze every contour in the following way:
compute rotatated rectangle that fits best around your contour (you need something similar to minAreaRect from OpenCV)
compute areas of rectangle and contour interior
if the difference between contour area and the area of the rotated bounding rectangle is small, the contour has approximately rectangular shape
find the contour that is both rectangular and satisfies some other condition (for example: typical area of the container). Assume that this belongs to container and compute its center.
I am not claiming that this is a solution that will work well in real world scenarios. It is also not fast. You should view it as a "sketch" that shows how to extract some useful information.
I assume the wheels maintain the cart a known offset from the floor and you can identify the floor. Filter out all points which are too close to the floor (this will remove wheels and everything but cart which will help limit data and simplify later steps.
If you isolate the cart, you could apply a simple average point (centroid), alternately, if that is not precise, you could try finding the bounding box of the isolated cart (min max in primary directions) and then take the centroid of that bounding box (this should be more accurate, but will still need a slight vertical offset due to the top handles).
If you can not isolate the cart or the other methods are not working well, you could try using PCL sample consensus specifically SACMODEL_LINE. This will be an involved strategy, but will give very solid results, basically run through and find each line and subtract its members from the cloud so as to find the next best line. After you have your 4 primary cart lines, use their parameters to find your centroid. *this would also be robust against random items being in or on the cart as well as carts of various sizes (assuming they always had linear perpendicular walls)
I am doing a project in opencv to detect handwritten characters from a user filled form. I have made algorithm to detect the skew angle of the scanned image using Hough Line Transform. But it does not work when the image is 180 degree rotated since 0 and 180 degree are treated as same by Hough Line function. My image contains some rectangles to fill data in them and some text. So how do i detect if a scanned image is 180 degree rotated or not?
Since I will have to first correct the skew angle of the image then only I can detect exactly where on the image user filled data (which I need to extract) lies using rectangle coordinates from the empty template form provided earlier, answers without using chacater recognition are appreciated.
To lift the 180° degrees ambiguity, only OCR can tell you: perform two reads on the deskewed text, one using the given angle, the other one using the angle + 180°, and keep the most successful read.
Unless you have some a priori information it's the only way, as other image processing operations don't know about characters.
UPDATE:
Some strings are forever ambiguous, like 0689HINOSXZ <=> ZXSONIH6890.
If the layout of the text is known (boxes) and asymmetric, it is a relatively easy matter to check matching of the text strings to the layout: choose a box (such as the topmost) and a string (the topmost), and align them by translation; then see how the other boxes and strings match (using a nearest neighbor rule) and establish the correspondences. Compare results with the straight and flipped layout, and keep the best overall area of overlap.
For reliability, it can be better to try more than a starting box/string pair, as there can be some ambiguity to which is the topmost (it could even be missing).
Isn't your problem more general? Let's say, you detect a skew angle of +45 degrees and rotate the image by -45 degrees. Then it could still be that the image is rotated by 180 degrees because it was not rotated +45 degrees but -135 instead.
Anyway, to the actual question: I am not an expert in character recognition but I think if you use it anyway in your application, couldn't you just try character recognition for both rotations and then choose the one that gets stronger response?
If you match the rectangles in your template with those of the skew corrected image, you'll be able to get the correct orientation (but only if there's no symmetry in the placement of those rectangles). For matching you may be able to use the rectangles in your template as a mask to extract regions from skew corrected image.
EDIT
Suppose your template and the skew corrected image look like this (in the best case where there are no displacements in skew corrected) :
Then you can use the template as a mask to copy data from skew corrected image. Then check what fraction of the white pixels in the template is contained in the copied image. This value will be very low for a 180 degree rotated image.
But as you say, this won't work in practice because of the displacements. Then may be you can try template matching (cross correlation) in which you use the template image as the template. Location of the strongest peak and the strength would give you some indication of the orientation. You can perform template matching at a reduced resolution so it runs faster.
You could try to match keypoints (Harris, Sift, ...) from the scanned image and the empty template. With the matched points you can easily find a transformation to align the scanned image with the template. This may work for your case, but you are more likely to succeed if the are some textured logos in the images, as it's usually the case for forms.
Can't you simple compute two cross-correlations? One with 180 rotation and one without? The one with the matching rectangle should give you a higher correlation maximum (provided the image contrast of the remaining page is not too misleading, but some pre-filtering could help here.)
What is Distance Transform?What is the theory behind it?if I have 2 similar images but in different positions, how does distance transform help in overlapping them?The results that distance transform function produce are like divided in the middle-is it to find the center of one image so that the other is overlapped just half way?I have looked into the documentation of opencv but it's still not clear.
Look at the picture below (you may want to increase you monitor brightness to see it better). The pictures shows the distance from the red contour depicted with pixel intensities, so in the middle of the image where the distance is maximum the intensities are highest. This is a manifestation of the distance transform. Here is an immediate application - a green shape is a so-called active contour or snake that moves according to the gradient of distances from the contour (and also follows some other constraints) curls around the red outline. Thus one application of distance transform is shape processing.
Another application is text recognition - one of the powerful cues for text is a stable width of a stroke. The distance transform run on segmented text can confirm this. A corresponding method is called stroke width transform (SWT)
As for aligning two rotated shapes, I am not sure how you can use DT. You can find a center of a shape to rotate the shape but you can also rotate it about any point as well. The difference will be just in translation which is irrelevant if you run matchTemplate to match them in correct orientation.
Perhaps if you upload your images it will be more clear what to do. In general you can match them as a whole or by features (which is more robust to various deformations or perspective distortions) or even using outlines/silhouettes if they there are only a few features. Finally you can figure out the orientation of your object (if it has a dominant orientation) by running PCA or fitting an ellipse (as rotated rectangle).
cv::RotatedRect rect = cv::fitEllipse(points2D);
float angle_to_rotate = rect.angle;
The distance transform is an operation that works on a single binary image that fundamentally seeks to measure a value from every empty point (zero pixel) to the nearest boundary point (non-zero pixel).
An example is provided here and here.
The measurement can be based on various definitions, calculated discretely or precisely: e.g. Euclidean, Manhattan, or Chessboard. Indeed, the parameters in the OpenCV implementation allow some of these, and control their accuracy via the mask size.
The function can return the output measurement image (floating point) - as well as a labelled connected components image (a Voronoi diagram). There is an example of it in operation here.
I see from another question you have asked recently you are looking to register two images together. I don't think the distance transform is really what you are looking for here. If you are looking to align a set of points I would instead suggest you look at techniques like Procrustes, Iterative Closest Point, or Ransac.
I'm new to image processing and I'm working on detecting lines in a document image. I read the theory of Hough line transform but I can't see why I must use Canny before calling that function in opencv like being said in many tutorials. What's the point of finding edges in this case? The fact is that if I don't use Canny or threshold before HoughLines() the results will be very messy. I hope someone will explain for me the reason why.
2 of the tutorials I've read:
Imgproc Feature Detection
Hough Line Transform
Short Answer
cvCanny is used to detect Edges, as well as increase contrast and remove image noise.
HoughLines which uses the Hough Transform is used to determine whether those edges are lines or not. Hough Transform requires edges to be detected well in order to be efficient and provide meaning results.
Long Answer
The Limitations of the Hough Transform are described in more detail on Wikipedia.
The efficiency of the Hough Transform relies of the bin of acculumated pixel being distinct, e.g. a direct contrast between a pixel and its surrounding neighbours or if using a mask region a pixel region and its surrounds regions. If all pixels had similar acculumated values nothing would stand out as a line or circle. This leads to the reduction of colour (colour to grayscale, grayscale to black and white) in order to increase contract.
The number of parameters to the Hough Transform also increase the spread of votes in the pixel bins and increase the complexity of the transform, which mean that normally only lines or circles are reliably detected using it as they have less than 3 parameters.
The edges need to be detected well before running the Hough Transform otherwise its efficiency suffers further. Also noisy images don't work well with Hough transform unless the noise is removed before hand.
First of all, to detect lines you need to work on a boolean matrix image (or binary), I mean: the color is black or white, there's no grayscale.
HoughLines()'s requirement to work properly is to have this kind of image as input. That's the reason you have to use Canny or Treshold, to convert the colored image matrix into a boolean one.
Hough transformation
A line in one picture is actually an edge. Hough transform scans the whole image and using a transformation that converts all white pixel cartesian coordinates in polar coordinates; the black pixels are left out. So you won't be able to get a line if you first don't detect edges, because HoughLines() don't know how to behave when there's a grayscale.
Theoretically, you are correct. Finding edges is not absolutely required for the Hough Line algorithm to work.
The way the Hough works is basically it takes every point and connects it to every other point, and whatever points have the most lines going through them, those lines stay. For this, we need points. The Canny creates those points. Theoretically you could use any sort of filter - isolate all blue or purple points and connect them, whatever - but edges works well.
The Hough also does not weight its lines or points. To the Hough, an image is binary - made up of either 1s or 0, points or not points. There is no need for greyscale, and the canny conveniently returns binary images.
Thus is the Canny always part of the Hough.
all is about processing binary data,
complex data -> (a binary data, b binary data, c binary data, ..) (using canny(),sobel(), etc)
a binary data -> function1() (using houghlines())
b binary data -> function2()
c binary data -> function3() ..
a binary data -X-> function2() ..
complex data -X-> function1() ..
HTH