find rectangle coordinates in a given image - machine-learning

I'm trying to blindly detect signals in a spectra.
one way that came to my mind is to detect rectangles in the waterfall (a 2D matrix that can be interpret as an image) .
Is there any fast way (in the order of 0.1 second) to find center and width of all of the horizontal rectangles in an image? (heights of rectangles are not considered for me).
an example image will be uploaded (Note I know that all rectangles are horizontal.
I would appreciate it if you give me any other suggestion for this purpose.
e.g. I want the algorithm to give me 9 center and 9 coordinates for the above image.

Since the rectangle are aligned, you can do that quite easily and efficiently (this is not the case with unaligned rectangles since they are not clearly separated). The idea is first to compute the average color of each line and for each column. You should get something like that:
Then, you can subtract the background color (blue), compute the luminance and then compute a threshold. You can remove some artefact using a median/blur before.
Then, you can just scan the resulting 1D array filled with binary values so to locate where each rectangle start/stop. The center of each rectangle is ((x_start+x_end)/2, (y_start+y_end)/2).

Related

Detecting contours of predefined shape with OpenCV

I'm working on a project which locates the Machine Readable Zone on ID cards.
For this I need to do some pre processing to extract the ID card from a scanned image which typically are randomly disposed on a white page. I'm able to locate the majority of the cards by using a Histogram equalization with CLAHE before a contour detection. But in some cases the border around the MRZ is totally invisible (white on white) as shown on the attached image.
I'd like to detect rectangle of a predefined shape as I know the shape of the ID card will be always the same but so far I wasn't able to find a way do do something like this with OpenCV.
Basically what I need is to find two rectangle of a fixed ratio that best match the 2 cards on the scan.
I'm wondering if I need to try OpenCV matchers or if there is a simpler way to accomplish this kind of detection.
The solution to you problem is likely going to be matrix transformations. The concept is to pinpoint 4 coordinates on the card that can be easily detected using opencv, such as the the rectangle colored in blue & cyan.
Have coordinates of the card with the predefined shape stored in an array, where a corner of the card is at the 0, 0. Also store the coordinates of the blue * cyan rectangle in an array. With the two arrays you can find the perspective transform of the two arrays using the cv2.getPerspectiveTransform method.
Using the perspective transform found, you can detect the coordinates of the whole card every time you detect the coordinates of the blue & cyan rectangle.

Calculation of center point for the localization of robot in 3D data

I am trying to find a reliable method to calculate the corner points of a container. From these corner point’s idea is to calculate the center point of the container for the localization of robot, it means that the calculated center point will be the destination of robot in order to pick the container. For this I am looking for any suggestions to calculate the corner points or may be if any possibility to calculate the center point directly. Up to this point PCL library C/C++ is used for the processing of the 3D data.
The image below is the screenshot of the container.
thanks in advance.
afterApplyingPassthrough
I did the following things:
I binarized the image (black pixels = 0, green pixels = 1),
inverted the image (black pixels = 1, green pixels = 0),
eroded the image with 3x3 kernel N-times and dilated it with same kernel M-times.
Left: N=2, M=1;Right: N=6, M=6
After that:
I computed contours of all non-zero areas and
removed the contour that surrounded entire image.
This are the contours that remained:
I do not know how "typical" input image looks like in your case. Since I only have access to one sample image, I would rather not speculate about "general solution" that will be suitable for you. But to solve this particular case, you could analyze every contour in the following way:
compute rotatated rectangle that fits best around your contour (you need something similar to minAreaRect from OpenCV)
compute areas of rectangle and contour interior
if the difference between contour area and the area of the rotated bounding rectangle is small, the contour has approximately rectangular shape
find the contour that is both rectangular and satisfies some other condition (for example: typical area of the container). Assume that this belongs to container and compute its center.
I am not claiming that this is a solution that will work well in real world scenarios. It is also not fast. You should view it as a "sketch" that shows how to extract some useful information.
I assume the wheels maintain the cart a known offset from the floor and you can identify the floor. Filter out all points which are too close to the floor (this will remove wheels and everything but cart which will help limit data and simplify later steps.
If you isolate the cart, you could apply a simple average point (centroid), alternately, if that is not precise, you could try finding the bounding box of the isolated cart (min max in primary directions) and then take the centroid of that bounding box (this should be more accurate, but will still need a slight vertical offset due to the top handles).
If you can not isolate the cart or the other methods are not working well, you could try using PCL sample consensus specifically SACMODEL_LINE. This will be an involved strategy, but will give very solid results, basically run through and find each line and subtract its members from the cloud so as to find the next best line. After you have your 4 primary cart lines, use their parameters to find your centroid. *this would also be robust against random items being in or on the cart as well as carts of various sizes (assuming they always had linear perpendicular walls)

Understanding Distance Transform in OpenCV

What is Distance Transform?What is the theory behind it?if I have 2 similar images but in different positions, how does distance transform help in overlapping them?The results that distance transform function produce are like divided in the middle-is it to find the center of one image so that the other is overlapped just half way?I have looked into the documentation of opencv but it's still not clear.
Look at the picture below (you may want to increase you monitor brightness to see it better). The pictures shows the distance from the red contour depicted with pixel intensities, so in the middle of the image where the distance is maximum the intensities are highest. This is a manifestation of the distance transform. Here is an immediate application - a green shape is a so-called active contour or snake that moves according to the gradient of distances from the contour (and also follows some other constraints) curls around the red outline. Thus one application of distance transform is shape processing.
Another application is text recognition - one of the powerful cues for text is a stable width of a stroke. The distance transform run on segmented text can confirm this. A corresponding method is called stroke width transform (SWT)
As for aligning two rotated shapes, I am not sure how you can use DT. You can find a center of a shape to rotate the shape but you can also rotate it about any point as well. The difference will be just in translation which is irrelevant if you run matchTemplate to match them in correct orientation.
Perhaps if you upload your images it will be more clear what to do. In general you can match them as a whole or by features (which is more robust to various deformations or perspective distortions) or even using outlines/silhouettes if they there are only a few features. Finally you can figure out the orientation of your object (if it has a dominant orientation) by running PCA or fitting an ellipse (as rotated rectangle).
cv::RotatedRect rect = cv::fitEllipse(points2D);
float angle_to_rotate = rect.angle;
The distance transform is an operation that works on a single binary image that fundamentally seeks to measure a value from every empty point (zero pixel) to the nearest boundary point (non-zero pixel).
An example is provided here and here.
The measurement can be based on various definitions, calculated discretely or precisely: e.g. Euclidean, Manhattan, or Chessboard. Indeed, the parameters in the OpenCV implementation allow some of these, and control their accuracy via the mask size.
The function can return the output measurement image (floating point) - as well as a labelled connected components image (a Voronoi diagram). There is an example of it in operation here.
I see from another question you have asked recently you are looking to register two images together. I don't think the distance transform is really what you are looking for here. If you are looking to align a set of points I would instead suggest you look at techniques like Procrustes, Iterative Closest Point, or Ransac.

Deforming an image so that curved lines become straight lines

I have an image with free-form curved lines (actually lists of small line-segments) overlayed onto it, and I want to generate some kind of image-warp that will deform the image in such a way that these curves are deformed into horizontal straight lines.
I already have the coordinates of all the line-segment points stored separately so they don't have to be extracted from the image. What I'm looking for is an appropriate method of warping the image such that these lines are warped into straight ones.
thanks
You can use methods similar to those developed here:
http://www-ui.is.s.u-tokyo.ac.jp/~takeo/research/rigid/
What you do, is you define an MxN grid of control points which covers your source image.
You then need to determine how to modify each of your control points so that the final image will minimize some energy function (minimum curvature or something of this sort).
The final image is a linear warp determined by your control points (think of it as a 2D mesh whose texture is your source image and whose vertices' positions you're about to modify).
As long as your energy function can be expressed using linear equations, you can globally solve your problem (figuring out where to send each control point) using linear equations solver.
You express each of your source points (those which lie on your curved lines) using bi-linear interpolation weights of their surrounding grid points, then you express your restriction on the target by writing equations for these points.
After solving these linear equations you end up with destination grid points, then you just render your 2D mesh with the new vertices' positions.
You need to start out with a mapping formula that given an output coordinate will provide the corresponding coordinate from the input image. Depending on the distortion you're trying to correct for, this can get exceedingly complex; your question doesn't specify the problem in enough detail. For example, are the curves at the top of the image the same as the curves on the bottom and the same as those in the middle? Do horizontal distances compress based on the angle of the line? Let's assume the simplest case where the horizontal coordinate doesn't need any correction at all, and the vertical simply needs a constant correction based on the horizontal. Here x,y are the coordinates on the input image, x',y' are the coordinates on the output image, and f() is the difference between the drawn line segment and your ideal straight line.
x = x'
y = y' + f(x')
Now you simply go through all the pixels of your output image, calculate the corresponding point in the input image, and copy the pixel. The wrinkle here is that your formula is likely to give you points that lie between input pixels, such as y=4.37. In that case you'll need to interpolate to get an intermediate value from the input; there are many interpolation methods for images and I won't try to get into that here. The simplest would be "nearest neighbor", where you simply round the coordinate to the nearest integer.

Stretch an image to fit in any quadrangle

The application PhotoFiltre has an option to stretch part of an image. You select a rectangular shape and you can then grab and move the vertexes somewhere else to make any quadrangle. The image part which you selected will stretch along. Hopefully these images make my point a little clearer:
Is there a general algorithm which can handle this? I would like to obtain the same effect on HTML5 canvas - given an image and the resulting corner points, I would like to be able to draw the stretched image in such a way that it fills the new quadrangle neatly.
A while ago I asked something similar, where the solution was to divide the image up in triangles and stretch each triangle so that each three points correspond to the three points on the original image. This technique turned out to be rather exprensive and I would like if there is a more general method of accomplishing this.
I would like to use this in a 3D renderer, but I would like to work with a (2D) quadrangle.
I don't know whether PhotoFiltre internally also uses triangles, or whether it uses another (cheaper) algorithm to stretch an image like this.
Does someone perhaps know if there is a cheaper or more general method/algorithm to stretch a rectangular image, so that it fills a quadrangle given four points?
The normal method is to start with the destination, pick an appropriate grid size and then for each point in the new shape calculate the corresponding point in the source image (possibly with interpolation depending on the quality you need)
Affine transform.
Given four points for the "stretched" figure and four points for the figure it should match (e.g. a rectangle), an affine transform provides the spatial mapping you need. For each point (x1,y1) in the original image there is a corresponding point (x2,y2) in the second, "stretched" image.
For each integer-valued pixel (x2, y2) in the stretched image, use the affine transform to find the corresponding real-valued point (x1, y1) in the original image and apply its color to (x2,y2).
http://demonstrations.wolfram.com/AffineTransform/
You'll find sample code for Java and other languages online. .NET has the Matrix class.

Resources