Given an image consisting of text in both horizontal and vertical orientation, I want to detect in which bounding box vertical text is present and what is the orientation of the corresponding box's text itself (top to bottom or horizontal stacked vertical).
So far what i have come up with helps only in detecting box's orientation:
Use sobel edge detector to get edges of the text and then dilate them. I can then perform connected component analysis to get the bounding of each component.
By comparing the bounding box's width and height, I could tell if the box's orientation is horizontal or vertical(assuming the text is written close so that after dilation i would get the whole text as a single component and not only one character per bounding box).[Sample Output of dilated edge mask]
But this only works for the case when text is written close to be detected as a single component (what if the text is of large size and has big spaces among characters) and only tells the orientation of box.
I have seen a lot of posts and research papers but most of them are concerned about the case where the whole document is rotated at some angle or are using ml.
I just need some heuristics using image processing, which would help me in detecting the above with some reasonable accuracy.
Related
I need to find orientation of corn pictures (as examples below) they have different angles to right or left. I need to turn them upside (90 degree angle with their normal) (when they look like a water drop)
Is there any way I can do it easily?
As starting point - find image moments (and Hu moments for complex forms like pear). From the link:
Information about image orientation can be derived by first using the
second order central moments to construct a covariance matrix.
I suspect that usage of some image processing library like OpenCV could give more reliable results in common case
From the OP I got the impression you a rookie in this so I stick to something simple:
compute bounding box of image
simple enough go through all pixels and remember min,max of x,y coordinates of non background pixels
compute critical dimensions
Just cast few lines through the bounding box computing the red points positions. So select the start points I choose 25%,50%,75% of height. First start from left and stop on first non background pixel. Then start from right and stop on first non background pixel.
axis aligned position
start rotating the image with some step remember/stop on position where the red dots are symmetric so they are almost the same distance from left and from right. Also the bounding box has maximal height and minimal width in axis aligned position so you can also exploit that instead ...
determine the position
You got 4 options if I call the distance l0,l1,l2,r0,r1,r2
l means from left, r means from right
0 is upper (bluish) line, 1 middle, 2 bottom
then you wanted position is if (l0==r0)>=(l1==r1)>=(l2==r2) and bounding box is bigger in y axis then in x axis so rotate by 90 degrees until match is found or determine the orientation directly from distances and rotate just once ...
[Notes]
You will need accessing pixels of image so I strongly recommend to use Graphics::TBitmap from VCL. Look here gfx in C specially the section GDI Bitmap and also at this finding horizon on high altitude photo might help a bit.
I use C++ and VCL so you have to translate to Pascal but the VCL stuff is the same...
I'd like to figure out a method for finding bounding boxes of words or a pair of words in binary image. The image itself looks like this: (bounding boxes I need are marked by blue rectangles).
Image is free of any other objects. I'm thinking about some form of connected component analysis, like detecting single letters first, then "drawing" their bounding boxes on another Mat object in such a way that neighbouring letters connect. There is a useful information I'd like to utilize - word or a pair of words forms a horizontal line, which is an information that could be used to separate "Hello there" and "abcdf" - I just don't know how to do it.
Contour the image.
Pick contours with a suitable area and width/height to be letters - get coords of centers.
From list of centers decide how far apart 2 centers can be to be adjacent letters
rather than a gap.
Group these contours into a word and take their
bounding box
Opencv has clustering, contour area and bounding box funcs if you don't want to do it yourself
Do OX-dilation using window size N, where N is approximate 1..2 size of letter width, then you will have black filled "boxes".
Find contours ( see http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html ).
Find rectangles and correct its with (minus approx 1 size of letter width) due to dilation width enlargement.
I am trying to crop a picture on right on along the contour. The object is detected using surf features and than i want to crop the image of extactly as detected.
When using crop some outside boundaries of other object is includes. I want to crop along the green line below. OpenCV has RotatedRect but i am unsure if its good for cropping.
Is there way to perfectly crop along the green line
I assume you get you get your example from http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html, so what you can do is to find the minimum axis aligned bounding box around the green bounding box, crop it from the image, use the inverted homography (H.inv()) matrix to transform that sub image into a new image (call cv::warpPerspective), and then crop your green bounding box (it should be axis aligned in your new image).
You can get the equations of the lines from the end points for each. Use these equations to check whether any given pixel lies within the green box or not i.e. does it lie between the left and right lines and between the top and bottom lines. Run this over the entire image and reset anything that doesn't lie within the box to black.
Not sure about in-built functionality to do this, but this simple methodology is guaranteed to work. For higher accuracy, you may want to consider sub-pixel checks.
Let's say I take a picture of two hammers side-by-side (although they may be aligned differently, but always one on the right and one on the left), wherein each might look like this, and I want to calculate the ratio of the lengths of the handles of the hammers.
For example, the output from an input image would be the length of the red part of the one on the left (its handle) divided by the length of the handle of the one on the right.
How would I go about doing this?
If you know the handle color it doesn't sound hard. Just select those pixels and take the longer side of a minimum oriented bounding box.
Here are a couple of hints:
Make sure that the bounding boxes of the hammers don't overlap. If you can guarantee this, try this approach:
Scale the image to width=10%, height=10px. Find the largest amount of pixels in background color near the middle of the image. That allows you to separate the two hammers into individual images. Multiply the positions by 10 to transform them back into coordinates of the original image.
Create two images (one for each hammer)
Crop the border
Scale the image to width = 10px, height = 10%. Count all reddish pixels (save the image and examine the pixel values for red and non-red parts to get an idea what to look for)
Consider that we are given an isometric grid (consider something like Diablo) of tiles. We have some measures for the grid, like grid height, grid width and tile height/width. Consider this image:
The center cell of the grid is 0,0 extending iso-north (+y), iso-south(-y), iso-east(+x), iso-west(-x).
Let's say we to draw a rectangle at an arbitrary location on the grid. We do NOT have the isometric positions for the rectangle, but rather have the normal draw coordinates for the grid where the top left hand corner is 0,0 and south is y+, right is x+.
If we had the top, left, height, width of the rectangle in question - how could we calculate an array of iso-cells that crossed by the bottom edge of the rectangle.
Any language you choose to demonstrate this will suffice.
In some papers and books about isometric programming (Isometric programming with Direct X7, yes its old but gives an overview about the problems and techniques) they use mousemaps.
Also there is the technique to render the area of the map covered by the rectangle into an image, each tile gets a unique color (and it is just the color rendered). Afterwards they check which colors are in the image and so extract the list of tiles.
Since you are using a classic isometric tile width half height there could be a mathematical solution too. Unfortunatly an suggested algorithm would depend heavily on your map layout.
The code for a Java based TileSystem can be found here