I am trying to build a document scanner using openCV. I am trying to auto crop an uploaded image. I have few use cases where there is a gap in the border when the document is out of frame(captured image).
Ex image
Below is the canny edge detection of the given image.
The borders are missing here and findContours does not return me proper results due to this.
How can I handle such images.
Both automatic canny edge detection as well as dilate does not work in such cases because it can join only small edges.
Also few documents might have only 2 sides or 3 sides captured using camera and how can we crop the other areas which is not required.
Example Image:
Is there any specific technique for handling such documents?
Please suggest few ideas.
Your problem is unusual. One way to solve this problem which comes to my mind is to:
Add white borders around image.
https://docs.opencv.org/3.4/dc/da3/tutorial_copyMakeBorder.html
Find lines in edges
http://www.robindavid.fr/opencv-tutorial/chapter5-line-edge-and-contours-detection.html
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
Make Probablistic HoughLines
Crop image by these lines. It will for sure work for image like 1st one.
For image like 2nd one you can use perpendicular and parallel lines.
For sure your algorithm must be pretty complex to works good. The easiest way is to take a picture of whole document if it is possible.
Good luck!
I'm working on a project with ARKit and I'm trying to do a perspective correction of the ARFrame.capturedImage to orient a piece of paper sitting on a detected plane so I can feed that into a CoreML model which expects images to be taken from directly overhead.
ARKit gives me the device orientation relative to the plane (ARCamera.transform, ARCamera.eulerAngles, and ARCamera.projectionMatrix all look promising).
So I have the orientation of the camera (and I know the plane is horizontal since that's all ARKit detects right now).. but I can't quite figure out how to create a GLKMatrix4 that will perform the correct perspective correction.
Originally I thought it would be as easy as transforming by the inverse of ARCamera.projectionMatrix but that doesn't appear to work at all; I'm not entirely sure what that matrix is describing.. it doesn't seem to change much based on the device orientation.
I've tried creating my own matrix using GLKMatrix4Rotate and the roll/pitch/yaw but that didn't work.. I couldn't even get it working with a single axis of rotation.
I found GLKMatrix4MakePerspective, GLKMatrix4MakeOrtho, and GLKMatrix4MakeFrustum which seem to do perspective transforms but I can't figure out how to take the information I have and translate it to the inputs of those functions to make the proper perspective transformation.
Edit:
As an example to better explain what I'm trying to do, I used the Perspective Warp tool in Photoshop to transform an example image; what I want to know is how to come up with a matrix that will perform a similar transform given the info I have about the scene.
I ended up using iOS11 Vision's Rectangle Detection and then feeding it into Core Image's CIPerspectiveCorrection filter.
I solved using OpenCV perspective transformation. (https://docs.opencv.org/trunk/da/d6e/tutorial_py_geometric_transformations.html,https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html#getperspectivetransform)
If you're able to get the corners of your paper in the scene (for example with an ARReferenceImage and project them in 2D), take them. Otherwise you can try to detect the corners through OpenCV directly (see https://stackoverflow.com/a/12636153/9298773) from the UIImage taken from sceneView.snapshot() with sceneView of type ARSceneView. In this last case I'd suggest you to binarize first and to change the MAX_CORNERS variable in the snippet at the link above to 4 (the 4 corners of your paper).
Then create a new cv::Mat with width and height of your choice respecting the proportion width and height of your paper and do perspective transform. For a guideline of this last paragraph, take a look at the section "Perspective Correction using Homography" at this link: https://www.learnopencv.com/homography-examples-using-opencv-python-c/#download. Succintly: you ask opencv to find an appropriate transform to project your prospected paper points into a perfectly rectangular plane (your new cv::Mat)
I'd like to register two images with the findTransformECC function offered by OpenCV.
My images have a irregular surrounding border I'd like to mask. I worked with feature based matching functions from the Feature2D-library and findHomography which worked quite well and offered a masking of image parts that should not be taken into account for estimating the transformation parameters.
findTransformECC doesn't offer such a masking, therefore I clipped the images by a centered rectangular shape. The clipped images are aligned very well after transformation. Since I'm using MOTION_EUCLIDEAN - which is just a rotation and translation - I thought I could use exactly the same transformation matrix for aligning the images of original extent - but I was proved wrong. The images aren't correct aligned after transforming them. The orientation of transformed images seem to be OK but images show a wrong translation. My thought was when input-images are clipped with exactly the same centered clipping-area and the rotation is performed around the center the final translation operation should fit as well?
Any suggestions appreciated.
In OpenCV 3.* masking is possible with the findTransformECC function. You can use the argument inputMask in the function.
I am trying to apply http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html this on an image using OpenCV on android.
The problem is that In that guide they use a 5x5 filter Gaussian. Now I know that you lose pixels (the edge ones if you apply a 3x3, you lose one pixel) if you apply a 5x5 filter you are going to lose 2 right?
OpenCV although seems to keep rendering them even after the edge detection has been applied. How do they do it?
Now I know that you lose pixels, right?
No, you don't lose pixels.
Usually this is achieved by creating a border around the image before applying the filter.
I have an application which requires that a solid black outline be drawn around a partly-transparent UIImage. Not around the frame of the image, but rather around all the opaque parts of the image itself. I.e., think of a transparent PNG with an opaque white "X" on it -- I need to outline the "X" in black.
To make matters trickier, AFTER the outline is drawn, the opacity of the original image will be adjusted, but the outline must remain opaque -- so the outline I generate has to include only the outline, and not the original image.
My current technique is this:
Create a new UIView which has the dimensions of the original image.
Duplicate the UIImage 4 times and add the duplicates as subviews of the UIView, with each UIImage offset diagonally from the original location by a couple pixels.
Turn that UIView into an image (via the typical UIGraphicsGetImageFromCurrentImageContext method).
Using CGImageMaskCreate and CGImageCreateWithMask, subtract the original image from this new image, so only the outline remains.
It works. Even with only the 4 offset images, the result looks quite good. However, it's horribly inefficient, and causes a good solid 4-second delay on an iPhone 4.
So what I need is a nice, speedy, efficient way to achieve the same thing, which is fully supported by iOS 4.0.
Any great ideas? :)
I would like to point out that whilst a few people have suggested edge detection, this is not an appropriate solution. Edge detection is for finding edges within image data where there is no obvious exact edge representation in the data.
For you, edges are more well defined, you are looking for the well defined outline. An edge in your case is any pixel which is on a fully transparent pixel and next to a pixel which is not fully transparent, simple as that! iterate through every pixel in the image and set them to black if they fulfil these conditions.
Alternatively, for an anti-aliased result, get a boolean representation of the image, and pass over it a small anti-aliased circle kernel. I know you said custom filters are not supported, but if you have direct access to image data this wouldn't be too difficult to implement by hand...
Cheers, hope this helps.
For the sake of contributing new ideas:
A variant on your current implementation would use CALayer's support for shadows, which it calculates from the actual pixel contents of the layer rather than merely its bounding rectangle, and for which it uses the GPU. You can try amping up the shadowOpacity to some massive value to try to eliminate the feathering; failing that you could to render to a suitable CGContext, take out the alpha layer only and manually process it to apply a threshold test on alpha values, pushing them either to fully opaque or fully transparent.
You can achieve that final processing step on the GPU even under ES 1 through a variety of ways. You'd use the alpha test to apply the actual threshold, you could then, say, prime the depth buffer to 1.0, disable colour output and the depth test, draw the version with the shadow at a depth of 0.5, draw the version without the shadow at a depth of 1.0 then enable colour output and depth tests and draw a solid black full-screen quad at a depth of 0.75. So it's like using the depth buffer to emulate stencil (since the GPU Apple used before the ES 2 capable device didn't support a stencil buffer).
That, of course, assumes that CALayer shadows appear outside of the compositor, which I haven't checked.
Alternatively, if you're willing to limit your support to ES 2 devices (everything 3GS+) then you could upload your image as a texture and do the entire process over on the GPU. But that would technically leave some iOS 4 capable devices unsupported so I assume isn't an option.
You just need to implement an edge detection algorithm, but instead of using brightness or color to determine where the edges are, use opacity. There are a number of different ways to go about that. For example, you can look at each pixel and its neighbors to identify areas where the opacity crosses whatever threshold you've set. Whenever you need to look at every pixel of an image in MacOS X or iOS, think Core Image. There's a helpful series of blog posts starting with this one that looks at implementing a custom Core Image filter -- I'd start there to build an edge detection filter.
instead using UIView, i suggest just push a context like following:
UIGraphicsBeginImageContextWithOptions(image.size,NO,0.0);
//draw your image 4 times and mask it whatever you like, you can just copy & paste
//current drawing code here.
....
outlinedimage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
this will be much faster than your UIView.