Image plotted by plt.imshow() is inverted while same image by cv2_imshow() is fine, how do I know what my neural net gets? - opencv

Here is my snippet for both of them
from google.colab.patches import cv2_imshow
import cv2
pt = '/content/content/DATA/testing_data/1/126056495_AO_BIZ-0000320943-Process_IP_Cheque_page-0001.jpg' ##param
img = cv2.imread(pt)
cv2_imshow(img)
and here is the other one
import matplotlib.image as mpimg
pt = '/content/content/DATA/testing_data/1/126056495_AO_BIZ-0000320943-Process_IP_Cheque_page-0001.jpg'
image = mpimg.imread(pt)
plt.imshow(image)
Now, the image in second case is inverted
and image on my system is upright
What I am mostly afraid of is, if my ML model is consuming inverted image, that is probably messing with my accuracy. What could possibly be the reason to It and how do I fix it
(ps: I cannot share the pictures unfortunately, as they are confidential )
(Run on google colab)
All the help is appreciated

Your picture is upside-down when you use one method for reading, and upright when you use the other method?
You use two different methods to read the image file:
OpenCV cv.imread()
Mediapipe mpimg.imread()
They behave differently. OpenCV's imread() respects file metadata and rotates the image as instructed. Mediapipe's function does not.
Solution: Stick to OpenCV's imread(). Don't use Mediapipe's function.
The issue is not with matplotlib. When plt.imshow() is called, it presents the image with an origin in the top left corner, i.e. the Y-axis grows downward. That corresponds to how cv.imshow() behaves.
If your plot does have an Y-axis growing upwards, causing the image to stand upside-down, then you must have set this plot up in specific ways that aren't presented in your question.

Related

Midas depth map: strange lines on sharp edge

I'm using Hugging Face's DPT large to compute depth map.
Here is an example of my problem:
(credit: museum of GenĂªve)
The depth map contains some little white lines just above the mountains in the background.
How can I avoid them ?
btw: I have cloned the repo and it works well on my local computer, so I have access to the code. I can make pre/post-processing. But as non-specialist I cannot patch Midas itself.
EDIT: I'm using Midas exactly as in the example: https://huggingface.co/spaces/akhaliq/DPT-Large/blob/main/app.py By the way, the effect I describe is visible in the offical demo.
EDIT: when I feed the extractor with the original 1148x790 image, the issue does not appears. It appears with a resized image 600x413. Thus a solution could be to only use non resized images.
Answer to myself.
It turns out that the issue disappears with:
Using the model "DPT_BEiT_L_512"
transform = midas_transforms.dpt_transform
prediction = torch.nn.functional.interpolate(
prediction.unsqueeze(1),
size=img.shape[:2],
mode="bilinear", # <--- instead of bicubic
antialias=True,
align_corners=True,

imagemagick cropping image creates jagged edges or saw-tooth shapes

The above pic (looks like zoomed one ) is from first level conversion from 1.Ai file 1_cropped.AI file what I get after cropping. I don't do resize during cropping it gets automatically resized.
I am trying crop an image, Seems without +repage imagemagick unable to crop. The problem is a simple crop created the jagged lines as you can see from the snapshot taken from a portion of image.
How to remove this. Some where in SOF post I found a recommendation to use "Gaussian blur" but didn't find a proper command to do the same. Many thanks! I am doing just the crop and no resizing.
Original : Due to copyright can't show the entire image. But below is one section:
Looking into : http://www.imagemagick.org/Usage/antialiasing/ now but unable to smoothe the 'stair case' or 'jaggies' so far.
UPDATE from the comments:
Yes the input is AI and output is almost all format AI/SVG/PNG/GIF/JPEG/BMP. So for smaller resolution files such as png/GIF I don't get that jagged shapes I I tried turning on anti-aliasing , blurring and guassian-bluring but no luck. I think the repaging zooms the image which I don't need, is it possible to set the canvas somehow so the original resolution is kept intact when converting from AI to AI? Yes initially I convert AI to AI after cropping and than feed the converted AI for further processing. The stair-stepping appears from first level AI to AI file conversion itself.

How to separate the query and the train image from the Mat object returned from DrawMatches() method

I am trying to detect an object in a video. i am using SURF as feature detection and descriptor extractor, and BRUTFORCE as matcher. i tested my work with faces, i captured a picture of me and when i run the camera and direct it toward me, my face gets detected and a rectangle is drawn around it. i tried to make another test, i captured an image of my mouse and resized it, and when i run the cam, it is not getting detected
the problems i am facing are:
1-is the size of the query/object image matters in such cases,? i am asking this question because the image i captured of my self is bigger than the one of the mouse, and the face is getting detected and the mouse not.
2-regardless of which image i am using as a query/object iamge, how to display camera preview of only the train/scene image without the query/object image. i am asking this question because, what i am getting is something as shown in the below posted images, while what i want to do is something as it is shown here, i checked the code in that link, it is in C++ but i followed the same thing and also the tutorial uses 'drawMatches' method which has a peer in java which is Features2D.DrawMatches() and both of them returns a Mat object with the query/object image on the left side and the train/scene image on the right side as also shown in the image i posted below.
what i want to do is, to display on the the camera output without the query/object image, i want the area designated for the camera output is to show only the train/scene image captured from the camera.
please let me know how to solve this issues, i want to do something as shown in the tutorial i cited in the link.
1 - size matters but in your case, I think the most crucial problem is "textureness". SURF detect the interest points where the "texture gradient" is strong. In the case of your mouse, the gradient is mainly smooth, except aroud the logo (fujitsu), the button and at the border of the image. In the tutorial you point to, you notice it uses a very textured object to demonstrate the effect.
2 - to the best of my knowledge, there is fully automatic method to do what you want, but it can be done with a few steps. Basically, you must determine the surrounding box of your object then draw it. To draw, the easier is to use cv::rectangle but you can be more precise with four (or more....) cv::line. To determine the surrounding box, you can estimate the extreme points among the filtered matches.
Good luck!

Best way to plot array as image with ROI selection and scale

I have a 2D numpy array that I need to plot as an image with a certain scale. Within that image I need to be able to select a ROI or at least be able to display the mouse coordinates (of a specific target contained in the image). I tried using pyqtgraph but I can't seem to plot an image as a data source rather than just an image (i.e. can't seem to set axes, etc)... what would be the best way to do this, then? The image browser is compiled as a widget with a slider that scrolls through frames of the file; this widget is then embedded in a main window with a few table widgets.
I think imshow in matplotlib might work for you. It is easy to zoom, pan, and scale, and works easily with numpy.
(If this answer doesn't work for you, could you please refine your question. I'm unsure whether you're looking for any tool that will do the job, or something that works within the context of a gui that you've already implemented. If the later, I think you'll probably need to do the ROI yourself, by, say, selecting areas of the numpy array to plot, e.g. a[xmin:xmax, ymin:ymax].)

What processing steps should I use to clean photos of line drawings?

My usual method of 100% contrast and some brightness adjusting to tweak the cutoff point usually works reasonably well to clean up photos of small sub-circuits or equations for posting on E&R.SE, however sometimes it's not quite that great, like with this image:
What other methods besides contrast (or instead of) can I use to give me a more consistent output?
I'm expecting a fairly general answer, but I'll probably implement it in a script (that I can just dump files into) using ImageMagick and/or PIL (Python) so if you have anything specific to them it would be welcome.
Ideally a better source image would be nice, but I occasionally use this on other folk's images to add some polish.
The first step is to equalize the illumination differences in the image while taking into account the white balance issues. The theory here is that the brightest part of the image within a limited area represents white. By blurring the image beforehand we eliminate the influence of noise in the image.
from PIL import Image
from PIL import ImageFilter
im = Image.open(r'c:\temp\temp.png')
white = im.filter(ImageFilter.BLUR).filter(ImageFilter.MaxFilter(15))
The next step is to create a grey-scale image from the RGB input. By scaling to the white point we correct for white balance issues. By taking the max of R,G,B we de-emphasize any color that isn't a pure grey such as the blue lines of the grid. The first line of code presented here is a dummy, to create an image of the correct size and format.
grey = im.convert('L')
width,height = im.size
impix = im.load()
whitepix = white.load()
greypix = grey.load()
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 * impix[x,y][0] / whitepix[x,y][0], 255 * impix[x,y][1] / whitepix[x,y][1], 255 * impix[x,y][2] / whitepix[x,y][2]))
The result of these operations is an image that has mostly consistent values and can be converted to black and white via a simple threshold.
Edit: It's nice to see a little competition. nikie has proposed a very similar approach, using subtraction instead of scaling to remove the variations in the white level. My method increases the contrast in the regions with poor lighting, and nikie's method does not - which method you prefer will depend on whether there is information in the poorly lighted areas which you wish to retain.
My attempt to recreate this approach resulted in this:
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 + impix[x,y][0] - whitepix[x,y][0], 255 + impix[x,y][1] - whitepix[x,y][1], 255 + impix[x,y][2] - whitepix[x,y][2]))
I'm working on a combination of techniques to deliver an even better result, but it's not quite ready yet.
One common way to remove the different background illumination is to calculate a "white image" from the image, by opening the image.
In this sample Octave code, I've used the blue channel of the image, because the lines in the background are least prominent in this channel (EDITED: using a circular structuring element produces less visual artifacts than a simple box):
src = imread('lines.png');
blue = src(:,:,3);
mask = fspecial("disk",10);
opened = imerode(imdilate(blue,mask),mask);
Result:
Then subtract this from the source image:
background_subtracted = opened-blue;
(contrast enhanced version)
Finally, I'd just binarize the image with a fixed threshold:
binary = background_subtracted < 35;
How about detecting edges? That should pick up the line drawings.
Here's the result of Sobel edge detection on your image:
If you then threshold the image (using either an empirically determined threshold or the Ohtsu method), you can clean up the image using morphological operations (e.g. dilation and erosion). That will help you get rid of broken/double lines.
As Lambert pointed out, you can pre-process the image using the blue channel to get rid of the grid lines if you don't want them in your result.
You will also get better results if you light the page evenly before you image it (or just use a scanner) cause then you don't have to worry about global vs. local thresholding as much.

Resources