Midas depth map: strange lines on sharp edge - opencv

I'm using Hugging Face's DPT large to compute depth map.
Here is an example of my problem:
(credit: museum of GenĂªve)
The depth map contains some little white lines just above the mountains in the background.
How can I avoid them ?
btw: I have cloned the repo and it works well on my local computer, so I have access to the code. I can make pre/post-processing. But as non-specialist I cannot patch Midas itself.
EDIT: I'm using Midas exactly as in the example: https://huggingface.co/spaces/akhaliq/DPT-Large/blob/main/app.py By the way, the effect I describe is visible in the offical demo.
EDIT: when I feed the extractor with the original 1148x790 image, the issue does not appears. It appears with a resized image 600x413. Thus a solution could be to only use non resized images.

Answer to myself.
It turns out that the issue disappears with:
Using the model "DPT_BEiT_L_512"
transform = midas_transforms.dpt_transform
prediction = torch.nn.functional.interpolate(
prediction.unsqueeze(1),
size=img.shape[:2],
mode="bilinear", # <--- instead of bicubic
antialias=True,
align_corners=True,

Related

Image plotted by plt.imshow() is inverted while same image by cv2_imshow() is fine, how do I know what my neural net gets?

Here is my snippet for both of them
from google.colab.patches import cv2_imshow
import cv2
pt = '/content/content/DATA/testing_data/1/126056495_AO_BIZ-0000320943-Process_IP_Cheque_page-0001.jpg' ##param
img = cv2.imread(pt)
cv2_imshow(img)
and here is the other one
import matplotlib.image as mpimg
pt = '/content/content/DATA/testing_data/1/126056495_AO_BIZ-0000320943-Process_IP_Cheque_page-0001.jpg'
image = mpimg.imread(pt)
plt.imshow(image)
Now, the image in second case is inverted
and image on my system is upright
What I am mostly afraid of is, if my ML model is consuming inverted image, that is probably messing with my accuracy. What could possibly be the reason to It and how do I fix it
(ps: I cannot share the pictures unfortunately, as they are confidential )
(Run on google colab)
All the help is appreciated
Your picture is upside-down when you use one method for reading, and upright when you use the other method?
You use two different methods to read the image file:
OpenCV cv.imread()
Mediapipe mpimg.imread()
They behave differently. OpenCV's imread() respects file metadata and rotates the image as instructed. Mediapipe's function does not.
Solution: Stick to OpenCV's imread(). Don't use Mediapipe's function.
The issue is not with matplotlib. When plt.imshow() is called, it presents the image with an origin in the top left corner, i.e. the Y-axis grows downward. That corresponds to how cv.imshow() behaves.
If your plot does have an Y-axis growing upwards, causing the image to stand upside-down, then you must have set this plot up in specific ways that aren't presented in your question.

imagemagick cropping image creates jagged edges or saw-tooth shapes

The above pic (looks like zoomed one ) is from first level conversion from 1.Ai file 1_cropped.AI file what I get after cropping. I don't do resize during cropping it gets automatically resized.
I am trying crop an image, Seems without +repage imagemagick unable to crop. The problem is a simple crop created the jagged lines as you can see from the snapshot taken from a portion of image.
How to remove this. Some where in SOF post I found a recommendation to use "Gaussian blur" but didn't find a proper command to do the same. Many thanks! I am doing just the crop and no resizing.
Original : Due to copyright can't show the entire image. But below is one section:
Looking into : http://www.imagemagick.org/Usage/antialiasing/ now but unable to smoothe the 'stair case' or 'jaggies' so far.
UPDATE from the comments:
Yes the input is AI and output is almost all format AI/SVG/PNG/GIF/JPEG/BMP. So for smaller resolution files such as png/GIF I don't get that jagged shapes I I tried turning on anti-aliasing , blurring and guassian-bluring but no luck. I think the repaging zooms the image which I don't need, is it possible to set the canvas somehow so the original resolution is kept intact when converting from AI to AI? Yes initially I convert AI to AI after cropping and than feed the converted AI for further processing. The stair-stepping appears from first level AI to AI file conversion itself.

Cropping image By selecting Object and color matching

We are developing an app where we need to crop an image according to the selecting object area. User will draw a line and we need to select the object and crop it .This crop need to be like the app: YourMoji
So far we have tried to get the color of the pixels along the line and then comparing those with the color of every pixel in the image and making a path from it to clip the image. But the almost going no where.
Is it possible through this way to crop an image or we are going in the wrong way? Can anyone provide a way to do this Or suggest a way to modify the way we have worked so far? Any advice and suggestions will be greatly appreciated!
Thanks in advance.
I guess what you want is the image segmentation algorithm called Graph Cut.
Here are two Github repositories, hope these would help:
GraphCut
GrabCutIOS
I'm not exactly clued up on image manipulation, but the first algorithm that comes to mind is something like this:
Take the average of the pixels in the line (as you have)
Since you appear to want faces, you might want to weight reds and blues over green. Not much green in faces of any skin tone.
For each pixel, if the colour is within a given threshold outside of your selected average, remove it / make transparent.
Perhaps the closer to the original line (or centroid), the less strict the threshold becomes.
I'd then provide the user with some tools for:
Sensitivity: how large the threshold is
Eraser: to remove parts of the image that your algorithm missed
Paintbrush: to replace parts of the image that your algorithm incorrectly removed.

How to separate the query and the train image from the Mat object returned from DrawMatches() method

I am trying to detect an object in a video. i am using SURF as feature detection and descriptor extractor, and BRUTFORCE as matcher. i tested my work with faces, i captured a picture of me and when i run the camera and direct it toward me, my face gets detected and a rectangle is drawn around it. i tried to make another test, i captured an image of my mouse and resized it, and when i run the cam, it is not getting detected
the problems i am facing are:
1-is the size of the query/object image matters in such cases,? i am asking this question because the image i captured of my self is bigger than the one of the mouse, and the face is getting detected and the mouse not.
2-regardless of which image i am using as a query/object iamge, how to display camera preview of only the train/scene image without the query/object image. i am asking this question because, what i am getting is something as shown in the below posted images, while what i want to do is something as it is shown here, i checked the code in that link, it is in C++ but i followed the same thing and also the tutorial uses 'drawMatches' method which has a peer in java which is Features2D.DrawMatches() and both of them returns a Mat object with the query/object image on the left side and the train/scene image on the right side as also shown in the image i posted below.
what i want to do is, to display on the the camera output without the query/object image, i want the area designated for the camera output is to show only the train/scene image captured from the camera.
please let me know how to solve this issues, i want to do something as shown in the tutorial i cited in the link.
1 - size matters but in your case, I think the most crucial problem is "textureness". SURF detect the interest points where the "texture gradient" is strong. In the case of your mouse, the gradient is mainly smooth, except aroud the logo (fujitsu), the button and at the border of the image. In the tutorial you point to, you notice it uses a very textured object to demonstrate the effect.
2 - to the best of my knowledge, there is fully automatic method to do what you want, but it can be done with a few steps. Basically, you must determine the surrounding box of your object then draw it. To draw, the easier is to use cv::rectangle but you can be more precise with four (or more....) cv::line. To determine the surrounding box, you can estimate the extreme points among the filtered matches.
Good luck!

flood fill performance issue on iPad

I am using 4-Way floodfill algorithm.
I have a transparent image with black out line.
That is staring point image(without color).
And after filling the color in this image it look like this
Please help me and let me know what can i do for proper fill.
I used and implemented myself FloodFill in other projects and the algorithm goes trough the whole draw, looking for closed spaces and then draw inside (or outside) them.
Your problem happens with every tool in the world that fills a draw, and the problem is the same, the spaces are not 100% closed.
The floodfill algorithm goes pixel by pixel and when it detect a black pixel, it stops. For example, the arm of the scuba driver is not thick enough or it has holes on it, and the flood fill algorithm manages to go trough it and not detect it as an empty space.
Nobody here can tell you why unless we take your project and analyse it, so the best I can offer is a guideline about where your error could be.
I tried the code with an image that has a very precise defined border around it (from here) and it seems to work OK with that image. I suggest perhaps that if you zoom into your image that there is some grey aliasing around the edges which won't get filled. Perhaps the algorithm has a threshold function that can be tweaked?
Try setting the andTolerance value (I tried 4 which seemed to improve my example).
//Call function to flood fill and get new image with filled color
UIImage *image1 = [self.image floodFillFromPoint:tpoint withColor:newcolor andTolerance:4];

Resources