so I have some images that have some sort of color and scale grid placed inside. The grid sizes used when taking photos are all fixed but their sizes may be different in each image.
The final goal is to feed them into machine learning models.
An example image is shown below:
I wonder how I can automatically resize the images so each grid appears to be of the same size shown?
My thoughts:
Detect the crosses shown in the grid, or the black and white boxes, then find their apparent lengths in the image.
Compute the ratio (actual length)/(apparent length) for each image
Rescale the image with this ratio by multiplying all dimensions with this ratio.
However, there is problem of camera distortion. We cannot simply use one ratio to apply to all parts of the image when there is distortion.
I am also not sure how to detect the crosses, or the black and white boxes within the image and obtain their coordinates. It seems that opencv has a findchessboard function but there isn't one specifically for this type of grids. Maybe I should simply detect corners or boxes? Though how can I make sure the boxes are the ones I want and fetch their metadata for computing lengths?
And how can I do camera calibration here?
Appreciate any guidance!
Related
i'm work on graduation project for image forgery detection using CNN , Most of the paper i read before feed the data set to the network they Down scale the image size, i want to know how Does this process effect image information ?
Images are resized/rescaled to a specific size for a few reasons:
(1) It allows the user to set the input size to their network. When designing a CNN you need to know the shape (dimensions) of your data at each step; so, having a static input size is an easy way to make sure your network gets data of the shape it was designed to take.
(2) Using a full resolution image as the input to the network is very inefficient (super slow to compute).
(3) For most cases the features desired to be extracted/learned from an image are also present when downsampling the image. So in a way resizing an image to a smaller size will denoise the image, filtering out much of the unimportant features within the image for you.
Well you change the images size. Of course it changes it's information.
You cannot reduce image size without omitting information. Simple case: Throw away every second pixel to scale image to 50%.
Scaling up adds new pixels. In its simplest form you duplicate pixels, creating redundant information.
More complex solutions create new pixels (less or more) by averaging neighbouring pixels or interpolating between them.
Scaling up is reversible. It doesn't create nor destroy information.
Scaling down divides the amount of information by the square of the downscaling factor*. Upscaling after downscaling results in a blurred image.
(*This is true in a first approximation. If the image doesn't have high frequencies, they are not lost, hence no loss of information.)
I have an image and a version that is scaled down to exactly half the width and height. The Lanczos filter (with a = 3) has been used to scale the image. Color spaces can be ignored, all colors are in a linear space.
Since the small image contains one pixel for each 2x2 pixel block of the original I'm thinking it should be possible to restore the original image from the small one with just 3 additional color values per 2x2 pixel block. However, I do not know how to calculate those 3 color values.
The original image has four times as much information as the scaled version. Using the original image I want to calculate the 3/4 of information that is missing in the scaled version such that I can use the scaled version and the calculated missing information to reconstruct the original image.
Consider the following use-case: Over a network you send the scaled image to a user as a thumbnail. Now the user wants to see the image at full size. How can we avoid repeating information that is already in the thumbnail? As far as I can tell progressive image compression algorithms do not manage to do this with more complex filtering.
For the box filter the problem is trivial. But since the kernels of the Lanczos filter overlap each other I do not know how to solve it. Given that this is just a linear system of equations I believe it is solvable. Additionally I would rather avoid deconvolution in frequency space.
How can I calculate the information that is missing in the down-scaled version and use it to restore the original image?
I have an image in which my region of interest are the dots in the middle. I would like to get rid of the pixels at the top right circled in blue. How can I solve it.
I assume that the image is binary. In that case:
If the "dots" (small circles) that you want to detect are larger than the spots you want to remove, you can apply a median filter on the image. The size of the median filter can be determined according to the size of the noise spots. Another possibility is to use morphological operations (erosion and dilation). All these operations are supported by OpenCV.
If the image is not binary, you can start by converting it to binary using a threshold value.
My current project is to calculate the surface area of the paste covered on the cylinder.
Refer the images below. The images below are cropped from the original images taken via a phone camera.
I am thinking terms like segmentation but due to the light reflection and shadows a simple segmentation won’t work out.
Can anyone tell me how to find the surface area covered by paste on the cylinder?
First I'd simplify the problem by rectifying the perspective effect (you may need to upscale the image to not lose precision here).
Then I'd scan vertical lines across the image.
Further, you can simplify the problem by segmentation of two classes of pixels, base and painted. Make some statistical analysis to find the range for the larger region, consisting of base pixels. Probably will make use of mathematical median of all pixels.
Then you expand the color space around this representative pixel, until you find the highest color distance gap. Repeat the procedure to retrieve the painted pixels. There's other image processing routines you may have to do such as smoothing out the noise, removing outliers and background, etc.
I have to find the crosses in the image. What I know is the exact position of each red square. Now I have to decide, if there is a cross inside the square or not. What is the best and fastest way to do this? I am using OpenCv/c++! Well, I could try to use the SVM of OpenCv? But is it fast? Do you have any other ideas?
If you really have the coordinates of the center of each number-box and you maybe can adjust the image acquisition a bit, this should be a feasible task.
The problem I see here is, that you have a brightness gradient in your image which you should get rid off by either taking a better picture or using a big Gaussian-filter and an image subtraction.
Otherwise I'm not sure you'll find a good brightness-threshold to separate crossed from non-crossed.
Another approach you could use is to calculate the variance of your pixels. This gives you a nice local measure, whether or not a dark pen spread your pixel-distribution. A quick test looks promising
Note, that I didn't had the real positions of the boxes. I just divided your image into equally sized regions which is not really correct regarding the box/sub-box like structure. Therefore there are some false positives in it because of the red triangles in each upper left corner and because of some overlapping crosses.
So here's what I did:
Take your image without the red channel and made it a graylevel image.
Filtering this image with a Gaussian of radius 100 and subtracting this from the image.
I divided your image into (7*6)x(7*2) subregions.
Calculated the variance of each subregion and used a variance-threshold of about 0.017 for the above image
Every box which had a larger variance was crossed.
Simple solution: if you know a-priori locations of all boxes, just calculate the average brightness of the box. Boxes with a mark will be much darker than empty boxes.
If not detecting red ink is an option, keep it simple: Accumulate all pixels within the red square and threshold on the "redness", i.e. the quotient of the sum of red values divided by the total color values.
just find rectungles and then do simple pixel compare.