The problem is fairly simple: I have the following image.
My list of points is the white pixels, I have them stored in a texture. What would be the best and possibly most efficient method to determine the trapezoid they define? (Convex shape with 4 corners, doesn't necessarily have 90 degree angles).
The texture is fairly small (800x600) so going for CUDA/CL is definetly not worth it (I'd rather iterate over the pixels if possible).
You should be able to do what you want, i.e. detect lines from incomplete information, using the Hough Transform.
There is a cool demo of it in the examples accompanying CImg which itself is a rather nice, simple, header-only C++ image processing library. I have made a video of it here, showing how the accumulator space on the right is updated as I move the mouse first along a horizontal bar of the cage and then down a vertical bar. You can see the votes cast in the accumulator and that the point in the accumulator gradually builds up to a peak of bright white:
You can also experiment with ImageMagick on the command-line without needing to write or compile any code, see example here. ImageMagick is installed on most Linux distros and is available for macOS and Windows.
So, using your image:
magick trapezoid.png -background black -fill red -hough-lines 9x9+10 result.png
Or, if you want the underlying information that identifies the 4 lines:
magick trapezoid.png -threshold 50% -hough-lines 9x9+10 mvg:
# Hough line transform: 9x9+10
viewbox 0 0 784 561
# x1,y1 x2,y2 # count angle distance
line 208.393,0 78.8759,561 # 14 13 312
line 0,101.078 784,267.722 # 28 102 460
line 0,355.907 784,551.38 # 14 104 722
line 680.493,0 550.976,561 # 12 13 772
If you look at the numbers immediately following the hash (#), i.e. 14, 28, 14, 12 they are the votes which correspond to the number of points/dots in your original image along that line. That's is why I set the threshold to 10, in the 9x9+10 part - rather than using the 40 in the ImageMagick example I linked to. I mean you have relatively few points on each line so you need a lower threshold.
Note that the Hough Transform is also available in other packages, such as OpenCV.
Related
Assume I convert an 8-bit TIFF image to 16-bit with the following ImageMagick command:
$ convert 8bit-image.tif -depth 16 16bit-image.tif
The result is a file that is detected by other programs as a file with 16-bit depth:
$ identify 16bit-image.tif
16bit-image.tif TIFF 740x573 740x573+0+0 16-bit sRGB 376950B 0.000u 0:00.000
Naturally, this file does not have "true" 16 bit, since it's an 8 bit file which has simply been marked as 16 bit. It hasn't got the subtle nuances one would expect from true 16 bit. How can I distinguish a true 16 bit image from one that just "pretends"?
Best,
Bela
When you have an 8-bit image, the pixel values range from 0 to 255. For a 16-bit image, the pixel range is from 0 to 65535. So you can express more nuances in 16 bit than you can in 8 bit.
Usually, when you have a 16-bit imager in a camera, it is able to capture these nuances and map them to the full 16 bit range. An 8 bit imager will be limited to a smaller range, so when taking the picture, some information is lost compared to the 16 bit imager.
Now when you start out with an 8 bit image, that information is already lost, so converting to 16 bit will not give you greater nuance, because ImageMagick cannot invent information where there is none.
What image processing tools usually do is to fill copy the pixel values of your 8 bit image into the 16 bit image, so your 16 bit image will still contain only values in the range of [0,255]. If this is the case in your example, you can check whether the brightest pixel of your 16 bit image is greater than 255. If it is, you can assume that it is a native 16 bit image. If it isn't, it's likely that it was converted from 8 bit.
However, there is not a guarantee that the 16 bit image was really converted from 8 bit, as it could simply be a very dark native 16 bit image that only uses the darkest pixels from the 8 bit range by chance.
Edit: It is possible that someone converts the 8-bit image to 16 bit using the full 16 bit range. This could mean that a pixel of value 0 might remain at 0, a pixel at 255 might now be at 65535 and all values inbetween will be evenly distributed to the 16 bit range.
However, since no new information can be invented, there will be gaps in the pixel values used, e.g. you might have pixels of value 0, 255, 510 and so on, but values in between do not occur.
Depeding on the algorithm used for stretching the pixel range, these specific values may differ, but you would be able to spot a conversion like that by looking at the image' histogram:
It will have a distinctive comb-like structure (image taken from http://www.northlight-images.co.uk/digital-black-and-white-working-in-16-bit/)
So depending on how the conversion from 8 to 16 bit is executed, finding out whether it is a native image or not may be a bit more complicated and even then it can not be guaranteed to robustly determine whether the image was actually converted or not.
I have to stitch number of tiles using GraphicsMagick to create one single image. I am currently using -convert with -mosaic with some overlap to stitch tiles. But the stitched image has border where the overlap is done.
Following is the command I am using:
gm convert -background transparent
-page "+0+0" "E:/Images/Scan 001_TileScan_001_s00_ch00.tif"
-page "+0+948" "E:/Images/Scan 001_TileScan_001_s01_ch00.tif"
-page "+0+1896" "E:/Images/Scan 001_TileScan_001_s02_ch00.tif"
-page "+0+2844" "E:/Images/Scan 001_TileScan_001_s03_ch00.tif"
-mosaic "E:/Output/temp/0.png"
The final image looks like this:
How to stitch and Blend without Border?
I've been part of several projects to make seamless image mosaics. There are a couple of other factors you might like to consider:
Flatfielding. Take a shot of a piece of white card with your lens and lighting setup, then use that to flatten out the image lightness. I don't know if GM has a thing to do this, #fmw42 would know. A flatfield image is specific to a lighting setup, lens aperture setting, focus setting and zoom setting, so you need to lock focus/aperture/zoom after taking one. You'll need to do this correction in linear light.
Lens distortion. Some lenses, especially wide-angle ones, will introduce significant geometric distortion. Take a shot of a piece of graph paper and check that the lines are all parallel. It's possible to use a graph-paper shot to automatically generate a lens model you can use to remove geometric errors, but simply choosing a lens with low distortion is easier.
Scatter. Are you moving the object or the camera? Is the lighting moving too? You can have problems with scatter if you shift the object: bright parts of the object will scatter light into dark areas when they move under a light. You need to model and remove this or you'll see seams in darker areas.
Rotation. You can get small amounts of rotation, depending on how your translation stage works and how carefully you've set the camera up. You can also get the focus changing across the field. You might find you need to correct for this too.
libvips has a package of functions for making seamless image mosaics, including all of the above features. I made an example for you: with these source images (near IR images of painting underdrawing):
Entering:
$ vips mosaic cd1.1.jpg cd1.2.jpg join.jpg horizontal 531 0 100 0
Makes a horizontal join to the file join.jpg. The numbers give a guessed overlap of 100 pixels -- the mosaic program will do a search and find the exact position for you. It then does a feathered join using a raised cosine to make:
Although the images have been flatfielded, you can see a join. This is because the camera sensitivity has changed as the object has moved. The libvips globalbalance operation will automatically take the mosaic apart, calculate a set of weightings for each frame that minimise average join error, and reassemble it.
For this pair I get:
nip2, the libvips GUI, has all this with a GUI interface. There's a chapter in the manual (press F1 to view) about assembling large image mosaics:
https://github.com/jcupitt/nip2/releases
Global balance won't work from the CLI, unfortunately, but it will work from any of the libvips language bindings (C#, Python, Ruby, JavaScript, C, C++, Go, Rust, PHP etc. etc.). For example, in pyvips you can write:
import pyvips
left = pyvips.Image.new_from_file("cd1.1.jpg")
right = pyvips.Image.new_from_file("cd1.2.jpg")
join = left.mosaic(right, "horizontal", 531, 0, 100, 0)
balance = join.globalbalance()
balance.write_to_file("x.jpg")
Here is an example using ImageMagick. But since colors are different, you will only mitigate the sharp edge with a ramped blend. The closer the colors are and the more gradual the blend (i.e. over a larger area), the less it will show.
1) Create red and blue images
convert -size 500x500 xc:red top.png
convert -size 500x500 xc:blue btm.png
2) Create mask that is solid white for most and a gradient where you want to overlap them. Here I have 100 pixels gradient for 100 pixel overlap
convert -size 500x100 gradient: -size 500x400 xc:black -append -negate mask_btm.png
convert mask_btm.png -flip mask_top.png
3) Put masks into the alpha channels of each image
convert top.png mask_top.png -alpha off -compose copy_opacity -composite top2.png
convert btm.png mask_btm.png -alpha off -compose copy_opacity -composite btm2.png
4) Mosaic the two images one above the other with an overlap of 100
convert -page +0+0 top2.png -page +0+400 btm2.png -background none -mosaic result.png
See also my tidbit about shaping the gradient at http://www.fmwconcepts.com/imagemagick/tidbits/image.php#composite1. But I would use a linear gradient for such work (as shown here), because as you overlap linear gradients they sum to a constant white, so the result will be fully opaque where they overlap.
One other thing to consider is trying to match the colors of the images to some common color map. This can be done by a number of methods. For example, histogram matching or mean/std (brightness/contrast) matching. See for example, my scripts: histmatch, matchimage and redist at http://www.fmwconcepts.com/imagemagick/index.php and ImageMagick -remap at https://www.imagemagick.org/Usage/quantize/#remap
I could not find any good explanations of sigmoidal-contrast parameter. For example, if we have such a command:
$ convert -channel B -gamma 1.25 -channel G -gamma 1.25 -channel RGB -sigmoidal-contrast 25x25% 564.tif 564-adj.tif
What does this 25x25% mean? What is the right syntax of this parameter? Can we have values like LxMxN%? Are these values - integer numbers only? Thanks!
Looking at the following sigmoidal curve between input (horizontal) and output (vertical axes:
-sigmoidal-contrast c1,c2%
c1 is the contrast or slope of the line at the midpoint. A c1 of 0 will be a straight line between lower left and upper right on the diagram. A larger c1 will make the slope or center part where it is straightest be more vertical.
c2 is the horizontal center point of the curve in the range 0 to 100% with 0% being shifted left so that straight part is on the left of the figure and 100% is shifted right so that the straight part is on the right.
You can see this from the diagrams of my script, sigmoidal, in terms of using the sigmoidal curve as a means of adjusting brightness and contrast with little clipping. See http://www.fmwconcepts.com/imagemagick/sigmoidal/index.php
I think the answer here is going to be quite subjective. As I said in the comments, there is an already excellent explanation by Anthony Thyssen here.
As far as I understand, there are two parameters which are:
the amount of contrast increase, with 0 being least and 10 being most, and
the centre-point about which to increase the contrast, which is on a scale of 0-100%, where 50% would increase the contrast centred around mid-grey (i.e. 128 on a scale of 0-255).
convert input.png -sigmoidal-contrast <AMOUNT>,<CENTRE>% result.png
Let's look at the <CENTRE> value first. In general, you would want to draw a line of increasing contrast through the range of pixel brightnesses that interests you. The histogram is the easiest way I know of to determine where that is. So, if your histogram looks like this:
then I would suggest you use something like 25% for the <CENTRE> value. Whereas if your histogram looks more like this:
then you would probably want to set the <CENTRE> to 75%. So, in general, 50% is not an unreasonable default for the <CENTRE> parameter.
The <AMOUNT> parameter is going to be very subjective and vary from photo to photo. If, as I suspect, you are analysing satellite imagery, you can probably experiment to find a sensible value and then bulk-apply it to your images from the same series. I would start with 3-5 for normal photos maybe.
There is an original high quality label. After it's been printed we scan a sample and want to compare it with original to find errors in printed text for example. Original and scanned images are almost of the same size (but a bit different).
ImageMagic can do it great but not with scanned image (I suppose it compares it bitwise but scanned image contains to much "noise").
Is there an utility that can so such a comparison? Or may be an algorithm (implemented or easy to implement) - like the one that uses Cauchy–Schwarz inequality in signal processing?
Adding sample pics.
Original:-
Scanned:-
Further Thoughts
As I explained in the comments, I think the registration of the original and scanned images is going to be important as your scans are not exactly horizontal nor the same size. To do a crude registration, you could find some points of high-contrast that are hopefully unique in the original image. So, say I wanted one on the top-left (called tl.jpg), one in the top-right (tr.jpg), one in the bottom-left (bl.jpg) and one in the bottom-right (br.jpg). I might choose these:
[]
[]3
I can now find these in the original image and in the scanned image using a sub-image search, for example:
compare -metric RMSE -subimage-search original.jpg tl.jpg a.png b.png
1148.27 (0.0175214) # 168,103
That shows me where the sub-image has been found, and the second (greyish) image shows me a white peak where the image is actually located. It also tells me that the sub image is at coordinates [168,103] in the original image.
compare -metric RMSE -subimage-search scanned.jpg tl.jpg a.png b.png
7343.29 (0.112051) # 173,102
And now I know that same point is at coordinates [173,102] in the scanned image. So I need to transform [173,102] to [168,103].
I then need to do that for the other sub images:
compare -metric RMSE -subimage-search scanned.jpg br.jpg result.png
8058.29 (0.122962) # 577,592
Ok, so we can get 4 points, one near each corner in the original image, and their corresponding locations in the scanned image. Then we need to do an affine transformation - which I may, or may not do in the future. There are notes on how to do it here.
Original Answer
It would help if you were able to supply some sample images to show what sort of problems you are expecting with the labels. However, let's assume you have these:
label.png
unhappy.png
unhappy2.png
I have only put a red border around them so you can see the edges on this white background.
If you use Fred Weinhaus's script similar from his superb website, you can now compute a normalised cross correlation between the original image and the unhappy ones. So, taking the original label and the one with one track of white across it, they come out pretty similar (96%)
./similar label.png unhappy.png
Similarity Metric: 0.960718
If we now try the more unhappy one with two tracks across it, they are less similar (92%):
./similar label.png unhappy2.png
Similarity Metric: 0.921804
Ok, that seems to work. We now need to deal with the shifted and differently sized scan, so I will attempt to trim them to only get the important stuff and blur them to lose any noise and resize to a standardised size for comparison using a little script.
#!/bin/bash
image1=$1
image2=$2
fuzz="10%"
filtration="-median 5x5"
resize="-resize 500x300"
echo DEBUG: Preparing $image1 and $image2...
# Get cropbox from blurred image
cropbox=$(convert "$image1" -fuzz $fuzz $filtration -format %# info:)
# Now crop original unblurred image and resize to standard size
convert "$image1" -crop "$cropbox" $resize +repage im1.png
# Get cropbox from blurred image
cropbox=$(convert "$image2" -fuzz $fuzz $filtration -format %# info:)
# Now crop original unblurred image and resize to standard size
convert "$image2" -crop "$cropbox" $resize +repage im2.png
# Now compare using Fred's script
./similar im1.png im2.png
We can now compare the original label with a new image called unhappy-shifted.png
./prepare label.png unhappy-shifted.png
DEBUG: Preparing label.png and unhappy-shifted.png...
Similarity Metric: 1
And we can see they compare the same despite being shifted. Obviously I cannot see your images, how noisy they are, what sort of background you have, how big they are, what colour they are and so on - so you may need to adjust the preparation where I have just done a median filter. Maybe you need a blur and/or a threshold. Maybe you need to go to greyscale.
Given an image (Like the one given below) I need to convert it into a binary image (black and white pixels only). This sounds easy enough, and I have tried with two thresholding functions. The problem is I cant get the perfect edges using either of these functions. Any help would be greatly appreciated.
The filters I have tried are, the Euclidean distance in the RGB and HSV spaces.
Sample image:
Here it is after running an RGB threshold filter. (40% it more artefects after this)
Here it is after running an HSV threshold filter. (at 30% the paths become barely visible but clearly unusable because of the noise)
The code I am using is pretty straightforward. Change the input image to appropriate color spaces and check the Euclidean distance with the the black color.
sqrt(R*R + G*G + B*B)
since I am comparing with black (0, 0, 0)
Your problem appears to be the variation in lighting over the scanned image which suggests that a locally adaptive thresholding method would give you better results.
The Sauvola method calculates the value of a binarized pixel based on the mean and standard deviation of pixels in a window of the original image. This means that if an area of the image is generally darker (or lighter) the threshold will be adjusted for that area and (likely) give you fewer dark splotches or washed-out lines in the binarized image.
http://www.mediateam.oulu.fi/publications/pdf/24.p
I also found a method by Shafait et al. that implements the Sauvola method with greater time efficiency. The drawback is that you have to compute two integral images of the original, one at 8 bits per pixel and the other potentially at 64 bits per pixel, which might present a problem with memory constraints.
http://www.dfki.uni-kl.de/~shafait/papers/Shafait-efficient-binarization-SPIE08.pdf
I haven't tried either of these methods, but they do look promising. I found Java implementations of both with a cursory Google search.
Running an adaptive threshold over the V channel in the HSV color space should produce brilliant results. Best results would come with higher than 11x11 size window, don't forget to choose a negative value for the threshold.
Adaptive thresholding basically is:
if (Pixel value + constant > Average pixel value in the window around the pixel )
Pixel_Binary = 1;
else
Pixel_Binary = 0;
Due to the noise and the illumination variation you may need an adaptive local thresholding, thanks to Beaker for his answer too.
Therefore, I tried the following steps:
Convert it to grayscale.
Do the mean or the median local thresholding, I used 10 for the window size and 10 for the intercept constant and got this image (smaller values might also work):
Please refer to : http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm if you need more
information on this techniques.
To make sure the thresholding was working fine, I skeletonized it to see if there is a line break. This skeleton may be the one needed for further processing.
To get ride of the remaining noise you can just find the longest connected component in the skeletonized image.
Thank you.
You probably want to do this as a three-step operation.
use leveling, not just thresholding: Take the input and scale the intensities (gamma correct) with parameters that simply dull the mid tones, without removing the darks or the lights (your rgb threshold is too strong, for instance. you lost some of your lines).
edge-detect the resulting image using a small kernel convolution (5x5 for binary images should be more than enough). Use a simple [1 2 3 2 1 ; 2 3 4 3 2 ; 3 4 5 4 3 ; 2 3 4 3 2 ; 1 2 3 2 1] kernel (normalised)
threshold the resulting image. You should now have a much better binary image.
You could try a black top-hat transform. This involves substracting the Image from the closing of the Image. I used a structural element window size of 11 and a constant threshold of 0.1 (25.5 on for a 255 scale)
You should get something like:
Which you can then easily threshold:
Best of luck.