I'm trying to implement image downsampling with Lanczos2.
However, the kernel seems to have zeros everywhere (since sin(pi*x)=0 if x is integer) except at the center pixel.
Thus, if the downsampling factor is an integer number (e.g. the output size is 1/2 of the original size at each dimension), then the Lanczos downsampling yields the exact same result as nearest neighbor interpolation (just taking every other pixel in 2X downsampling).
I believe that this is not intended to be the case, so my question is:
What am I missing?
How to use lanczos2 filter for 2x downsampling and is the result expected to be different than simply take every other pixel?
The kernel for 2x downsampling is given in section "Decimation by factor of 2 with the Lanczos2 sinc function" on page 10 of the reference you linked, with the coefficients:
0, -0.032, 0, 0.284, 0.496, 0.284, 0, -0.032, 0
This kernel is obtained by evaluating the given lanczos2(x) function at values of x=0.5n where n is the sample number (an integer). This reflect the fact that the output rate is half the original sampling rate (thus requires a half-band filter before pixel decimation to avoid aliasing).
P.S.: the kernel with zeros everywhere except at the center pixel that you have obtained, would typically be used (although implementations would usually optimize this kernel out as a simple pixel copy) in conjunction with a phase 1/2 kernel for interpolation by a factor 2.
Related
I can clearly understand how bilinear interpolation works when up scaling the image, like fill the values while taking 4 nearest neighbours, but i can't understand how it works while down scaling the image. It would mean a lot to me if someone clarify for me.
Scaling an image requires mapping pixels from the input to pixels on the output. If those pixel coordinates don't map to an integer, interpolation is required to estimate what the pixel value would have been. The "Bi" part of bilinear means it's linear interpolation applied in two dimensions independently. If for example output pixel 2,3 needs to come from input coordinates 1.5,7.2 you would interpolate in the X direction by taking 0.5 of each of the pixels at 1.0 and 2.0, then interpolate in the Y direction by taking 0.8 of the pixel at 7.0 and 0.2 of the pixel at 8.0. Usually these operations are combined into a single set of equations, but they can be applied separately if needed.
Bilinear is a poor choice for downscaling because it leads to aliasing artifacts. This is when you attempt to create spatial frequencies that are beyond the Nyquist sampling limit, and high frequency detail turns into low frequency artifacts. You can minimize this by blurring the image before you downscale it. Or you can choose an interpolation algorithm that incorporates some low pass filtering.
I'm trying to translate a photoshop setting for sharpening images to graphicsmagick. Therefore I found this helpful article:
https://redskiesatnight.com/2005/04/06/sharpening-using-image-magick/
The problem is that if I use to photoshop equivalent values explained in the article in graphicsmagick the images are not so sharp and clear like on photoshop.
For example I use this settings on photoshop:
Strength: 500%
Radius: 2.0 Pixel
Threshold: 8
In the article the parameters are explained like this:
The radius parameter
The radius parameter specifies (official documentation)
“the radius of the Gaussian, in pixels, not counting the center pixel”
Unsharp masking, like many other image-processing filters, is a
convolution kernel operation. The filter processes the image pixel by
pixel. For each pixel it examines a block of pixels surrounding it
(the kernel) and does some calculations on them to render the output
pixel value. The radius parameter determines which pixels surrounding
the center pixel will be considered in the convolution kernel: (think
of a circle) the larger the radius, the more pixels that will need to
be processed for each individual pixel.
Image Magick’s radius is similar to the same parameter in Photoshop,
GIMP and other image editors. In practical terms, it affects the size
of the “halos” created to increase contrast along the edges in the
image, increasing acutance and thus apparent sharpness.
How do you know how big of a radius to use? It depends on your output
target resolution, for one thing. It also depends on your personal
preferences, as well as the specific needs of the image at hand. As
far as the resolution issue goes, the GIMP User Manual recommends that
unsharp mask radius be set as follows:
radius = (output ppi / 30) * 0.2 Which is very similar to another commonly found rule of thumb:
radius = output ppi / 150 So for a monitor with 72 PPI resolution, you’d use a radius of approximately 0.5; if your targeting a printer
at 300 PPI you’d use a value of 2.0. Use these as a starting point;
different images have different sharpening requirements, and
individual preference is also a consideration. [Aside: there are a few
postings around the net (including some referenced in this article)
that suggest that Image Magick accepts, but does not honor, fractional
radii; that is, if you specify a radius of 0.5 or 1.2 it is rounded,
or defaults to an integer, or is silently ignored, etc. This is not
true, at least as of version 5.4.7, which is the one that I am using
as I write this article. You can easily see for yourself by doing
something like the following:
$ convert -unsharp 1.2x1.2+5+0 test.tif testo1.tif $ convert -unsharp
1.4x1.4+5+0 test.tif testo2.tif $ composite -compose difference test01.tif testo2.tif diff.tif $ display diff.tif you can also load
them into the GIMP or Photoshop into different layers and change the
blend mode to “Difference”; the resulting image is not black (you may
need to look closely for a 0.2 difference in radius). No, this
mistaken impression likely comes from the fact that there is a
relationship between the radius and sigma parameters, and if you do
not specify sigma properly in relation to the radius, the radius may
indeed be changed, or at least not work as expected. Read on for more
on this.]
Please note that the default radius (if you do not specify anything)
is 0, a special value which tells the unsharp mask algorithm to
“select an appropriate value for the radius”!.
The sigma parameter
The sigma parameter specifies (official documentation)
“the standard deviation of the Gaussian, in pixels”
This is the most confusing parameter of the four, probably because it
is “invisible” in other implementations of unsharp masking, and it is
most sparsely documented. The best explanation I have found for it
came from a google search that unearthed an archived mailing list
thread which had the following snippet:
Comparing the results of
convert -unsharp 1.2x1+4+0 test test1.2x1+4+0
and
convert -unsharp 30x1+4+0 test test30x1+4+0
results in no significant differences but the latter takes approx. 50 times
longer to complete.
That is not surprising. A radius of 30 involves on the order of 61x61
input pixels in the convolution of each output pixel. A radius of 1.2
involves 3x3 or 5x5 pixels.
Please can anybody give me any hints, what 'sigma' means?
It describes the relative weight of pixels as a function of their
distance from the center of the convolution kernel. For small sigma,
the outer pixels have little weight. Another important clue comes from
the documentation for the -unsharp option to convert (emphasis mine):
The -unsharp option sharpens an image. We convolve the image with a
Gaussian operator of the given radius and standard deviation (sigma).
For reasonable results, radius should be larger than sigma. Use a
radius of 0 to have the method select a suitable radius.
Combining the two clues provides some good insight: sigma is a
parameter that gives you some control over how the strength (given by
the amount parameter) of the sharpening is “graduated” or lessened as
you radiate away from a given pixel at the center of the convolution
matrix to the limit defined by the radius. My testing confirms this
inferred conclusion, namely that a bigger sigma causes more pronounced
sharpening for a given radius. That is why the poster in the mailing
list question (above) did not see any significant difference in the
sharpening even though he was using an amount of 400% (!!) and a
threshold of 0%; with a sigma of only 1.0, the strength of the filter
falls off too rapidly to be noticed despite the large difference in
radius between the two invocations. This is also why the man page says
“for reasonable results, radius should be larger than sigma”; if it is
not, then the sigma parameter does not have a graduated effect, as
designed, to “soften” the halos toward their edges; instead it simply
applies the amount evenly to the edge of the radius (which may be what
you want in some circumstances). A general rule of thumb for choosing
sigma might be:
if radius < 1, then sigma = radius else sigma = sqrt(radius) Summary:
choose your radius first, then choose a sigma smaller than or equal to
that. Experimentation will yield the best results. Please note that
the default sigma (if you do not specify anything) is 1.0. This is the
main culprit for why most people don’t see as much effect with Image
Magick’s unsharp mask operator as they do with other implementations
of unsharp mask if they are using a larger radius: unless you bump up
this parameter you are not getting the full benefit of the larger
radius!
[Aside: you might be wondering what happens if sigma is specifed
larger than the radius. The answer, as the documentation states, is
that the result may not be “reasonable”. In my testing, the usual
result is that the sharpening is extended at the specified amount to
the edge of the specified radius, and larger values of sigma have
little if any effect. In some cases (e.g. for radius < 0) specifying a
larger sigma increased the effective radius (e.g. to 1); this may be
the result of a “sanity check” on the parameters in the code. In any
case, keep in mind that the algorithm is designed for sigma to be less
than or equal to the radius, and results may be unexpected if used
otherwise.]
The amount parameter
The amount parameter specifies (official documentation)
“the percentage of the difference between the original and the blur
image that is added back into the original”
The amount parameter can be thought of as the “strength” of the
sharpening: higher values increase contrast in the halos more than
lower ones. Very large values may cause highlights on halos to blow
out or shadows on them to block up. If this happens, consider using a
lower amount with a higher radius instead.
amount should be specified as a decimal value indicating the
percentage. So, for example, if in Photoshop you would use an amount
of 170 (170%), in Image Magick you would use 1.7.
Please note that the default amount (if you do not specify anything)
is 1.0 (i.e. 100%).
The threshold parameter
The threshold parameter specifies (official documentation)
“as a fraction of MaxRGB, needed to apply the difference amount”
The threshold specifies a minimum amount of difference between the
center pixel vs. sourrounding pixels in the convolution kernel
necessary to apply the local contrast enhancement. Increasing this
value causes the algorithm to become less sensitive to differences
that may define edges. Specifying a positive threshold is often used
to avoid sharpening smooth areas that may contain noise (e.g. an area
of blue sky). If you have a noisy image, strongly consider raising the
threshold, or using some kind of smart sharpening technique instead.
The threshold parameter should be specified as a decimal value
indicating this percentage. This is different than GIMP or Photoshop,
which both specify the threshold in actual pixel levels between 0 and
the maximum (for 8-bit images, 255).
Please note that the default threshold (if you do not specify
anything) is 0.05 (i.e. 5%; this corresponds to a threshold of .05 *
255 = 12-13 in Photoshop). Photoshop uses a default threshold of 0
(i.e. no threshold) and the unsharp masking is applied evenly
throughout the image. If that is what you want you will need to
specify a 0.0 value for Image Magick’s threshold. This is undoubtedly
another source of confusion regarding Image Magick’s sharpening
algorithm.
So I did it like than and come up to this command:
gm convert file1.jpg -unsharp 2x1.41+5+0.03 file1_2x1.41+5+0.03.jpg
But like I said the images does not get that much sharpen like in photoshop. We also experimented with a lot of other values but without good images. So is it possible to do photoshop sharpening stuff with graphicsmagick? Or is it just a not good library? The main problem of just using photoshop for sharpenings is that we want to improve the images on our linux server and photoshop is not really good running on linux.
As of speaking about this 1D discrete denoising via variational calculus I would like to know how to manipulate the length of smoothing term as long as it should be N-1, while the length of data term is N. Here the equation:
E=0;
for i=1:n
E+=(u(i)-f(i))^2 + lambda*(u[i+1]-n[i])
E is the cost of actual u in optimization process
f is given image (noised)
u is output image (denoised)
n is the length of 1D vector.
lambda>=0 is weight of smoothness in optimization process (described around 13 minute in video)
here the length of second term and first term mismatch. How to resolve this?
More importantly, I would like to use linear equation system to solve this problem.
This is nowhere near my cup of tea but I think you are referring to the fact that:
u[i+1]-n[i] is accessing the next pixel making the term work only on resolution 1 pixel smaller then original f image
In graphics and filtering is this usually resolved in 2 ways:
use default value for pixels outside image resolution
you can set default or neutral(for the process) color to those pixels (like black)
use color of the closest neighbor inside image resolution
interpolate the mising pixels (bilinear,bicubic...)
I think the first choice is not suitable for your denoising technique.
change the resolution of output image
Usually after some filtering techniques (via FIR,etc) the result is 1 pixel smaller then the input to resolve the missing data problem. In your case it looks like your resulting u image should be 1 pixel bigger then input image f while computing cost functions.
So either enlarge it via bullet #1 and when the optimization is done you can crop back to original size.
Or virtually crop the f one pixel down (just say n'=n-1) before computing cost function so you avoid access violations (and also you can restore back after the optimization...)
I have a question about the Convolve function in OpenCV using GPU acceleration.
The speed of the convolutions are roughly 3.5 faster using GPU
when running:
convolve(src_32F, kernel, cresult, false, cbuffer);
However the image borders are missing (in cresult)
The result is excellent otherwise though (kernel size is 60x60)
thanks
This is the way convolution works.
It calculates the value of every pixel as a weighted average of the surrounding ones. So, if you take into account 30 pixels each side, for all the pixels that are closer to the image border than 30 pixels, convolution is not defined.
In the CPU implementation of filtering functions, those missing pixels are supplemented with bogus values based on a given strategy (copy, mirror, blank, etc).
What you can do is to manually pad your matrix with the desired values in a bigger matrix, filter the big one, and then crop it back. For that you can use the gpu::copyMakeBorder() func.
When applying a Gaussian blur to an image, typically the sigma is a parameter (examples include Matlab and ImageJ).
How does one know what sigma should be? Is there a mathematical way to figure out an optimal sigma? In my case, i have some objects in images that are bright compared to the background, and I need to find them computationally. I am going to apply a Gaussian filter to make the center of these objects even brighter, which hopefully facilitates finding them. How can I determine the optimal sigma for this?
There's no formula to determine it for you; the optimal sigma will depend on image factors - primarily the resolution of the image and the size of your objects in it (in pixels).
Also, note that Gaussian filters aren't actually meant to brighten anything; you might want to look into contrast maximization techniques - sounds like something as simple as histogram stretching could work well for you.
edit: More explanation - sigma basically controls how "fat" your kernel function is going to be; higher sigma values blur over a wider radius. Since you're working with images, bigger sigma also forces you to use a larger kernel matrix to capture enough of the function's energy. For your specific case, you want your kernel to be big enough to cover most of the object (so that it's blurred enough), but not so large that it starts overlapping multiple neighboring objects at a time - so actually, object separation is also a factor along with size.
Since you mentioned MATLAB - you can take a look at various gaussian kernels with different parameters using the fspecial('gaussian', hsize, sigma) function, where hsize is the size of the kernel and sigma is, well, sigma. Try varying the parameters to see how it changes.
I use this convention as a rule of thumb. If k is the size of kernel than sigma=(k-1)/6 . This is because the length for 99 percentile of gaussian pdf is 6sigma.
You have to find a min/max of a function G such that G(X,sigma) where X is a set of your observations (in your case, your image grayscale values) , This function can be anything that maintain the "order" of the intensities of the iamge, for example, this can be done with the 1st derivative of the image (as G),
fil = fspecial('sobel');
im = imfilter(I,fil);
imagesc(im);
colormap = gray;
this gives you the result of first derivative of an image, now you want to find max sigma by
maximzing G(X,sigma), that means that you are trying a few sigmas (let say, in increasing order) until you reach a sigma that makes G maximal. This can also be done with second derivative.
Given the central value of the kernel equals 1 the dimension that guarantees to have the outermost value less than a limit (e.g 1/100) is as follows:
double limit = 1.0 / 100.0;
size = static_cast<int>(2 * std::ceil(sqrt(-2.0 * sigma * sigma * log(limit))));
if (size % 2 == 0)
{
size++;
}