I'm using libjpeg to produce jpeg from raw RGB data at work. It works correctly, though the quality of the jpeg output is not satisfactory, and differences from similarly saved png (libpng) are significant. I thought, may be the default quality setting is set to low, but I checked that the default is set to 100, which is the maximum you can get for that library. Even setting that explicitly to 100 using jpeg_set_quality() didn't help. I then looked in the library description, and changed the default J_DCT_METHOD from JDCT_ISLOW to JDCT_FLOAT, just because it says the latter is a little more accurate than the former. The final output however is no different, and the image is still 'blurry' at places. I also checked that 'smoothing' is set to zero which could have made a difference if non-zero. If I didn't care about speed/memory, are there any other settings I can change to increase the fidelity of my image produced? I'm referencing this page for library methods : https://www4.cs.fau.de/Services/Doc/graphics/doc/jpeg/libjpeg.html
Thanks!
Poor JPEG quality comes from a combination of subsampling and quantization tables. In LibJpeg, the "quality" setting affects the latter. If you have set that to 100 and are still getting bad output, check the subsampling.
Also, JPEG is sensitive to the type of image. JPEG does not do a good job with images that consist of areas of the identical color (e.g., drawing, cartoons—as opposed to photographs)
Changing the DCT method, as you have suggested, will do little for the quality.
Related
I have certain png files. They are of size approx 1MB. I tried several command but they didn't work for me. Any suggestions. One is as below :
"C:\\Program Files\\ImageMagick-6.9.9-Q16\\mogrify.exe" -depth 8 -format png -define PNG:compression-strategy=2 -define PNG:compression-filter=0 test.png
Thanks,
As pointed out by #fmw42 in comments, your image may already be optimized. Also, #Mark's comment regarding reducing colors is true.
But apart from this, the important thing to know is that "there is no ideal command". You will have to figure out bit depth in your color channels and reduce them. There will always be a trade-off between reducing colors and quality you wish to pick.
Apart from that, there can also be other methods that you can use:
If opacity of PNG is fully opaque, you can strip alpha channel as it
makes no sense in that case. This can give you some file size savings.
If the image is visibly grayscale and still color type is
true-color, true-color-alpha or indexed-color, you can make significant savings by saving the image with a grayscale color space.
Retry optimizing PNG files using adaptive delta filtering and LZ77 Optimizations. This can be done easily using "optipng". But if the image is already optimized enough, this won't provide significant file sizes reduction. Moreover, choice of filtering depends upon png bit depths, so you would have to look up and understand PNG compression from various documentation available online regarding PNG compression.
I am studying jpeg compression and it seems to work by reducing high frequency components in images. Since noise is usually high frequency, does this imply that jpeg compression somewhat works on reducing noise in images?
JPEG compression can reduce noise by smoothing out the high-frequency components of the image, but it also introduces visual noise in the form of compression artifacts. Here is a zoomed-in (3x) view of part of my avatar (a high-quality JPEG) and part of your avatar (a PNG drawing), on the left as downloaded and on the right as compressed with ImageMagick using -quality 60. To my eye they both look "noisier" when JPEG-compressed.
Strictly speaking, no.
JPEG does remove high frequencies (see below), but not selectively enough to be a denoising algorithm. In other words, it will remove high frequencies if they are noise, but also if they are useful detail information.
To understand this, it helps to know the basics of how JPEG works. First, the image is divided in 8x8 blocks. Then the discrete cosine transform (DCT) is applied. As a result, each element of the 8x8 block contains the "weight" of a different frequency. Then the elements are quantized in a fixed way depending on the quality level selected a priori. This quantization means gaining coding performance at the cost of losing precision. The amount of precision lost is fixed a priori, and (as I said above) it does not differenciate between noise and useful detail.
You can test this yourself by saving the same image with different qualities (which technically control the amount of quantization applied to each block) and see that not only noise is removed. There is a nice video showing this effect for different quality levels here: https://upload.wikimedia.org/wikipedia/commons/f/f3/Continuously_varied_JPEG_compression_for_an_abdominal_CT_scan_-_1471-2342-12-24-S1.ogv.
I'm wonder if there is a way to automatically choose a reasonable JPEG compression level in OpenCV?
The current JPEG sizes I'm getting are too large, and nailing it to a fixed value feels dirty. If I recall such features existed in image editors such as Dreamweaver. If there is no such features, i'm also wondering if somebody knows of an algorithm that is able to estimate this parameter without performing hard disk IO.
std::vector<int> params;
params.push_back(CV_IMWRITE_JPEG_QUALITY);
params.push_back(magic); //Want a way to estimate magic
cv::imwrite("my.jpg",image,params);
Unfortunately, to "optimize" JPEG compression, one would have to learn and apply many technical details about the JPEG compression. Because of this, many libraries do not offer the full suite of adjustment parameters. The 0-100 JPEG quality parameter is already a good compromise.
ImageMagick may have such functionality.
You are looking for a way to "automatically choose a reasonable JPEG compression level in OpenCV".
However, "reasonable" is subjective, and depends on the the image owner's perception of what features are important in the given image. This means the perception can be different for every combination of (different owners) x (different images).
The short answer
No, OpenCV does not currently offer this functionality.
The "sysadmin" answer
Look at OpenCV ImageMagick integration.
http://www.imagemagick.org/discourse-server/viewtopic.php?f=22&t=20333&start=45
The quick and dirty answer
Use method of bisection (0, 100, 50, 75, 87, ...) to search for a JPEG quality level that will approach a specified output file size.
Secant method may also be applicable.
Edited: Newton's method is probably not useful, because one cannot obtain the first derivative of the quality-file size curve without an analytical model.
Obviously this is too inefficient for practical every-day use, so it is not provided by the library.
If you want to use it, you have to implement it yourself with your own choice of techniques.
To avoid disk I/O, use cv::imencode which writes to memory instead of to disk.
The slightly longer answer
Although it doesn't implement this functionality, it is obvious that it is a nice feature to have.
If someone is willing to implement it with code quality good for use in OpenCV, OpenCV may consider accept it.
The yet longer answer
OpenCV uses jpeglib, or optionally libjpeg-turbo, and both libraries allow one to configure the technical details of JPEG compression.
Below I will focus on these technical details.
Read first: JPEG compression on Wikipedia
Of the JPEG compression pipeline, three of the compression steps can be configured by users of jpeglib or libjpeg-turbo:
Chroma subsampling
After the conversion from RGB to YCbCr, the chroma (color-carrying) channels: Chroma-blue and Chroma-red, are optionally stored in a lower resolution relative to the Luminance (Y) channel, also known as the Intensity or Grayscale channel, the latter is always stored at full resolution.
Most JPEG decoders can support these downsampling factors:
(1, 1) - no subsampling
(1, 2), (2, 1), (2, 2) - moderate subsampling, where one or both dimensions may be subsampled by 2.
(1, 4), (2, 4), (4, 2), (4, 1) - heavy subsampling. Note that the original JPEG specification forbids some of these combinations, but most JPEG decoders are able to decode them nevertheless.
Quantization table
Each JPEG image can define a quantization table for the "AC coefficients" of the DCT transformed coefficients
Each JPEG image can define a quantization table for the "DC coefficient" (i.e. the average value of the 8x8 block) computed from the DCT transform.
Quantization is the "lossy step" of JPEG compression. So, a technical user will have to decide how much loss (quantization) is acceptable, and then configure the quantization table accordingly.
Huffman table
Huffman coding is a lossless compression technique. In other words, if one could really spend time optimizing the Huffman coding table based on the statistics of the quantized DCT coefficients of the whole image, one can often construct a good Huffman table to optimize compression without having to trade off quality.
Unfortunately, the reality is more complicated, and such optimization is often not enabled.
It requires keeping all DCT coefficients in memory, for the whole image. This bloats memory usage.
Writing to the file cannot start until everything is in memory. In contrast, if a library chooses the quantization table and Huffman table up-front, without looking at the statistics of the DCT coefficients, then the library would be able to write to the file incrementally as rows and rows of pixels are being processed. Because libjpeg is designed to be usable in the lowest-denominator devices (including smart watches, and maybe your refrigerator too?), being able to operate with minimum memory is an important feature.
Sorry but there is no way to tell the size before you make compress the file. If you are not in a hurry, compress the image using different quality values and then select the best one.
I have some (millions) of 16-bit losslessly compressed TIFFs (about 2MB each) and after exhausting TB of disk space I think it's time I archive the older TIFFs as 8-bit JPEGs. Each individual image is a grayscale image, though there may be as many as 5 such images representing the same imaging area at different wavelengths. Now I want to preserve as much information as possible in this process, including the ability to restore the images to their approximate original values. I know there are ways to get further savings through spatial correlations across multiple channels, but the number of channels can vary, and it would be nice to be able to load channels independently.
The images themselves suggest some possible strategies to use since close to ~60% of the area in each image is dark 'background'. So one way to preserve more of the useful image range is just to threshold away anything below this 'background' before scaling and reducing the bit depth. This strategy is, of course, pretty subjective, and I'm looking for any other suggestions for strategies that are demonstrably superior and/or more general. Maybe something like trying to preserve the most image entropy?
Thanks.
Your 2MB TIFFs are already losslessly compressed, so you would be hard-pressed to find a method that allows you to "restore the images" to their original value ranges without some loss of intensity detail.
So here are some questions to narrow down your problem a bit:
What are the image dimensions and number of channels? It's a bit difficult to guess from the filesize and bit depth alone, because as you've mentioned you're using lossless compression. A sample image would be good.
What sort of images are they? E.g. are they B/W blueprints, X-ray/MRI images, color photographs. You mention that around 60% of the images is "background" -- could you tell us more about the image content?
What are they used for? Is it just for a human viewer, or are they training images for some computer algorithm?
What kind of coding efficiency are you expecting? E.g. for the current 2MB filesize, how small do you want your compressed files to be?
Based on that information, people may be able to suggest something. For example, if your images are just color photographs that people will look at, 4:2:0 chroma subsampling will give you a 50% reduction in space without any visually detectable quality loss. You may even be able to keep your 16-bit image depth, if the reduction is sufficient.
Finally, note that you've compared two fundamentally different things in your question:
"top ~40% of the pixels" -- here it sounds like you're talking about contiguous parts of the intensity spectrum (e.g. intensities from 0.6 to 1.0) -- essentially the probability density function of the image.
"close to ~60% of the area in each image" -- here you're talking about the distribution of pixels in the spatial domain.
In general, these two things are unrelated and comparing them is meaningless. There may be an exception for specific image content -- please put up a representative image to make it obvious what you're dealing with.
If you edit your question, I'll have a look and reply if I think of something.
This is really a two part question, since I don't fully understand how these things work just yet:
My situation: I'm writing a web app which lets the user upload an image. My app then resizes to something displayable (eg: 640x480-ish) and saves the file for use later.
My questions:
Given an arbitrary JPEG file, is it possible to tell what the quality level is, so that I can use that same quality when saving the resized image?
Does this even matter?? Should I be saving all the images at a decent level (eg: 75-80), regardless of the original quality?
I'm not so sure about this because, as I figure it: (let's take an extreme example), if someone had a 5 megapixel image saved at quality 0, it would be blocky as anything. Reducing the image size to 640x480, the blockiness would be smoothed out and barely less noticeable... until I saved it with quality 0 again...
On the other end of the spectrum, if there was an image which was 800x600 with q=0, resizing to 640x480 isn't going to change the fact that it looks like utter crap, so saving with q=80 would be redundant.
Am I even close?
I'm using GD2 library on PHP if that is of any use
You can view compress level using the identify tool in ImageMagick. Download and installation instructions can be found at the official website.
After you install it, run the following command from the command line:
identify -format '%Q' yourimage.jpg
This will return a value from 0 (low quality, small filesize) to 100 (high quality, large filesize).
Information source
JPEG is a lossy format. Every time you save a JPEG same image, regardless of quality level, you will reduce the actual image quality. Therefore even if you did obtain a quality level from the file, you could not maintain that same quality when you save a JPEG again (even at quality=100).
You should save your JPEG at as high a quality as you can afford in terms of file size. Or use a loss-less format such as PNG.
Low quality JPEG files do not simply become more blocky. Instead colour depth is reduced and the detail of sections of the image are removed. You can't rely on lower quality images being blocky and looking ok at smaller sizes.
According to the JFIF spec. the quality number (0-100) is not stored in the image header, although the horizontal and vertical pixel density is stored.
For future visitors, checking the quality of a given jpeg, you could just use imagemagick tooling:
$> identify -format '%Q' filename.jpg
92%
Jpeg compression algorithm has some parameters which influence on the quality of the result image.
One of such parameters are quantization tables which defines how many bits will be used on each coefficient. Different programs use different quatization tables.
Some programs allow user to set quality level 0-100. But there is no common defenition of this number. The image made with Photoshop with 60% quality takes 46 KB, while the image made with GIMP takes only 26 KB.
Quantization tables are also different.
There are other parameters such subsampling, dct method and etc.
So you can't describe all of them by single quality level number and you can't compare quality of jpeg images by single number. But you can create such number like photoshop or gimp which will describe compromiss between size on quality.
More information:
http://patrakov.blogspot.com/2008/12/jpeg-quality-is-meaningless-number.html
Common practice is that you resize the image to appropriate size and apply jpeg after that. In this case huge and middle images will have the same size and quality.
Here is a formula I've found to work well:
jpg100size (the size it should not exceed in bytes for 98-100% quality) = width*height/1.7
jpgxsize = jpg100size*x (x = percent, e.g. 0.65)
so, you could use these to find out statistically what quality your jpg was last saved at. if you want to get it down to let's say 65% quality and if you want to avoid resampling, you should compare the size initially to make sure it's not already too low, and only then reduce the quality
As there are already two answers using identify, here's one that also outputs the file name (for scanning multiple files at once):
If you wish to have a simple output of filename: quality for use on multiple images, you can use
identify -format '%f: %Q' *
to show the filename + compression of all files within the current directory.
So, there are basically two cases you care about:
If an incoming image has quality set too high, it may take up an inappropriate amount of space. Therefore, you might want, for example, to reduce incoming q=99 to q=85.
If an incoming image has quality set too low, it might be a waste of space to raise it's quality. Except that an image that's had a large amount of data discarded won't magically take up more space when the quality is raised -- blocky images will compress very nicely even at high quality settings. So, in my opinion it's perfectly OK to raise incoming q=1 to q=85.
From this I would think simply forcing a decent quality setting is a perfectly acceptable thing to do.
Every new save of the file will further decrease overall quality, by using higher quality values you will preserve more of image. Regardless of what original image quality was.
If you resave a JPEG using the same software that created it originally, using the same settings, you'll find that the damage is minimized - the algorithm will tend to throw out the same information it threw out the first time. I don't think there's any way to know what level was selected just by looking at the file; even if you could, different software almost guarantees different parameters and rounding, making a match almost impossible.
This may be a silly question, but why would you be concerned about micromanaging the quality of the document? I believe if you use ImageMagick to do the conversion, it will manage the quality of the JPEG for you for best effect. http://www.php.net/manual/en/intro.imagick.php
Here are some ways to achieve your (1) and get it right.
There are ways to do this by fitting to the quantization tables. Sherloq - for example - does this:
https://github.com/GuidoBartoli/sherloq
The relevant (python) code is at https://github.com/GuidoBartoli/sherloq/blob/master/gui/quality.py
There is another algorithm written up in https://arxiv.org/abs/1802.00992 - you might consider contacting the author for any code etc.
You can also simulate file_size(image_dimensions,quality_level) and then invert that function/lookup table to get quality_level(image_dimensions,file_size). Hey presto!
Finally, you can adopt a brute-force https://en.wikipedia.org/wiki/Error_level_analysis approach by calculating the difference between the original image and recompressed versions each saved at a different quality level. The quality level of the original is roughly the one for which the difference is minimized. Seems to work reasonably well (but is linear in the for-loop..).
Most often the quality factor used seems to be 75 or 95 which might help you to get to the result faster. Probably no-one would save a JPEG at 100. Probably no-one would usefully save it at < 60 either.
I can add other links for this as they become available - please put them in the comments.
If you trust Irfanview estimation of JPEG compression level you can extract that information from the info text file created by the following Windows line command (your path to i_view32.exe might be different):
"C:\Program Files (x86)\IrfanView\i_view32.exe" <image-file> /info=txtfile
Jpg compression level is recorded in the IPTC data of an image.
Use exiftool (it's free) to get the exif data of an image then do a search on the returned string for "Photoshop Quality". Or at least put the data returned into a text document and check to see what's recorded. It may vary depending on the software used to save the image.
"Writer Name : Adobe Photoshop
Reader Name : Adobe Photoshop CS6
Photoshop Quality : 7"