Does running imagemagick multiple times degrade image quality? - imagemagick

I run a website and we have a number of folks working on page development. Despite multiple training sessions, many of them upload unoptimized images to the site. We want to run a batch compression using imagemagick that catches those images and resizes them down. We're looking at using imagemagick, as that's what's installed on our server. However, we're wondering if we can run that batch multiple times on the same images or whether that will cause degradation every time and eventually the images will suffer from it. Is there an easy way to prevent that from happening?

The question is about generation loss, with repeated compression using the same "quality". The answer depends upon whether chroma subsampling is enabled.
Chroma sub-sampling is used in ImageMagick by default for quality values less than 90.
Therefore, for quality values of 90 or greater (or if the -sampling-factor option is used to prevent subsampling), there should be little generation loss after the first pass or two.
I suggest that OP use quality=90 for the project.
Here is an article entitled Why JPEG is like a photocopier that explains generation loss and the effect of chroma sub-sampling.
EDIT: I ran some experiments, and they did not exactly bear out my assertions.
Some images (mostly photos) converged very quickly, in 2 or 3 iterations, while others (drawings) took as many as 30 iterations in the worst case. Sub-sampling took a little longer to converge, and quality 90 took a little longer than quality 85. Here are a variety of 64x64-pixel images, all enlarged 4 times to make the individual pixels visible:

Related

How to resize(reshape) the images in CNN? Mathematical intuition behind resizing

I have been working on Images for few months for my internship, and recently I have been wondering that is there a mathematical way of resizing the images.
This becomes a fairly difficult task to resize the images because many a times freshers like me have little experience about the pre-processing in Images.
Given that my problem statement was Gender classification using the human eye. However I found it difficult because
The images were 3 channel
The images were in rectangular shape (17:11)
I did try to resize the images by following few blogs which said to start small and then go up, while it could have worked I still did not understand how small. I resized them to 800,800 randomly and go Resource Exhaustive error(I was using GPU).
So I ask the community if there is any such mathematical formula or a generalized way of doing the resizing task.
Thank you in advance.
This partially answers your question. But, normally many people use transfer learning and a pre-designed architecture for computer vision tasks. Since almost all architecture is designed for square input shape, you can get a better results by making the shape of your input image squared. Another solution would be only padding your 17X11 to make it square by 0 values. (you need to test to see which one works best in your case, but the common practice is re-shaping to square.)
It is fine to have 3 channel images, almost all images are designed for 3 channel input ( even for BW images it is suggested to repeat the channel to have 3 channel input for the model)
About resizing
About resizing the image, in theory, you need to resize the image to the model you are going to use. For example, LeNet-5 accepts images of Mnist with size 28x28. In theory, larger images result in better model performance, but in your case, the images are super low resolution you can start with 28x28 or 224x224 architectures and later use bigger ones and see if it helps in your case.
About the error it's pretty normal your model size was going to be bigger than your GPU memory so, you see Out of memory error. you can use a smaller model ( and smaller input image size) with your device, or you need to use a device with bigger GPU memory.
Finally, you should consider the size of architecture you are going to reuse to determine the correct resize of the dataset you need. If you are designing your model then best starting point can be something around 28x28 ( basically using Lenet) and later developing based on needs/performance.
the resizing can be as easy as calling a Transform with Pytorch transforms like ( i mean you don't need to manually recreate a copy of the dataset just for resizing)
T.Compose([
T.RandomResize(224)
])

image resolution for training data for vehicle detection and tracking?

I'm new to computer vision. I'm working on a research project whose objective is (1) to detect vehicles from images and videos, and then later on (2) to be able to track moving vehicles.
I'm at the initial stage where I'm collecting training data, and I'm really concerned about getting images which are at an optimum resolution for detection and tracking.
Any ideas? The current dataset I've been given (from a past project) has images of about 1200x600 pixels. But I've been told this may or may not be an optimum resolution for the detection and tracking task. Apart from considering the fact that I will be extracting haar-like features from the images, I can't think of any factor to include in making a resolution decision. Any ideas of what a good resolution ought to be for training data images in this case?
First of all, feeding raw images directly to classifiers does not produce great results although sometimes useful such as face-detection. So you need to think about feature extraction.
One big issue is that a 1200x600 has 720,000 pixels. This defines 720,000 dimensions and it poses a challenge for training and classification because of dimension explosion.
So basically you need to scale down your dimensions particularly using feature extraction. What features to detect? It completely depends on the domain.
Another important aspect is the speed. Processing bigger images takes more time and this is especially important for processing real-time images which is something of 15-30 fps.
In my project (see my profile) which was real-time (15fps), I was working on 640x480 images and for some operations I had to scale down to improve performance.
Hope this helps.

Can an image segmentation algorithm like GrabCut run on the iPhone GPU?

I've been toying around with the GrabCut algorithm (as implemented in OpenCV) on the iPhone. The performance is horrid. It takes about 10-15 seconds to run even on the simulator for an image that's about 800x800. On my phone it runs for several minutes, eventually runs out of memory, and crashes (iPhone 4). I'm sure there's probably some optimization I can do if I write my own version of the algorithm in C, but I get the feeling that no amount of optimization is going to get it anywhere near usable. I've dug up some performance measurements in some academic papers and even they were seeing 30 second runtimes on multicore 1.8 ghz CPU's.
So my only hope is the GPU, which I know literally nothing about. I've done some basic research on OpenGL ES so far, but it is a pretty in-depth topic and I don't want to waste hours or days learning the basic concepts just so I can find out whether or not I'm on the right path.
So my question is twofold:
1) Can something like GrabCut be run on the GPU? If so, I'd love to have a starting point other than "learn OpenGL ES". Ideally I would like to know what concepts I need to pay particular attention to. Keep in mind that I have no experience with OpenGL and very little experience with image processing.
2) Even if this type of algorithm can be run on the GPU, what kind of performance improvement should I expect? Considering that the current runtime is about 30 seconds AT BEST on the CPU, it seems unlikely that the GPU will put a big enough dent in the runtime to make the algorithm useful.
EDIT: For the algorithm to be "useful", I think it would have to run in 10 seconds or less.
Thanks in advance.
It seems that grabcut doesn't benefit from the image resolution. It means that the quality of the result doesn't depend directly from the quality of the input image. On the other hand the performance benefits from the size, meaning that the smaller is the image the faster is the algorithm in performing the cutout. So, try to scale the image down to 300x300, apply grabcut, take the mask out, scale the mask up to the original size , and apply the mask to the original image to obtain the result. Let me know if it works.
Luca

How to compare the "quality" of two image scaling algorithms?

Suppose I want to include an image upscaling/downscaling algorithm in my program. Execution time is not important, the result "quality" is. How do you define "quality" in this case and how do you then choose the best algorithm from the available choices?
On a side note, if this is too general, the underlying problem for which I'm trying to find a solution is this: suppose I have a lot of images that I will need to upscale at runtime (a video, actually). I can pre-process them and upscale them somewhat with a slow and high-quality algorithm, but I don't know the final resolution (well, people have different monitors after all), so I can't resize to that immediately. Would it be beneficial if I upscaled it somewhat with my high-quality algorithm, and then let the player upscale it further to the necessary resolution at runtime (with a fast but low quality algorithm)? Or should I leave the video as-is and leave all the upscaling to be done in one pass at runtime?
The only way to really objectively judge the quality is to do some (semi-)scientific research. Recruit several participants. Show them the upscaled images in a random order, and have them rank the subjective quality (bonus points for doing it double-blind). Then you average out the scores and choose the algorithm with the highest average score (and perhaps test for statistical significance).
You'll want to make sure the images you test give a representative sampling of the actual images you're using. If you're doing it for video, it would probably be a good idea to use short video clips as the test images, instead of stills, as I would suspect that people would perceive the upscaling quality differently for those two.
If you don't care about rigorousness, you could just perform the tests with yourself as the only test subject.
As for doing an initial prescaling, my guess is that it would not be worth it. Scaling up from a larger image shouldn't be any less expensive than scaling up from the smaller original, and I would expect it to me much more expensive than scaling up by a convinient factor, such as 2x. However, don't take my word for it... test!

good ways to preserve image information when reducing bit depth

I have some (millions) of 16-bit losslessly compressed TIFFs (about 2MB each) and after exhausting TB of disk space I think it's time I archive the older TIFFs as 8-bit JPEGs. Each individual image is a grayscale image, though there may be as many as 5 such images representing the same imaging area at different wavelengths. Now I want to preserve as much information as possible in this process, including the ability to restore the images to their approximate original values. I know there are ways to get further savings through spatial correlations across multiple channels, but the number of channels can vary, and it would be nice to be able to load channels independently.
The images themselves suggest some possible strategies to use since close to ~60% of the area in each image is dark 'background'. So one way to preserve more of the useful image range is just to threshold away anything below this 'background' before scaling and reducing the bit depth. This strategy is, of course, pretty subjective, and I'm looking for any other suggestions for strategies that are demonstrably superior and/or more general. Maybe something like trying to preserve the most image entropy?
Thanks.
Your 2MB TIFFs are already losslessly compressed, so you would be hard-pressed to find a method that allows you to "restore the images" to their original value ranges without some loss of intensity detail.
So here are some questions to narrow down your problem a bit:
What are the image dimensions and number of channels? It's a bit difficult to guess from the filesize and bit depth alone, because as you've mentioned you're using lossless compression. A sample image would be good.
What sort of images are they? E.g. are they B/W blueprints, X-ray/MRI images, color photographs. You mention that around 60% of the images is "background" -- could you tell us more about the image content?
What are they used for? Is it just for a human viewer, or are they training images for some computer algorithm?
What kind of coding efficiency are you expecting? E.g. for the current 2MB filesize, how small do you want your compressed files to be?
Based on that information, people may be able to suggest something. For example, if your images are just color photographs that people will look at, 4:2:0 chroma subsampling will give you a 50% reduction in space without any visually detectable quality loss. You may even be able to keep your 16-bit image depth, if the reduction is sufficient.
Finally, note that you've compared two fundamentally different things in your question:
"top ~40% of the pixels" -- here it sounds like you're talking about contiguous parts of the intensity spectrum (e.g. intensities from 0.6 to 1.0) -- essentially the probability density function of the image.
"close to ~60% of the area in each image" -- here you're talking about the distribution of pixels in the spatial domain.
In general, these two things are unrelated and comparing them is meaningless. There may be an exception for specific image content -- please put up a representative image to make it obvious what you're dealing with.
If you edit your question, I'll have a look and reply if I think of something.

Resources