I am facing this strange issue where i am trying to read the blob of WebP Image through MagickReadImageBlob and in the next line i just try to fetch the same blob using MagickGetImageBlob . So, my final blob size reduces strangely. So, can anyone explain this behaviour?
I am using Version: ImageMagick 6.9.8-10 Q16 x86_64 on ubuntu 16.04
So, can anyone explain this behaviour?
The MagickReadImageBlob decodes an image-file buffer into a raster of authenticated pixels.
The MagickGetImageBlob encodes the raster back into an image-file buffer.
WebP format can be either lossy, or lossless, as well as implement different compression techniques during the encoding process. It's more than possible that the encoding routine simply found another way to store the raster than the previous one. Your version of ImageMagick has a quantum depth of 16 (Q16), so the decoding/scaling of WebP's 24-bit Color + 8-bit alpha to Q16 might influence some encoding variations. Try setting MagickSetImageDepth(wand, 8) to see if that helps.
Related
I am doing segmentation via deep learning in pytorch. My dataset is a .raw/.mhd format ultrasound images.
I want to input my dataset into the system via data loader.
I faced few important questions:
Does changing the format of the dataset to either .png or .jpg make the segmentation inaccurate?(I think I lost some information in this way!)
Which format is less data lossy?
How should I make a dumpy array if I don't convert the original image format, i.e., .raw/.mhd?
How should I load this dataset?
Knowing nothing about raw and mhd formats, I can give partial answers.
Firstly, jpg is lossy and png is not. So, you're surely losing information in jpg. png is lossless for "normal" images - 1, 3 or 4 channel, with 8 bit precision in each (perhaps also 16 bits are also supported, don't quote me on that). I know nothing about ultrasound images, but if they use higher precision than that, even png will be lossy.
Secondly, I don't know what mhd is and what raw means in the context of ultrasound images. That being said, a simple google search reveals some package for reading the former to numpy.
Finally, to load the dataset, you can use the ImageFolder class from torchvision. You need to write a custom function which loads an image given its path (for instance using the package mentioned above) and pass it to the loader keyword argument.
I am trying to open an image and turn it into a numpy array.
I have tried:
1) cv2.imread which gives you a numpy array directly
2) and PIL.Image.open then do a numpy.asarray to convert the image object.
Then i realise the resulting array from the same picture is different, please see the attached screenshots.
cv2.imread
PIL.Image.open
I would expect the color channel should always have the same sequence, no matter the package, but I do not seem to be able find any documentation for pillow reagarding this.
Or am I just being silly? Thanks in advance for any suggestion!!!
I don't know anything about PIL but, contrary to just about every other system in the world, OpenCV stores images in BGR order, not RGB. That catches every OpenCV beginner by surprise and it looks like that's the case with your example.
Opencv
image = cv2.imread(image_path, 1)
image_cv = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
Pillow
image_from_pil = Image.open(image_path).convert("RGB")
image_pillow = numpy.array(image_from_pil)
image_np equals image_cv
Notes: While reading a JPEG image, image_np and image_cv may be little different because the libjpeg version may be different in OpenCV and Pillow.
As SSteve and Xin Yang correctly say, the main problem could be that cv2 returns spatial domain (pixels) in BGR color plane, instead of usual RGB. You need to convert the output (reverse the order in channels' axis or use cv2.cvtColor).
Even after the color plane conversion, the output might not be the same. Both PIL and cv2 use libjpeg under the hood, but the outputs of libjpeg do differ for different versions. Read this research paper for reference. Based on my experiments I can say that the libjpeg version used by PIL is unpredictable (differs even on two identical MacBook Pro 2020 M1 using brew and the same Python and PIL version).
If it does matter and you want to have control over which libjpeg/libjpeg-turbo/mozjpeg version is used for compression and decompression, use jpeglib. It is still in beta, but the production release is coming.
I need to convert many TIFF images to JPEG per second. Currently I'm using libmagick++ (Q16). I'm in the process of compiling ImageMagick Q8 as I read that it may improve performance (specially because I'm only working with 8bit images).
CImg also looks like a good option and GraphicsMagick claims to be faster than ImageMagic. I haven't tested either of those yet, but I was wondering if there are any other alternatives that could be faster than using ImageMagick Q8?
I'm looking for a Linux only solution.
UPDATE width GraphicsMagick & ImageMagick Q8
Base comparison (see comment to Mark): 0.2 secs with ImageMagick Q16
I successfully compiled GraphicsMagick with Q8, but after all, it seems about 30% slower than ImageMagick (0.3 secs).
After compiling ImageMagick with Q8, there was a gain of about 25% (0.15 secs). Nice :)
UPDATE width VIPS
Thanks to Mark's post, I give it a try to VIPS. Using the 7.38 version that is found in Ubuntu Trusty repositories:
time vips copy input.tiff output.jpg[Q=95]
real 0m0.105s
user 0m0.130s
sys 0m0.038s
Very nice :)
I also tried with the 7.42 (from ppa:dhor/myway) but it seems slighlty slower:
real 0m0.134s
user 0m0.168s
sys 0m0.039s
I will try to compile VIPS from source and see if I can beat that time. Well done Mark!
UPDATE: with VIPS 8.0
Compiled from source, vips-8.0 gets practically the same performance than 7.38:
real 0m0.100s
user 0m0.137s
sys 0m0.031s
Configure command:
./configure CC=c99 CFLAGS=-O2 --without-magick --without-OpenEXR --without-openslide --without-matio --without-cfitsio --without-libwebp --without-pangoft2 --without-zip --without-png --without-python
I have a few thoughts...
Thought 1
If your input images are 15MB and, for argument's sake, your output images are 1MB, you are already using 80MB/s of disk bandwidth to process 5 images a second - which is already around 50% of what a sensible disk might sustain. I would do a little experiment with using a RAMdisk to see if that might help, or an SSD if you have one.
Thought 2
Try experimenting with using VIPS from the command line to convert your images. I benchmarked it like this:
# Create dummy input image with ImageMagick
convert -size 3288x1152! xc:gray +noise gaussian -depth 8 input.tif
# Check it out
ls -lrt
-rw-r--r--# 1 mark staff 11372808 28 May 11:36 input.tif
identify input.tif
input.tif TIFF 3288x1152 3288x1152+0+0 8-bit sRGB 11.37MB 0.000u 0:00.000
Convert to JPEG with ImageMagick
time convert input.tif output.jpg
real 0m0.409s
user 0m0.330s
sys 0m0.046s
Convert to JPEG with VIPS
time vips copy input.tif output.jpg
real 0m0.218s
user 0m0.169s
sys 0m0.036s
Mmm, seems a good bit faster. YMMV of course.
Thought 3
Depending on the result of your test on disk speed, if your disk is not the limiting factor, consider using GNU Parallel to process more than one image at a time if you have a quad core CPU. It is pretty simple to use and I have always had excellent results with it.
For example, here I sequentially process 32 TIFF images created as above:
time for i in {0..31} ; do convert input-$i.tif output-$i.jpg; done
real 0m11.565s
user 0m10.571s
sys 0m0.862s
Now, I do exactly the same with GNU Parallel, doing 16 in parallel at a time
time parallel -j16 convert {} {.}.jpg ::: *tif
real 0m2.458s
user 0m15.773s
sys 0m1.734s
So, that's now 13 images per second, rather than 2.7 per second.
What is possible image format of Y800 which is available in OpenCV? is it always referred to GRAY? Any other options?
Thanks in advance.
Eight years later, I stumbled upon this question. I want to add an answer with regard to OpenCV 4.2.0.
For videos, this version features an ffmpeg backend which natively understands the FOURCC identifier "Y800". Confusingly, it does not take one-channel grayscale (CV_8UC1) frames, but the usual OpenCV three-channel BGR (CV_8UC3):
cv::VideoWriter vw("y8.avi", cv::VideoWriter::fourcc('Y', '8', '0', '0'), 60, frame.size());
vw.write(frame); // note: frame must be 8UC3!
OpenCV supports the grayscale mode in number of image file formats, including, but not limited to PGM, PNG and JPEG.
cv::imwrite("gray.pgm", image);
I have an embedded application where an image scanner sends out a stream of 16-bit pixels that are later assembled to a grayscale image. As I need to both save this data locally and forward it to a network interface, I'd like to compress the data stream to reduce the required storage space and network bandwidth.
Is there a simple algorithm that I can use to losslessly compress the pixel data?
I first thought of computing the difference between two consecutive pixels and then encoding this difference with a Huffman code. Unfortunately, the pixels are unsigned 16-bit quantities so the difference can be anywhere in the range -65535 .. +65535 which leads to potentially huge codeword lengths. If a few really long codewords occur in a row, I'll run into buffer overflow problems.
Update: my platform is an FPGA
PNG provides free, open-source, lossless image compression in a standard format using standard tools. PNG uses zlib as part of its compression. There is also a libpng. Unless your platform is very unusual, it should not be hard to port this code to it.
How many resources do you have available on your embedded platform?
Could you port zlib and do gzip compression? Even with limited resources, you should be able to port something like LZ77 or LZ88.
There are a wide variety of image compression libraries available. For example, this page lists nothing but libraries/toolkits for PNG images. Which format/library works best for you will most likely depend on the particular resource constraints you are working under (in particular, whether or not your embedded system can do floating-point arithmetic).
The goal with lossless compression is to be able to predict the next pixel based on previous pixels, and then to encode the difference between your prediction and the real value of the pixel. This is what you initial thought to do, but you were only using the one previous pixel and making the prediction that the next pixel would be the same.
Keep in mind that if you have all of the previous pixels, you have more relevant information than just the preceding pixel. That is, if you are trying to predict the value of X, you should use the O pixels:
..OOO...
..OX
Also, you would not want to use the previous pixel, B, in the stream to predict X in the following situation:
OO...B <-- End of row
X <- Start of next row
Instead you would make your prediction base on the Os.
How 'lossless' do you need?
If this is a real scanner there is a limit to the bandwidth/resolution so even if it can send +/-64K values it may be unphysical for adjacent pixels to have a difference of more than say 8 bits.
In which case you can do a start pixel value for each row and then do differences between each pixel.
This will smear out peaks but it may be that any peaks more than 'N'bits are noise anyway.
A good LZ77/RLE hybrid with bells and wwhistles can get wonderful compression that is fairly quick to decompress. They will also be bigger, badder compressors on smaller files due to the lack of library overhead. For a good, but GPLd implentation of this, check out PUCrunch