Using Imagemagick to convert PDFs having rasters in them results in white backgrounds behind the sections covered by the rasters

Using Imagemagick to convert PDFs having rasters in them results in white backgrounds behind the sections covered by the rasters - imagemagick

I tried to use ImageMagick (v6.8.9-9 Q16) to convert a PDF containing a PNG file embedded in it to a PNG file.
The original PNG file had a transparent background. In the PDF too it appears fine. But in the PNG obtained after conversion, the area originally occupied by the PNG in the PDF has a white background. Please see the links for more clarity.
The command I ran is as follows:
convert -colorspace sRGB dice.pdf converted_dice.png
I also tried setting the -transparent white switch but it ends up taking out whites that were actually required in the final image.
Are there any extra switches or parameters to pass to convert in order to get rid of just this white background?

Kurt already explains the whole thing in great detail. So here is just how to assemble an image with ImageMagick after running it through pdfimages -png
pdfimages -png my.pdf my
This resluts in two files
identify my-0*png
my-000.png PNG 360x310 360x310+0+0 8-bit sRGB 256c 3.3KB 0.000u 0:00.000
my-001.png PNG 360x310 360x310+0+0 8-bit sRGB 256c 9.44KB 0.000u 0:00.000
my-001.png is the image labeled smask in pdfimages -list. To reassemble the image back to it's original form use -compose CopyOpacity with the ImageMagick command composite
composite -compose CopyOpacity my-001.png my-000.png my-reassembled.png
See also http://www.imagemagick.org/Usage/masking/#masks for more information.

Your approach to this task cannot work.
The command you used will convert the complete letter-sized PDF page (612 x 792 pt) into a PNG image.
However, the original size of the image embedded in the PDF page (612 x 792 pt) is 800 x 600 pixels. This can be seen by running pdfimages -list:
pdfimages -list dice.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
----------------------------------------------------------------------------------------
1 0 image 800 600 rgb 3 8 image no 12 0 72 72 277K 20%
1 1 smask 800 600 gray 1 8 image no 12 0 72 72 50.1K 11%
So this is the first problem when converting the PDF page: it does not give your the correct size of the contained images.
The second, more fundamental problem however is: any image you get from converting a PDF page is the combination of all PDF objects overlayed on each other as they are from the page area. (Of course you could crop only part of the page -- but this gives you likewise the combination of all PDF objects from the cropped area...). The results of this you've encountered when you tried to convert all white pixels into transparent ones: since the originally different objects are merged into one representation of pixels, you can no longer discriminate between them as required.
You should take a different approach and use a different tool to extract the image: use pdfimages (the tool used above with the -list parameter to display image properties from the PDF's pages). As you can see, there are two images list: one is an RGB raster image, the other is a grayscale raster image, dubbed as type smask (softmask).
Here is a command to extract both images as PNG:
pdfimages -png dice.pdf dice-images
This will extract the two:
dice-images-0000.png (a color image)
dice-images-0001.png (a grayscale image)
(Note: Only very recent versions of pdfimages, the Poppler version, will let your extract the images as PNG. Within the PDF there is no such thing as PNG. There are only raster data, compressed with different methods. Older versions will only be able to extract images as PPM or PNM. This does not have any influence on what I describe below. Even if you extract PPM/PNM images, these two files can still be processed as described below...)
Below is a side-by-side, scaled-down montage of the two:
As you can see, the image itself does not have a transparent background, but a white one. (It does not have an Alpha channel.) Within the PDF format, these two images are used in combination to create transparent areas:
what appears completely black in the softmask (right) means: this pixel of the real image (left) is meant to be fully transparent.
what appears completely white in the softmask (right) means: this pixel of the real image (left) is meant to be fully opaque.
what appears in a shade of gray in the softmask (right) means: this pixel of the real image (left) is meant to be partially transparent (in line with its level of gray/black).
To combine these two files (color image and grayscale softmask) back into one PNG with transparency, you can employ ImageMagick now...

Related

Image magic convert command creates more than one file

I executed below command to convert a .tif file to a .jpg file. But for some tif images it generates 3 jpg files when only one file is expected. One is the expected jpg file, one is the same image in a black background and the other is just a white image.
magick convert /<.tif image name> -intent relative -resize 1500x1500> -quality 95 -colorspace sRGB -strip -auto-orient /<output .jpg image name>
Does anyone know the reason for this? what property of the input file causing this? or is there a issue with the command?
magick convert /<.tif image> -intent relative -resize 1500x1500> -quality 95 -colorspace sRGB -strip -auto-orient /<output .jpg image>
Expect this to give a single jpg image. But it gives 3 images for some input .tif images

Just adding some meat to #GeeMack's comment...
TIFF files often contain multiple images - or IFDs as referred to in the documentation. These can represent many things, but the most common are:
a low-resolution, flattened preview image followed by a full-resolution image aimed at providing quick previews
multiple pages of longer documents
colour separations for printing
the many channels of multi/hyper-spectral images
the layers of a multi-layer images, e.g. Photoshop editing layers
images and their associated masks, or classes/categories/classifications
... and so on.
A quick way to check what you have is with ImageMagick's identify command as that will produce a line for each image in the file, and you can often tell by the sizes, shapes and types of the layers which is a small preview and which is high resolution image, or that there are 242 channels of identical resolution images for a EO-1 hyperspectral imager.
magick identify IMAGE.TIF
Here's an example:
magick identify Prokudin-Gorskii.tif
Prokudin-Gorskii.tif[0] TIFF 3702x3205 3702x3205+0+0 16-bit sRGB 134.955MiB 0.060u 0:00.064
Prokudin-Gorskii.tif[1] TIFF 3702x3205 3702x3205+0+0 16-bit sRGB 134.955MiB 0.000u 0:00.001
Prokudin-Gorskii.tif[2] TIFF 625x175 625x175+841+814 16-bit sRGB 134.955MiB 0.000u 0:00.001
and you can see from the sizes that there are two full layers followed by a reduced size layer that is only annotation or markup on a small area of the image.
Another useful technique is to lay out all the images within a TIFF beside each other in a row across the page, with 10 pixel gaps between, using a command like this:
magick IMAGE.TIFF +smush 10 contents.jpg
We can now see that the three layers in the foregoing image correspond to a flattened version of all the layers on the left, followed by the two individual layers themselves in the centre and the reduced size yellow line overlay layer on the right.
If we then determine that it is only the first, flattened image we are interested in, we can extract and manipulate that alone by adding its sequence number in square brackets afterwards. So, to extract just the first flattened image:
magick IMAGE.TIF[0] extracted.tif
You can also extract multiple individual images and ranges, using commas and dashes.
Note also that magick convert is generally not what you want.
Note also that exiftool is lighter weight than a full ImageMagick installation and can also tell you what's in a multi-IFD TIFF.

Gray scale to text scan image to black/white image with higher resolution

convert 0101.jp2 -threshold 50% -type bilevel -monochrome -compress LZW ../0101.tiff
The resulting image looks jagged when I use the above command to convert a colored scanned text page to a black/white image (must be one bit per pixel). I want to make it of a higher resolution to look smoother. How can I use convert to do so?
Note that SO automatically converts tif image to jpg format so the output image shown below is not the same as the output image. You will need to run the convert command to get the true output image in tif.

If instead of thresholding you apply a strong contrast the gray pixels on the edge remain in a range of grays and the output is not jagged.
convert Original.jpg -sigmoidal-contrast 30 Corrected.jpg
(there are several ways to increase contrast in Magick)

ImageMagick: Exact remap of greyscale values to RGB ones

I'm using ImageMagick 6.8 and I have LUT color table created in text format:
# ImageMagick pixel enumeration: 848,1,255,srgb
0,0: (0 , 0 , 0 ) #000000
1,0: (226, 226, 224) #E2E2E0
2,0: (48 , 74 , 0 ) #304A00
# ...
# few hundred more colors
Which has one colour per grayscale value (between 0 and 848 in my use case).
So, I want to convert a grayscale image to RGB one, using this LUT without any fancy gamma corrections, colour space remaps, interpolations and etc. Just straight replacement. How to do it?
Current issues start since the beginning:
Trying to convert lut.txt lut.png with various options always give me more colours than they are actually. In the LUT, there are 540 unique colours, but inspecting the generated PNG, or even identify lut.txt reports 615! This means that the LUT is not interpreted straight at all.
On the other hand, even if I succeed to read the LUT exactly, or probably avoid converting it to PNG, there comes another problem. Using -clut maps the whole greyscale range (0-65535) to the LUT, so I guess I have to normalize it first. But this screws up the greyscales input to begin with.
P.S. An answer which might be useful here is, if there is image format with bigger than 8-bit indexed palette. Then that text LUT be used as its palette and the greyscale raster as its pixel values.

In Imagemagick, use -clut to process a grayscale image with a colored look-up table image to colorize the grayscale image.
First create a 3-color color table LUT image with red, green and blue hex colors. I show an enlarged version.
convert xc:"#ff0000" xc:"#00ff00" xc:"#0000ff" +append colortable.gif
Here is the input - a simple gradient that I will colorize.
Now apply the color table image to the gradient using -clut.
convert gradient.png colortable.gif -clut gradient_colored.png
The default is a linear interpolation. But if you only want to see the 3 colors, then use -interpolate nearest-neighbor.
convert gradient.png colortable.gif -interpolate nearest-neighbor -clut gradient_colored2.png

How to change the depth of an image using imagemagick?

I have tried adding the option -depth 12 to the string
convert transparentPNG.png -resize 500x400 -background white -flatten -depth 12 png_small.jpg
The input file is a transparent png to which I'm adding a background and then changing the depth. But the depth remains the same as 8bits. I verified the same using the -verbose.
I'm not sure what could I be doing wrong here. I'm referring to the site link
The transparent input png file used for my test can be found here
Let me know if you have any questions on the tests i did. Hoping to get some tips.

A JPG can only be 8-bit, so your internal 12-bit image is converted back to 8-bit when you save the result.

ImageMagick convert adds several extra "border" colors from tiff to jpeg?

I created an 8-bit .tiff image ("test.tiff") containing a grid of 30 different color patches in the RGB color space using ImageMagick -convert.
When I convert this image into a jpeg (which is what I need) using:
convert -quality 100 -colorspace RGB -depth 8 test.tiff test.jpg
The identify -verbose command reveals that the resulting jpeg has several additional colors in the color table, each only taking up a few (1-4) pixels and residing very near the desired colors in RGB space. My assumption is that some kind of border bleeding is happening; maybe due to compression?
I don't understand why this border bleeding has occurred, especially given that it does not occur when I convert the tiff image to either a bmp or pcx image.
Thank you

By definition, JPEG is a lossy compression. The effects your experiencing are expected with the JPEG format. Setting the -quality of 100 will not have a 1-to-1 image result as tiff.
See additional answers:
Should I use JPG or TIFF for high-quality prints?
[...] because every time [JPEG] would save it it would generate some changes.
Is Jpeg lossless when quality is set to 100?
At [quality] 100, you just get the LEAST loss possible.

I don't know how you created your 30 colour swatch, or how your histogram looks, but you might try adding -dither None and -colors 30 options to your convert commands:
convert test.tiff -dither None -colors 30 ...

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart