Imagemagick parallel conversion - imagemagick

I want to get screenshot of each page of a pdf into jpg. To do this I am using ImageMagick's convert command in command line.
I have to achieve the following -
Get screenshots of each page of the pdf file.
resize the screenshot into 3 different sizes (small, med and preview).
store the different sizes in different folders (small, med and preview).
I am using the following command which works, however, it is slow. How can I improve its execution time or execute the commands parallely.
convert -density 400 -quality 100 /input/test.pdf -resize 170x117> -scene 1 /small/test_%d_small.jpg & convert -density 400 -quality 100 /input/test.pdf -resize 230x160> -scene 1 /med/test_%d_med.jpg & convert -density 400 -quality 100 /input/test.pdf -resize 1310x650> -scene 1 /preview/test_%d_preview.jpg
Splitting the command for readability
convert -density 400 -quality 100 /input/test.pdf -resize 170x117> -scene 1 /small/test_%d_small.jpg
convert -density 400 -quality 100 /input/test.pdf -resize 230x160> -scene 1 /med/test_%d_med.jpg
convert -density 400 -quality 100 /input/test.pdf -resize 1310x650> -scene 1 /preview/test_%d_preview.jpg

Updated Answer
I see you have long, multi-page documents and while my original answer is good for making multiple sizes of a single page quickly, it doesn't address doing pages in parallel. So, here is a way of doing it using GNU Parallel which is available for free for OS X (using homebrew), installed on most Linux distros and also available for Windows - if you really must.
The code looks like this:
#!/bin/bash
shopt -s nullglob
shopt -s nocaseglob
doPage(){
# Expecting filename as first parameter and page number as second
# echo DEBUG: File: $1 Page: $2
noexten=${1%%.*}
convert -density 400 -quality 100 "$1[$2]" \
-resize 1310x650 -write "${noexten}-p-$2-large.jpg" \
-resize 230x160 -write "${noexten}-p-$2-med.jpg" \
-resize 170x117 "${noexten}-p-$2-small.jpg"
}
export -f doPage
# First, get list of all PDF documents
for d in *.pdf; do
# Now get number of pages in this document - "pdfinfo" is probably quicker
p=$(identify "$d" | wc -l)
for ((i=0;i<$p;i++));do
echo $d:$i
done
done | parallel --eta --colsep ':' doPage {1} {2}
If you want to see how it works, remove the | parallel .... from the last line and you will see that the preceding loop just echoes a list of filenames and a counter for the page number into GNU Parallel. It will then run one process per CPU core, unless you specify -j 8 if you want say 8 processes to run in parallel. Remove the --eta if you don't want any updates on when the command is likely to finish.
In the comment I allude to pdfinfo being faster than identify, if you have that available (it's part of the poppler package under homebrew on OS X), then you can use this to get the number of pages in a PDF:
pdfinfo SomeDocument.pdf | awk '/^Pages:/ {print $2}'
Original Answer
Something along these lines so you only read it in once and then generate successively smaller images from the largest one:
convert -density 400 -quality 100 x.pdf \
-resize 1310x650 -write large.jpg \
-resize 230x160 -write medium.jpg \
-resize 170x117 small.jpg
Unless you mean you have, say, a 50 page PDF, and you want to do all 50 pages in parallel. If you do, say so, and I'll show you that using GNU Parallel when I get up in 10 hours...

Related

ffmpeg resize large image and high resolution

I tried to resize a very big image (457 MB and 21600x21600) with the following command
-i test.png -vf scale=320:-1 out.png
but it throws exception saying "Picture size 21600x21600 is invalid". How can I find out the biggest supported resolution by ffmpeg? Is there a way to resize this high resolution image with ffmpeg?
If you want to use ImageMagick it is included in most Linux distros and is available for macOS and Windows.
Your command becomes:
convert test.png -resize 320x result.png
If you are running v7 or newer, use:
magick test.png -resize 320x result.png
If you have lots to do, and you want all the resized images written in a directory called thumbs you can use:
mkdir thumbs
magick mogrify -path thumbs -resize 320x *.png
Alternatively, you may find vips is a lighter-weight installation and does a faster conversion using less memory:
mkdir thumbs
vipsthumbnail -s 320 -o "thumbs/%s.png" image.png

Batch append images in groups of two with Imagemagick

I have a directory of images and need to merge those images horizontally in groups of two, then save the output of each to a new image file:
image-1.jpeg
image-2.jpeg
image-3.jpeg
image-4.jpeg
image-5.jpeg
image-6.jpeg
Using Imagemagick via command line, is there a way to loop through every other image in a directory and run magick convert image-1.jpeg image-2.jpeg +append image-combined-*.jpg?
So the result would be combined pairs of images:
image-1.jpeg image-2.jpeg -> image-combined-1.jpg
image-3.jpeg image-4.jpeg -> image-combined-2.jpg
image-5.jpeg image-6.jpeg -> image-combined-3.jpg
Get them all appended succinctly and in parallel with GNU Parallel and actually use all those lovely CPU cores you paid Intel for!
parallel -N2 convert {1} {2} +append combined-{#}.jpeg ::: *jpeg
where:
-N2 says to take two files at a time
{1} and {2} are the first two parameters
{#} is the sequential job number, and
::: demarcates the start of the parameters
If your CPU has 8 cores, GNU Parallel will run 8 converts at once, unless you specify say 4 jobs at a time by adding -j4.
If you are learning and just finding your way with GNU Parallel add:
--dry-run so you can see what it would do without actually doing anything
-k to keep the outputs in order
So, I mean:
parallel --dry-run -k -N2 convert {1} {2} +append combined-{#}.jpeg ::: *jpeg
Sample Output
convert image-1.jpeg image-2.jpeg +append combined-1.jpeg
convert image-3.jpeg image-4.jpeg +append combined-2.jpeg
convert image-5.jpeg image-6.jpeg +append combined-3.jpeg
On macOS, you can simply install GNU Parallel with:
brew install parallel
If you have thousands, or hundreds of thousands of files, you may run into an error Argument list too long - although this is pretty rare on macOS because the limit is 262,144 characters:
sysctl -a kern.argmax
kern.argmax: 262144
If that happens, you can use this syntax to pipe the filenames in GNU Parallel instead:
find /somewhere -iname "*.jpeg" -print0 | parallel -0 -N2 convert {1} {2} +append combined-{#}.jpeg
If the images are all the same size and orientation, and if your system has the memory to read in all the images in the directory, it can be done as simply as this...
magick *.jpeg -set option:doublewide %[fx:w*2] \
+append +repage -crop %[doublewide]x%[h] +repage image-combined-%02d.jpg
This can be scripted easily using ImageMagick. I could show you how in Unix. But if you have more than 9 images, then you may have to rename with leading zeros, since alphabetically image-10 will come before image-2. You do not mention your IM version or platform and scripting will differ depending upon OS.
Here is a Unix solution. I have images rose-01.jpg ... rose-06.jpg in folder test on my desktop (Mac OSX). Each image has a label under it with its filename so we can keep track of the files.
cd
cd desktop/test
arr=(`ls *.jpg`)
num=${#arr[*]}
for ((i=0; i<num; i=i+2)); do
j=$((i+1))
k=$((i+2))
magick ${arr[$i]} ${arr[$j]} +append newimage_${j}_${k}.jpg
done
Note that arrays start with index 0. So I use j=i+1 and k=i+2 for the images that correspond to 1,2 3,4 5,6 in the filenames from ls in the array.
The result is (newimage_1_2.jpg, newimage_3_4.jpg, newimage_5_6.jpg)
An alternate solution is to montage all the images together two-by-two as an array of 2x3 and then equally crop them into 3 sections vertically. So in ImageMagick, this also works since these images are all the same size.
cd
cd desktop/test
arr=(`ls *.jpg`)
num=${#arr[*]}
num2=`magick xc: -format "%[fx:ceil($num/2)]" info:`
magick montage ${arr[*]} -tile 2x -geometry +0+0 miff:- | magick - -crop 1x3# +repage newimage.jpg
The results are: newimage-0.jpg, newimage-1.jpg, newimage-2.jpg
Ole Tang wrote:
Fails on filenames like My summer photo.jpg
So here is the solution using ImageMagick as modified from my original post.
Images:
rose 1.png
rose 2.png
rose 3.png
rose 4.png
rose 5.png
rose 6.png
OLDIFS=IFS
IFS=$'\n'
arr=(`ls *.png`)
for ((i=0;i<6;i++)); do
echo "${arr[$i]}"
done
IFS=OLDIFS
num=${#arr[*]}
for ((i=0; i<num; i=i+2)); do
j=$((i+1))
k=$((i+2))
magick "${arr[$i]}" "${arr[$j]}" +append newimage_${j}_${k}.jpg
done
This produces:
newimage_1_2.jpg
newimage_3_4.jpg
newimage_5_6.jpg

Resize indexed PNG image with ImageMagick while preserving color map

I am using custom batch script to make resized copies (33% and 66%) of all PNG images in folder. Here is my code:
for f in $(find /myFolder -name '*.png');
do
sudo cp -a $f "${f/%.png/-3x.png}";
sudo convert $f -resize 66.67% "${f/%.png/-2x.png}";
sudo convert $f -resize 33.33% $f;
done
It works fine, except when the original image is indexed. In this case the smaller version of the image is RGB (so even larger file size then original image).
I have try several versions but not worked. One that I guess supposed to sort this out was fallowing:
for f in $(find /myFolder -name '*.png');
do
sudo cp -a $f "${f/%.png/-3x.png}";
sudo convert $f -define png:preserve-colormap -resize 66.67% "${f/%.png/-2x.png}";
sudo convert $f -define png:preserve-colormap -resize 33.33% $f;
done
But it doesn't work.
EDIT:
This is updated co, but it still doesn't work as it supposed to (see the attached image-left is original, right is resized):
for f in $(find /myFolder -name '*.png');
do
sudo cp -a $f "${f/%.png/-3x.png}";
numberOfColors=`identify -format "%k" $f`
convert "$f" \
\( +clone -resize 66.67% -colors $numberOfColors -write "${f/%.png/-2x.png}" +delete \) \
-resize 33.33% -colors $numberOfColors "$f"
done
Original image:
Scaled version:
Use "-sample" instead of "-resize" to preserve the color set. This causes the resizing to be done by nearest-neighbor color selection rather than any kind of interpolation.
Otherwise, the colormap ends up with more than 256 colors and the png encoder can't preserve it, due to the 256-color limit on the size of a PNG PLTE chunk. I cannot guarantee that you'll like the appearance of the result, though.
Also, be sure you are using a recent version of ImageMagick.
I'm not observing this problem with the current release (6.9.3-7). Your script works fine and produces clean -2x and -3x images.
There are several things to address here...
find vs glob
You say you want to process all files in a folder, then you use find which will search down into sub-directories as well. If you just want to process files in the current directory, you can let bash do the globbing directly for you. So, instead of
for f in $(find . -name "*.png"); do
you can just do:
shopt -s nullglob
for f in *.png; do
Performance
You run convert twice and load the original image twice, and that is not very efficient. You can run a single process that loads a single image and resizes to two different sizes and writes both to disk. So, instead of
for ...; do
convert ...
convert ...
done
you can write the following to start one convert, read the image once, clone it in memory and write it out, delete the spare copy in memory and then resize the original image and re-save that.
for ...; do
convert "$f" \
\( +clone -resize 66.67% -write "${f/%.png/-2x.png}" +delete \) \
-resize 33.33% "$f"
done
Palette
It seems you actually only want to output palettised (indexed) images with "any" colormap rather than with a "specific" colormap. Glenn's answer is perfect if you want to retain a specific colormap. However, if any colormap is ok, you can use -colors to reduce the colours in the resulting image to a level where the PNG library can make the decision to create a palettised image. Glenn knows a lot more than me about that as he wrote it! However, I think if you reduce the colours to 250 (or so) you will probably get a 256 entry colormap and if you reduce the colours to around 60 or so, you will get a 64 entry colourmap. So, you would do:
shopt -s nullglob
for f in *.png; do
sudo cp ... ...
convert "$f" \
\( +clone -resize 66.67% -colors 250 -write "${f/%.png/-2x.png}" +delete \) \
-resize 33.33% -colors 250 "$f"
done
You can try experimenting with other numbers of colours and see how that affects filesize - the number you need will depend on your images.

resize images in a folder with imagemagick and move to another with specific name for file

I need to take file from folder Images
and move it to folder test
and have them renamed
image.jpg -> image_big.jpg
i do that: mogrify -resize 200 -path images/../test/ images/*.*
works great!
when i try to change the name of the file like this
mogrify -resize 200 -format %t_big.%e -path images/../test/ images/*.*
i get the file name something like image._big.big
i tried with convert (but i will have 3000 images and i read that it uses the ram and not doing it like mogrify
convert images/*.jpg -resize 200 images/../test/
what can i do?
For mogrify you an use the -set & -fromat options.
mogrify -resize 200x -set filename:f "big.%e" \
-format "%[filename:f]" -path test/ images/*
This will give destination filenames of "image.big.jpg"
For convert, use bash to iterator over all 3000 files
for infile in `ls images/*.jpg`
do
convert $infile -resize 200x -set filename:f "%t_big.%e" test/%[filename:f]
done
This will give you filenames like "image_big.jpg"

ImageMagick fails on php but works in shell

I've this command:
/usr/local/bin/convert -density 200 /singlePage.pdf -colorspace RGB -verbose -geometry 1155 -quality 10 -limit area 100mb singlePicture.jpg
When executing with php (via browser) it has no result output (executing with php function exec()).
When executing the same command on shell, it works perfectly.
I tried another pdf file, which works on php and shell. The only difference is the filesize.
1,0806 MB => Works
1,0962 MB => Not Works
Any ideas?
So this:
/usr/local/bin/convert -density 200 /singlePage.pdf -colorspace RGB -verbose -geometry 1155 -quality 10 -limit area 100mb singlePicture.jpg
implies that the singlePage.pdf file is located on the root of your filesystem. I doubt that is true. My guess is the "/singlePage.pdf" path is wrong.

Resources