ImageMagick convert tif to png or jpeg producing larger images - imagemagick

I have a lot of tif images I would like to convert to a web friendly format. Ive been playing a little bit with imagemagick (and a couple other libraries) to try to convert them, but the resulting image is a lot larger than the original, uncompressed image.
For example:
1.tif - 348.2kB
1.png - 781.7 kB
1.jpg - 429.1 kb
2.tif - 49.8 kB
2.png - 76.2 kB
2.jpg 900.4 kB
3.tif 7.7 kB
3.png 21.4 kB
3.jpg 191.3 kB
The command im using:
convert *.tif -set filename: "%t" %[filename:].jpg
Im not an expert, but I dont understand how passing from an uncompressed source image to a compressed one makes the size explode. Any idea of what is happening?
After running the proposed command
identify -format "%f: size: %B, compression: %C\n" *png *tif *jpg
I get the following output
00000001.png: size: 104522, compression: Zip
00000002.png: size: 23565, compression: Zip
00000003.png: size: 58936, compression: Zip
00000001.tif: size: 74122, compression: Group4
00000002.tif: size: 10946, compression: Group4
00000003.tif: size: 29702, compression: Group4
00000001.jpg: size: 1011535, compression: JPEG
00000002.jpg: size: 226068, compression: JPEG
00000003.jpg: size: 457045, compression: JPEG

It's hard to say what is happening without seeing your images - not all software uses the same compression method, or quality and some formats, such as PNG are lossless.
As a first attempt, try running this command to check the filename, file size, compression type of all your images:
identify -format "%f: size: %B, compression: %C\n" *png *tif *jpg
Sample Output
zHZB9.png: size: 1849, compression: Zip
result.tif: size: 290078, compression: None
z.tif: size: 213682, compression: LZW
sky.jpg: size: 88162, compression: JPEG
z.jpg: size: 8122, compression: JPEG
Now you can see the numbers you can decide what you want to address. So, to get a list of all the compression types you can use:
identify -list compress
Sample Output
B44A
B44
BZip
DWAA
DWAB
DXT1
DXT3
DXT5
Fax
Group4
JBIG1
JBIG2
JPEG2000
JPEG
LosslessJPEG
Lossless
LZMA
LZW
None
Piz
Pxr24
RLE
RunlengthEncoded
WebP
ZipS
Zip
Zstd
Now, you can experiment, e.g. by making some TIFFs with different settings:
convert -size 1024x768 gradient: 1.tif
convert -size 1024x768 gradient: -compress lzw 2.tif
convert -size 1024x768 gradient: -compress jpeg 3.tif
convert -size 1024x768 gradient: -compress jpeg -quality 40 4.tif
Now check:
1.tif: size: 1573138, compression: None
2.tif: size: 6316, compression: LZW
3.tif: size: 17520, compression: JPEG
4.tif: size: 9830, compression: JPEG

Related

How to overlay sequence of frames on video using ffmpeg-python?

I tried below but it is only showing the background video.
background_video = ffmpeg.input( "input.mp4")
overlay_video = ffmpeg.input(f'{frames_folder}*.png', pattern_type='glob', framerate=25)
subprocess = ffmpeg.overlay(
background_video,
overlay_video,
).filter("setsar", sar=1)
I also tried to assemble sequence of frames into .webm/.mov video but transparency is lost. video is taking black as background.
P.s - frame size is same as background video size. So no scaling needed.
Edit
I tried #Rotem suggestions
Try using single PNG image first
overlay_video = ffmpeg.input('test-frame.png')
It's not working for frames generated by OpenCV but working for any other png image. This is weird, when I'm manually viewing these frames folder it's showing blank images(Link to my frames folder).
But If I convert these frames into the video(see below) it is showing correctly what I draw on each frame.
output_options = {
'crf': 20,
'preset': 'slower',
'movflags': 'faststart',
'pix_fmt': 'yuv420p'
}
ffmpeg.input(f'{frames_folder}*.png', pattern_type='glob', framerate=25 , reinit_filter=0).output(
'movie.avi',
**output_options
).global_args('-report').run()
try creating a video from all the PNG images without overlay
It's working as expected only issue is transparency. Is there is way to create a transparent background video? I tried .webm/.mov/.avi but no luck.
Add .global_args('-report') and check the log file
Report written to "ffmpeg-20221119-110731.log"
Log level: 48
ffmpeg version 5.1 Copyright (c) 2000-2022 the FFmpeg developers
built with Apple clang version 13.1.6 (clang-1316.0.21.2.5)
configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/5.1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-neon
libavutil 57. 28.100 / 57. 28.100
libavcodec 59. 37.100 / 59. 37.100
libavformat 59. 27.100 / 59. 27.100
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100
libpostproc 56. 6.100 / 56. 6.100
Input #0, image2, from './frames/*.png':
Duration: 00:00:05.00, start: 0.000000, bitrate: N/A
Stream #0:0: Video: png, rgba(pc), 1920x1080, 25 fps, 25 tbr, 25 tbn
Codec AVOption crf (Select the quality for constant quality mode) specified for output file #0 (movie.avi) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream.
Codec AVOption preset (Configuration preset) specified for output file #0 (movie.avi) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream.
Stream mapping:
Stream #0:0 -> #0:0 (png (native) -> mpeg4 (native))
Press [q] to stop, [?] for help
Output #0, avi, to 'movie.avi':
Metadata:
ISFT : Lavf59.27.100
Stream #0:0: Video: mpeg4 (FMP4 / 0x34504D46), yuv420p(tv, progressive), 1920x1080, q=2-31, 200 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc59.37.100 mpeg4
Side data:
cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
frame= 125 fps= 85 q=31.0 Lsize= 491kB time=00:00:05.00 bitrate= 804.3kbits/s speed=3.39x
video:482kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.772174%
To draw frame I used below.
for i in range(num_frames):
transparent_img = np.zeros((height, width, 4), dtype=np.uint8)
cv2.line(transparent_img, (x1,y1), (x2,y2) ,(255, 255, 255), thickness=1, lineType=cv2.LINE_AA)
self.frames.append(transparent_img)
## To Save each frame of the video in the given folder
for i, f in enumerate(frames):
cv2.imwrite("{}/{:0{n}d}.png".format(path_to_frames, i, n=num_digits), f)
Here are answers to your two questions:
For drawing a white line on BGRA image, use (255, 255, 255, 255) color instead of (255, 255, 255).
The last 255 applies alpha (transparency) channel value, and 255 makes the line fully opaque.
For creating video with transparent background try: .webm file type, use libvpx-vp9 video codec, and use -pix_fmt yuva420p - the a of yuva applies alpha (transparency) channel.
Here is a "self contained" code sample (please read the comments):
import cv2
import numpy as np
import ffmpeg
# Create synthetic MP4 video file from testing
ffmpeg.input('testsrc=size=192x108:rate=1:duration=10', f='lavfi').output('tmp.mp4').overwrite_output().run()
transparent_img = np.zeros((108, 192, 4), np.uint8)
width, height, fps = 192, 108, 1
def make_sample_image(i):
p = width//60
img = np.zeros((height, width, 4), np.uint8) # Fully transparent
cv2.putText(img, str(i), (width//2-p*10*len(str(i)), height//2+p*10), cv2.FONT_HERSHEY_DUPLEX, p, (255, 255, 255, 255), p*2) # White number
return img
# Create 10 PNG files with transparent background an white number (counter).
for i in range(1, 11):
transparent_img = make_sample_image(i)
cv2.imwrite(f'{i:03d}.png', transparent_img)
output_options = { 'vcodec' : 'libvpx-vp9', # libvpx-vp9 supports transparency.
'crf': 20,
#'preset': 'slower', # Not supported by libvpx-vp9
#'movflags': 'faststart', # Not supported by WebM
'pix_fmt': 'yuva420p' # yuva420p includes transparency.
}
frames_folder = './'
# Create video with transparency:
# reinit_filter=0 is required, only if the PNG images have different characteristics (example: some are RGB and some RGBA).
# Use %03d.png instead of glob pattern, becuase my Windows version of FFmpeg doesn't support glob pattern.
ffmpeg.input(f'{frames_folder}%03d.png', framerate=fps, reinit_filter=0).output(
'movie.webm', # WebM container supports transparency
**output_options
).global_args('-report').overwrite_output().run()
# Overlay the PNG on top of tmp.mp4
background_video = ffmpeg.input( "tmp.mp4")
overlay_video = ffmpeg.input(f'{frames_folder}%03d.png', framerate=fps)
subprocess = ffmpeg.overlay(
background_video,
overlay_video,
).filter("setsar", sar=1)
subprocess.output('overlay_video.webm', **output_options).global_args('-report').overwrite_output().run()

FFMPEG - Extract color-accurate image sequence from UHD MOV in bt.2020

Im trying to use FFMPEG to extract an image sequence from an MOV that comes from an iphone 13 with bt2020
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2021-10-11T16:12:07.000000Z
com.apple.quicktime.make: Apple
com.apple.quicktime.model: iPhone 13 Pro
com.apple.quicktime.software: 15.0.1
com.apple.quicktime.creationdate: 2021-10-11T18:12:07+0200
Duration: 00:00:07.13, start: 0.000000, bitrate: 8693 kb/s
Stream #0:0(und): Video: hevc (Main 10) (hvc1 / 0x31637668), yuv420p10le(tv, bt2020nc/bt2020/arib-std-b67), 1920x1080, 8472 kb/s, 29.99 fps, 30 tbr, 600 tbn, 600 tbc (default)
But i haven't been able to get a color match on any extracted images. I have tried following a number of posts, a few on jpgs (most said it would have to be a PNG) and a few on PNG. But no matter what I seem to try, i end up with a lifted, less saturated image. I can only use jpg or PNG in this case - am I just stuck with a non-accurate color image extraction on HDR, or is there something I am missing here?
I have tried playing with scale and colormatrix commands, trying different options to go from bt2020:bt709 (though im not sure this is the correct approach, or if I mis-formatted something). I also tried a suggestion on another post to get 10bit PNG out - but ended up with the exact same color issue as just extracting a standard jpeg:
ffmpeg -i img_2106.mov -vsync 0 -f image2 test.%04d.png
and many other variations of options.
Anyone have any suggestions?
as added info im using:
ffmpeg version 4.4 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 12.0.5 (clang-1205.0.22.9)

When reading 4-channel tif file, value different from SKIMAGE, TIFFFILE and so on

I know the opencv got a BGR order, but in my experiment, not only the order but also the values are totally messed
import cv2 as cv
import tifffile as tiff
import skimage.io
img_path = r"C:\test\pics\t100r50s16_1_19.tif"
c = cv.imread(img_path,cv.IMREAD_UNCHANGED)
t = tiff.imread(img_path)
s = skimage.io.imread(img_path)
print("c:", c.shape, "t:", t.shape, "s:", s.shape)
print("c:", c.dtype, "t:", t.dtype, "s:", s.dtype)
print(c[0, 0], c[1023, 0], c[0, 1023], c[1023, 1023])
print(t[0, 0], t[1023, 0], t[0, 1023], t[1023, 1023])
print(s[0, 0], s[1023, 0], s[0, 1023], s[1023, 1023])
print(c.sum())
print(t.sum())
print(s.sum())
And the outputs like this:
c: (1024, 1024, 4) t: (1024, 1024, 4) s: (1024, 1024, 4)
c: uint8 t: uint8 s: uint8
[ 50 63 56 182] [131 137 140 193] [29 28 27 94] [123 130 134 190]
[ 79 88 70 182] [185 181 173 193] [74 77 80 94] [180 174 165 190]
[ 79 88 70 182] [185 181 173 193] [74 77 80 94] [180 174 165 190]
# Here seems that opencv only read the alpha channel right,
# the values of first three channels are much different than other package
539623146
659997127
659997127
The image i use can be download here. So, here is my question, how open cv handle 4 channel tiff file? Because when i test on 3-channel image, everything looks alright.
I don't buy it for a minute that there is a rounding error or some error related to JPEG decoding like the linked article suggests.
Firstly because your image is integer, specifically uint8 so there is no rounding of floats, and secondly because the compression of your TIF image is not JPEG - in fact there is no compression. You can see that for yourself if you use ImageMagick and do:
identify -verbose a.tif
or if you use tiffinfo that ships with libtiff, like this:
tiffinfo -v a.tif
So, I did some experiments by generating sample images with ImageMagick like this:
# Make 8x8 pixel TIF full of RGBA(64,128,192) with full opacity
convert -depth 8 -size 8x8 xc:"rgba(64,128,192,1)" a.tif
# Make 8x8 pixel TIFF with 4 rows per strip
convert -depth 8 -define tiff:rows-per-strip=4 -size 8x8 xc:"rgba(64,128,192,1)" a.tif
And OpenCV was able to read all those correctly, however, when I did the following it went wrong.
# Make 8x8 pixel TIFF with RGB(64,128,192) with 50% opacity
convert -depth 8 -define tiff:rows-per-strip=1 -size 8x8 xc:"rgba(64,128,192,0.5)" a.tif
And the values came out in OpenCV as 32, 64, 96 - yes, exactly HALF the correct values - like OpenCV is pre-multiplying the alpha. So I tried with an opacity of 25% and the values came out at 1/4 of the correct ones. So, I suspect there is a bug in OpenCV that premultiplies the alpha.
If you look at your values, you will see that tifffile and skimage read the first pixel as:
[ 79 88 70 182 ]
if you look at the alpha of that pixel, it is 0.713725 (182/255), and if you multiply each of those values by that, you will get:
[ 50 63 56 182 ]
which is exactly what OpenCV did.
As a workaround, I guess you could divide by the alpha to scale correctly.
In case the argument is that OpenCV intentionally pre-multiplies the alpha, then that begs the question why it does that for TIFF files but NOT for PNG files:
# Create 8x8 PNG image full of rgb(64,128,192) with alpha=0.5
convert -depth 8 size 8x8 xc:"rgba(64,128,192,0.5)" a.png
Check with OpenCV:
import cv2
c = cv2.imread('a.png',cv2.IMREAD_UNCHANGED)
In [4]: c.shape
Out[4]: (8, 8, 4)
In [5]: c
Out[5]:
array([[[192, 128, 64, 128],
[192, 128, 64, 128],
...
...
In case anyone thinks that the values in the TIF file are as OpenCV reports them, I can only say that I wrote rgb(64,128,192) at 50% opacity and I tested each of the following and found that they all agree, with the sole exception of OpenCV that that is exactly what the file contains:
ImageMagick v7
libvips v8
Adobe Photoshop CC 2017
PIL/Pillow v5.2.0
GIMP v2.8
scikit-image v0.14

Software/tool to generate R-G-B values of every pixel from an image and vice-versa

Is there a software/tool that can generate me a matrix of RGB values from a simple raw 8-bit RGB image?
Also, is there a software/tool that can generate an image from a given matrix of RGB values?
Thank you.
PS:
i) I am aware that this can be done using Matlab. I am looking for a tool that can do it that is not Matlab.
ii) I am aware of existing question about doing similar stuff programmatically. I need a software tool, if there is any, that can do this task.
I would suggest you use the venerable NetPBM which is available for Linux, macOS and Windows. Alternatively, you could use ImageMagick but that is much heavier weight, see later.
NetPBM Method - see Wikipedia NetPBM entry
So, let's start with a raw, 8-bit RGB file that contains a red, a green and a blue pixel:
-rw-r--r-- 1 mark staff 9 10 Oct 07:47 rgb888.bin
As you can see, it has 9 bytes. Let's look at them:
xxd -g3 rgb888.bin
00000000: ff0000 00ff00 0000ff
Now, if we want that image as a matrix of legible values:
rawtoppm -plain 3 1 rgb888.bin
Sample Output
P3
3 1
255
255 0 0 0 255 0 0 0 255
where:
-plain means to display in ASCII rather than binary
P3 tells us it is colour and ASCII
3 1 tells us its dimension are 3 pixels wide by 1 pixel high
255 essentially tells us it is 8-bit (65536 would mean 16-bit)
the last row is the pixels
Converting back to binary is a little harder, let's assume we start with a PPM file created like this:
rawtoppm -plain 3 1 rgb888.bin > image.ppm
So, we can get the binary version like this:
ppmtoppm < image.ppm | tail -c 9 > rgb888.bin
and look at it with:
xxd -g3 rgb888.bin
00000000: ff00 0000 ff00 0000 ff
ImageMagick Method
# Convert binary RGB888 to text
convert -depth 8 -size 3x1 RGB:rgb888.bin txt:
Sample Output
# ImageMagick pixel enumeration: 3,1,65535,srgb
0,0: (65535,0,0) #FF0000 red
1,0: (0,65535,0) #00FF00 lime
2,0: (0,0,65535) #0000FF blue
Or, slightly different appearance:
# Convert binary RGB888 to matrix
convert -depth 8 -size 3x1 RGB:rgb888.bin -compress none ppm:
Sample Output
P3
3 1
255
255 0 0 0 255 0 0 0 255
And now going the other way, PPM to binary
# Convert PPM image to binary
convert image.ppm rgb:image.bin
# Check how the binary looks
xxd -g 3 image.bin
00000000: ff0000 00ff00 0000ff .........
Plain dump method
Maybe you are happy with a plain dump from od:
od -An -t u1 rgb888.bin
Sample Output
255 0 0 0 255 0 0 0 255

24 bit bmp to RGB565 file conversion

I want to convert 24 bit bmp files to RGB565 format to write to a serial TFT colour display.
The size of the 24bit bmp will always be 320x240 pixels as my TFT display is 320x240
Has anyone any experience of doing this? It could be C/C++, Shell, Python, Java Script and so on...
I would use either NetPBM (much smaller and lighter-weight) or ImageMagick (much bigger installation) to convert the BMP into a format that is simple to parse and then use Perl to convert that to RGB565 format.
I assume you are planning to write the RGB565 data to a frame buffer, so you would do something like:
./bmp2rgb565 image.bmp > /dev/fb1
So, save the following as bmp2rgb565:
#!/bin/bash
################################################################################
# bmp2rgb565
# Mark Setchell
################################################################################
if [ $# -ne 1 ]; then
echo Usage: $0 image.bmp
exit 1
fi
file=$1
# Use NetPBM's "bmptopnm" to convert BMP to PNM for easy reading
# You could use ImageMagick: convert "$file" PNM: | perl ...
bmptopnm "$file" 2> /dev/null |
perl -e '
my $debug=0; # Change to 1 for debugging
# Discard first 3 lines of PNM header:
# P3
# 320 240
# 255
my $line=<STDIN>; $line=<STDIN>; $line=<STDIN>;
# Read file, 3 RGB bytes at a time
{
local $/ = \3;
while(my $pixel=<STDIN>){
# Extract 8-bit R,G and B from pixel
my ($r,$g,$b)=unpack("CCC",$pixel);
printf("R/G/B: %d/%d/%d\n",$r,$g,$b) if $debug;
# Convert to RGB565
my $r5=$r>>3;
my $g6=$g>>2;
my $b5=$b>>3;
my $rgb565 = ($r5<<11) | ($g6<<5) | $b5;
# Convert to little-endian 16-bit (VAX order) and write 2 bytes
my $v=pack("v",$rgb565);
syswrite(STDOUT,$v,2);
}
}
'
I don't have a frame buffer handy to test, but it should be pretty close.
Note that you could make the code more robust by starting off with:
convert "$file" -depth 8 -resize 320x240\! PNM: | perl ...
which would make sure the image always matches the framebuffer size, and that it is 8-bit and not 16-bit. You may also want a -flip or -flop in there if the BMP image is upside-down or back-to-front.
Note that if you use ImageMagick convert, the code will work for GIFs, TIFFs, JPEGs, PNGs and around 150 other formats as well as BMP.
Note that if you want to test the code, you can generate a black image with ImageMagick like this:
convert -size 320x240 xc:black BMP3:black.bmp
and then look for a bunch of zeroes in the output if you run:
./bmp2rgb565 black.bmp | xxd -g2
Likewise, you can generate a white image and look for a bunch of ffs:
convert -size 320x240 xc:red BMP3:white.bmp
And so on with red, green and blue:
convert -size 320x240 xc:red BMP3:red.bmp
convert -size 320x240 xc:lime BMP3:green.bmp
convert -size 320x240 xc:blue BMP3:blue.bmp
# Or make a cyan-magenta gradient image
convert -size 320x240 gradient:cyan-magenta cyan-magenta-gradient.bmp
Example:
./RGB565 red.bmp | xxd -g2 | more
00000000: 00f8 00f8 00f8 00f8 00f8 00f8 00f8 00f8 ................
00000010: 00f8 00f8 00f8 00f8 00f8 00f8 00f8 00f8 ................
Example:
./RGB565 blue.bmp | xxd -g2 | more
00000000: 1f00 1f00 1f00 1f00 1f00 1f00 1f00 1f00 ................
00000010: 1f00 1f00 1f00 1f00 1f00 1f00 1f00 1f00 ................
Keywords: RGB565, rgb565, framebuffer, frame-buffer, pack, unpack, Perl, BMP, PGM, image, Raspberry Pi, RASPI
To convert to RGB565 you can use ImageMagick:
convert test.png -resize 320x200 -ordered-dither threshold,32,64,32 test2.png
In what format do you need the output to be?
I'm assuming you simply want to stream out the pixel data without any headers or additional data.
The core of the problem has already been described here.
If you want to implement it for yourself, the following approach will probably help:
After opening the file, yout need to skip the BMP header by reading the bfOffBits value (see this BMP format description), the easiest way would probably be to implement it in C by reading into a struct that matches the BMP header.
(Note: This only works because the dimensions of the image are divisible by 4! Check the format description for more details)
Then seek forward by bfOffBits and while the file has not ended, transform three consecutive bytes (uint8_t) to one 16 bit value (uint16_t) by using the approach from the question I mentioned above. The basic C file operations you would need are fopen, fclose, fread, fwrite, fgetc/fgetwc, fputc/fputwc and fseek

Resources