Convert JPEG-In-Tiff to normal TIff - image-processing

I have a directory structure containing 300 gigs worth of Tiff Images. Some are encoded as Jpeg-in-Tiff, others as "Group 4 fax encoding". I need to change the format of all to the later without losing my folder structures.
Anyone have any ideas?

Pretty much any image viewer with a batch conversion shoudl do this I like irfanview

Related

Reverse Engineering proprietary TIFF format

I'm deep in the weeds reverse engineering a very old proprietary document storage format (Keyfile). Embedded in the middle of a larger file is a block of image data (the scan of a single document page) that is encoded with CCITT4. I've learned enough about the file and the TIFF spec so far to write a filter that extracts the data from the source file and writes a new file that is supposed to be a plain TIFF, but it's not quite there yet, and I can't figure out what I'm still missing.
Encouragingly Adobe Photoshop opens my newly minted TIFF file and displays the document just fine (no errors, no warnings). Unfortunately, none of the other common tools will. I'm on a mac and have access to linux so I've tried:
Gimp
Preview (OSX)
ImageMagick
some of the libtiff utilities like fax2pdf
I suspect there's something wrong still with my TIFF file, that Photoshop is silently overlooking. I hope it's not in the raw CCITT4 image data, because I would rather not have to write code to decode that completely.
I can't post the files I'm working with because they contain sensitive data. However, I'm hoping that I'm just doing something wrong with my tiff header block that someone can point out. To that end. here's some basic information about my test file (the one that opens fine in Photoshop).
Keyfile.tiff 31K (32300 bytes)
Keyfile TIFF Version 1.01
0100.0004.00000001.000009f0 ImageWidth
0101.0004.00000001.00000ce0 ImageLength
0102.0003.00000001.00000001 BitsPerSample
0103.0003.00000001.00000004 Compression
0106.0003.00000001.00000000 PhotometricInterpolation
0111.0004.00000001.00000200 StripOffsets
0115.0003.00000001.00000001 SamplesPerPixel
0116.0004.00000001.00000ce0 RowsPerStrip
0117.0004.00000001.00007c2c StripByteCounts
011a.0005.00000001.000001d6 XResolution
011b.0005.00000001.000001de YResolution
0128.0004.00000001.00000002 ResolutionUnit
0131.0002.0000001a.000001e6 Software
This decode of the TIFF header block comes from code that I've written. Here's a hex dump of the header portion of the file to address 0x200.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E303100
What follows is exactly 0x7c2c bytes of compressed image data. I say this based on the tiff compression tag (4), which is copied over intact form the original file, and from looking at dozens and dozens of files with a hex editor and learning to recognize the image data block. Also the fact that Photoshop opens this file would seem to indicate I am correct.
Any help figuring out what I still need to do to make this file compatible with the rest of the utilities would be much appreciated.
For what it's worth here's the error produced by imagemagick:
>convert Keyfile.tiff Keyfile.pdf
convert: Premature EOL at line 0 of strip 0 (got 0, expected 2544). `Fax4Decode' # warning/tiff.c/TIFFWarnings/881.
I'm new to coding for TIFF and so any utilities or hints that would allow me to gather more detailed information about what's going on would also be appreciated.
Update:
Here are the first 0x318 bytes of the file. There's nothing sensitive here and you have the first 0x118 bytes of the image data. I can probably provide a bit more of the file if needed.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFFFFFC8085B51FFFFFFFFFFFFFFFFFFFFFFFF90154E0C4221836AC80A900F04142050814204679705E823C0D3089900E92D641B9B1D2907364E94886C112854118E6208686E6492B47D11C1A29289806DC25083A41427495102E6D349641736AA96439B08496113867960B314A08CC1A2102141410221AADC28102123E918508E02AC41143D2C5131C3C68B1620B8CCB02A8238F564536394D16F11AA050CEA8A9944105DB92591D12D04513E195B23E1252561A742191D11B0628110DA6E5259A6881891832C74B704A0C8F1B4618450E2AA4087391D17988888EA41CDAD8A2B0AAA4436A2647D94CC585
Update 2:
OK, I found a file that I can post. It's a mostly white page, but if rendered correctly, you will see the two darkish crescent moons which are the reflection of the holes on the original scanned page. There's also a bit of noise over to the right and along the top. Here's what it looks like (image):
I used Photoshop to convert/save a file I could upload. Here's a hex dump of the file my code generated, which opens fine in Photoshop, but not with anything else.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C00001701040001000000530300001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031002C19461170350282E88E8AF52889A91024623806A1C8F97C8E8D111D1847115B44CF3A2388DA2E8C2388122F98C868E23451112508B88600D4297C8E88E44788F91E308BC4745CC8F91E23A2EC8E88F11E23B36447C8F11CC8E611020711111A6888390E39C738E0848E8BA23A388D4A224111B03681C206478DA892946E2E06D06B51121718036032092844E0AE470350604AA229C88E0680CC224511803402E24A11F88E0660D8224A40CD1016ACC8E0B606048906482C101752460C8E19006E224AC3203901D091B03C08122D9C0DA12141BFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2D2125082123F1A2EA08124122EB6820A475E2105130A8209826474388886475612449543B295550C8E88224EC591D1174295B23A48C0EC591E08762111E23A2F9F46D11D02E22323E088A3870447542223EE35BDF56AD5856AD430A1856AC2879692C06C2FC304259A688BA23D2D23211A4088FC504162A5373447C20A2396062188A891F23F7C48E89502F41A46D11B417126E51328709EDE4747D04171D8B23A650E5714E13158921F111588AB0AF72CA6AB50ED27690664750C286B6B1B29D351609F21976B8685A8613C309A96014631FFFFFFFFFFFFF2039C720383A5C5DFEB56B0B51FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9601A8FFFFFFFFFFFFFFFFFFFFFFFFFFCEC6947FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF95CEA3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE5A852A3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC004004
Here are it's specs.
Keyfile_66.tiff 1K (1363 bytes)
Keyfile TIFF Version 1.01
0100.0004.00000001.000009f0 ImageWidth
0101.0004.00000001.00000ce0 ImageLength
0102.0003.00000001.00000001 BitsPerSample
0103.0003.00000001.00000004 Compression
0106.0003.00000001.00000000 PhotometricInterpolation
0111.0004.00000001.00000200 StripOffsets
0115.0003.00000001.00000001 SamplesPerPixel
0116.0004.00000001.00000ce0 RowsPerStrip
0117.0004.00000001.00000353 StripByteCounts
011a.0005.00000001.000001d6 XResolution
011b.0005.00000001.000001de YResolution
0128.0004.00000001.00000002 ResolutionUnit
0131.0002.0000001a.000001e6 Software
Here's a link to download the file.
Any idea why this is would be much appreciated.

In OpenCV many conversions to JPG using imEncode fails

For a specific purpose I am trying to convert an AVI video to a kind of Moving JPEG format using OpenCV. In order to do so I read images from the source video, convert them to JPEG using imEncode, and write these JPEG images to the target video.
After several hundreds of frames suddenly the size of the resulting JPEG image nearly doubles. Here's a list of sizes:
68045
68145
68139
67885
67521
67461
67537
67420
67578
67573
67577
67635
67700
67751
127800
127899
127508
127302
126990
126904
Anybody got a clue what's going on here?
By the way: I'm using OpenCV.Net as a wrapper for OpenCV.
Thanks a lot in advance,
Paul
I found the solution. If I explicitly enter the third parameter to imEncode (for JPEG encoding this indicates the quality of the encoding, ranging from 0 to 100) instead of using the default (95) the problem disappears. It's likely this is a bug in OpenCV.Net, but it could also be a bug in OpenCV itself.

Is there anyway (commandline tools) to calculate MD5 hash for .NEF (also .CR2, .TIFF) regardless any metadata?

Is there anyway (commandline tools) to calculate MD5 hash for .NEF (also .CR2, .TIFF) regardless any metadata, e.g. EXIF, IPTC, XMP and so on?
The MD5 hash should be same once we update any metadata inside the image file.
I searched for a while, the closest solution is:
exiftool test.nef -all= -o - -m | md5
but 'exiftool -all=' still keeps a set of EXIF tags in the output file. The MD5 hash can be changed if I update remaining tags.
ImageMagick has a method for doing exactly this. It is installed on most Linux distros and is available for OSX (ideally via homebrew) and also Windows. There is an escape for the image signature which includes only pixel data and not metadata - you use it like this:
identify -format %# _DSC2007.NEF
feb37d5e9cd16879ee361e7987be7cf018a70dd466d938772dd29bdbb9d16610
I know it does what you want and that the calculated checksum does not change when you modify the metadata on PNG files for example, and I know it does calculate the checksum correctly for CR2 and NEF files. However, I am not in the habit of modifying RAW files such as you have and have not tested it does the right thing in that case - though I would be startled if it didn't! So please test before use.
The reason that there is still some Exif data left is because the image data for a NEF file (and similar TIFF based filetypes) is located within that Exif block. Remove that and you have removed the image data. See ExifTool FAQ 7, which has an example shortcut tag that may help you out.
I assume your intention is to verify the actual image data has not been tampered with.
An alternate approach to stripping the meta-data can be to convert the image to a format that has no metadata.
ImageMagick is a well known open source (Apache 2 license) for image manipulation and conversion. It provides libraries with various language bindings as well as command line tools for various operating systems.
You could try:
convert test.nef bmp:- | md5
This converts test.nef to bmp on stdout and pipes it to md5.
AFAIR bmp has no support for metadata and I'm not sure if ImageMagick even preserves metadata across conversions.
This will only work with single image files (i.e. not multi-image tiff or gif animations). There is also the slight possibility some changes can be made to the image which result in the same conversion because of color space conversions, but these changes would not be visible.

OpenCV imwrite increases the size of png image

I am doing image manipulation on the png images. I have the following problem. After saving an image with imwrite() function, the size of the image is increased. For example previously image is 847KB, after saving it becomes 1.20 MB. Here is a code. I just read an image and then save it, but the size is increased. I tried to set compression params but it doesn't help.
Mat image;
image = imread("5.png", -1);
vector<int> compression_params;
compression_params.push_back(CV_IMWRITE_PNG_COMPRESSION);
compression_params.push_back(9);
compression_params.push_back(0);
imwrite("output.png",image,compression_params);
What could be a problem? Any help please.
Thanks.
PNG has several options that influence the compression: deflate compression level (0-9), deflate strategy (HUFFMAN/FILTERED), and the choice (or strategy for dynamically chosing) for the internal prediction error filter (AVERAGE, PAETH...).
It seems OpenCV only lets you change the first one, and it hasn't a good default value for the second. So, it seems you must live with that.
Update: looking into the sources, it seems that compression strategy setting has been added (after complaints), but it isn't documented. I wonder if that source is released. Try to set the option CV_IMWRITE_PNG_STRATEGY with Z_FILTERED and see what happens
See the linked source code for more details about the params.
#Karmar, It's been many years since your last edit.
I had similar confuse to yours in June, 2021. And I found out sth which might benefit others like us.
PNG files seem to have this thing called mode. Here, let's focus only on three modes: RGB, P and L.
To quickly check an image's mode, you can use Python:
from PIL import Image
print(Image.open("5.png").mode)
Basically, when using P and L you are attributing 8 bits/pixel while RGB uses 3*8 bits/pixel.
For more detailed explanation, one can refer to this fine stackoverflow post: What is the difference between images in 'P' and 'L' mode in PIL?
Now, when we use OpenCV to open a PNG file, what we get will be an array of three channels, regardless which mode that
file was saved into. Three channels with data type uint8, that means when we imwrite this array into a file, no matter
how hard you compress it, it will be hard to beat the original file if it was saved in P or L mode.
I guess #Karmar might have already had this question solved. For future readers, check the mode of your own 5.png.

File Format Conversion to TIFF. Some issues?

I'm having a proprietary image format SNG( a proprietary format) which is having a countinous array of Image data along with Image meta information in seperate HDR file.
Now I need to convert this SNG format to a Standard TIFF 6.0 Format. So I studied the TIFF format i.e. about its Header, Image File Directories( IFD's) and Stripped Image Data.
Now I have few concerns about this conversion. Please assist me.
SNG Continous Data vs TIFF Stripped Data: Should I convert SNG Data to TIFF as a continous data in one Strip( data load/edit time problem?) OR make logical StripOffsets of the SNG Image data.
SNG Data Header uses only necessary Meta Information, thus while converting the SNG to TIFF, some information can’t be retrieved such as NewSubFileType, Software Tag etc.
So this raises a concern that after conversion whether any missing directory information such as NewSubFileType, Software Tag etc is necessary and sufficient condition for TIFF File.
Encoding of each pixel component of RGB Sample in SNG data:
Here each SNG Image Data Strip per Pixel component is encoded as:
Out^[i] := round( LineBuffer^[i * 3] * **0.072169** + LineBuffer^[i * 3 + 1] * **0.715160** + LineBuffer^[i * 3+ 2]* **0.212671**);
Only way I deduce from it is that each Pixel is represented with 3 RGB component and some coefficient is multiplied with each component to make the SNG Viewer work RGB color information of SNG Image Data. (Developer who earlier work on this left, now i am following the trace :))
Thus while converting this to TIFF, the decoding the same needs to be done. This raises a concern that the how RBG information in TIFF is produced, or better do we need this information?.
Please assist...
Are you able to load this into a standard windows bitmap handle? If so, there are probably a bunch of free and commercial libraries for saving it as TIFF.
The standard for TIFF is libtiff -- it's a C library. Here's a version for Delphi made by an expert in the TIFF format:
http://www.awaresystems.be/imaging/tiff/delphi.html
There seems to be a lot of choices.
I think the approach of
Loading your format into an in-memory standard bitmap (which you need to do to show it, right?)
Using a pre-existing TIFF encoding library to save as TIFF
Will be a lot easier than trying to do a direct format-to-format conversion. The only reasons I wouldn't do it this way are:
The bitmap is too big to keep in memory
The original format is lossy and I will lose more quality in the re-encoding -- but you'd have to be saving in a standard lossy format (JPEG) to save quality.
Disclaimer: I work for Atalasoft.
We make .NET imaging codecs (including TIFF) -- that are a lot easier to use than LibTiff -- you can call them in Delphi through COM. We can convert standard windows bitmaps to TIFF or PDF (or other formats) with a couple of lines of code.
One approach, if you have a Windows application which handles and can print this format, would be to let it do the work for you, and call it to print the file to one of the many available 'printer drivers' which support direct output to TIFF.

Resources