Read the PNG image as NSData, and some extra bytes appear - ios

Above is the data to read the picture using iOS SDK,
00000004 43674249 50002006 2cb87766
are extra bytes.
Does anyone know what these bytes are?

Your image is not a standard PNG according to the official specifications. The bytes you point out form a valid PNG chunk:
00000004 Data length
43674249 Block ID "CgBI"
50002006 4 bytes of proprietary information
2CB87766 CRC checksum of the data block
but the location and the indicated type of the chunk are invalid for a conforming PNG image.
This indicates the image is processed with Apple's modified version of pngcrush. See http://iphonedevwiki.net/index.php/CgBI_file_format#Differences_from_PNG for non-official information on the block contents and other deviations from the official specification.

Related

Reading byte by byte HEIF/HEIC images XMP metadata

I am trying to build a native byte parser that given an HEIF image it returns back its metadata (mainly width and height of the image).
I am struggling a lot at the moment finding the right documentation and specs to use for parsing such info. I have to do such thing for both XMP and EXIF metadata, but let's focus only on XMP for now.
What I need is the exact byte structure of where to find what. According to the HEIF international standard doc (here):
For image items, XMP metadata shall be stored as an item of item_type value 'mime' and content type'application/rdf+xml'. The body of the item shall be a valid XMP document, in XML form.
Perfect, if I analyse a sample image I can find such marker:
From now on I can't find anywhere how to get the info I need. I would expect something saying "the first 2 bytes are the header, with marker 0xFF 0xCE (just an example), the next 2 bytes are the width, and following 2 bytes the height...etc".
In my case I am going by intuition. My sample image is of dimensions 8736x5856. If in the tool I look for Big-Endian 2 byte integer 8736, I can find it:
And hey, 2 bytes later there is the 5856 height as well:
But again, I arrived here by luck and intuition. I need a proper schema that tells me where to find what in such a way that I can traslate it to code.
What I think you'r seeing is a "mime" and "ispe" mp4 box as HEIF is ISOBMFF based. I would recommend looking at the file using a mp4 capable tool like mp4dump, HexFiend or fq (note: my tool). The "ispe" (Image Spatial Extents) box i probably what you want to read.
fq does no support ispe box yet but you could read it like this:
$ fq 'grep_by(.type=="ispe").data | tobytes | [.[-8:-4], .[-4:] | tonumber]' file.heif
[
8736,
5856
]
So what you need is probably a basic ISOBMFF reader and then look for the "ispe" box and decode it. If you'r only looking for the first of a specific box you can probably ignore that ISOBMFF is a tree structure.

Reverse Engineering proprietary TIFF format

I'm deep in the weeds reverse engineering a very old proprietary document storage format (Keyfile). Embedded in the middle of a larger file is a block of image data (the scan of a single document page) that is encoded with CCITT4. I've learned enough about the file and the TIFF spec so far to write a filter that extracts the data from the source file and writes a new file that is supposed to be a plain TIFF, but it's not quite there yet, and I can't figure out what I'm still missing.
Encouragingly Adobe Photoshop opens my newly minted TIFF file and displays the document just fine (no errors, no warnings). Unfortunately, none of the other common tools will. I'm on a mac and have access to linux so I've tried:
Gimp
Preview (OSX)
ImageMagick
some of the libtiff utilities like fax2pdf
I suspect there's something wrong still with my TIFF file, that Photoshop is silently overlooking. I hope it's not in the raw CCITT4 image data, because I would rather not have to write code to decode that completely.
I can't post the files I'm working with because they contain sensitive data. However, I'm hoping that I'm just doing something wrong with my tiff header block that someone can point out. To that end. here's some basic information about my test file (the one that opens fine in Photoshop).
Keyfile.tiff 31K (32300 bytes)
Keyfile TIFF Version 1.01
0100.0004.00000001.000009f0 ImageWidth
0101.0004.00000001.00000ce0 ImageLength
0102.0003.00000001.00000001 BitsPerSample
0103.0003.00000001.00000004 Compression
0106.0003.00000001.00000000 PhotometricInterpolation
0111.0004.00000001.00000200 StripOffsets
0115.0003.00000001.00000001 SamplesPerPixel
0116.0004.00000001.00000ce0 RowsPerStrip
0117.0004.00000001.00007c2c StripByteCounts
011a.0005.00000001.000001d6 XResolution
011b.0005.00000001.000001de YResolution
0128.0004.00000001.00000002 ResolutionUnit
0131.0002.0000001a.000001e6 Software
This decode of the TIFF header block comes from code that I've written. Here's a hex dump of the header portion of the file to address 0x200.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E303100
What follows is exactly 0x7c2c bytes of compressed image data. I say this based on the tiff compression tag (4), which is copied over intact form the original file, and from looking at dozens and dozens of files with a hex editor and learning to recognize the image data block. Also the fact that Photoshop opens this file would seem to indicate I am correct.
Any help figuring out what I still need to do to make this file compatible with the rest of the utilities would be much appreciated.
For what it's worth here's the error produced by imagemagick:
>convert Keyfile.tiff Keyfile.pdf
convert: Premature EOL at line 0 of strip 0 (got 0, expected 2544). `Fax4Decode' # warning/tiff.c/TIFFWarnings/881.
I'm new to coding for TIFF and so any utilities or hints that would allow me to gather more detailed information about what's going on would also be appreciated.
Update:
Here are the first 0x318 bytes of the file. There's nothing sensitive here and you have the first 0x118 bytes of the image data. I can probably provide a bit more of the file if needed.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFFFFFC8085B51FFFFFFFFFFFFFFFFFFFFFFFF90154E0C4221836AC80A900F04142050814204679705E823C0D3089900E92D641B9B1D2907364E94886C112854118E6208686E6492B47D11C1A29289806DC25083A41427495102E6D349641736AA96439B08496113867960B314A08CC1A2102141410221AADC28102123E918508E02AC41143D2C5131C3C68B1620B8CCB02A8238F564536394D16F11AA050CEA8A9944105DB92591D12D04513E195B23E1252561A742191D11B0628110DA6E5259A6881891832C74B704A0C8F1B4618450E2AA4087391D17988888EA41CDAD8A2B0AAA4436A2647D94CC585
Update 2:
OK, I found a file that I can post. It's a mostly white page, but if rendered correctly, you will see the two darkish crescent moons which are the reflection of the holes on the original scanned page. There's also a bit of noise over to the right and along the top. Here's what it looks like (image):
I used Photoshop to convert/save a file I could upload. Here's a hex dump of the file my code generated, which opens fine in Photoshop, but not with anything else.
49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C00001701040001000000530300001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031002C19461170350282E88E8AF52889A91024623806A1C8F97C8E8D111D1847115B44CF3A2388DA2E8C2388122F98C868E23451112508B88600D4297C8E88E44788F91E308BC4745CC8F91E23A2EC8E88F11E23B36447C8F11CC8E611020711111A6888390E39C738E0848E8BA23A388D4A224111B03681C206478DA892946E2E06D06B51121718036032092844E0AE470350604AA229C88E0680CC224511803402E24A11F88E0660D8224A40CD1016ACC8E0B606048906482C101752460C8E19006E224AC3203901D091B03C08122D9C0DA12141BFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2D2125082123F1A2EA08124122EB6820A475E2105130A8209826474388886475612449543B295550C8E88224EC591D1174295B23A48C0EC591E08762111E23A2F9F46D11D02E22323E088A3870447542223EE35BDF56AD5856AD430A1856AC2879692C06C2FC304259A688BA23D2D23211A4088FC504162A5373447C20A2396062188A891F23F7C48E89502F41A46D11B417126E51328709EDE4747D04171D8B23A650E5714E13158921F111588AB0AF72CA6AB50ED27690664750C286B6B1B29D351609F21976B8685A8613C309A96014631FFFFFFFFFFFFF2039C720383A5C5DFEB56B0B51FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9601A8FFFFFFFFFFFFFFFFFFFFFFFFFFCEC6947FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF95CEA3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE5A852A3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC004004
Here are it's specs.
Keyfile_66.tiff 1K (1363 bytes)
Keyfile TIFF Version 1.01
0100.0004.00000001.000009f0 ImageWidth
0101.0004.00000001.00000ce0 ImageLength
0102.0003.00000001.00000001 BitsPerSample
0103.0003.00000001.00000004 Compression
0106.0003.00000001.00000000 PhotometricInterpolation
0111.0004.00000001.00000200 StripOffsets
0115.0003.00000001.00000001 SamplesPerPixel
0116.0004.00000001.00000ce0 RowsPerStrip
0117.0004.00000001.00000353 StripByteCounts
011a.0005.00000001.000001d6 XResolution
011b.0005.00000001.000001de YResolution
0128.0004.00000001.00000002 ResolutionUnit
0131.0002.0000001a.000001e6 Software
Here's a link to download the file.
Any idea why this is would be much appreciated.

Find out if PNG is 8 or 24 bits and alpha

Given a UIImage/NSImage or NSData instance, how do you find out programatically if a PNG is 8 bits or 24 bits? And if it has an alpha channel?
Is there any Cocoa/Cocoa Touch API that helps with this?
To avoid duplication, here is the non-programatic answer to the question, and here's a way to find out if an image is a PNG.
As a person who programmed for a long time on a J2ME platform, I know the PNG format very well. If you have the raw data as a NSData instance, looking up the information is very easy since the PNG format is very straightforward.
See PNG Specification
The PNG file starts with a signature and then it contains a sequence of chunks. You are interested only in the first chunk, IHDR.
So, the code should be along the lines of:
Skip the first 8 bytes (signature)
Skip 4 bytes (IHDR chunk length)
Skip 4 bytes (IHDR chunk type = "IHDR")
Read IHDR
Width: 4 bytes
Height: 4 bytes
Bit depth: 1 byte
Color type: 1 byte
Compression method: 1 byte
Filter method: 1 byte
Interlace method: 1 byte
For alpha, you should also check if there is a tRNS (transparency) chunk in the file. To find a chunk, you can go with the following algorithm:
Read chunk length (4 bytes)
Read chunk type (4 bytes)
Check chunk type whether it is the type we are looking for
Skip chunk length bytes
Skip 4 bytes of CRC
Repeat
EDIT:
To find info about a UIImage instance, get its CGImage and use one of the CGImageGet... functions.
It should be noted that All integer values in a PNG file are read in Big-endian format.

Jfif/jpeg parsing, bytes between streams

I'm parsing an Jpeg/JFIF file and I noticed that after the SOI (0xFF D8) I parse the different "streams" starting with 0xFFXX (where XX is a hexadecimal number) until I find the EOI (0XFFD9). Now the structure of the diffrent chunks is:
APP0 marker 2 Bytes
Length 2 Bytes
Now when I parse the a chunk I parse until i reach the length written in the 2 Bytes of the length field. After that I thought I would immediately find another Marker, followed by a length for the next chunk. According to my parser that is not always true, there might be data between the chunks. I couldn't find out what that data is, and if it is relevant to the image. Do you have any hints what this could be and how to interpret those bytes?
I'm lost and would be happy if somebody could point me in the correct direction. Thanks in advance
I've recently noticed this too. In my case it's an APP2 chunk which is the ICC profile which doesn't contain the length of the chunk.
In fact so far as I can see the length of the chunk needn't be the first 2 bytes (though it usually is).
In JFIF all 0xFF bytes are replaced with 0xFF 0x00 in the data section, so it should just be a matter of calculating the length from that. I just read until I hit another header, however I've noticed that sometimes (again in the ICC profile) there are byte sequences which don't make sense such as 0xFF 0x6D, so I may still be missing something.

File Format Conversion to TIFF. Some issues?

I'm having a proprietary image format SNG( a proprietary format) which is having a countinous array of Image data along with Image meta information in seperate HDR file.
Now I need to convert this SNG format to a Standard TIFF 6.0 Format. So I studied the TIFF format i.e. about its Header, Image File Directories( IFD's) and Stripped Image Data.
Now I have few concerns about this conversion. Please assist me.
SNG Continous Data vs TIFF Stripped Data: Should I convert SNG Data to TIFF as a continous data in one Strip( data load/edit time problem?) OR make logical StripOffsets of the SNG Image data.
SNG Data Header uses only necessary Meta Information, thus while converting the SNG to TIFF, some information can’t be retrieved such as NewSubFileType, Software Tag etc.
So this raises a concern that after conversion whether any missing directory information such as NewSubFileType, Software Tag etc is necessary and sufficient condition for TIFF File.
Encoding of each pixel component of RGB Sample in SNG data:
Here each SNG Image Data Strip per Pixel component is encoded as:
Out^[i] := round( LineBuffer^[i * 3] * **0.072169** + LineBuffer^[i * 3 + 1] * **0.715160** + LineBuffer^[i * 3+ 2]* **0.212671**);
Only way I deduce from it is that each Pixel is represented with 3 RGB component and some coefficient is multiplied with each component to make the SNG Viewer work RGB color information of SNG Image Data. (Developer who earlier work on this left, now i am following the trace :))
Thus while converting this to TIFF, the decoding the same needs to be done. This raises a concern that the how RBG information in TIFF is produced, or better do we need this information?.
Please assist...
Are you able to load this into a standard windows bitmap handle? If so, there are probably a bunch of free and commercial libraries for saving it as TIFF.
The standard for TIFF is libtiff -- it's a C library. Here's a version for Delphi made by an expert in the TIFF format:
http://www.awaresystems.be/imaging/tiff/delphi.html
There seems to be a lot of choices.
I think the approach of
Loading your format into an in-memory standard bitmap (which you need to do to show it, right?)
Using a pre-existing TIFF encoding library to save as TIFF
Will be a lot easier than trying to do a direct format-to-format conversion. The only reasons I wouldn't do it this way are:
The bitmap is too big to keep in memory
The original format is lossy and I will lose more quality in the re-encoding -- but you'd have to be saving in a standard lossy format (JPEG) to save quality.
Disclaimer: I work for Atalasoft.
We make .NET imaging codecs (including TIFF) -- that are a lot easier to use than LibTiff -- you can call them in Delphi through COM. We can convert standard windows bitmaps to TIFF or PDF (or other formats) with a couple of lines of code.
One approach, if you have a Windows application which handles and can print this format, would be to let it do the work for you, and call it to print the file to one of the many available 'printer drivers' which support direct output to TIFF.

Resources