Is there any difference in terms of precision and speed in using SIFT with JPEG, PNG or PGM images? Obviously supposing the same image size.
This question does not make sense.
SIFT is an algorithm, it operates on raw pixel data in memory, not on files.
You would need some framework that loads image files into some data structure in memroy.
SIFT itself doesn't care if you load a jpeg or a pgm. It will never know where the pixels came from.
Having the same size (to 1 bit) for the same image in three different formats would be pretty impossible in my opinion. As PGM is uncompressed this would also mean that JPEG and PNG would have to be uncompressed to have roughly the same size. If you have no compression you have no losses. So there would be no difference in SIFT performance.
Related
I'm wondering about what types of metrics I can use for comparing different (lossy) image compression methods (i.e., things other than compression ratio). For example, comparing JPEG, JPEG 2000, and JPEG XR on a set of different images.
Ideally I'd like to do this in a python notebook, but I'm open to any suggestions.
Thank you!
I think this boils down to comparing the lossy images to their original, i.e. measuring the loss of image quality.
Measure the difference between the lossy image and its origin, do the same for other lossy images and compare the results.
How this can be done has been asked before and I have just added an answer mentioning some approaches.
The title pretty much sums up the question but I was wondering if JPG/PNG files have a notable difference in speed and performance when using CIFilters. Is using one type of file preferred over the other? Is there another file type that could be potentially faster than both JPG and PNG?
JPEG and PNG are storage mechanism. Filters have to be performed on uncompressed data, not on JPEG or PNG streams.
The speed difference between JPEG and PNG occurs reading or writing. PNG compression generally is slower than JPEG compression. PNG expansion is generally faster than JPEG.
JPEG is not suitable for images that have abrupt changes in color, e.g. drawings, cartoons.
JPEG is not suitable for images that are stored, retrieved, modified, stored . . . . Each cycle changes the image.
JPEG generally produces much smaller compressed streams than PNG.
It actually depends! PNG's are better if it is a smaller image, as it cuts away the white surrounding what is actually there. But, it does not compress at all to maintain quality. This could slow down performance. JPEG compresses, therefore cutting down file size, but also compromising quality to a point. I'd say go for JPEG if it is a huge image for the app, but go for PNG if you want quality.
Thanks for asking and hope this helps. If this answers your question, I'd love if you could just hit that check mark.
Cheers,
Theo
i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:
i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.
compressed size in file system is around 7.5 MB.
decompressing into memory (with some image viewer) inflates to around 450 MB.
monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!
i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.
so which of the following is true?
some bug in this particular tool's implementation
some configuration switch i have missed
some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
the "mathematical operations" consuming even more memory than "decoding the image into memory"
edit:
according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:
are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?
PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.
I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.
The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.
It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.
But it will be memory intensive.
jpeg compression works very roughly as follows:
Convert image into YCbCr colour space.
Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
Compress (non-lossily) the (now quite compressible) data using huffman encoding.
So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.
If we compare image procesing of the losslessly compressed images with the image processing of the lossy compressed images, does the latter provide the results comparable to the former one.
I am asking this question because the images prodiced by lossless compression are ok for human eye but they vary at minute details which may effect the processing of images by the computer. But I can't tell how much.
I don't see much of a question here, but you are right. It is especially visible if processing a JPG image with a medium compression ratio -- the 8x8 squares of which JPG's are built of tend to get more visible after filtering.
This is comparable to the rising of computational error when operating on computer-based floating point numbers.
Your best bet for image processing is using lossless formats for image processing -- PNG's are a good choice, cause they both provide lossless compression, as well as a decent support for bitdepths, transparency and are browser-compatible.
Another format, more often used in the professional world are TIFF's (Targa).
However, note that if your source image is already in a loss-based format, converting it to a lossless one will only prevent adding additional artifact's, not spreading and enhancing the old one. You can however reduce the extent of error by converting it to a lossless format and running it through a small seed gaussian blur.
Perhaps you are looking for the Perceptual Image Diff utility?
For iPhone game development, I switched from PNG format to PVRTC format for the sake of performance. But PVRTC compression is creating files that are much bigger than the PNG files.. So a PNG of 140 KB (1024x1024) gets bloated to 512 KB or more in the PVRTC format.. I read somewhere that a PNG file of 50KB got compressed to some 10KB and all, in my case, its the other way around..
Any reason why it happens this way and how I can avoid this.. If PVRTC compression is blindly doing 4bpp conversion (1024x1024x0.5) irrespective of the transparencies in the PNG, then whats the compression we are achieving here..
I have 100s of these 1024x1024 images in my game as there are numerous characters each doing some complex animations.. so in this rate of 512KB per image, my app would get more than 50MB.. which is unacceptable for my customer.. ( with PNG, I could have got my app to 10MB)..
In general, uncompressed image data is either 24bpp (RGB) or 32bpp (RGBA) flatrate. PVRTC is 4bpp (or 2bpp) flatrate so there is a compression of 6 or 8 (12 or 16) times compared to this.
A requirement for graphics hardware to use textures natively is that the format of the texture must be random accessible for the hardware. PVRTC is this kind of format, PNG is not and this is why PNG can achieve greater compression ratios. PVRTC is a runtime, deployment format; PNG is a storage format.
PVRTC compression is carried out on 4x4 blocks of pixels at a time and at a flat bit rate so it is easy to calculate where in memory to retrieve the data required to derive a particular texel's value from and there is only one access to memory required. There is dedicated circuitry in the graphics core which will decode this 4x4 block and give the texel value to your shader/texture combiner etc.
PNG compression does not work at a flat bitrate and is more complicated to retrieve specific values from; memory needs to be accessed from multiple locations in order to retrieve a single colour value and far more memory and processing would be required every single time a texture read occurs. So it's not suitable for use as a native texture format and this is why your textures must be decompressed before the graphics hardware will use them. This increases bandwidth use when compared to PVRTC, which requires no decompression for use.
So for offline storage (the size of your application on disk), PNG is smaller than PVRTC which is smaller than completely uncompressed. For runtime memory footprint and performance, PVRTC is smaller and faster than PNG which, because it must be decompressed, is just as large and slow as uncompressed textures. You might gain some advantage with PNG at initialisation for disk access, but then you'd lose time for decompression.
If you want to reduce the storage footprint of PVRTC you could try zip-style compression on the texture files and expand these when you load from disk.
PVRTC (PowerVR Texture Compression) is a texture compression format. On devices using PowerVR e.g. most higher end mobile phones including the iPhone and other ARM-based gadgets like the iPod it is very fast to draw since drawing it is hardware accelerated. It also uses much less memory since images are represented in their compressed form and decoded each draw, whereas a PNG needs to be decompressed before being drawn.
PNG is lossless compression.
PVRTC is lossy compression meaning it approximates the image. It has a completely different design criteria.
PVRTC will 'compress' (by approximating) any type of artwork, giving a fixed bits per texel, including photographic images.
PNG does not approximate the image, so if the image contains little redundancy it will hardly compress at all. On the other hand, a uniform image e.g. an illustration will compress best with PNG.
Its apples and oranges.
Place more than one frame tiled onto a single image and blit the subrectangles of the texture. This will dramatically reduce your memory consumption.
If you images are, say, 64x64, then you can place 256 of them on a 1024x1024 texture in a 16x16 arrangement.
With a little effort, images do not need to be all the same size, just so long as you keep track in the code of the rectangle in the texture that each image is at.
This is how iPhone game developers do it.
I agree with Will. There is no point in the question. I read the question 3 times, but I still don't know what Sankar want to know. It's just a complain, no question.
The only thing I can advice, don't use PVRTC if you mind to use it. It offers performance gain and saves VRAM, but it won't help you in this case. Because what you want is just reducing game volume, not a consideration about trade-off between performance and quality.