I am interested in flipping few bits in image section of popular image formats such jpg, tiff, png, heic. Let's consider a example.
Given a image as byte array, few bytes represent the header section say from [0 to N]
Then few more bytes contain metadata such as exif etc from [N+1 to M]
Then a lot more bytes contain the image pixels in a compressed format according to some compression algorithm, say [M+1 to X]
Lastly there is tail section from [X+1 to Z] where Z is length of given byte array.
I am interested in flipping few bits in bytes from [M+1 to X]. I am assuming that these bytes do not contain anything else except compressed bits of the image so when bits are flipped, any image viewer will still work without any loss of image quality.
I need recommendation for java or python libs that can parse the image and give me indices for M+1 and X.
Thanks for reading and helping out in advance.
Best
Related
I need to iterate over the pixels of a YUV NV12 buffer and set color. I think the conversion for NV12 format should be easy but I can't figure it out. If I could set the top 50x50 pixels at 0,0 to white, I'd be set. Thank you in advance.
Have you tried setting the first 3 bytes (12 bits) * number of pixels to all 0x00 or all 0xFF?
Since you really don't seem to care about the conversion simply overwriting the buffer would suffice. If that works, you can tackle the other problems, like finding the right color and producing a rect instead of a line.
For the first, you need to understand the YUV coding. https://wiki.videolan.org/YUV#NV12. According to this document you will most likely need to overwrite bits in the Y range and in the UV range. So writing at two different locations. Thats very contrary to the RGB buffer where all pixel colors have close locality. So you can start and overwrite the first 8 bits in the Y range and the first or last 2 bits in the UV range. That should set you one pixel to a different color than before.
Finally you can tackle the display of the 50x50 rectangle. You'll need to know the image dimensions, because you'll need to offset after each row (if the buffer is transmitted by rows!). E.g., this graph:
.------.
|xx |
|xx |
| |
'------'
In a rgb color space, with row major transmitted values, the buffer would look like this: xx0000xx0000000000. So you would need to overwrite bytes 0-6 and bytes 18-24 (RGB). Because: first range * 3 bytes RGB. Then next range starts at row number (1) * image width (6) * 3 bytes (RGB), and so on. You have to apply the same thinking to the YUV color space.
While I was reading jpeg spec, I came to know while encoding jpeg, image is first broken into 8x8 blocks then DCT and other things happen.
So I am curious to know how would an image (raw file) containing a single row get encoded using jpeg?
would jpeg add extra 7 rows to file so that it can break it in 8x8 blocks?
A very nice explanation is given in https://dsp.stackexchange.com/questions/35339/jpeg-dct-padding
From Baseline JPEG:
The image is partitioned into blocks of size 8x8.
Each block is then independently transformed using the 8x8 DCT. If the image dimensions are not exact multiples of 8, the blocks on the lower and right hand boundaries may be only partially occupied. These boundary blocks must be padded to the full 8x8 block size and processed in an identical fashion to every other block. The compressor is free to select the value used to pad partial boundary blocks.
In JPEG compression, images that are not multiples of the MCU size are padded upwards to that size.
Task at hand is to split an available BGR(raw) image into N equal number image. Can someone give me hint on storage of BGR -raw images in memory
For example:
If I have 1920 * 1080 pixels BGR image, and I would like to split it into 8 equal parts then is there any available framework that can help me. I'm trying to write native CPP code on Android, working with OpenCV would be expensive to do, any other alternative
Whenever i read a colored image with 3 channels via cv::imread; its data alignment is a bit awkward (neither a byte nor an integer) and slows me down when i read a single pixel data on GPU memory. And it seems cv::Mat class's logic behind the alignment is a bit different than what i had initially thought. It does not add an extra byte between two pixels in a single row in order to have each pixel in a row started at every 4 bytes; but rather it pads some extra bytes at the END of each row for which any row may start at every 4 bytes boundary.
What should i do to pack each pixel data into a single unsigned integer? Is there a built-in method in OpenCV so that i do not have to use logical OR operation for packing each pixel data one by one?
Kind Regards.
You can convert the pixel format from BGR to BGRA
See this example.
I am using libpng to convertraw image data (3 channel, 8 bit, no metadata) to PNG and store it in a buffer. I now have a problem to allocate the right amount of buffer space for writing the PNG data to it. It is clear to me, that the compressed data might be larger than the raw data (cf. the overhead for a 1x1 image)
Is there any general rule for an upper margin of the compressed data size with respect to the image size and the different filtering/compression options? If that is too generic, let's say we use PNG_COLOR_TYPE_RGB, PNG_INTERLACE_NONE, PNG_COMPRESSION_TYPE_DEFAULT, PNG_FILTER_TYPE_DEFAULT.
Thank you
PNG overhead is 8 (signature) + 25 (IHDR) +12 (first IDAT) + 12 (IEND) plus 1 byte per row (filter byte), plus 12 bytes per additional IDAT when the size exceeds the zlib buffer size which is typically 8192. Zlib overhead is 6 (2-byte header and 4-byte checksum). Deflate overhead is 5 bytes plus 5 bytes per additional 32k in size.
So figure (1.02 * (3*W+1) * H) + 68.
You can decrease the 1.02 factor if you use a larger Zlib buffer size or increase it if you use a smaller buffer size. For example, a 256x256 RGB PNG compressed with a 1000000-byte buffer size (1000000 bytes per IDAT chunk) will have only one IDAT chunk and the total overhead will be around 330 bytes, or less than .2 percent, while if you compress it with a very small buffer size, for example 100 bytes, then there will be around 2000 IDAT chunks and the overhead will be about twelve percent.
See RFC-1950, RFC-1951, and RFC-2083.
You can use compressBound() in zlib to determine an upper bound on the size of the compressed data given an uncompressed data length, assuming the default zlib settings. For a specific set of different zlib settings, you can use deflateBound() after deflateInit2() has been used to establish the settings.