I have a camera that outputs 160 fps of 1024x1280 pixels in 8-bit grayscale (256 colors).
I need to encode this live without any loss.
What's the best codec for this?
I can code this in either python or c++, and have lots of cores, so parallelization is an option.
Thank you
Motion JPEG-2000 supports lossless and gray scale.
ffv1 https://github.com/FFmpeg/FFV1/blob/master/ffv1.md is another common option for lossless.
Your uncompressed data rate is 160 fps * 1024 * 1280 = 210 Mbytes/s.
I am guessing 50% compressing so you end up with about 100 MBytes/s compressed video.
This should be a doable I/O rate for a SSD.
Concerning the CPU - I suggest a naïve parallelization where you run one video compressor per core. So you have to do some sort of scheduling, pipelining and resorting of the output frames.
So if you have a 16 (32) core CPU each core needs to do 10 (5) fps which sounds pretty reasonable.
Related
I'm trying to capture an image using AVFoundation and save it as JPEG with maximum quality. I'm aware of AVVideoCodecJPEG and AVVideoQualityKey for AVFoundation capture, but it doesn't produce the output I'm looking for. Namely, I need the final JPEG to use the 4:4:4 chroma subsampling. Instead AVFoundation even at quality set to 1.0 (max) produces the image with 4:2:0 chroma subsampling. I tried to capture an image in BGRA32 format and later compress with VideoToolbox - same result. The only way I can get the desired output is by using CGImageDestinationCreateWithData method, but it's too slow - 0.4 sec for 12 MPx image (if I set quality to less than 1.0, the image can be compressed in 0.1 sec for the same size, but than I get again 4:2:0 chroma subsampling).
Did anyone ever succeeded to produce a JPEG with chroma subsampling 4:4:4 using AVFoundation or VideoToolbox? Or maybe there is there any way to accelerate CGImageDestinationCreateWithData method? Thanks in advance
Ok, there seems to be no support for 4:4:4 chroma sub-sampling using hardware accelerated JPG compression.
I'm testing different compressions for my spritesheets for a game on iOS. In a surprising way, I get a more important memory (RAM) use with PVR 2 bits with alpha instead of a PNG 32 (RGBA 4444). I would say the consumption is 25% higher with PVR 2 bits instead of PNG 32 once the spritesheets are loaded inside memory. I'm using Instruments with xCode to verify the memory use on the physical device (iPad Air 2)
I'm using TexturePacker to generate my spritesheets.
I've read evrywhere PVR 2 or 4 is much less memory consumer than PNG 32. How is it possible ?
Edit:
This is strange because according my observations, PVRTC 4 bits RGBA uses a lot more memory (RAM) than PNG 32, neraly 3 times more according Instruments from XCode. PVRTC 2 bits RGBA is 25% higher than PNG 32 RGBA 4444. I'm talking about live RAM consomption, not disque size which has nothing to do and is not a problem. It seems iOS manages PVR differently than it's supposed to do, especially when loading them into RAM.
Edit2:
My textures are 2048x2048, there are POT and have the square format. Evrything work fine, except the RAM consomption is much higher that it should be. I make all my tests with a physical iPad Air 2 device connected to my Mac with a USB cable. I use Instruments inside xcode to verify and follow the RAM consumption in live. I've solved the RAM consomption problem by switching to a PNG 8 bits (indexed) format with a texture divided by 2 (1024x1024). I make a scale x2 in the code to recover a normal size texture. The RAM consumption droped to 240 MB (PNG 8 bits indexed) instead of 950 MB with (PVR 2 bits RGBA). My game is a video puzzles (with 8 seconds video loops at 15 fps) and uses a lot of sprites. (43 spritesheets in each puzzle generated by TexturePacker, around 130 sprites in each spritesheet)
It is my understanding that PVR textures, made with texturetool, are simply compressed images. Therefore the difference lies in the file size.
Frankly, the file size doesn't interest me. What I want to know is, can a PVR texture consume less RAM than a normal .PNG texture? Or does this depend entirely on the texture format (like RGBA8888 etc)?
The essential question would be:
Given X.png and X.pvr, if I display both with texture format RGBA8888, will one consume less RAM than the other?
Yes, the PVR will consume less RAM at all stages — it's unpacked live by the GPU as it's accessed. There's no intermediate decompression.
A PVR-like approach used in digital video is that instead of storing RGB at every pixel, convert to YUV, then store Y at every pixel and U and V only twice per four-pixel block. So you go from 128 bits for the block to 64 bits. To get back to RGB the outputter reads the exactly correct Y and interpolates or accesses the most nearby U and V as necessary.
Schemes like PVR do a similar thing of not storing the full value at every pixel but inferring parts of it from nearby context. What counts as nearby is picked directly corresponding to however the caching is arranged on that GPU. It's usually more in-depth that just scaling down the sampling resolution of some of the channels, e.g. specifying a base offset for samples and then using a tiny precision for each is also common.
So the GPU can always get a value for pixel X by reading only values in a very small, local region of the data.
This contrasts with traditional schemes like PNG where having to know every pixel in the stream prior to X is acceptable if it improves the compression. Processing such things live would flood the GPU's memory bandwidth and hence be completely impractical, so such textures are decompressed from disk and then uploaded.
So schemes like PVR tend to lead to poorer compression and lower per-pixel quality but the win is that they can sit in VRAM compressed. A game will often increase the resolution of its textures if using PVR to try to find a comfortable balance.
Uncompressed textures are:
16 bit per pixel (RGB565, RGBA4444),
24 bit per pixel (RGB888)
PVRTC textures are either 4bpp or even 2bpp. So yes they do use less memory.
Also they perform better because need less memory bandwidth to fetch textures.
i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:
i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.
compressed size in file system is around 7.5 MB.
decompressing into memory (with some image viewer) inflates to around 450 MB.
monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!
i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.
so which of the following is true?
some bug in this particular tool's implementation
some configuration switch i have missed
some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
the "mathematical operations" consuming even more memory than "decoding the image into memory"
edit:
according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:
are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?
PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.
I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.
The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.
It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.
But it will be memory intensive.
jpeg compression works very roughly as follows:
Convert image into YCbCr colour space.
Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
Compress (non-lossily) the (now quite compressible) data using huffman encoding.
So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.
For iPhone game development, I switched from PNG format to PVRTC format for the sake of performance. But PVRTC compression is creating files that are much bigger than the PNG files.. So a PNG of 140 KB (1024x1024) gets bloated to 512 KB or more in the PVRTC format.. I read somewhere that a PNG file of 50KB got compressed to some 10KB and all, in my case, its the other way around..
Any reason why it happens this way and how I can avoid this.. If PVRTC compression is blindly doing 4bpp conversion (1024x1024x0.5) irrespective of the transparencies in the PNG, then whats the compression we are achieving here..
I have 100s of these 1024x1024 images in my game as there are numerous characters each doing some complex animations.. so in this rate of 512KB per image, my app would get more than 50MB.. which is unacceptable for my customer.. ( with PNG, I could have got my app to 10MB)..
In general, uncompressed image data is either 24bpp (RGB) or 32bpp (RGBA) flatrate. PVRTC is 4bpp (or 2bpp) flatrate so there is a compression of 6 or 8 (12 or 16) times compared to this.
A requirement for graphics hardware to use textures natively is that the format of the texture must be random accessible for the hardware. PVRTC is this kind of format, PNG is not and this is why PNG can achieve greater compression ratios. PVRTC is a runtime, deployment format; PNG is a storage format.
PVRTC compression is carried out on 4x4 blocks of pixels at a time and at a flat bit rate so it is easy to calculate where in memory to retrieve the data required to derive a particular texel's value from and there is only one access to memory required. There is dedicated circuitry in the graphics core which will decode this 4x4 block and give the texel value to your shader/texture combiner etc.
PNG compression does not work at a flat bitrate and is more complicated to retrieve specific values from; memory needs to be accessed from multiple locations in order to retrieve a single colour value and far more memory and processing would be required every single time a texture read occurs. So it's not suitable for use as a native texture format and this is why your textures must be decompressed before the graphics hardware will use them. This increases bandwidth use when compared to PVRTC, which requires no decompression for use.
So for offline storage (the size of your application on disk), PNG is smaller than PVRTC which is smaller than completely uncompressed. For runtime memory footprint and performance, PVRTC is smaller and faster than PNG which, because it must be decompressed, is just as large and slow as uncompressed textures. You might gain some advantage with PNG at initialisation for disk access, but then you'd lose time for decompression.
If you want to reduce the storage footprint of PVRTC you could try zip-style compression on the texture files and expand these when you load from disk.
PVRTC (PowerVR Texture Compression) is a texture compression format. On devices using PowerVR e.g. most higher end mobile phones including the iPhone and other ARM-based gadgets like the iPod it is very fast to draw since drawing it is hardware accelerated. It also uses much less memory since images are represented in their compressed form and decoded each draw, whereas a PNG needs to be decompressed before being drawn.
PNG is lossless compression.
PVRTC is lossy compression meaning it approximates the image. It has a completely different design criteria.
PVRTC will 'compress' (by approximating) any type of artwork, giving a fixed bits per texel, including photographic images.
PNG does not approximate the image, so if the image contains little redundancy it will hardly compress at all. On the other hand, a uniform image e.g. an illustration will compress best with PNG.
Its apples and oranges.
Place more than one frame tiled onto a single image and blit the subrectangles of the texture. This will dramatically reduce your memory consumption.
If you images are, say, 64x64, then you can place 256 of them on a 1024x1024 texture in a 16x16 arrangement.
With a little effort, images do not need to be all the same size, just so long as you keep track in the code of the rectangle in the texture that each image is at.
This is how iPhone game developers do it.
I agree with Will. There is no point in the question. I read the question 3 times, but I still don't know what Sankar want to know. It's just a complain, no question.
The only thing I can advice, don't use PVRTC if you mind to use it. It offers performance gain and saves VRAM, but it won't help you in this case. Because what you want is just reducing game volume, not a consideration about trade-off between performance and quality.