How to add a colour palette to b/w image? - image-processing

I have noticed in some tile based game engines, tiles are saved as grayscale or sometimes even black or white, and the colour is then added through storing a 'palette' along with it to apply to certain pixels however i've never seen how it knows which pixels.
Just to name a few engines i've seen use this, Notch's Minicraft and the old Pokemon games for Gameboy. This is what informed me of how a colour palette is used in old games: deconstructulator
From the little i've seen of people use this technique in tutorials it uses a form of bit-shifting however i'd like to know how that was so efficient that it was next to mandatory in old 8-bit consoles - how it is possible to apply red, green and blue to specific pixels of an image every frame instead of saving the whole coloured image (some pseudo-code would be nice).

The efficient thing about it is that it saves memory. Storing the RGB values usually requires 24 bits (8 bits per channel). While having a palette of 256 colors (requiring 256*24 bits = 768 bytes) each pixel requires just 8 bits (2^8 = 256 colors). So three times as many pixels can be stored in the same amount of memory (if you don't count the needed palette), but with a limitied set of colors obviously. This used to be supported in hardware so the graphics memory could be more efficiently used too (well this is actually still supported in modern PC hardware but almost never used since the graphics memory isn't that limited anymore).
Some hardware (including the Super Gameboy supported by the first Pokemon games) used more than one hardware palette at a time. Different sets of tiles is mapped to different palettes. But the way the tiles are mapped to different palettes is very hardware dependent and often not very straight forward.
The way the bits of the image is stored isn't always so straight forward either. If the pixels are 8 bits it can be an array of bytes where every byte simply is one pixel (as for the classic VGA mode used in many old DOS games). But the Gameboy for example uses 2 bits per pixel and others used 4, that is 2^4 = 16 colors per palette. A common way to arrange the bits in memory is by using bitplanes. That is (in the case of 16 color graphics) storing 4 separate b/w images. But the bitplanes can in some cases also be interleaved in different ways. So there is no simple and generic answer to how to decode/encode graphics this way. So I guess you have to be more specific what you want pseudo code for.
And I'm not sure how any of this applies to Minicraft. Maybe just how the graphics is stored on disk. But it has no significance once the graphics is loaded into graphics memory. (Maybe you have some other feature of Minicraft in mind?)

Related

How to get color depth of DICOM pixel data in reliable way?

A DICOM file may have either uncompressed pixel data or compressed pixel data. It's PhotometricInterpretation (0028,0004) can be MONOCHROME1/MONOCHROME2/RGB/PALETTE COLOR/YBR etc. There is also a Pydicom page about color space.
But from these pages or any other DICOM websites, it is not clear to me that how to get the color depth.
Is either BitsAllocated (0028,0100) or BitsStored (0028,0101) tag referring to color depth? Can its color depth be different than these two tag values?
How to get color depth of DICOM pixel data in reliable way?
Bits Stored is the number of bits that is used for the actual color or grayscale data, so it is at least related to the color depth. Bits Allocated is always a multiple of 8, as the data is always organized in bytes, where some of the upper bits may not be used for data (with the exception of Bit Data, where it is 1).
Getting the bit depth is not as straightforward as it may seem. While the number of bits used for the data can mostly be defined, the resolution of the data (e.g. the distance between adjacent values) may also depend on the Photometric Interpretation, and of course on the resolution provided by the modality itself.
The easiest case is monochrome data (Photometric Interpretation is MONOCHROME1 or MONOCHROME2), where the color depth is directly defined by Bits Stored typical values being 12, 14 or 16. The same is mostly true for RGB data (e.g. data originally recorded as RGB), and while it is true that Bits Stored can have different values for JPEG2000 encoded images as correctly mentioned by #kritzel_sw, I yet have to see any RGB data with Bits Stored different from 8. Update: I still haven't seen this, but found that RTDOSE images can have 32 Bits Stored.
For color data in the YBR color space (Photometric Interpretation is YBR_xxx) this is less clear. It somewhat depends on your definition of color depth. Given that the used color space is YBR instead of RGB, and the number of bits used for each component maybe different (for example in YBR_FULL_422, which is used for some JPEG compressed images, 2 channels our downsampled), the resulting image if converted into RGB (what is mostly done) uses 8 bits for each color component, but the actual number of possible values is less than 256 for that reason. So if your definition of color depth depends on the number of bits used per RGB channel, the answer would probably be 8 in this case, but if you define the color depth per YBR channel, the answer could be different and depends both on the Photometric Interpretation and Bits Stored.
A special case is the PhotometricInterpretation of PALETTE COLOR, where the possible colors are defined in the color table. In this case, the number of colors per color component is defined in the first value of the Palette Color Lookup Table Descriptor (0028,1101-1104), which is equal for all 3 tables (e.g. for the Red, Green and Blue components). The actual color depth has to be derived from that value.
Given all that the answer is probably: it depends. I'll also add the note by #kritzel_sw, that many of the IODs limit the degrees of freedom of how pixel data is encoded significantly, which will narrow down the possibilities for the color depth for any concrete type of images.
I'm interested if anybody has a more straightforward answer.

Why Microsoft rescaled their HSL ranges to [0-240]?

I'm studying digital image processing in university, and now we're doing a work where we have to convert RGB values to HSL, and it has to be equal to the Microsoft Paint. When I finished the implementation I saw that it was not equal to Paint, and searching I found that they don't use [0-360] range, as the usual way to implement it, but [0-240] range.
Source
But even reading it I couldn't got why to do this.
There are several reasons.
We really prefer ranges which have just 256 values. In such case we can store data in a byte. We may use two bytes, but so we will double the memory usage. Or just using more then 8-bits, but so we have to do bit operation on every pixel before doing anything interesting, so very inefficient. Note: with two bytes we can use half-float, which are very useful in some context, but float points are also inefficient on various calculation (but it you do filters and layers composition).
The second reason it is stated in the link you included in the question, but it may need some more information: in many format (especially for video or televisions) we have a limited range (not 0 to 255, but 16 to 235 or 16 to 240). There are few reasons: the analog signal could have more black than black (which may be impossible) and more white then white (realistic). We just calibrated TV so to define white and black points (and possibly broadcaster will filter out the point that are below black. This was useful for some reasons (converting analog data, and more white then white is a colour we experience a lot). With digitalization, TV needed analog to digital converters (and the inverse). These converters costs (and more bit: more costs), and we already used data outside "visible colours" range (in analog signal) for other reasons, outside colours. So instead to have different converted depending on the kind of data (or just more bits on converter), we just keep limited range).
For every colour operation, one should check the range (and the definition of colour space). Quantization may be also a problem (many professional software use floating points, so we are less limited on range (but there are also other reason for floating point, e.g. precision and efficiency when working in linear space [without gamma correction]). Note: you should care only on the ratio value to maximal value. Angles have various measurements: 360 degree, 400 degree, or 2 Pi (the most common, sometime, especially on past you may write then as decimal degree (or other factors, so e.g. 3600, 36000, or 21600 [minutes of degree]). User interfaces may just show a different convention (more convenient for users).
And to make all more complex, the HSL conversion is not so exact, if you want exact H, S, and L (in stricter definition), just a quick shortcut.
Short Answer
The reason for a hue of 240 is only about using one single int (8 bit integer) for the value, a remnant from a bygone era.
Longer Answer
Regarding the accepted answer: it has nothing at all to do with digital video code values, nothing to do with AtoD converters, nor the other comments on the purpose of video range, etc.
The use of a hue with 240 values instead of 360 dates back to when computers did not have math coprocessors (i.e. possibly before the OP was born, considering he's at university now, but I date myself LOL). Without a math coprocessor, doing anything in floating point was a significant performance bottleneck. Moreover, low CPU speed, limited RAM, etc... back then much was done to be efficient in ways that might not make sense today.
Remember we are talking about decades before it was possible to have a hand-held computer connected to all the world's knowledge that also happened to double as a phone, camera, map with your current location, music player, speech to text interpreter, video production system, and luminance adjustable flashlight...
LOL 😂
Back in the late 80s/early 90s, 8bit integer math was how almost everything was done, and going out of that 8bit integer zone was... was uphill... both ways!! ....
Reducing 360 to 240 keeps that value in 8bit, and has the advantage that it is exactly 2/3rds of 360.
Since there are three primaries in an RGB monitor, HSL breaks things down accordingly. If you think of sRGB as a "cube" with one corner as black, one corner as white, then you have 6 corners as the max values of full saturation, such as #f00 for red and #0ff for aqua.
HSL is not a perceptually uniform color model — it is just a low-tech way to make color adjustments "intuitive" for the user.
Side Note
Color (hue) does not exist as an a "angle". Color is not actually real, there is simply wavelengths of visible light, from longer than 700nm (red) to shorter than 400nm (blue). The sensation of color is just a perception; a function of your neurological system.
Nevertheless encoding those to a circle makes it fairly intuitive. So in any application where color values are defined as an angle, it is purely a construct of convenience.
Speaking of convenience, my hand held computer that is connected to all the world's knowledge is beeping at me to remind me it's time to feed my cat. And it's not even a beep like we had in the 80s. It's a stereo harp sound with some tubular bells in the background...
Edit to add:
The "uphill both ways" is an old person joke that is common here in the US, and that I find myself making more and more often, at my age, LOL... I don't know how well it tranlates to EU be honest...

Apple Metal MPSImage memory layout

I'm having trouble working with the underlying memory of an MPSImage. I've been using the methods getBytes and replace on MPSImage's texture member variable to read and write the underlying data. The problem is I can't find documentation of how the memory is interpreted as an image (i.e. how the rows, columns, and channels are layed out). Part of what complicates the issue is that regardless of the number of feature channels, the data is stored as a stack of RGBA texture slices, with some channels possibly left unused. For instance, with 3 feature channels, there will be one RGBA texture slice, and one channel's worth of space will be left unused.
The problem is, how is the MPSImage data actually arranged within the texture? It seems more complicated than I originally would have guessed.
After much experimentation, it seems like the data is arranged differently depending on whether the number of feature channels is < 4 or > 4. But I'm still having trouble figuring it out.
Can anyone explain the MPSImage data layout to me?
The first four feature channels are encoded as they would be for a standard RGBA texture. Feature channel 0 is in the "R" position, feature channel 1 is in the "G" position and so forth.
The next four feature channels are present as the next slice in a texture2d_array. If you have a 100x100 image with 20 feature channels, this will be encoded as a 100x100 texture array with (20/4=) 5 slices in the array.
To make matters more complicated, you can have MPSImage arrays with have multiple images in them, each with more than 4 feature channels. This is frequently referred to as batching. The second image is found in the texture array immediately after the first image. If we have multiple 100x100x20 images in the MPSImage, then the second one starts at slice 5, the third one at slice 10 and so forth.

Most Efficient way of Multi-Texturing - iOS, OpenGL ES2, optimization

I'm trying to find the most efficient way of handling multi-texturing in OpenGL ES2 on iOS. By 'efficient' I mean the fastest rendering even on older iOS devices (iPhone 4 and up) - but also balancing convenience.
I've considered (and tried) several different methods. But have run into a couple of problems and questions.
Method 1 - My base and normal values are rgb with NO ALPHA. For these objects I don't need transparency. My emission and specular information are each only one channel. To reduce texture2D() calls I figured I could store the emission as the alpha channel of the base, and the specular as the alpha of the normal. With each being in their own file it would look like this:
My problem so far has been finding a file format that will support a full non-premultiplied alpha channel. PNG just hasn't worked for me. Every way that I've tried to save this as a PNG premultiplies the .alpha with the .rgb on file save (via photoshop) basically destroying the .rgb. Any pixel with a 0.0 alpha has a black rgb when I reload the file. I posted that question here with no activity.
I know this method would yield faster renders if I could work out a way to save and load this independent 4th channel. But so far I haven't been able to and had to move on.
Method 2 - When that didn't work I moved on to a single 4-way texture where each quadrant has a different map. This doesn't reduce texture2D() calls but it reduces the number of textures that are being accessed within the shader.
The 4-way texture does require that I modify the texture coordinates within the shader. For model flexibility I leave the texcoords as is in the model's structure and modify them in the shader like so:
v_fragmentTexCoord0 = a_vertexTexCoord0 * 0.5;
v_fragmentTexCoord1 = v_fragmentTexCoord0 + vec2(0.0, 0.5); // illumination frag is up half
v_fragmentTexCoord2 = v_fragmentTexCoord0 + vec2(0.5, 0.5); // shininess frag is up and over
v_fragmentTexCoord3 = v_fragmentTexCoord0 + vec2(0.5, 0.0); // normal frag is over half
To avoid dynamic texture lookups (Thanks Brad Larson) I moved these offsets to the vertex shader and keep them out of the fragment shader.
But my question here is: Does reducing the number of texture samplers used in a shader matter? Or would I be better off using 4 different smaller textures here?
The one problem I did have with this was bleed over between the different maps. A texcoord of 1.0 was was averaging in some of the blue normal pixels due to linear texture mapping. This added a blue edge on the object near the seam. To avoid it I had to change my UV mapping to not get too close to the edge. And that's a pain to do with very many objects.
Method 3 would be to combine methods 1 and 2. and have the base.rgb + emission.a on one side and normal.rgb + specular.a on the other. But again I still have this problem getting an independent alpha to save in a file.
Maybe I could save them as two files but combine them during loading before sending it over to openGL. I'll have to try that.
Method 4 Finally, In a 3d world if I have 20 different panel textures for walls, should these be individual files or all packed in a single texture atlas? I recently noticed that at some point minecraft moved from an atlas to individual textures - albeit they are 16x16 each.
With a single model and by modifying the texture coordinates (which I'm already doing in method 2 and 3 above), you can easily send an offset to the shader to select a particular map in an atlas:
v_fragmentTexCoord0 = u_texOffset + a_vertexTexCoord0 * u_texScale;
This offers a lot of flexibility and reduces the number of texture bindings. It's basically how I'm doing it in my game now. But IS IT faster to access a small portion of a larger texture and have the above math in the vertex shader? Or is it faster to repeatedly bind smaller textures over and over? Especially if you're not sorting objects by texture.
I know this is a lot. But the main question here is what's the most efficient method considering speed + convenience? Will method 4 be faster for multiple textures or would multiple rebinds be faster? Or is there some other way that I'm overlooking. I see all these 3d games with a lot of graphics and area coverage. How do they keep frame rates up, especially on older devices like the iphone4?
**** UPDATE ****
Since I've suddenly had 2 answers in the last few days I'll say this. Basically I did find the answer. Or AN answer. The question is which method is more efficient? Meaning which method will result in the best frame rates. I've tried the various methods above and on the iPhone 5 they're all just about as fast. The iPhone5/5S has an extremely fast gpu. Where it matters is on older devices like the iPhone4/4S, or on larger devices like a retina iPad. My tests were not scientific and I don't have ms speeds to report. But 4 texture2D() calls to 4 RGBA textures was actually just as fast or maybe even faster than 4 texture2d() calls to a single texture with offsets. And of course I do those offset calculations in the vertex shader and not the fragment shader (never in the fragment shader).
So maybe someday I'll do the tests and make a grid with some numbers to report. But I don't have time to do that right now and write a proper answer myself. And I can't really checkmark any other answer that isn't answering the question cause that's not how SO works.
But thanks to the people who have answered. And check out this other question of mine that also answered some of this one: Load an RGBA image from two jpegs on iOS - OpenGL ES 2.0
Have a post process step in your content pipeline where you merge your rgb with alpha texture and store it in a. Ktx file when you package the game or as a post build event when you compile.
It's fairly trivial format and would be simple to write such command-line tool that loads 2 png's and merges these into one Ktx, rgb + alpha.
Some benefits by doing that is
- less cpu overhead when loading the file at game start up, so the games starts quicker.
- Some GPUso does not natively support rgb 24bit format, which would force the driver to internally convert it to rgba 32bit. This adds more time to the loading stage and temporary memory usage.
Now when you got the data in a texture object, you do want to minimize texture sampling as it means alot of gpu operations and memory accesses depending on filtering mode.
I would recommend to have 2 textures with 2 layers each since there's issues if you do add all of them to the same one is potential artifacts when you sample with bilinear or mipmapped as it may include neighbour pixels close to edge where one texture layer ends and the second begins, or if you decided to have mipmaps generated.
As an extra improvement I would recommend not having raw rgba 32bit data in the Ktx, but actually compressing it into a dxt or pvrtc format. This would use much less memory which means faster loading times and less memory transfers for the gpu, as memory bandwidth is limited.
Of course, adding the compressor to the post process tool is slightly more complex.
Do note that compressed textures do loose a bit of the quality depending on algorithm and implementation.
Silly question but are you sure you are sampler limited? It just seems to me that, with your "two 2-way textures" you are potentially pulling in a lot of texture data, and you might instead be bandwidth limited.
What if you were to use 3 textures [ BaseRGB, NormalRBG, and combined Emission+Specular] and use PVRTC compression? Depending on the detail, you might even be able to use 2bpp (rather than 4bpp) for the BaseRGB and/or Emission+Specular.
For the Normals I'd probably stick to 4bpp. Further, if you can afford the shader instructions, only store the R&G channels (putting 0 in the blue channel) and re-derive the blue channel with a bit of maths. This should give better quality.

Understanding just what is an image

I suppose the simplest understanding of what a (bitmap) image is would be an array of pixels. After that, it gets pretty technical.
I've been trying to understand the sort of information that an image may provide and have come across a large collection of technical terms like "mipmap", "pitch", "stride", "linear", "depth", as well as other format-specific things.
These seem to pop up across a lot of different formats so it'd probably be useful to understand what purpose they serve in an image. Looking at the DDS, BMP, PNG, TGA, JPG documentations has only made it clear that an image is pretty confusing.
Though searching around for some hours, there wasn't any nice tutorial-like break-down of just what an image is and all of the different properties.
The eventual goal would be to take proprietary image formats and convert them to more common formats like DDS or BMP. Or to make up some image format.
Any good readings?
Even your simplified explanation of an image doesn't encompass all the possibilities. For example an image can be divided by planes, where the red pixel values are all together followed by the green pixel values, followed by the blue pixel values. Such layouts are uncommon but still possible.
Assuming a simple layout of pixels you must still determine the pixel format. You might have a paletted image where some number of bits (1, 4, or 8) will be an index into a palette or color table which will define the RGB color of the pixel along with the transparency of the pixel (one index will typically be reserved as a transparent pixel). Otherwise the pixel will be 3 or 4 bytes depending on whether a transparency or alpha value is included. The order of the values (R,G,B) or (B,G,R) will depend on the format - Windows bitmaps are B,G,R while everything else will most likely be R,G,B.
The stride is the number of bytes between rows of the image. Windows bitmaps for example will take the width of the image times the number of bytes per pixel and round it up to the next multiple of 4 bytes.
I've never heard of DDA, and BMP is only common in the Windows world (and there's a lot more computing in the non-windows world than you might think). Rather than worry about all of the technical details of this, why not just use an existing toolkit such as image magick, which can already batch convert from dozens of formats to your one common format?
Unless you're doing specialized work, where you would need something fancy like hdr (which most image formats don't even support -- so most of your sources would not have it in the first place), you're probably best off picking something standard like PNG or JPG. They both have plusses and minuses. You might want to support both of those depending on the image.

Resources