Multi-Channel EXR Files - image-processing

I am relatively new to computer vision and image processing. I have a single EXR file with 7 channels: 1-3 give me the RGB values, 4-6 give me the surface normals coded as RGB values and 7th channel contains depth information from the camera of the rendered image. I was wondering if there was any way to view these 7 channels separately. For eg. I would like an image showing only the depth values in grayscale. So far I haven't found a multi-channel EXR file viewer that does this for me.
Thanks for the help!

Related

How to create GIF (which looks more natural) with dithering from video frames or Array of UIImages?

Here is the link (https://imgplay.zendesk.com/hc/en-us/articles/360029411991-What-is-GIF-Dithering-Option-) where it says When you save the file as GIF with dithering, it can make your GIF more natural.
How to implement Dithering for creating more natural GIF from UIImages or video frames using Objective-C or Swift?
Assuming your source image is 8-bit per channel RGB, you could use vImage. vImage doesn't have a vImageConvert_RGB88toIndexed8, but you can split your interleaved image into three 8-bit planar buffers for RGB. I don't know exactly how well this would work, but you could convert two of the three channels to Indexed2 with vImageConvert_Planar8toIndexed2 and the other channel to Indexed4 with vImageConvert_Planar8toIndexed4. That would give you the required 8-bit lookup table.
Apple have loads of Accelerate sample code projects here. Optimising Image-Processing Performance discusses converting interleaved images to planar format. If you have a known palette, Applying Color Transforms to Images with a Multidimensional Lookup Table may be a solution to quantising your image to 256 colors.

Converting images captured via active IR illumination to standard RGB images, for generating inference through a Convolutional Neural Network

I am building and training a CNN for a binary classification task. I have extracted images (frames) from a labelled video-database I had. The database claims to have videos recorded via active IR Illumination. The frames I have extracted as images, have 3 channel information.
The resulting trained algorithm (CNN model) would be deployed over an embedded board, which would take video feeds from a standard RGB usb-camera and would work on a frame level basis over the video-feed.
Question PART-1:
Now correct me if I am wrong, but I am concerned - since my knowledge suggests that the data distribution of the active IR illuminated videos would be different than that of a standard RGB feed, would this model perform with an equal precision over the RGB images, for classifying frames?
Note 1: Although the videos in the database look like they are 'greyscale' (due to the visible grey-tone of the video, maybe due to active IR illumination) in nature, upon processing, they were found to be containing 3 channel information.
Note 2: The difference between the values of the per-pixel 3 channel information is considerably higher in normal RGB images, when compared to the images (frames) extracted from the database.
For example, in a normal RGB image, if you consider any particular pixel, at random, the values corresponding to the three channels might differ from each other. It may be something like (128, 32, 98) or (34, 209, 173), etc. (Look at the difference between values in the three channels.)
In case of the frames extracted from the videos of the database that I have, the values along the three channels of a pixel DO NOT vary as much as they do in case of regular RGB images - It is something along the lines of (112, 117, 109) or (231, 240, 235) or (32, 34, 30), etc. I am supposing this is due to the fact that the videos are in general grey-ish, for reference - similar to a black and white filter, but not exactly black and white.
Question PART-2:
Would it be fair to convert RGB images into grey-scale and duplicating the single channel twice to essentially make it a three channel image?
Part 1: the neural net will perform best with the more contrasted channels. And training on one type of image will perform poorly on the other type.
Part 2: an RGB image is three-channelled. It would be a nonsense to make the channels equal and lose the good information.
Most probably, your IR images are not grayscale, they are packed as an RGB image for viewing. As they are very similar to each other, the colors are very desaturated, i.e. nearly gray.
And sorry to say, capturing three IR channels is of little use.

How can I use 32bit color depth image to convert to pcd or ply with PCL

I try to get a point cloud using a 32bit color depth image from Hololens, But I am having a hard time because I do not have much information about it. Do I have to have camera parameters to get point clouds from the depth image? Is there a way to convert from PCL or OpenCV?
I add some comment and picture. Finally I can get the point cloud using depth image from hololens. But I convert 32bit depth image to grayscale and feel that the sensors of the lens alone have a lot of distortion. To complement this, I think we need to find a way to undistortion and filtering the depth image.
Do you have any other information about this?

Jpeg, remove chroma, keep only luma without recompressing?

Is it possible to strip down chroma information from a jpeg file without loss on the luma?
Ideally I'd like a smaller file-size, greyscale version of an existing and optimized image.
Assuming the scans are not interleaved, you could update the SOF marker to have one scan and then delete the 2d and 3d scans from the input stream.

GPUImage : YUV or RGBA impact on performance?

I'm working on some still image processing, and GPUImage is a really awesome framework (thank you Brad Larson!).
I understand that :
some filters can be done with only 1 component. In this case, the image should be YUV (YCbCr), and we use only Y (luma = image grey level).
other filters need all color information from the 3 components - R,G and B.
YUV -> RGB conversion is provided (in GPUVideoCamera), RGB -> YUV may be hard-coded into the fragment shader (ex: GPUImageChromaKeyFilter)
I have many image-processing steps, some which can be based on YUV, others on RGB.
Basically, I want to mix RGB and YUV filters, so my general question is this :
What is the cost / information-loss of such successive conversions, and would you recommend any design ?
Thanks!
(PS : what is the problem with iPhone4 YUV->RGB conversion & AVCaptureStillImageOutput pixel-Format ?)
The use of YUV in GPUImage is a fairly new addition, and something I'm still experimenting with. I wanted to pull in YUV to try to improve filter performance, reduce memory usage, and possibly increase color fidelity. So far, my modifications have only achieved one of these three.
As you can see, I pull in YUV frames from the camera and then decide what to do with them at subsequent stages in the filter pipeline. If all of the filters that the camera input targets only want monochrome inputs, the camera input will send only the unprocessed Y channel texture on down the pipeline. If any of the filters need RGB input, the camera input will perform a shader-based conversion from YUV->RGB.
For filters that take in monochrome, this can lead to a significant performance boost with the elimination of the RGB conversion phase (done by AV Foundation when requesting BGRA data, or in my conversion shader), as well as a redundant conversion of RGB back to luminance. On an iPhone 4, the performance of the Sobel edge detection filter running on 720p frames goes from 36.0 ms per frame with RGB input to 15.1 ms using the direct Y channel. This also avoids a slight loss of information due to rounding from converting YUV to RGB and back to luminance. 8-bit color channels only have so much dynamic range.
Even when using RGB inputs, the movement of this conversion out of AV Foundation and into my shader leads to a performance win. On an iPhone 4S, running a saturation filter against 1080p inputs drops from 2.2 ms per frame to 1.5 ms per frame with my conversion shader instead of AV Foundation's built-in BGRA output.
Memory consumption is nearly identical between the two RGB approaches, so I'm experimenting with a way to improve this. For monochrome inputs, memory usage drops significantly due to the smaller texture size of the inputs.
Implementing an all-YUV pipeline is a little more challenging, because you would need to maintain parallel rendering pathways and shaders for the Y and UV planes, with separate input and output textures for both. Extracting planar YUV from RGB is tricky, because you'd need to somehow pull two outputs from one input, something that isn't natively supported in OpenGL ES. You'd need to do two render passes, which is fairly wasteful. Interleaved YUV444 might be more practical as a color format for a multistage pipeline, but I haven't played around with this yet.
Again, I'm just beginning to tinker with this.

Resources