pytorch summary RuntimeError, expected channels mismatch

pytorch summary RuntimeError, expected channels mismatch - image-processing

I am new to AI, when i try to get the summary info from my network, i got RuntimeError:
what parameter should i pass, according to the error? thanks.

Your first layer expects 3 channels input (in images this usually means a color image with Red, Green and Blue channels), but your input seems to have only one channel (in images this usually means your input image is a grey-scale image and not a color image).

Related

What does cv2.COLOR_GRAY2RGB do?

I happened to encounter this API cv2.COLOR_GRAY2RGB. I found it strange because there should have no way to convert an grey scale image to RGB image. So I tried something like this:
I took an image like this:
The image is shown by plt.imshow(img) (with default arguments).
Then I convert it into grey scale with cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) and get this:
I know it does not appear grey-scale looking is because imshow() by default is not displaying grey-scale image (more like heat-map I think). So I used cv2.cvtColor(img, cv2.COLOR_GRAY2RGB) and got this:
It appears grey to out eyes despite it has three channels now. So I conclude that cv2.COLOR_GRAY2RGB is a walk-around to display grey-scale image in grey-scale fashion without changing the settings for imshow().
Now my question is, when I use cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) again to convert this three-channel gray image back to one channel, the pixel value is exactly the same as the first time I converted the original image into one channel with cv2.cvtColor(img, cv2.COLOR_BGR2GRAY):
In other words, cv2.COLOR_BGR2GRAY can do a many-to-one mapping. I wonder how is that possible.

COLOR_BGR2GRAY color mode estimates a gray value for each pixel using a weighted sum of B, G, R channels, w_R*R+w_G*G+w_B*B=Y per pixel. So any 3 channel image becomes 1 channel.
COLOR_GRAY2BGR color mode basically replaces all B, G, R channels with the gray value Y, so B=Y, G=Y, R=Y. It converts a single channel image to multichannel by replicating.
More documentation about color modes are here.

GPUImage Histogram Equalization

I would like to use GPUImage's Histogram Equalization filter (link to .h) (link to .m) for a camera app. I'd like to use it in real time and present it as an option to be applied on the live camera feed. I understand this may be an expensive operation and cause some latency.
I'm confused about how this filter works. When selected in GPUImage's example project (Filter Showcase) the filter shows a very dark image that is biased toward red and blue which does not seem to be the way equalization should work.
Also what is the difference between the histogram types kGPUImageHistogramLuminance and kGPUImageHistogramRGB? Filter Showcase uses kGPUImageHistogramLuminance but the default in the init is kGPUImageHistogramRGB. If I switch Filter Showcase to kGPUImageHistogramRGB, I just get a black screen. My goal is an overall contrast optimization.
Does anyone have experience using this filter? Or are there current limitations with this filter that are documented somewhere?

Histogram equalization of RGB images is done using the Luminance as equalizing the RGB channels separately would render the colour information useless.
You basically convert RGB to a colour space that separates colour from intensity information. Then equalize the intensity image and finally reconvert it to RGB.
According to the documentation: http://oss.io/p/BradLarson/GPUImage
GPUImageHistogramFilter: This analyzes the incoming image and creates
an output histogram with the frequency at which each color value
occurs. The output of this filter is a 3-pixel-high, 256-pixel-wide
image with the center (vertical) pixels containing pixels that
correspond to the frequency at which various color values occurred.
Each color value occupies one of the 256 width positions, from 0 on
the left to 255 on the right. This histogram can be generated for
individual color channels (kGPUImageHistogramRed,
kGPUImageHistogramGreen, kGPUImageHistogramBlue), the luminance of the
image (kGPUImageHistogramLuminance), or for all three color channels
at once (kGPUImageHistogramRGB).
I'm not very familiar with the programming language used so I can't tell if the implementation is correct. But in the end, colours should not change too much. Pixels should just become brighter or darker.

How can I separate color pixels that are not common for two images?

Here's what I would like to do:
I have an image of rice leaf. I have another image of rice leaf that got brown color spots on the leaf. What I want to do is separate the color pixels that are not common for two images using opencv.(color of the spots can be vary)
I tried to do this using histogram intersection. But only managed to find number of pixels common between two images.
Is there any way to do this using opencv? Please be me kind enough to help me.

if the 2 images match perfectly
if they match use RhinoDevel approach:
so loop through all pixels of first image
and compare each pixel with corresponding pixel from second image
if the difference is higher then treshold
you found not matching Pixel and do what you need to do
like add pixel to some output map or recolor (brown) pixel to color from first image or whatever
if the 2 images does not match
so you just got some reference leaf image and the processed image can have any position/rotation skew
create list of colors per each image
sort them ascending by color
cross compare booth lists
if any color is in list2 but not in list1
then recolor/copy all pixels that contains such color in/from image2
this approach is slower O(xs*ys*n)
xs,ys is image 2 resolution and n is number of non common colors
[Notes]
RGB is usually fine but may be you got better result on HSV color space
in HSV you can compare all 3 parameters or just few of them like ignoring V value...

OpenCV - Dynamically find HSV ranges for color

When given an image such as this:
And not knowing the color of the object in the image, I would like to be able to automatically find the best H, S and V ranges to threshold the object itself, in order to get a result such as this:
In this example, I manually found the values and thresholded the image using cv::inRange.The output I'm looking for, are the best H, S and V ranges (min and max value each, total of 6 integer values) to threshold the given object in the image, without knowing in advance what color the object is. I need to use these values later on in my code.
Keypoints to remember:
- All given images will be of the same size.
- All given images will have the same dark background.
- All the objects I'll put in the images will be of full color.
I can brute force over all possible permutations of the 6 HSV ranges values, threshold each one and find a clever way to figure out when the best blob was found (blob size maybe?). That seems like a very cumbersome, long and highly ineffective solution though.
What would be good way to approach this? I did some research, and found that OpenCV has some machine learning capabilities, but I need to have the actual 6 values at the end of the process, and not just a thresholded image.

You could create a small 2 layer neural network for the task of dynamic HSV masking.
steps:
create/generate ground truth annotations for image and its HSV range for the required object
design a small neural network with at least 1 conv layer and 1 fcn layer.
Input : Mask of the image after applying the HSV range from ground truth( mxn)
Output : mxn mask of the image in binary
post processing : multiply the mask with the original image to get the required object highligted

How to get threshold value from histogram?

I'm writing an Android app in OpenCV to detect blobs. One task is to threshold the image to differentiate the foreground objects from the background (see image).
It works fine as long as the image is known and I can manually pass a threshold value to threshold()--in this particular image say, 200. But assuming that the image is not known with the only knowledge that there would be a dark solid background and lighter foreground objects how can I dynamically figure out the threshold value?
I've come across the histogram where I can compute the intensity distribution of the grayscale image. But I couldn't find a method to analyze the histogram and choose the value where the objects of interest (lighter) lies. That is; I want to differ the obviously dark background spikes from the lighter foreground spikes--in this case above 200, but in another case could be say, 100 if the objects are grayish.

If all your images are like this, or can be brought to this style, i think cv2.THRESHOLD_OTSU, ie otsu's tresholding algorithm is a good shot.
Below is a sample using Python in command terminal :
>>> import cv2
>>> import numpy as np
>>> img2 = cv2.imread('D:\Abid_Rahman_K\work_space\sofeggs.jpg',0)
>>> ret,thresh = cv2.threshold(img2,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
>>> ret
122.0
ret is the threshold value which is automatically calculated. We just pass '0' as threshold value for this.
I got 124 in GIMP ( which is comparable to result we got). And it also removes the noise. See result below:

If you say that the background is dark (black) and the foreground is lighter, then I recommend to use the YUV color space (or any other YXX like YCrCb, etc.), because the first component of such color spaces is luminance (or lightning).
So after the Y channel is extracted (via the extractChennel function) we need to analyse the histogram of this channel (image):
See the first (left) hump? It represents dark areas (the background in your situation) on your image. So our aim now is to find a segment (on abscissa, it's red part in the image) that contains this hump. Obviously the left point of this segment is zero. The right point is the first point where:
the (local) maximum of histogram is from the left of the point
the value of histogram is less than some small epsilon (you can set it to 10)
I drew a green vertical line to show the location of the right point of the segment in this histogram.
And that's it! This right point of the segment is the needed threshold. Here's the result (epsilon is 10 and the calculated threshold is 50):
I think that it's not a problem for you to delete the noise in the image above.

The following is a C++ implementation of Abid's answer that works with OpenCV 3.x:
// Convert the source image to a 1 channel grayscale:
Mat gray;
cvtColor(src, gray, CV_BGR2GRAY);
// Apply the threshold function with the CV_THRESH_OTSU setting as well
// You can skip having it return the value, but I include it for showing the
// results from OTSU
double thresholdValue = threshold(gray, gray, 0, 255, CV_THRESH_BINARY+CV_THRESH_OTSU);
// Present the threshold value
printf("Threshold value: %f\n", thresholdValue);
Running this against the original image, I get the following:
OpenCV calculated a threshold value of 122 for it, close to the value Abid found in his answer.
Just to verify, I altered the original image as seen here:
And produced the following, with a new threshold value of 178:

Categories

HOME

spring-security

ruby-on-rails

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

pytorch summary RuntimeError, expected channels mismatch - image-processing

I am new to AI, when i try to get the summary info from my network, i got RuntimeError: what parameter should i pass, according to the error? thanks.

Your first layer expects 3 channels input (in images this usually means a color image with Red, Green and Blue channels), but your input seems to have only one channel (in images this usually means your input image is a grey-scale image and not a color image).

Related

What does cv2.COLOR_GRAY2RGB do?

GPUImage Histogram Equalization

How can I separate color pixels that are not common for two images?

OpenCV - Dynamically find HSV ranges for color

How to get threshold value from histogram?

Categories

Resources