Hi guys,
I'm trying to reduce the number of bits per pixel to below 8, on gray scale images using Scilab
Is this possible?
If so, how can I do this?
Thank you.
I think it is not possible. The integer types available in Scilab are one or multiple bytes, see types here.
If you are looking to loose the high frequency information, you could shift out information.
Pseudo implementation
for x=1:width
for y=1:height
// Get pixel and make a 1 byte integer
pixel = int8(picture(x,y))
//Display bits
disp( dec2bin(pixel) )
// We start out with 8 bits - 4 = 4 bits info
bits_to_shift = 4
shifted_down_pixel = pixel/(2^bits_to_shift)
//Display shifted down
disp( dec2bin(shifted_down_pixel))
//Shift it back
shifted_back_pixel = pixel*(2^bits_to_shift)
disp( dec2bin(shifted_back_pixel))
// Replace old pixel with new
picture(x,y) = shifted_back_pixel
end
end
Of course you can do the above code much faster with one big matrix operation, but it is to show the concept.
Working example
rgb = imread('your_image.png')
gry = rgb2gray(rgb)
gry8bit = im2uint8(gry)
function result = reduce_bits(img, bits)
reduced = img / (2^bits);
result = reduced * (2^bits);
return result;
endfunction
gry2bit = reduce_bits(gry8bit, 6)
imshow(gry2bit)
Related
I have a task - to multiply big row vector (10 000 elements) via big column-major matrix (10 000 rows, 400 columns). I decided to go with ARM NEON since I'm curious about this technology and would like to learn more about it.
Here's a working example of vector matrix multiplication I wrote:
//float* vec_ptr - a pointer to vector
//float* mat_ptr - a pointer to matrix
//float* out_ptr - a pointer to output vector
//int matCols - matrix columns
//int vecRows - vector rows, the same as matrix
for (int i = 0, max_i = matCols; i < max_i; i++) {
for (int j = 0, max_j = vecRows - 3; j < max_j; j+=4, mat_ptr+=4, vec_ptr+=4) {
float32x4_t mat_val = vld1q_f32(mat_ptr); //get 4 elements from matrix
float32x4_t vec_val = vld1q_f32(vec_ptr); //get 4 elements from vector
float32x4_t out_val = vmulq_f32(mat_val, vec_val); //multiply vectors
float32_t total_sum = vaddvq_f32(out_val); //sum elements of vector together
out_ptr[i] += total_sum;
}
vec_ptr = &myVec[0]; //switch ptr back again to zero element
}
The problem is that it's taking very long time to compute - 30 ms on iPhone 7+ when my goal is 1 ms or even less if it's possible. Current execution time is understandable since I launch multiplication iteration 400 * (10000 / 4) = 1 000 000 times.
Also, I tried to process 8 elements instead of 4. It seems to help, but numbers still very far from my goal.
I understand that I might make some horrible mistakes since I'm newbie with ARM NEON. And I would be happy if someone can give me some tip how I can optimize my code.
Also - is it worth doing big vector-matrix multiplication via ARM NEON? Does this technology fit well for such purpose?
Your code is completely flawed: it iterates 16 times assuming both matCols and vecRows are 4. What's the point of SIMD then?
And the major performance problem lies in float32_t total_sum = vaddvq_f32(out_val);:
You should never convert a vector to a scalar inside a loop since it causes a pipeline hazard that costs around 15 cycles everytime.
The solution:
float32x4x4_t myMat;
float32x2_t myVecLow, myVecHigh;
myVecLow = vld1_f32(&pVec[0]);
myVecHigh = vld1_f32(&pVec[2]);
myMat = vld4q_f32(pMat);
myMat.val[0] = vmulq_lane_f32(myMat.val[0], myVecLow, 0);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[1], myVecLow, 1);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[2], myVecHigh, 0);
myMat.val[0] = vmlaq_lane_f32(myMat.val[0], myMat.val[3], myVecHigh, 1);
vst1q_f32(pDst, myMat.val[0]);
Compute all the four rows in a single pass
Do a matrix transpose (rotation) on-the-fly by vld4
Do vector-scalar multiply-accumulate instead of vector-vector multiply and horizontal add that causes the pipeline hazards.
You were asking if SIMD is suitable for matrix operations? A simple "yes" would be a monumental understatement. You don't even need a loop for this.
I have a visualization output of gabor filter with 12 different orientations.I want to superimpose the vizualization image on my image of retina for vessel extraction.How do i do it?I have tried the below method.is there any other method to perform superimposition of images in matlab.
here is my code
I = getimage();
I=I(:,:,2);
lambda = 8;
theta = 0;
psi = [0 pi/2];
gamma = 0.5;
bw = 1;
N = 2;
img_in = im2double(I);
%img_in(:,:,2:3) = []; % discard redundant channels, it's gray anyway
img_out = zeros(size(img_in,1), size(img_in,2), N);
for n=1:N
gb = gabor_fn(bw,gamma,psi(1),lambda,theta)...
+ 1i * gabor_fn(bw,gamma,psi(2),lambda,theta);
% gb is the n-th gabor filter
img_out(:,:,n) = imfilter(img_in, gb, 'symmetric');
% filter output to the n-th channel
%theta = theta + 2*pi/N
%figure;
%imshow(img_out(:,:,n));
imshow(img_in); hold on;
h = imagesc(img_out(:,:,n)); % here i am getting error saying CDATA must be size[M*N]
set( h, 'AlphaData', .5 ); % .5 transparency
figure;
imshow(h);
theta = 15 * n; % next orientation
end
this is my original image
this is my visualized image got by gabor filter using orientation
this is the kind/type of image i have to get with respect to visualisation .i.e i have to impose visualized image on my original image and i have to get this type of image
With the information you have provided, my understanding is you want the third/final image to be an overlay on top of the first/initial image. I do things like this when using segmentation to detect hemorrhaging in MRI images of the brain.
First, let's set up some defintions:
I_src = source/original image
I_out = output/final image
Now, make a copy of I_src and make it a color image rather than grayscale.
I_hybrid = I_src
colorIm = gray2rgb(I_src)
Let's assume both I_src and I_out are the same visual dimensions (ie: width, height), and that I_out is strictly black-and-white (ie: monochrome). Now, we can use I_out as a mask template for alpha channel adjustments in the resulting image. This is where it gets fun.
BLACK=0;
WHITE=1;
[length width] = size(I_out);
for i = 1:1:length
for j = 1:1:width
if (I_out(i,j) == WHITE)
I_hybrid(i,j) = I_hybrid(i,j) + [0.25 0 0]a;
end
end
This will result in you getting your original image with the blood vessels in the eye being slightly brighter and tinted red. You now have a beautiful composite of your original image with the desired features highlighted, but not overwritten (ie: you can undo the highlighting by subtracting the original color vector).
I will include an example of what the output would look like, but it's noisy because I had to create it in GIMP as I don't have Matlab installed right now. The results will be similar, but yours would be much cleaner and prettier.
Please let me know how this goes.
References
"Converting Images from Grayscale to Color" http://blogs.mathworks.com/pick/2012/11/25/converting-images-from-grayscale-to-color/
I'm trying to implement Otsu binarization technique on document images such as the one shown:
Could someone please tell me how to implement the code in MATLAB?
Taken from Otsu's method on Wikipedia
I = imread('cameraman.tif');
Step 1. Compute histogram and probabilities of each intensity level.
nbins = 256; % Number of bins
counts = imhist(I,nbins); % Each intensity increments the histogram from 0 to 255
p = counts / sum(counts); % Probabilities
Step 2. Set up initial omega_i(0) and mu_i(0)
omega1 = 0;
omega2 = 1;
mu1 = 0;
mu2 = mean(I(:));
Step 3. Step through all possible thresholds from 0 to maximum intensity (255)
Step 3.1 Update omega_i and mu_i
Step 3.2 Compute sigma_b_squared
for t = 1:nbins
omega1(t) = sum(p(1:t));
omega2(t) = sum(p(t+1:end));
mu1(t) = sum(p(1:t).*(1:t)');
mu2(t) = sum(p(t+1:end).*(t+1:nbins)');
end
sigma_b_squared_wiki = omega1 .* omega2 .* (mu2-mu1).^2; % Eq. (14)
sigma_b_squared_otsu = (mu1(end) .* omega1-mu1) .^2 ./(omega1 .* (1-omega1)); % Eq. (18)
Step 4 Desired threshold corresponds to the location of maximum of sigma_b_squared
[~,thres_level_wiki] = max(sigma_b_squared_wiki);
[~,thres_level_otsu] = max(sigma_b_squared_otsu);
There are some differences between the wiki-version eq. (14) in Otsu and the eq. (18), and I don't why. But the thres_level_otsu correspond to the MATLAB's implementation graythresh(I)
Since the function graythresh in Matlab implements the Otsu method, what you have to do is convert your image to grayscale and then use the im2bw function to binarize the image using the threhsold level returned by graythresh.
To convert your image I to grayscale you can use the following code:
I = im2uint8(I);
if size(I,3) ~= 1
I = rgb2gray(I);
end;
To get the binary image Ib using the Otsu's method, use the following code:
Ib = im2bw(I, graythresh(I));
You should get the following result:
Starting out with what your initial question was implementing the OTSU thresolding its true that MATLAB's graythresh function is based on that method
The OTSU's method considers the threshold value as the valley between two peaks that is one of the foreground pixels and the other of the background pixels
Pertaining to your image which seems like a historical manuscript found this paper that compares all the methods that could be used for thresholding document images
You can also download and read up sauvola thresholding from here
Good luck with its implementation =)
Corrected MATLAB Implementation (for 2d matrix)
function [T] = myotsu(I,N);
% create histogram
nbins = N;
[x,h] = hist(I(:),nbins);
% calculate probabilities
p = x./sum(x);
% initialisation
om1 = 0;
om2 = 1;
mu1 = 0;
mu2 = mode(I(:));
for t = 1:nbins,
om1(t) = sum(p(1:t));
om2(t) = sum(p(t+1:nbins));
mu1(t) = sum(p(1:t).*[1:t]);
mu2(t) = sum(p(t+1:nbins).*[t+1:nbins]);
end
sigma = (mu1(nbins).*om1-mu1).^2./(om1.*(1-om1));
idx = find(sigma == max(sigma));
T = h(idx(1));
This is a formula for LoG filtering:
(source: ed.ac.uk)
Also in applications with LoG filtering I see that function is called with only one parameter:
sigma(σ).
I want to try LoG filtering using that formula (previous attempt was by gaussian filter and then laplacian filter with some filter-window size )
But looking at that formula I can't understand how the size of filter is connected with this formula, does it mean that the filter size is fixed?
Can you explain how to use it?
As you've probably figured out by now from the other answers and links, LoG filter detects edges and lines in the image. What is still missing is an explanation of what σ is.
σ is the scale of the filter. Is a one-pixel-wide line a line or noise? Is a line 6 pixels wide a line or an object with two distinct parallel edges? Is a gradient that changes from black to white across 6 or 8 pixels an edge or just a gradient? It's something you have to decide, and the value of σ reflects your decision — the larger σ is the wider are the lines, the smoother the edges, and more noise is ignored.
Do not get confused between the scale of the filter (σ) and the size of the discrete approximation (usually called stencil). In Paul's link σ=1.4 and the stencil size is 9. While it is usually reasonable to use stencil size of 4σ to 6σ, these two quantities are quite independent. A larger stencil provides better approximation of the filter, but in most cases you don't need a very good approximation.
This was something that confused me too, and it wasn't until I had to do the same as you for a uni project that I understood what you were supposed to do with the formula!
You can use this formula to generate a discrete LoG filter. If you write a bit of code to implement that formula, you can then to generate a filter for use in image convolution. To generate, say a 5x5 template, simply call the code with x and y ranging from -2 to +2.
This will generate the values to use in a LoG template. If you graph the values this produces you should see the "mexican hat" shape typical of this filter, like so:
(source: ed.ac.uk)
You can fine tune the template by changing how wide it is (the size) and the sigma value (how broad the peak is). The wider and broader the template the less affected by noise the result will be because it will operate over a wider area.
Once you have the filter, you can apply it to the image by convolving the template with the image. If you've not done this before, check out these few tutorials.
java applet tutorials more mathsy.
Essentially, at each pixel location, you "place" your convolution template, centred at that pixel. You then multiply the surrounding pixel values by the corresponding "pixel" in the template and add up the result. This is then the new pixel value at that location (typically you also have to normalise (scale) the output to bring it back into the correct value range).
The code below gives a rough idea of how you might implement this. Please forgive any mistakes / typos etc. as it hasn't been tested.
I hope this helps.
private float LoG(float x, float y, float sigma)
{
// implement formula here
return (1 / (Math.PI * sigma*sigma*sigma*sigma)) * //etc etc - also, can't remember the code for "to the power of" off hand
}
private void GenerateTemplate(int templateSize, float sigma)
{
// Make sure it's an odd number for convenience
if(templateSize % 2 == 1)
{
// Create the data array
float[][] template = new float[templateSize][templatesize];
// Work out the "min and max" values. Log is centered around 0, 0
// so, for a size 5 template (say) we want to get the values from
// -2 to +2, ie: -2, -1, 0, +1, +2 and feed those into the formula.
int min = Math.Ceil(-templateSize / 2) - 1;
int max = Math.Floor(templateSize / 2) + 1;
// We also need a count to index into the data array...
int xCount = 0;
int yCount = 0;
for(int x = min; x <= max; ++x)
{
for(int y = min; y <= max; ++y)
{
// Get the LoG value for this (x,y) pair
template[xCount][yCount] = LoG(x, y, sigma);
++yCount;
}
++xCount;
}
}
}
Just for visualization purposes, here is a simple Matlab 3D colored plot of the Laplacian of Gaussian (Mexican Hat) wavelet. You can change the sigma(σ) parameter and see its effect on the shape of the graph:
sigmaSq = 0.5 % Square of σ parameter
[x y] = meshgrid(linspace(-3,3), linspace(-3,3));
z = (-1/(pi*(sigmaSq^2))) .* (1-((x.^2+y.^2)/(2*sigmaSq))) .*exp(-(x.^2+y.^2)/(2*sigmaSq));
surf(x,y,z)
You could also compare the effects of the sigma parameter on the Mexican Hat doing the following:
t = -5:0.01:5;
sigma = 0.5;
mexhat05 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 1;
mexhat1 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
sigma = 2;
mexhat2 = exp(-t.*t/(2*sigma*sigma)) * 2 .*(t.*t/(sigma*sigma) - 1) / (pi^(1/4)*sqrt(3*sigma));
plot(t, mexhat05, 'r', ...
t, mexhat1, 'b', ...
t, mexhat2, 'g');
Or simply use the Wavelet toolbox provided by Matlab as follows:
lb = -5; ub = 5; n = 1000;
[psi,x] = mexihat(lb,ub,n);
plot(x,psi), title('Mexican hat wavelet')
I found this useful when implementing this for edge detection in computer vision. Although not the exact answer, hope this helps.
It appears to be a continuous circular filter whose radius is sqrt(2) * sigma. If you want to implement this for image processing you'll need to approximate it.
There's an example for sigma = 1.4 here: http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm
I need to convert an 8-bit IplImage to a 32-bits IplImage. Using documentation from all over the web I've tried the following things:
// general code
img2 = cvCreateImage(cvSize(img->width, img->height), 32, 3);
int height = img->height;
int width = img->width;
int channels = img->nChannels;
int step1 = img->widthStep;
int step2 = img2->widthStep;
int depth1 = img->depth;
int depth2 = img2->depth;
uchar *data1 = (uchar *)img->imageData;
uchar *data2 = (uchar *)img2->imageData;
for(h=0;h<height;h++) for(w=0;w<width;w++) for(c=0;c<channels;c++) {
// attempt code...
}
// attempt one
// result: white image, two red spots which appear in the original image too.
// this is the closest result, what's going wrong?!
// see: http://files.dazjorz.com/cache/conversion.png
((float*)data2+h*step2+w*channels+c)[0] = data1[h*step1+w*channels+c];
// attempt two
// when I change float to unsigned long in both previous examples, I get a black screen.
// attempt three
// result: seemingly random data to the top of the screen.
data2[h*step2+w*channels*3+c] = data1[h*step1+w*channels+c];
data2[h*step2+w*channels*3+c+1] = 0x00;
data2[h*step2+w*channels*3+c+2] = 0x00;
// and then some other things. Nothing did what I wanted. I couldn't get an output
// image which looked the same as the input image.
As you see I don't really know what I'm doing. I'd love to find out, but I'd love it more if I could get this done correctly.
Thanks for any help I get!
The function you are looking for is cvConvertScale(). It automagically does any type conversion for you. You just have to specify that you want to scale by a factor of 1/255 (which maps the range [0...255] to [0...1]).
Example:
IplImage *im8 = cvLoadImage(argv[1]);
IplImage *im32 = cvCreateImage(cvSize(im8->width, im8->height), 32, 3);
cvConvertScale(im8, im32, 1/255.);
Note the dot in 1/255. - to force a double division. Without it you get a scale of 0.
Perhaps this link can help you?
Edit In response to the second edit of the OP and the comment
Have you tried
float value = 0.5
instead of
float value = 0x0000001;
I thought the range for a float color value goes from 0.0 to 1.0, where 1.0 is white.
Floating point colors go from 0.0 to 1.0, and uchars go from 0 to 255. The following code fixes it:
// h is height, w is width, c is current channel (0 to 2)
int b = ((uchar *)(img->imageData + h*img->widthStep))[w*img->nChannels + c];
((float *)(img2->imageData + h*img2->widthStep))[w*img2->nChannels + c] = ((float)b) / 255.0;
Many, many thanks to Stefan Schmidt for helping me fix this!
If you do not put the dot (.), some compilers will understand is as an int division, giving you a int result (zero in this case).
You can create an IplImage wrapper using boost::shared_ptr and template-metaprogramming. I have done that, and I get automatic garbage collection, together with automatic image conversions from one depth to another, or from one-channel to multi-channel images.
I have called the API blImageAPI and it can be found here:
http://www.barbato.us/2010/10/14/image-data-structure-based-shared_ptr-iplimage/
It is very fast, and make code very readable, (good for maintaining algorithms)
It is also can be used instead of IplImage in opencv algorithms without changing anything.
Good luck and have fun writing algorithms!!!
IplImage *img8,*img32;
img8 =cvLoadImage("a.jpg",1);
cvNamedWindow("Convert",1);
img32 = cvCreateImage(cvGetSize(img8),IPL_DEPTH_32F,3);
cvConvertScale(img8,img32,1.0/255.0,0.0);
//For Confirmation Check the pixel values (between 0 - 1)
for(int row = 0; row < img32->height; row++ ){
float* pt = (float*) (img32->imageData + row * img32->widthStep);
for ( int col = 0; col < width; col++ )
printf("\n %3.3f , %3.3f , %3.3f ",pt[3*col],pt[3*col+1],pt[3*col+2]);
}
cvShowImage("Convert",img32);
cvWaitKey(0);
cvReleaseImage(&img8);
cvReleaseImage(&img32);
cvDestroyWindow("Convert");