image shuffling and slicing - image-processing

This is my code for slicing my 512*512 image into a cube of 64*64*64 dimension. but when i reshape it again into a 2D array why is it not giving me the original image.am i doing something incorrect please help.
clc;
im=ind2gray(y,ymap);
% im=imresize(im,0.125);
[rows ,columns, colbands] = size(im)
end
image3d=reshape(image3d,512,512);
figure,imshow(uint8(image3d));

Just a small hint.
P(:,:,1) = [0,0;0,0]
P(:,:,2) = [1,1;1,1]
P(:,:,3) = [2,2;2,2]
P(:,:,4) = [3,3;3,3]
B = reshape(P,4,4)
B =
0 1 2 3
0 1 2 3
0 1 2 3
0 1 2 3
So you might change the slicing or do the reshaping on your own.

If I have understood your question right, you can look into the code below to perform the same operation.
% Random image of the provided size 512X512
imageX = rand(512,512)
imagesc(imageX)
% Converting the image "imageX" into the cube of 64X64X64 dimension
sliceColWise = reshape(imageX,64,64,64)
size(sliceColWise)
% Reshaping the cube to obtain the image original that was "imageX",
% in order to observe that they are identical the difference is plotted
imageY = reshape(sliceColWise,512,512);
imagesc(imageX-imageY)
n.b: From MATLAB help you can see that the reshape works column wise
reshape(X,M,N) or reshape(X,[M,N]) returns the M-by-N matrix
whose elements are taken columnwise from X. An error results
if X does not have M*N elements.

Related

How to export all the information from 3d numpy array to a csv file

Kaggle Dataset and code link
I'm trying to solve the above Kaggle problem and I want to export preprocessed csv so that I can build a model on weka, but when I'm trying to save it in csv I'm losing a dimension, I want to retain all the information in that csv.
please help me with the relevant code or any resource.
Thanks
print (scaled_x)
|x |y |z |label
|1.485231 |-0.661030 |-1.194153 |0
|0.888257 |-1.370361 |-0.829636 |0
|0.691523 |-0.594794 |-0.936247 |0
Fs=20
frame_size = Fs*4 #80
hop_size = Fs*2 #40
def get_frames(df, frame_size, hop_size):
N_FEATURES = 3
frames = []
labels = []
for i in range(0,len(df )- frame_size, hop_size):
x = df['x'].values[i: i+frame_size]
y = df['y'].values[i: i+frame_size]
z = df['z'].values[i: i+frame_size]
label = stats.mode(df['label'][i: i+frame_size])[0][0]
frames.append([x,y,z])
labels.append(label)
frames = np.asarray(frames).reshape(-1, frame_size, N_FEATURES)
labels = np.asarray(labels)
return frames, labels
x,y = get_frames(scaled_x, frame_size, hop_size)
x.shape, y.shape
((78728, 80, 3), (78728,))
According to the link you posted, the data is times series accelerometer/gyro data sampled at 20 Hz, with a label for each sample. They want to aggregate the time series into frames (with the corresponding label being the most common label during a given frame).
So frame_size is the number of samples in a frame, and hop_size is the amount the sliding window moves forward each iteration. In other words, the frames overlap by 50% since hop_size = frame_size / 2.
Thus at the end you get a 3D array of 78728 frames of length 80, with 3 values (x, y, z) each.
EDIT: To answer your new question about how to export as CSV, you'll need to "flatten" the 3D frame array to a 2D array since that's what a CSV represents. There are multiple different ways to do this but I think the easiest may just be to concatenate the final two dimensions, so that each row is a frame, consisting of 240 values (80 samples of 3 co-ordinates each). Then concatenate the labels as the final column.
x_2d = np.reshape(x, (x.shape[0], -1))
full = np.concatenate([x, y], axis=1)
import pandas as pd
df = pd.DataFrame(full)
df.to_csv("frames.csv")
If you also want proper column names:
columns = []
for i in range(1, x.shape[1] + 1):
columns.extend([f"{i}_X", f"{i}_Y", f"{i}_Z"])
columns.append("label")
df = pd.DataFrame(full, columns=columns)

Need a vectorized solution in pytorch

I'm doing an experiment using face images in PyTorch framework. The input x is the given face image of size 5 * 5 (height * width) and there are 192 channels.
Objective: To obtain patches of x of patch_size(given as argument).
I have obtained the required result with the help of two for loops. But I want a better-vectorized solution so that the computation cost will be very less than using two for loops.
Used: PyTorch 0.4.1, (12 GB) Nvidia TitanX GPU.
The following is my implementation using two for loops
def extractpatches( x, patch_size): # x is bsx192x5x5
patches = x.unfold( 2, patch_size , 1).unfold(3,patch_size,1)
bs,c,pi,pj, _, _ = patches.size() #bs,192,
cnt = 0
p = torch.empty((bs,pi*pj,c,patch_size,patch_size)).to(device)
s = torch.empty((bs,pi*pj, c*patch_size*patch_size)).to(device)
//Want a vectorized method instead of two for loops below
for i in range(pi):
for j in range(pj):
p[:,cnt,:,:,:] = patches[:,:,i,j,:,:]
s[:,cnt,:] = p[:,cnt,:,:,:].view(-1,c*patch_size*patch_size)
cnt = cnt+1
return s
Thanks for your help in advance.
I think you can try this as following. I used some parts of your code for my experiment and it worked for me. Here l and f are the lists of tensor patches
l = [patches[:,:,int(i/pi),i%pi,:,:] for i in range(pi * pi)]
f = [l[i].contiguous().view(-1,c*patch_size*patch_size) for i in range(pi * pi)]
You can verify the above code using toy input values.
Thanks.

How to split the image into chunks without breaking character - python

I am trying to read image from the text.
I am getting better result if I break the images into small chunks but the problem is when i try to split the image it is cutting/slicing my characters.
code I am using :
from __future__ import division
import math
import os
from PIL import Image
def long_slice(image_path, out_name, outdir, slice_size):
"""slice an image into parts slice_size tall"""
img = Image.open(image_path)
width, height = img.size
upper = 0
left = 0
slices = int(math.ceil(height/slice_size))
count = 1
for slice in range(slices):
#if we are at the end, set the lower bound to be the bottom of the image
if count == slices:
lower = height
else:
lower = int(count * slice_size)
#set the bounding box! The important bit
bbox = (left, upper, width, lower)
working_slice = img.crop(bbox)
upper += slice_size
#save the slice
working_slice.save(os.path.join(outdir, "slice_" + out_name + "_" + str(count)+".png"))
count +=1
if __name__ == '__main__':
#slice_size is the max height of the slices in pixels
long_slice("/python_project/screenshot.png","longcat", os.getcwd(), 100)
Sample Image : The image i want to process
Expected/What i am trying to do :
I want to split every line as separate image without cutting the character
Line 1:
Line 2:
Current result:Characters in the image are cropped
I dont want to cut the image based on pixels since each document will have separate spacing and line width
Thanks
Jk
Here is a solution that finds the brightest rows in the image (i.e., the rows without text) and then splits the image on those rows. So far I have just marked the sections, and am leaving the actual cropping up to you.
The algorithm is as follows:
Find the sum of the luminance (I am just using the red channel) of every pixel in each row
Find the rows with sums that are at least 0.999 (which is the threshold I am using) as bright as the brightest row
Mark those rows
Here is the code that will return a list of these rows:
def find_lightest_rows(img, threshold):
line_luminances = [0] * img.height
for y in range(img.height):
for x in range(img.width):
line_luminances[y] += img.getpixel((x, y))[0]
line_luminances = [x for x in enumerate(line_luminances)]
line_luminances.sort(key=lambda x: -x[1])
lightest_row_luminance = line_luminances[0][1]
lightest_rows = []
for row, lum in line_luminances:
if(lum > lightest_row_luminance * threshold):
lightest_rows.add(row)
return lightest_rows
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ... ]
After colouring these rows red, we have this image:

Gaussian filter in scipy

I want to apply a Gaussian filter of dimension 5x5 pixels on an image of 512x512 pixels. I found a scipy function to do that:
scipy.ndimage.filters.gaussian_filter(input, sigma, truncate=3.0)
How I choose the parameter of sigma to make sure that my Gaussian window is 5x5 pixels?
Check out the source code here: https://github.com/scipy/scipy/blob/master/scipy/ndimage/filters.py
You'll see that gaussian_filter calls gaussian_filter1d for each axis. In gaussian_filter1d, the width of the filter is determined implicitly by the values of sigma and truncate. In effect, the width w is
w = 2*int(truncate*sigma + 0.5) + 1
So
(w - 1)/2 = int(truncate*sigma + 0.5)
For w = 5, the left side is 2. The right side is 2 if
2 <= truncate*sigma + 0.5 < 3
or
1.5 <= truncate*sigma < 2.5
If you choose truncate = 3 (overriding the default of 4), you get
0.5 <= sigma < 0.83333...
We can check this by filtering an input that is all 0 except for a single 1 (i.e. find the impulse response of the filter) and counting the number of nonzero values in the filtered output. (In the following, np is numpy.)
First create an input with a single 1:
In [248]: x = np.zeros(9)
In [249]: x[4] = 1
Check the change in the size at sigma = 0.5...
In [250]: np.count_nonzero(gaussian_filter1d(x, 0.49, truncate=3))
Out[250]: 3
In [251]: np.count_nonzero(gaussian_filter1d(x, 0.5, truncate=3))
Out[251]: 5
... and at sigma = 0.8333...:
In [252]: np.count_nonzero(gaussian_filter1d(x, 0.8333, truncate=3))
Out[252]: 5
In [253]: np.count_nonzero(gaussian_filter1d(x, 0.8334, truncate=3))
Out[253]: 7
Following the excellent previous answer:
set sigma s = 2
set window size w = 5
evaluate the 'truncate' value: t = (((w - 1)/2)-0.5)/s
filtering: filtered_data = scipy.ndimage.filters.gaussian_filter(data, sigma=s, truncate=t)

logistic regression with gradient descent error

I am trying to implement logistic regression with gradient descent,
I get my Cost function j_theta for the number of iterations and fortunately my j_theta is decreasing when plotted j_theta against the number of iteration.
The data set I use is given below:
x=
1 20 30
1 40 60
1 70 30
1 50 50
1 50 40
1 60 40
1 30 40
1 40 50
1 10 20
1 30 40
1 70 70
y= 0
1
1
1
0
1
0
0
0
0
1
The code that I managed to write for logistic regression using Gradient descent is:
%1. The below code would load the data present in your desktop to the octave memory
x=load('stud_marks.dat');
%y=load('ex4y.dat');
y=x(:,3);
x=x(:,1:2);
%2. Now we want to add a column x0 with all the rows as value 1 into the matrix.
%First take the length
[m,n]=size(x);
x=[ones(m,1),x];
X=x;
% Now we limit the x1 and x2 we need to leave or skip the first column x0 because they should stay as 1.
mn = mean(x);
sd = std(x);
x(:,2) = (x(:,2) - mn(2))./ sd(2);
x(:,3) = (x(:,3) - mn(3))./ sd(3);
% We will not use vectorized technique, Because its hard to debug, We shall try using many for loops rather
max_iter=50;
theta = zeros(size(x(1,:)))';
j_theta=zeros(max_iter,1);
for num_iter=1:max_iter
% We calculate the cost Function
j_cost_each=0;
alpha=1;
theta
for i=1:m
z=0;
for j=1:n+1
% theta(j)
z=z+(theta(j)*x(i,j));
z
end
h= 1.0 ./(1.0 + exp(-z));
j_cost_each=j_cost_each + ( (-y(i) * log(h)) - ((1-y(i)) * log(1-h)) );
% j_cost_each
end
j_theta(num_iter)=(1/m) * j_cost_each;
for j=1:n+1
grad(j) = 0;
for i=1:m
z=(x(i,:)*theta);
z
h=1.0 ./ (1.0 + exp(-z));
h
grad(j) += (h-y(i)) * x(i,j);
end
grad(j)=grad(j)/m;
grad(j)
theta(j)=theta(j)- alpha * grad(j);
end
end
figure
plot(0:1999, j_theta(1:2000), 'b', 'LineWidth', 2)
hold off
figure
%3. In this step we will plot the graph for the given input data set just to see how is the distribution of the two class.
pos = find(y == 1); % This will take the postion or array number from y for all the class that has value 1
neg = find(y == 0); % Similarly this will take the position or array number from y for all class that has value 0
% Now we plot the graph column x1 Vs x2 for y=1 and y=0
plot(x(pos, 2), x(pos,3), '+');
hold on
plot(x(neg, 2), x(neg, 3), 'o');
xlabel('x1 marks in subject 1')
ylabel('y1 marks in subject 2')
legend('pass', 'Failed')
plot_x = [min(x(:,2))-2, max(x(:,2))+2]; % This min and max decides the length of the decision graph.
% Calculate the decision boundary line
plot_y = (-1./theta(3)).*(theta(2).*plot_x +theta(1));
plot(plot_x, plot_y)
hold off
%%%%%%% The only difference is In the last plot I used X where as now I use x whose attributes or features are featured scaled %%%%%%%%%%%
If you view the graph of x1 vs x2 the graph would look like,
After I run my code I create a decision boundary. The shape of the decision line seems to be okay but it is a bit displaced. The graph of the x1 vs x2 with decision boundary is given below:
![enter image description here][2]
Please suggest me where am I going wrong ....
Thanks:)
The New Graph::::
![enter image description here][1]
If you see the new graph the coordinated of x axis have changed ..... Thats because I use x(feature scalled) instead of X.
The problem lies in your cost function calculation and/or gradient calculation, your plotting function is fine. I ran your dataset on the algorithm I implemented for logistic regression but using the vectorized technique because in my opinion it is easier to debug.
The final values I got for theta were
theta =
[-76.4242,
0.8214,
0.7948]
I also used alpha = 0.3
I plotted the decision boundary and it looks fine, I would recommend using the vectorized form as it is easier to implement and to debug in my opinion.
I also think your implementation of gradient descent is not quite correct. 50 iterations is just not enough and the cost at the last iteration is not good enough. Maybe you should try to run it for more iterations with a stopping condition.
Also check this lecture for optimization techniques.
https://class.coursera.org/ml-006/lecture/37

Resources