Linefitting how to deal with continuous values? - image-processing

I'm trying to fit a line using quadratic poly, but because the fit results in continuous values, the integer conversion (for CartesianIndex) rounds it off, and I loose data at that pixel.
I tried the method
here. So I get new y values as
using Images, Polynomials, Plots,ImageView
img = load("jTjYb.png")
img = Gray.(img)
img = img[end:-1:1, :]
nodes = findall(img.>0)
xdata = map(p->p[2], nodes)
ydata = map(p->p[1], nodes)
f = fit(xdata, ydata, 2)
ydata_new .= round.(Int, f.(xdata)
new_line_fitted_img=zeros(size(img))
new_line_fitted_img[xdata,ydata_new].=1
imshow(new_line_fitted_img)
which results in chopped line as below
whereas I was expecting it to be continuous line as it was in pre-processing

Do you expect the following:
Raw Image
Fitted Polynomial
Superposition
enter image description here
enter image description here
enter image description here
Code:
using Images, Polynomials
img = load("img.png");
img = Gray.(img)
fx(data, dCoef, cCoef, bCoef, aCoef) = #. data^3 *aCoef + data^2 *bCoef + data*cCoef + dCoef;
function fit_poly(img::Array{<:Gray, 2})
img = img[end:-1:1, :]
nodes = findall(img.>0)
xdata = map(p->p[2], nodes)
ydata = map(p->p[1], nodes)
f = fit(xdata, ydata, 3)
xdt = unique(xdata)
xdt, fx(xdt, f.coeffs...)
end;
function draw_poly!(X, y)
the_min = minimum(y)
if the_min<0
y .-= the_min - 1
end
initialized_img = Gray.(zeros(maximum(X), maximum(y)))
initialized_img[CartesianIndex.(X, y)] .= 1
dif = diff(y)
for i in eachindex(dif)
the_dif = dif[i]
if abs(the_dif) >= 2
segment = the_dif รท 2
initialized_img[i, y[i]:y[i]+segment] .= 1
initialized_img[i+1, y[i]+segment+1:y[i+1]-1] .= 1
end
end
rotl90(initialized_img)
end;
X, y = fit_poly(img);
y = convert(Vector{Int64}, round.(y));
draw_poly!(X, y)

Related

Why does Tesseract fail to recognize 6 out of 26 of my alphabetic keyboard keys even with several parameter tunings?

TL;DR I'm using:
adaptive thresholding
segmenting by keys (width/height ratio) - see green boxes in image result
psm 10 to treat each key as a character
but it fails to recognize some keys, falsely identifies others or identifies 2 for 1 char (see the L character in the image result, it's an L and P), etc.
Note: I cropped the image and re-ran the results to get it to fit on this site, but before cropping it did slightly better (recognized more keys, fewer false positives, etc).
I just want it to recognize the alphabet keys. Ultimately I will want it to work for realtime video.
config:
'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
I've tried scaling the image differently, scaling the individual key segments, using opening/closing/etc but it doesn't recognize all the keys.
original image
image result
Update: new results if I make the image straighter (bird's eye) and remove the whitelisting, it manages to detect all for the most part (although it thinks the O is a 0 and the I is a |, which is understandable). Why is this and how could I make this adaptive enough for a dynamic video when it is so sensitive to these conditions?
Code:
import pytesseract
import numpy as np
try:
from PIL import Image
except ImportError:
import Image
import cv2
from tqdm import tqdm
from collections import defaultdict
def get_missing_chars(dict):
capital_alphabet = [chr(ascii) for ascii in range(65, 91)]
return [let for let in capital_alphabet if let not in dict]
def draw_box_and_char(img, contour_dims, c, box_col, text_col):
x, y, w, h = contour_dims
top_left = (x, y)
bot_right = (x + w, y+h)
font_offset = 3
text_pos = (x+h//2+12, y+h-font_offset)
img_copy = img.copy()
cv2.rectangle(img_copy, top_left, bot_right, box_col, 2)
cv2.putText(img_copy, c, text_pos, cv2.FONT_HERSHEY_SIMPLEX, fontScale=.5, color=text_col, thickness=1, lineType=cv2.LINE_AA)
return img_copy
def detect_keys(img):
scaling = .25
img = cv2.resize(img, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_AREA)
print("img shape", img.shape)
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ratio_min = 0.7
area_min = 1000
nbrhood_size = 1001
bias = 2
# adapt to different lighting
bin_img = cv2.adaptiveThreshold(gray_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY_INV, nbrhood_size, bias)
items = cv2.findContours(bin_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = items[0] if len(items) == 2 else items[1]
key_contours = []
for c in contours:
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
# square-like ratio, try to get character
if ratio > ratio_min and area > area_min:
key_contours.append(c)
detected = defaultdict(int)
n_kept = 0
img_copy = cv2.cvtColor(bin_img, cv2.COLOR_GRAY2RGB)
let_to_contour = {}
n_contours = len(key_contours)
# offset to get smaller square within the key segment for easier char recognition
offset = 10
show_each_char = False
for _, c in tqdm(enumerate(key_contours), total=n_contours):
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
base = np.zeros(bin_img.shape, dtype=np.uint8)
base.fill(255)
n_kept += 1
new_y = y+offset
new_x = x+offset
new_h = h-2*offset
new_w = w-2*offset
base[new_y:new_y+new_h, new_x:new_x+new_w] = bin_img[new_y:new_y+new_h, new_x:new_x+new_w]
segment = cv2.bitwise_not(base)
# try scaling up individual keys
# scaling = 2
# segment = cv2.resize(segment, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_CUBIC)
# psm 10: treats the segment as a single character
custom_config = r'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
d = pytesseract.image_to_data(segment, config=custom_config, output_type='dict')
conf = d['conf']
c = d['text'][-1]
if c:
# sometimes recognizes multiple keys even though there is only 1
for sub_c in c:
# save character and contour to draw on image and show bounds/detection
if sub_c not in let_to_contour or (sub_c in let_to_contour and conf > let_to_contour[sub_c]['conf']):
let_to_contour[sub_c] = {'conf': conf, 'cont': (new_x, new_y, new_w, new_h)}
else:
c = "?"
text_col = (0, 0, 255)
if show_each_char:
contour_dims = (new_x, new_y, new_w, new_h)
box_col = (0, 255, 0)
text_col = (0, 0, 0)
segment_with_boxes = draw_box_and_char(segment, contour_dims, c, box_col, text_col)
cv2.imshow('segment', segment_with_boxes)
cv2.waitKey(0)
cv2.destroyAllWindows()
# draw boxes around recognized keys
for c, data in let_to_contour.items():
box_col = (0, 255, 0)
text_col = (0, 0, 0)
img_copy = draw_box_and_char(img_copy, data['cont'], c, box_col, text_col)
detected = {k: 1 for k in let_to_contour}
for det in let_to_contour:
print(det, let_to_contour[det])
print("total detected: ", let_to_contour.keys())
missing = get_missing_chars(detected)
print(f"n_missing: {len(missing)}")
print(f"chars missing: {missing}")
return img_copy
if __name__ == "__main__":
img_file = "keyboard.jpg"
img = cv2.imread(img_file)
img_with_detected_keys = detect_keys(img)
cv2.imshow("detected", img_with_detected_keys)
cv2.waitKey(0)
cv2.destroyAllWindows()

How to solve logistic regression using gradient descent in octave?

I am learning Machine Learning course from coursera from Andrews Ng. I have written a code for logistic regression in octave. But, it is not working. Can someone help me?
I have taken the dataset from the following link:
Titanic survivors
Here is my code:
pkg load io;
[An, Tn, Ra, limits] = xlsread("~/ML/ML Practice/dataset/train_and_test2.csv", "Sheet2", "A2:H1000");
# As per CSV file we are reading columns from 1 to 7. 8-th column is Survived, which is what we are going to predict
X = [An(:, [1:7])];
Y = [An(:, 8)];
X = horzcat(ones(size(X,1), 1), X);
# Initializing theta values as zero for all
#theta = zeros(size(X,2),1);
theta = [-3;1;1;-3;1;1;1;1];
learningRate = -0.00021;
#learningRate = -0.00011;
# Step 1: Calculate Hypothesis
function g_z = estimateHypothesis(X, theta)
z = theta' * X';
z = z';
e_z = -1 * power(2.72, z);
denominator = 1.+e_z;
g_z = 1./denominator;
endfunction
# Step 2: Calculate Cost function
function cost = estimateCostFunction(hypothesis, Y)
log_1 = log(hypothesis);
log_2 = log(1.-hypothesis);
y1 = Y;
term_1 = y1.*log_1;
y2 = 1.-Y;
term_2 = y2.*log_2;
cost = term_1 + term_2;
cost = sum(cost);
# no.of.rows
m = size(Y, 1);
cost = -1 * (cost/m);
endfunction
# Step 3: Using gradient descent I am updating theta values
function updatedTheta = updateThetaValues(_X, _Y, _theta, _hypothesis, learningRate)
#s1 = _X * _theta;
#s2 = s1 - _Y;
#s3 = _X' * s2;
# no.of.rows
#m = size(_Y, 1);
#s4 = (learningRate * s3)/m;
#updatedTheta = _theta - s4;
s1 = _hypothesis - _Y;
s2 = s1 .* _X;
s3 = sum(s2);
# no.of.rows
m = size(_Y, 1);
s4 = (learningRate * s3)/m;
updatedTheta = _theta .- s4';
endfunction
costVector = [];
iterationVector = [];
for i = 1:1000
# Step 1
hypothesis = estimateHypothesis(X, theta);
#disp("hypothesis");
#disp(hypothesis);
# Step 2
cost = estimateCostFunction(hypothesis, Y);
costVector = vertcat(costVector, cost);
#disp("Cost");
#disp(cost);
# Step 3 - Updating theta values
theta = updateThetaValues(X, Y, theta, hypothesis, learningRate);
iterationVector = vertcat(iterationVector, i);
endfor
function plotGraph(iterationVector, costVector)
plot(iterationVector, costVector);
ylabel('Cost Function');
xlabel('Iteration');
endfunction
plotGraph(iterationVector, costVector);
This is the graph I am getting when I am plotting against no.of.iterations and cost function.
I am tired by adjusting theta values and learning rate. Can someone help me to solve this problem.
Thanks.
I have done a mathematical error. I should have used either power(2.72, -z) or exp(-z). Instead I have used as -1 * power(2.72, z). Now, I'm getting a proper curve.
Thanks.

How to find accumulator matrix for line in an image?

I am a newbie in the field of CV and IP. I was writing the HoughTransform algorithm for finding line.I am not getting what is wrong with this code in which i m trying to find the accumulator array
numRowsInBW = size(BW,1);
numColsInBW = size(BW,2);
%length of the diagonal of image
D = sqrt((numRowsInBW - 1)^2 + (numColsInBW - 1)^2);
%number of rows in the accumulator array
nrho = 2*(ceil(D/rhoStep)) + 1;
%number of cols in the accumulator array
ntheta = length(theta);
H = zeros(nrho,ntheta);
%this means the particular pixle is white
%i.e the edge pixle
[allrows allcols] = find(BW == 1);
for i = (1 : size(allrows))
y = allrows(i);
x = allcols(i);
for th = (1 : 180)
d = floor(x*cos(th) - y*sin(th));
H(d+floor(nrho/2),th) += 1;
end
end
I m applying this for a simple image
I m getting this result
But this is expected
I am not able to find the mistake.Please help me.Thanks in advance.
There are several issues with your code. The main issue is here:
ntheta = length(theta);
% ...
for i = (1 : size(allrows))
% ...
for th = (1 : 180)
d = floor(x*cos(th) - y*sin(th));
% ...
th seems to be an angle in degrees. cos(th) is meaningless. Instead, use cosd and sind.
Another issue is that th iterates from 1 to 180, but there is no guarantee that ntheta is 180. So, loop as follows instead:
for i = 1 : size(allrows)
% ...
for j = 1 : numel(theta)
th = theta(j);
% ...
and use th as the angle, and j as the index into H.
Finally, given your image and your expected output, you should apply some edge detection first (Canny, for example). Maybe you already did this?

Emgu CV - Anisotropic Diffusion

Can anybody guide me to some existing implementations of anisotropic diffusion, preferably the perona-malik diffusion?
translate the following MATLAB code :
% pm2.m - Anisotropic Diffusion routines
function ZN = pm2(ZN,K,iterate);
[m,n] = size(ZN);
% lambda = 0.250;
lambda = .025;
%K=16;
rowC = [1:m]; rowN = [1 1:m-1]; rowS = [2:m m];
colC = [1:n]; colE = [2:n n]; colW = [1 1:n-1];
result_save=0;
for i = 1:iterate,
%i;
% result=PSNR(Z,ZN);
% if result>result_save
% result_save=result;
% else
% break;
% end
deltaN = ZN(rowN,colC) - ZN(rowC,colC);
deltaS = ZN(rowS,colC) - ZN(rowC,colC);
deltaE = ZN(rowC,colE) - ZN(rowC,colC);
deltaW = ZN(rowC,colW) - ZN(rowC,colC);
% deltaN = deltaN .*abs(deltaN<K);
% deltaS = deltaS .*abs(deltaS<K);
% deltaE = deltaE .*abs(deltaE<K);
% deltaW = deltaW .*abs(deltaW<K);
fluxN = deltaN .* exp(-((abs(deltaN) ./ K).^2) );
fluxS = deltaS .* exp(-((abs(deltaS) ./ K).^2) );
fluxE = deltaE .* exp(-((abs(deltaE) ./ K).^2) );
fluxW = deltaW .* exp(-((abs(deltaW) ./ K).^2) );
ZN = ZN + lambda*(fluxN +fluxS + fluxE + fluxW);
%ZN=max(0,ZN);ZN=min(255,ZN);
end
the code is not mine and has been taken from: http://www.csee.wvu.edu/~xinl/code/pm2.m
OpenCV Implementation (It needs 3 channel image):
from cv2.ximgproc import anisotropicDiffusion
ultrasound_ad_cv2 = anisotropicDiffusion(im,0.075 ,80, 100)
Juxtapose comparison
From scratch in Python: (For grayscale image only)
import scipy.ndimage.filters as flt
import numpy as np
import warnings
def anisodiff(img,niter=1,kappa=50,gamma=0.1,step=(1.,1.),sigma=0, option=1,ploton=False):
"""
Anisotropic diffusion.
Usage:
imgout = anisodiff(im, niter, kappa, gamma, option)
Arguments:
img - input image
niter - number of iterations
kappa - conduction coefficient 20-100 ?
gamma - max value of .25 for stability
step - tuple, the distance between adjacent pixels in (y,x)
option - 1 Perona Malik diffusion equation No 1
2 Perona Malik diffusion equation No 2
ploton - if True, the image will be plotted on every iteration
Returns:
imgout - diffused image.
kappa controls conduction as a function of gradient. If kappa is low
small intensity gradients are able to block conduction and hence diffusion
across step edges. A large value reduces the influence of intensity
gradients on conduction.
gamma controls speed of diffusion (you usually want it at a maximum of
0.25)
step is used to scale the gradients in case the spacing between adjacent
pixels differs in the x and y axes
Diffusion equation 1 favours high contrast edges over low contrast ones.
Diffusion equation 2 favours wide regions over smaller ones.
"""
# ...you could always diffuse each color channel independently if you
# really want
if img.ndim == 3:
warnings.warn("Only grayscale images allowed, converting to 2D matrix")
img = img.mean(2)
# initialize output array
img = img.astype('float32')
imgout = img.copy()
# initialize some internal variables
deltaS = np.zeros_like(imgout)
deltaE = deltaS.copy()
NS = deltaS.copy()
EW = deltaS.copy()
gS = np.ones_like(imgout)
gE = gS.copy()
# create the plot figure, if requested
if ploton:
import pylab as pl
from time import sleep
fig = pl.figure(figsize=(20,5.5),num="Anisotropic diffusion")
ax1,ax2 = fig.add_subplot(1,2,1),fig.add_subplot(1,2,2)
ax1.imshow(img,interpolation='nearest')
ih = ax2.imshow(imgout,interpolation='nearest',animated=True)
ax1.set_title("Original image")
ax2.set_title("Iteration 0")
fig.canvas.draw()
for ii in np.arange(1,niter):
# calculate the diffs
deltaS[:-1,: ] = np.diff(imgout,axis=0)
deltaE[: ,:-1] = np.diff(imgout,axis=1)
if 0<sigma:
deltaSf=flt.gaussian_filter(deltaS,sigma);
deltaEf=flt.gaussian_filter(deltaE,sigma);
else:
deltaSf=deltaS;
deltaEf=deltaE;
# conduction gradients (only need to compute one per dim!)
if option == 1:
gS = np.exp(-(deltaSf/kappa)**2.)/step[0]
gE = np.exp(-(deltaEf/kappa)**2.)/step[1]
elif option == 2:
gS = 1./(1.+(deltaSf/kappa)**2.)/step[0]
gE = 1./(1.+(deltaEf/kappa)**2.)/step[1]
# update matrices
E = gE*deltaE
S = gS*deltaS
# subtract a copy that has been shifted 'North/West' by one
# pixel. don't as questions. just do it. trust me.
NS[:] = S
EW[:] = E
NS[1:,:] -= S[:-1,:]
EW[:,1:] -= E[:,:-1]
# update the image
imgout += gamma*(NS+EW)
if ploton:
iterstring = "Iteration %i" %(ii+1)
ih.set_data(imgout)
ax2.set_title(iterstring)
fig.canvas.draw()
# sleep(0.01)
return imgout
Usage
:
#anisodiff(img,niter=1,kappa=50,gamma=0.1,step=(1.,1.),sigma=0, option=1,ploton=False)
us_im_ad = anisodiff(ultrasound,100,80,0.075,(1,1),2.5,1)
Source
Juxtapose comparison

DCT equation in openCV

I write JPEG compression in Scilab (equivalent of MATLAB) using function imdct. In this function is used function DCT from openCV and I don't know which equation is used in dct function.
lenna by imdct
lenna by my_function
You can see lenna by imdct which is internal function and lenna by my_function is my function in scilab.
I add my code in scilab
function vystup = dct_rovnice(vstup)
[M,N] = size(vstup)
for u=1:M
for v=1:N
cos_celkem = 0;
for m=1:M
for n=1:N
pom = double(vstup(m,n));
cos_citatel1 = cos(((2*m) * u * %pi)/(2*M));
cos_citatel2 = cos(((2*n) * v * %pi)/(2*N));
cos_celkem = cos_celkem + (pom * cos_citatel1 * cos_citatel2);
end
end
c_u = 0;
c_v = 0;
if u == 1 then
c_u = 1 / sqrt(2);
else
c_u = 1;
end
if v == 1 then
c_v = 1 / sqrt(2);
else
c_v = 1;
end
vystup(u,v) = (2/sqrt(n*m)) * c_u * c_v * cos_celkem;
end
end
endfunction
function vystup = dct_prevod(vstup)
Y = vstup(:,:,1);
Cb = vstup(:,:,2);
Cr = vstup(:,:,3);
[rows,columns]=size(vstup)
vystup = zeros(rows,columns,3)
for y=1:8:rows-7
for x=1:8:columns-7
blok_Y = Y(y:y+7,x:x+7)
blok_Cb = Cb(y:y+7,x:x+7)
blok_Cr = Cr(y:y+7,x:x+7)
blok_dct_Y = dct_rovnice(blok_Y)
blok_dct_Cb = dct_rovnice(blok_Cb)
blok_dct_Cr = dct_rovnice(blok_Cr)
vystup(y:y+7,x:x+7,1)= blok_dct_Y
vystup(y:y+7,x:x+7,2)= blok_dct_Cb
vystup(y:y+7,x:x+7,3)= blok_dct_Cr
end
end
vystup = uint8(vystup)
endfunction
You can see equation I used
EQUATION
The issue seems to be in the use of different normalization of the resulting coefficients.
The OpenCV library uses this equation for a forward transform (N=8, in your case):
The basis g is defined as
where
(Sorry for the ugly images, but SO does not provide any support for typesetting equations.)
Take care there are several definitions of the dct function (DCT-I, DCT-II, DCT-III and DCT-IV normalized and un-normmalized)
Moreover have you tried the Scilab builtin function dct (from FFTW) which can be applied straightforward to images.

Resources