I'm trying to fit a line using quadratic poly, but because the fit results in continuous values, the integer conversion (for CartesianIndex) rounds it off, and I loose data at that pixel.
I tried the method
here. So I get new y values as
using Images, Polynomials, Plots,ImageView
img = load("jTjYb.png")
img = Gray.(img)
img = img[end:-1:1, :]
nodes = findall(img.>0)
xdata = map(p->p[2], nodes)
ydata = map(p->p[1], nodes)
f = fit(xdata, ydata, 2)
ydata_new .= round.(Int, f.(xdata)
new_line_fitted_img=zeros(size(img))
new_line_fitted_img[xdata,ydata_new].=1
imshow(new_line_fitted_img)
which results in chopped line as below
whereas I was expecting it to be continuous line as it was in pre-processing
Do you expect the following:
Raw Image
Fitted Polynomial
Superposition
enter image description here
enter image description here
enter image description here
Code:
using Images, Polynomials
img = load("img.png");
img = Gray.(img)
fx(data, dCoef, cCoef, bCoef, aCoef) = #. data^3 *aCoef + data^2 *bCoef + data*cCoef + dCoef;
function fit_poly(img::Array{<:Gray, 2})
img = img[end:-1:1, :]
nodes = findall(img.>0)
xdata = map(p->p[2], nodes)
ydata = map(p->p[1], nodes)
f = fit(xdata, ydata, 3)
xdt = unique(xdata)
xdt, fx(xdt, f.coeffs...)
end;
function draw_poly!(X, y)
the_min = minimum(y)
if the_min<0
y .-= the_min - 1
end
initialized_img = Gray.(zeros(maximum(X), maximum(y)))
initialized_img[CartesianIndex.(X, y)] .= 1
dif = diff(y)
for i in eachindex(dif)
the_dif = dif[i]
if abs(the_dif) >= 2
segment = the_dif รท 2
initialized_img[i, y[i]:y[i]+segment] .= 1
initialized_img[i+1, y[i]+segment+1:y[i+1]-1] .= 1
end
end
rotl90(initialized_img)
end;
X, y = fit_poly(img);
y = convert(Vector{Int64}, round.(y));
draw_poly!(X, y)
Related
TL;DR I'm using:
adaptive thresholding
segmenting by keys (width/height ratio) - see green boxes in image result
psm 10 to treat each key as a character
but it fails to recognize some keys, falsely identifies others or identifies 2 for 1 char (see the L character in the image result, it's an L and P), etc.
Note: I cropped the image and re-ran the results to get it to fit on this site, but before cropping it did slightly better (recognized more keys, fewer false positives, etc).
I just want it to recognize the alphabet keys. Ultimately I will want it to work for realtime video.
config:
'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
I've tried scaling the image differently, scaling the individual key segments, using opening/closing/etc but it doesn't recognize all the keys.
original image
image result
Update: new results if I make the image straighter (bird's eye) and remove the whitelisting, it manages to detect all for the most part (although it thinks the O is a 0 and the I is a |, which is understandable). Why is this and how could I make this adaptive enough for a dynamic video when it is so sensitive to these conditions?
Code:
import pytesseract
import numpy as np
try:
from PIL import Image
except ImportError:
import Image
import cv2
from tqdm import tqdm
from collections import defaultdict
def get_missing_chars(dict):
capital_alphabet = [chr(ascii) for ascii in range(65, 91)]
return [let for let in capital_alphabet if let not in dict]
def draw_box_and_char(img, contour_dims, c, box_col, text_col):
x, y, w, h = contour_dims
top_left = (x, y)
bot_right = (x + w, y+h)
font_offset = 3
text_pos = (x+h//2+12, y+h-font_offset)
img_copy = img.copy()
cv2.rectangle(img_copy, top_left, bot_right, box_col, 2)
cv2.putText(img_copy, c, text_pos, cv2.FONT_HERSHEY_SIMPLEX, fontScale=.5, color=text_col, thickness=1, lineType=cv2.LINE_AA)
return img_copy
def detect_keys(img):
scaling = .25
img = cv2.resize(img, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_AREA)
print("img shape", img.shape)
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ratio_min = 0.7
area_min = 1000
nbrhood_size = 1001
bias = 2
# adapt to different lighting
bin_img = cv2.adaptiveThreshold(gray_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY_INV, nbrhood_size, bias)
items = cv2.findContours(bin_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = items[0] if len(items) == 2 else items[1]
key_contours = []
for c in contours:
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
# square-like ratio, try to get character
if ratio > ratio_min and area > area_min:
key_contours.append(c)
detected = defaultdict(int)
n_kept = 0
img_copy = cv2.cvtColor(bin_img, cv2.COLOR_GRAY2RGB)
let_to_contour = {}
n_contours = len(key_contours)
# offset to get smaller square within the key segment for easier char recognition
offset = 10
show_each_char = False
for _, c in tqdm(enumerate(key_contours), total=n_contours):
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
base = np.zeros(bin_img.shape, dtype=np.uint8)
base.fill(255)
n_kept += 1
new_y = y+offset
new_x = x+offset
new_h = h-2*offset
new_w = w-2*offset
base[new_y:new_y+new_h, new_x:new_x+new_w] = bin_img[new_y:new_y+new_h, new_x:new_x+new_w]
segment = cv2.bitwise_not(base)
# try scaling up individual keys
# scaling = 2
# segment = cv2.resize(segment, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_CUBIC)
# psm 10: treats the segment as a single character
custom_config = r'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
d = pytesseract.image_to_data(segment, config=custom_config, output_type='dict')
conf = d['conf']
c = d['text'][-1]
if c:
# sometimes recognizes multiple keys even though there is only 1
for sub_c in c:
# save character and contour to draw on image and show bounds/detection
if sub_c not in let_to_contour or (sub_c in let_to_contour and conf > let_to_contour[sub_c]['conf']):
let_to_contour[sub_c] = {'conf': conf, 'cont': (new_x, new_y, new_w, new_h)}
else:
c = "?"
text_col = (0, 0, 255)
if show_each_char:
contour_dims = (new_x, new_y, new_w, new_h)
box_col = (0, 255, 0)
text_col = (0, 0, 0)
segment_with_boxes = draw_box_and_char(segment, contour_dims, c, box_col, text_col)
cv2.imshow('segment', segment_with_boxes)
cv2.waitKey(0)
cv2.destroyAllWindows()
# draw boxes around recognized keys
for c, data in let_to_contour.items():
box_col = (0, 255, 0)
text_col = (0, 0, 0)
img_copy = draw_box_and_char(img_copy, data['cont'], c, box_col, text_col)
detected = {k: 1 for k in let_to_contour}
for det in let_to_contour:
print(det, let_to_contour[det])
print("total detected: ", let_to_contour.keys())
missing = get_missing_chars(detected)
print(f"n_missing: {len(missing)}")
print(f"chars missing: {missing}")
return img_copy
if __name__ == "__main__":
img_file = "keyboard.jpg"
img = cv2.imread(img_file)
img_with_detected_keys = detect_keys(img)
cv2.imshow("detected", img_with_detected_keys)
cv2.waitKey(0)
cv2.destroyAllWindows()
I am learning Machine Learning course from coursera from Andrews Ng. I have written a code for logistic regression in octave. But, it is not working. Can someone help me?
I have taken the dataset from the following link:
Titanic survivors
Here is my code:
pkg load io;
[An, Tn, Ra, limits] = xlsread("~/ML/ML Practice/dataset/train_and_test2.csv", "Sheet2", "A2:H1000");
# As per CSV file we are reading columns from 1 to 7. 8-th column is Survived, which is what we are going to predict
X = [An(:, [1:7])];
Y = [An(:, 8)];
X = horzcat(ones(size(X,1), 1), X);
# Initializing theta values as zero for all
#theta = zeros(size(X,2),1);
theta = [-3;1;1;-3;1;1;1;1];
learningRate = -0.00021;
#learningRate = -0.00011;
# Step 1: Calculate Hypothesis
function g_z = estimateHypothesis(X, theta)
z = theta' * X';
z = z';
e_z = -1 * power(2.72, z);
denominator = 1.+e_z;
g_z = 1./denominator;
endfunction
# Step 2: Calculate Cost function
function cost = estimateCostFunction(hypothesis, Y)
log_1 = log(hypothesis);
log_2 = log(1.-hypothesis);
y1 = Y;
term_1 = y1.*log_1;
y2 = 1.-Y;
term_2 = y2.*log_2;
cost = term_1 + term_2;
cost = sum(cost);
# no.of.rows
m = size(Y, 1);
cost = -1 * (cost/m);
endfunction
# Step 3: Using gradient descent I am updating theta values
function updatedTheta = updateThetaValues(_X, _Y, _theta, _hypothesis, learningRate)
#s1 = _X * _theta;
#s2 = s1 - _Y;
#s3 = _X' * s2;
# no.of.rows
#m = size(_Y, 1);
#s4 = (learningRate * s3)/m;
#updatedTheta = _theta - s4;
s1 = _hypothesis - _Y;
s2 = s1 .* _X;
s3 = sum(s2);
# no.of.rows
m = size(_Y, 1);
s4 = (learningRate * s3)/m;
updatedTheta = _theta .- s4';
endfunction
costVector = [];
iterationVector = [];
for i = 1:1000
# Step 1
hypothesis = estimateHypothesis(X, theta);
#disp("hypothesis");
#disp(hypothesis);
# Step 2
cost = estimateCostFunction(hypothesis, Y);
costVector = vertcat(costVector, cost);
#disp("Cost");
#disp(cost);
# Step 3 - Updating theta values
theta = updateThetaValues(X, Y, theta, hypothesis, learningRate);
iterationVector = vertcat(iterationVector, i);
endfor
function plotGraph(iterationVector, costVector)
plot(iterationVector, costVector);
ylabel('Cost Function');
xlabel('Iteration');
endfunction
plotGraph(iterationVector, costVector);
This is the graph I am getting when I am plotting against no.of.iterations and cost function.
I am tired by adjusting theta values and learning rate. Can someone help me to solve this problem.
Thanks.
I have done a mathematical error. I should have used either power(2.72, -z) or exp(-z). Instead I have used as -1 * power(2.72, z). Now, I'm getting a proper curve.
Thanks.
I am a newbie in the field of CV and IP. I was writing the HoughTransform algorithm for finding line.I am not getting what is wrong with this code in which i m trying to find the accumulator array
numRowsInBW = size(BW,1);
numColsInBW = size(BW,2);
%length of the diagonal of image
D = sqrt((numRowsInBW - 1)^2 + (numColsInBW - 1)^2);
%number of rows in the accumulator array
nrho = 2*(ceil(D/rhoStep)) + 1;
%number of cols in the accumulator array
ntheta = length(theta);
H = zeros(nrho,ntheta);
%this means the particular pixle is white
%i.e the edge pixle
[allrows allcols] = find(BW == 1);
for i = (1 : size(allrows))
y = allrows(i);
x = allcols(i);
for th = (1 : 180)
d = floor(x*cos(th) - y*sin(th));
H(d+floor(nrho/2),th) += 1;
end
end
I m applying this for a simple image
I m getting this result
But this is expected
I am not able to find the mistake.Please help me.Thanks in advance.
There are several issues with your code. The main issue is here:
ntheta = length(theta);
% ...
for i = (1 : size(allrows))
% ...
for th = (1 : 180)
d = floor(x*cos(th) - y*sin(th));
% ...
th seems to be an angle in degrees. cos(th) is meaningless. Instead, use cosd and sind.
Another issue is that th iterates from 1 to 180, but there is no guarantee that ntheta is 180. So, loop as follows instead:
for i = 1 : size(allrows)
% ...
for j = 1 : numel(theta)
th = theta(j);
% ...
and use th as the angle, and j as the index into H.
Finally, given your image and your expected output, you should apply some edge detection first (Canny, for example). Maybe you already did this?
Can anybody guide me to some existing implementations of anisotropic diffusion, preferably the perona-malik diffusion?
translate the following MATLAB code :
% pm2.m - Anisotropic Diffusion routines
function ZN = pm2(ZN,K,iterate);
[m,n] = size(ZN);
% lambda = 0.250;
lambda = .025;
%K=16;
rowC = [1:m]; rowN = [1 1:m-1]; rowS = [2:m m];
colC = [1:n]; colE = [2:n n]; colW = [1 1:n-1];
result_save=0;
for i = 1:iterate,
%i;
% result=PSNR(Z,ZN);
% if result>result_save
% result_save=result;
% else
% break;
% end
deltaN = ZN(rowN,colC) - ZN(rowC,colC);
deltaS = ZN(rowS,colC) - ZN(rowC,colC);
deltaE = ZN(rowC,colE) - ZN(rowC,colC);
deltaW = ZN(rowC,colW) - ZN(rowC,colC);
% deltaN = deltaN .*abs(deltaN<K);
% deltaS = deltaS .*abs(deltaS<K);
% deltaE = deltaE .*abs(deltaE<K);
% deltaW = deltaW .*abs(deltaW<K);
fluxN = deltaN .* exp(-((abs(deltaN) ./ K).^2) );
fluxS = deltaS .* exp(-((abs(deltaS) ./ K).^2) );
fluxE = deltaE .* exp(-((abs(deltaE) ./ K).^2) );
fluxW = deltaW .* exp(-((abs(deltaW) ./ K).^2) );
ZN = ZN + lambda*(fluxN +fluxS + fluxE + fluxW);
%ZN=max(0,ZN);ZN=min(255,ZN);
end
the code is not mine and has been taken from: http://www.csee.wvu.edu/~xinl/code/pm2.m
OpenCV Implementation (It needs 3 channel image):
from cv2.ximgproc import anisotropicDiffusion
ultrasound_ad_cv2 = anisotropicDiffusion(im,0.075 ,80, 100)
Juxtapose comparison
From scratch in Python: (For grayscale image only)
import scipy.ndimage.filters as flt
import numpy as np
import warnings
def anisodiff(img,niter=1,kappa=50,gamma=0.1,step=(1.,1.),sigma=0, option=1,ploton=False):
"""
Anisotropic diffusion.
Usage:
imgout = anisodiff(im, niter, kappa, gamma, option)
Arguments:
img - input image
niter - number of iterations
kappa - conduction coefficient 20-100 ?
gamma - max value of .25 for stability
step - tuple, the distance between adjacent pixels in (y,x)
option - 1 Perona Malik diffusion equation No 1
2 Perona Malik diffusion equation No 2
ploton - if True, the image will be plotted on every iteration
Returns:
imgout - diffused image.
kappa controls conduction as a function of gradient. If kappa is low
small intensity gradients are able to block conduction and hence diffusion
across step edges. A large value reduces the influence of intensity
gradients on conduction.
gamma controls speed of diffusion (you usually want it at a maximum of
0.25)
step is used to scale the gradients in case the spacing between adjacent
pixels differs in the x and y axes
Diffusion equation 1 favours high contrast edges over low contrast ones.
Diffusion equation 2 favours wide regions over smaller ones.
"""
# ...you could always diffuse each color channel independently if you
# really want
if img.ndim == 3:
warnings.warn("Only grayscale images allowed, converting to 2D matrix")
img = img.mean(2)
# initialize output array
img = img.astype('float32')
imgout = img.copy()
# initialize some internal variables
deltaS = np.zeros_like(imgout)
deltaE = deltaS.copy()
NS = deltaS.copy()
EW = deltaS.copy()
gS = np.ones_like(imgout)
gE = gS.copy()
# create the plot figure, if requested
if ploton:
import pylab as pl
from time import sleep
fig = pl.figure(figsize=(20,5.5),num="Anisotropic diffusion")
ax1,ax2 = fig.add_subplot(1,2,1),fig.add_subplot(1,2,2)
ax1.imshow(img,interpolation='nearest')
ih = ax2.imshow(imgout,interpolation='nearest',animated=True)
ax1.set_title("Original image")
ax2.set_title("Iteration 0")
fig.canvas.draw()
for ii in np.arange(1,niter):
# calculate the diffs
deltaS[:-1,: ] = np.diff(imgout,axis=0)
deltaE[: ,:-1] = np.diff(imgout,axis=1)
if 0<sigma:
deltaSf=flt.gaussian_filter(deltaS,sigma);
deltaEf=flt.gaussian_filter(deltaE,sigma);
else:
deltaSf=deltaS;
deltaEf=deltaE;
# conduction gradients (only need to compute one per dim!)
if option == 1:
gS = np.exp(-(deltaSf/kappa)**2.)/step[0]
gE = np.exp(-(deltaEf/kappa)**2.)/step[1]
elif option == 2:
gS = 1./(1.+(deltaSf/kappa)**2.)/step[0]
gE = 1./(1.+(deltaEf/kappa)**2.)/step[1]
# update matrices
E = gE*deltaE
S = gS*deltaS
# subtract a copy that has been shifted 'North/West' by one
# pixel. don't as questions. just do it. trust me.
NS[:] = S
EW[:] = E
NS[1:,:] -= S[:-1,:]
EW[:,1:] -= E[:,:-1]
# update the image
imgout += gamma*(NS+EW)
if ploton:
iterstring = "Iteration %i" %(ii+1)
ih.set_data(imgout)
ax2.set_title(iterstring)
fig.canvas.draw()
# sleep(0.01)
return imgout
Usage
:
#anisodiff(img,niter=1,kappa=50,gamma=0.1,step=(1.,1.),sigma=0, option=1,ploton=False)
us_im_ad = anisodiff(ultrasound,100,80,0.075,(1,1),2.5,1)
Source
Juxtapose comparison
I write JPEG compression in Scilab (equivalent of MATLAB) using function imdct. In this function is used function DCT from openCV and I don't know which equation is used in dct function.
lenna by imdct
lenna by my_function
You can see lenna by imdct which is internal function and lenna by my_function is my function in scilab.
I add my code in scilab
function vystup = dct_rovnice(vstup)
[M,N] = size(vstup)
for u=1:M
for v=1:N
cos_celkem = 0;
for m=1:M
for n=1:N
pom = double(vstup(m,n));
cos_citatel1 = cos(((2*m) * u * %pi)/(2*M));
cos_citatel2 = cos(((2*n) * v * %pi)/(2*N));
cos_celkem = cos_celkem + (pom * cos_citatel1 * cos_citatel2);
end
end
c_u = 0;
c_v = 0;
if u == 1 then
c_u = 1 / sqrt(2);
else
c_u = 1;
end
if v == 1 then
c_v = 1 / sqrt(2);
else
c_v = 1;
end
vystup(u,v) = (2/sqrt(n*m)) * c_u * c_v * cos_celkem;
end
end
endfunction
function vystup = dct_prevod(vstup)
Y = vstup(:,:,1);
Cb = vstup(:,:,2);
Cr = vstup(:,:,3);
[rows,columns]=size(vstup)
vystup = zeros(rows,columns,3)
for y=1:8:rows-7
for x=1:8:columns-7
blok_Y = Y(y:y+7,x:x+7)
blok_Cb = Cb(y:y+7,x:x+7)
blok_Cr = Cr(y:y+7,x:x+7)
blok_dct_Y = dct_rovnice(blok_Y)
blok_dct_Cb = dct_rovnice(blok_Cb)
blok_dct_Cr = dct_rovnice(blok_Cr)
vystup(y:y+7,x:x+7,1)= blok_dct_Y
vystup(y:y+7,x:x+7,2)= blok_dct_Cb
vystup(y:y+7,x:x+7,3)= blok_dct_Cr
end
end
vystup = uint8(vystup)
endfunction
You can see equation I used
EQUATION
The issue seems to be in the use of different normalization of the resulting coefficients.
The OpenCV library uses this equation for a forward transform (N=8, in your case):
The basis g is defined as
where
(Sorry for the ugly images, but SO does not provide any support for typesetting equations.)
Take care there are several definitions of the dct function (DCT-I, DCT-II, DCT-III and DCT-IV normalized and un-normmalized)
Moreover have you tried the Scilab builtin function dct (from FFTW) which can be applied straightforward to images.