Projective Transform - matlab code - image-processing

i can't use any toolbox function i need to build it from scratch.
% load images
img1 = readImage('roadSign.tif');
img2 = readImage('lena.tif');
% call the main function
mapIntoImage(img1,img2)
function [newImage] = mapIntoImage(imageA,imageB)
% Input: imageA, imageB - a grayscale image in the range [0..255].
%
% Output: newImage – imageA into which image B has been mapped.
%
showImage(imageA)
hold on
% Initially, the list of points is empty.
xy = [];
% Loop, picking up the points.
disp('Please enter corners of place to insert image in clockwise order.')
for j = 1:4
[xi,yi] = ginput(1);
%draw a yellow dot
plot(xi,yi,'yo')
xy(:,j) = [xi;yi];
end
% get x1 y1 cordinates - xy(:, 1)
imgRow = size(imageB,1);
imgCol = size(imageB,2);
[X,Y] = meshgrid(1:imgCol,1:imgRow);
imgBcords = [0 size(imageB, 1) size(imageB,1) 0 ;
0 0 size(imageB,2) size(imageB,2)];
coefs = findCoefficients(xy, imgBcords);
A = [coefs(1) coefs(2) coefs(5);coefs(3) coefs(4) coefs(6); coefs(7) coefs(8) 1];
temp = zeros(size(X,1), size(X,2), 3);
new = ones(256);
for i = 1:size(X,1)
for j = 1:size(X,2)
temp(i,j,:) =A*[X(i,j); Y(i,j); new(i,j)];
end
end
end
function [ result ] = findCoefficients( imageA, imageB )
% finds coefficients for inverse mapping algorithem
% takes 2 X 2d vectors each consists of 4 points x,y
% and returns the coef accroding to reverse mapping function
%
% x y 0 0 1 0 -xx' -yx'
% 0 0 x y 0 1 -xy' -yy'
% y' and x' are in the destenation picture;
A = [imageB(1,1) imageB(2,1) 0 0 1 0 -imageB(1,1)*imageA(1,1) -imageB(2,1)*imageA(1,1);
0 0 imageB(1,1) imageB(2,1) 0 1 -imageB(1,1)*imageA(2,1) -imageB(2,1)*imageA(2,1);
imageB(1,2) imageB(2,2) 0 0 1 0 -imageB(1,2)*imageA(1,2) -imageB(2,2)*imageA(1,2);
0 0 imageB(1,2) imageB(2,2) 0 1 -imageB(1,2)*imageA(2,2) -imageB(2,2)*imageA(2,2);
imageB(1,3) imageB(2,3) 0 0 1 0 -imageB(1,3)*imageA(1,3) -imageB(2,3)*imageA(1,3);
0 0 imageB(1,3) imageB(2,3) 0 1 -imageB(1,3)*imageA(2,3) -imageB(2,3)*imageA(2,3);
imageB(1,4) imageB(2,4) 0 0 1 0 -imageB(1,4)*imageA(1,4) -imageB(2,4)*imageA(1,4);
0 0 imageB(1,4) imageB(2,4) 0 1 -imageB(1,4)*imageA(2,4) -imageB(2,4)*imageA(2,4)];
B = [imageB(1,1); imageB(2,1); imageB(1,2); imageB(2,2); imageB(1,3); imageB(2,3); imageB(1,4); imageB(2,4)];
result = pinv(A)*B;
end
i want to build now the transform
[x' y' 1] = A*[X Y 1];
i have figured out that i would need to use repmat, but i can't seem to get to the real syntax without loops.
what's the most efficient way to do it?

A projective transform has the form of
$ x' = \frac {a_{11}x+a_{12}y+a_{13}}{a_{13}x+a_{23}y+a_{33}} \\
y' = \frac {a_{21}x+a_{22}y+a_{23}}{a_{13}x+a_{23}y+a_{33}}
$
Where the coefficients are defined up to some scale factor. One of the ways to ensure a constant scale factor is to set $a_{33}=1$. One easy way to think about it is to use the homogenous coordinates:
$ \left( \begin{array}{ccc} x'\\y'\\S\end{array} \right) =
\left( \begin{array}{ccc} a_{11} & a_{12} & a_{13}\\a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33}\end{array} \right)
\left( \begin{array}{ccc} x\\y\\1\end{array} \right)
$
These coordinates are defined up to scale. That is,
$ \left( \begin{array}{ccc} x'/S\\y'/S\\1\end{array} \right) \equiv
\left( \begin{array}{ccc} x'\\y'\\S\end{array} \right)$
Thus, in your case you should do: (Assuming that x and y are column vectors, and A is the transpose of the matrix that I described above:
XY = A * [x y ones(size(x))];
XY(:,1) = XY(:,1)./XY(:,3);
XY(:,2) = XY(:,2)./XY(:,3);

Related

How does Roblox's math.noise() deal with negative inputs?

While messing around with noise outside of Roblox, I realized Perlin/Simplex Noise does not like negative inputs. Remembering Roblox has a noise function, I tried there, and found out negative numbers do work nicely for Roblox's math.noise(). Does anybody know how they made this work, or how to get negative numbers to work for Perlin/Simplex noise in general?
The Simplex Noise I am using (copied from here but changed to have the bitwise and operation):
local function bit_and(a, b) --bitwise and operation
local p, c = 1, 0
while a > 0 and b > 0 do
local ra, rb = a%2, b%2
if (ra + rb) > 1 then
c = c + p
end
a = (a - ra) / 2
b = (b - rb) / 2
p = p * 2
end
return c
end
-- 2D simplex noise
local grad3 = {
{1,1,0},{-1,1,0},{1,-1,0},{-1,-1,0},
{1,0,1},{-1,0,1},{1,0,-1},{-1,0,-1},
{0,1,1},{0,-1,1},{0,1,-1},{0,-1,-1}
}
local p = {151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180}
local perm = {}
for i=0,511 do
perm[i+1] = p[bit_and(i, 255) + 1]
end
local function dot(g, ...)
local v = {...}
local sum = 0
for i=1,#v do
sum = sum + v[i] * g[i]
end
return sum
end
local noise = {}
function noise.produce(xin, yin)
local n0, n1, n2 -- Noise contributions from the three corners
-- Skew the input space to determine which simplex cell we're in
local F2 = 0.5*(math.sqrt(3.0)-1.0)
local s = (xin+yin)*F2; -- Hairy factor for 2D
local i = math.floor(xin+s)
local j = math.floor(yin+s)
local G2 = (3.0-math.sqrt(3.0))/6.0
local t = (i+j)*G2
local X0 = i-t -- Unskew the cell origin back to (x,y) space
local Y0 = j-t
local x0 = xin-X0 -- The x,y distances from the cell origin
local y0 = yin-Y0
-- For the 2D case, the simplex shape is an equilateral triangle.
-- Determine which simplex we are in.
local i1, j1 -- Offsets for second (middle) corner of simplex in (i,j) coords
if x0 > y0 then
i1 = 1
j1 = 0 -- lower triangle, XY order: (0,0)->(1,0)->(1,1)
else
i1 = 0
j1 = 1
end-- upper triangle, YX order: (0,0)->(0,1)->(1,1)
-- A step of (1,0) in (i,j) means a step of (1-c,-c) in (x,y), and
-- a step of (0,1) in (i,j) means a step of (-c,1-c) in (x,y), where
-- c = (3-sqrt(3))/6
local x1 = x0 - i1 + G2 -- Offsets for middle corner in (x,y) unskewed coords
local y1 = y0 - j1 + G2
local x2 = x0 - 1 + 2 * G2 -- Offsets for last corner in (x,y) unskewed coords
local y2 = y0 - 1 + 2 * G2
-- Work out the hashed gradient indices of the three simplex corners
local ii = bit_and(i, 255)
local jj = bit_and(j, 255)
local gi0 = perm[ii + perm[jj+1]+1] % 12
local gi1 = perm[ii + i1 + perm[jj + j1+1]+1] % 12
local gi2 = perm[ii + 1 + perm[jj + 1+1]+1] % 12
-- Calculate the contribution from the three corners
local t0 = 0.5 - x0 * x0 - y0 * y0
if t0 < 0 then
n0 = 0.0
else
t0 = t0 * t0
n0 = t0 * t0 * dot(grad3[gi0+1], x0, y0) -- (x,y) of grad3 used for 2D gradient
end
local t1 = 0.5 - x1 * x1 - y1 * y1
if t1 < 0 then
n1 = 0.0
else
t1 = t1 * t1
n1 = t1 * t1 * dot(grad3[gi1+1], x1, y1)
end
local t2 = 0.5 - x2 * x2 - y2 * y2
if t2 < 0 then
n2 = 0.0
else
t2 = t2 * t2
n2 = t2 * t2 * dot(grad3[gi2+1], x2, y2)
end
-- Add contributions from each corner to get the final noise value.
-- The result is scaled to return values in the interval [-1,1].
return 70.0 * (n0 + n1 + n2)
end
return noise
The Lua programming language version that Roblox uses, LuaU (or Luau), is actually open-source since November of 2021. You can find it here. The math library can be found in this file called lmathlib.cpp and it contains the math.noise function along with internal functions to calculate it, perlin (main function), grad, lerp, and fade. It's a quite complicated thing I can't explain myself, but I have converted it into Lua here.

bmp aspect ratio issue

I've been trying to understand how bmp files work so I can render some Mandelbrot set pictures and output them as bmp files since that seems to be one of the easiest methods but for some reason when I use an aspect ratio that isn't 1:1 even though its something to the power of 4 (so no padding is needed) I get weird artifacts like these 200:100 48:100 what I'm trying to do is turning an array of pixels that has white for even numbers and black for odd numbers into a bmp, this (100:100) is what it looks like with 1:1 aspect ratio.
I've tried reading through the wikipedia article to see if I can figure out what I'm doing wrong but I still don't get what I'm missing.
This is the script I've written in Lua so far:
ResolutionX = 100
ResolutionY = 100
local cos, atan, sin, atan2, sqrt, floor = math.cos, math.atan, math.sin, math.atan2, math.sqrt, math.floor
local insert, concat = table.insert, table.concat
local sub, char, rep = string.sub, string.char, string.rep
io.output("Test.bmp")
function Basen(n,b)
n = floor(n)
if not b or b == 10 then return tostring(n) end
local digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
local t = {}
repeat
local d = (n % b) + 1
n = floor(n / b)
insert(t, 1, digits:sub(d,d))
until n == 0
return rep("0",32-#t)..concat(t,"")
end
FileSize = Basen(ResolutionY*ResolutionX*3 + 54,2)
FileSize4 = tonumber(sub(FileSize,1,8),2) or 0
FileSize3 = tonumber(sub(FileSize,9,16),2) or 0
FileSize2 = tonumber(sub(FileSize,17,24),2) or 0
FileSize1 = tonumber(sub(FileSize,25,32),2) or 0
Width = Basen(ResolutionX,2)
print("Width: ",Width)
Width4 = tonumber(sub(Width,1,8),2) or 0
Width3 = tonumber(sub(Width,9,16),2) or 0
Width2 = tonumber(sub(Width,17,24),2) or 0
Width1 = tonumber(sub(Width,25,32),2) or 0
Height = Basen(ResolutionY,2)
print("Height: ",Height)
Height4 = tonumber(sub(Height,1,8),2) or 0
Height3 = tonumber(sub(Height,9,16),2) or 0
Height2 = tonumber(sub(Height,17,24),2) or 0
Height1 = tonumber(sub(Height,25,32),2) or 0
BMPSize = Basen(ResolutionY*ResolutionX*3,2)
BMPSize4 = tonumber(sub(BMPSize,1,8),2) or 0
BMPSize3 = tonumber(sub(BMPSize,9,16),2) or 0
BMPSize2 = tonumber(sub(BMPSize,17,24),2) or 0
BMPSize1 = tonumber(sub(BMPSize,25,32),2) or 0
print("TotalSize: ",FileSize1,FileSize2,FileSize3,FileSize4,"\nWidth: ",Width1,Width2,Width3,Width4,"\nHeight: ",Height1,Height2,Height3,Height4,"\nImage data: ",BMPSize1,BMPSize2,BMPSize3,BMPSize4)
Result = {"BM"..char( --File type
FileSize1,FileSize2,FileSize3,FileSize4,--File size
0,0,0,0, --Reserved
54,0,0,0, --Where the pixel data starts
40,0,0,0, --DIB header
Width1,Width2,Width3,Width4, --Width
Height1,Height2,Height3,Height4, --Height
1,0, --Color planes
24,00, --Bit depth
0,0,0,0, --Compression
BMPSize1,BMPSize2,BMPSize3,BMPSize4, --The amount of bytes pixel data will consume
Width1,Width2,Width3,Width4,
Height1,Height2,Height3,Height4,
0,0,0,0, --Number of colors in palatte
0,0,0,0
)}
for X = 0, ResolutionX - 1 do
for Y = 0, ResolutionY - 1 do
insert(Result,rep(char(255 * ((X + 1) % 2) * ((Y + 1) % 2)),3))
end
end
io.write(table.concat(Result))
Ok, here's a BMP version. I've put things in a module so it might be easier to use.
local writeBMP = {}
local floor = math.floor
local insert, concat = table.insert, table.concat
local sub, char, rep = string.sub, string.char, string.rep
local function Basen(n,b)
n = floor(n)
if not b or b == 10 then return tostring(n) end
local digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
local t = {}
repeat
local d = (n % b) + 1
n = floor(n / b)
insert(t, 1, digits:sub(d,d))
until n == 0
return rep("0",32-#t)..concat(t,"")
end
local function nextMul4(x)
if ( x % 4 == 0 ) then
return x
else
return x+4-(x%4)
end
end
local function clamp(x)
local y = x
if ( x > 255 ) then
y = 255
elseif ( x < 0 ) then
y = 0
end
return floor(y)
end
-- Accepts array of type pixelsXYC[X][Y][C] of numbers 0-255
-- C=1,2,3 are the red, green and blue channels respectively
-- X increases left to right, and Y increases top to bottom
function writeBMP.data(pixelsXYC, resolutionX, resolutionY)
local Pixels = pixelsXYC
local ResolutionX = resolutionX
local ResolutionY = resolutionY
assert(#Pixels == ResolutionX, "Table size and X resolution mismatch")
assert(#Pixels[1] == ResolutionY, "Table size and Y resolution mismatch")
local FileSize = Basen(ResolutionY*nextMul4(3*ResolutionX) + 54,2)
local FileSize4 = tonumber(sub(FileSize,1,8),2) or 0
local FileSize3 = tonumber(sub(FileSize,9,16),2) or 0
local FileSize2 = tonumber(sub(FileSize,17,24),2) or 0
local FileSize1 = tonumber(sub(FileSize,25,32),2) or 0
local Width = Basen(ResolutionX,2)
local Width4 = tonumber(sub(Width,1,8),2) or 0
local Width3 = tonumber(sub(Width,9,16),2) or 0
local Width2 = tonumber(sub(Width,17,24),2) or 0
local Width1 = tonumber(sub(Width,25,32),2) or 0
local Height = Basen(ResolutionY,2)
local Height4 = tonumber(sub(Height,1,8),2) or 0
local Height3 = tonumber(sub(Height,9,16),2) or 0
local Height2 = tonumber(sub(Height,17,24),2) or 0
local Height1 = tonumber(sub(Height,25,32),2) or 0
local BMPSize = Basen(ResolutionY*nextMul4(3*ResolutionX),2)
local BMPSize4 = tonumber(sub(BMPSize,1,8),2) or 0
local BMPSize3 = tonumber(sub(BMPSize,9,16),2) or 0
local BMPSize2 = tonumber(sub(BMPSize,17,24),2) or 0
local BMPSize1 = tonumber(sub(BMPSize,25,32),2) or 0
local Result = {}
Result[1] = "BM" .. char( --File type
FileSize1,FileSize2,FileSize3,FileSize4,--File size
0,0, --Reserved
0,0, --Reserved
54,0,0,0, --Where the pixel data starts
40,0,0,0, --DIB header
Width1,Width2,Width3,Width4, --Width
Height1,Height2,Height3,Height4, --Height
1,0, --Color planes
24,0, --Bit depth
0,0,0,0, --Compression
BMPSize1,BMPSize2,BMPSize3,BMPSize4, --The amount of bytes pixel data will consume
37,22,0,0, --Pixels per meter horizontal
37,22,0,0, --Pixels per meter vertical
0,0,0,0, --Number of colors in palatte
0,0,0,0
)
local Y = ResolutionY
while( Y >= 1 ) do
for X = 1, ResolutionX do
local r = clamp( Pixels[X][Y][1] )
local g = clamp( Pixels[X][Y][2] )
local b = clamp( Pixels[X][Y][3] )
Result[#Result+1] = char(b)
Result[#Result+1] = char(g)
Result[#Result+1] = char(r)
end
-- byte alignment
if ( ( (3*ResolutionX) % 4 ) ~= 0 ) then
local Padding = 4 - ((3*ResolutionX) % 4)
Result[#Result+1] = rep(char(0),Padding)
end
Y = Y - 1
end
return table.concat(Result)
end
function writeBMP.write(pixelsXYC, resolutionX, resolutionY, filename)
local file = io.open(filename,"wb")
local data = writeBMP.data(pixelsXYC, resolutionX, resolutionY)
file:write(data)
end
return writeBMP
A simple test:
-- writeBMP example
local writeBMP = require "writeBMP"
local resolutionX = 100
local resolutionY = 100
-- Pixel data
local pixels = {}
for x=1,resolutionX do
pixels[x] = {}
for y=1, resolutionY do
pixels[x][y] = {}
local red = 255*(resolutionX-x+resolutionY-y)/(resolutionX+resolutionY)
local green = 255*y/resolutionY
local blue = 255*x/resolutionX
pixels[x][y][1] = red
pixels[x][y][2] = green
pixels[x][y][3] = blue
end
end
writeBMP.write(pixels,resolutionX,resolutionY,"testwritebmp.bmp")
return
Note: In BMP, the Y axis starts on the bottom. I am more used to Y axis going from top down in computer graphics (so I wrote it that way).
Thanks HAX for the code.
Welcome to Stack Exchange :)
I suggest having a look at PPM files, they are easy. They can be converted with other tools to png or bmp with other tools.
Wikipedia PPM Specification
Here is a PPM solution:
ResolutionX = 48
ResolutionY = 100
local cos, atan, sin, atan2, sqrt, floor = math.cos, math.atan, math.sin, math.atan2, math.sqrt, math.floor
local insert, concat = table.insert, table.concat
local sub, char, rep = string.sub, string.char, string.rep
local file = io.open("Test.ppm","w")
-- PPM File headers
local Output = { }
Output[1] = "P3"
Output[2] = tostring(ResolutionX) .. " " .. tostring(ResolutionY)
Output[3] = "255"
-- Image body
for Y = 0, ResolutionY - 1 do
for X = 0, ResolutionX - 1 do
local value = 255 * ((X + 1) % 2) * ((Y + 1) % 2)
Output[#Output+1] = rep(tostring(floor(value)),3," ")
end
end
-- Join lines together
local Result = table.concat(Output,"\n")
file:write(Result)
Notes: I could not get your code to write to file (see my usage of file). The inner loop must be X (columns) and the outer loop must be Y (rows) if the write order is English reading order (left-right down-up).

What does the distance in machine learning signify?

In Neural networks, we regularly use the equation:
w1*x1 + w2*x2 + w3*x3 ...
We can interpret this as equation of a line with each x as a dimension. To make things more clearer, lets take an example of a simple perceptron network.
Imagine a single layer perceptron with 2 ip's/features (x1 & x2) and one output (y). (Sorry stackoverflow didn't allow me to post an additional image)
Let
R = w1*x1 + w2 * x2
y = 0 if R >= threshold
y = 1 if R < threshold
Scenario 1:
Threshold = 0
w1 = 2, w2 = -1
The line separating class 0 and 1 has the equation 2*x1 - x2 = 0
Suppose we get a test sample
P = (1,1)
R = 2*1 - 1 = 1 > 0
Sample P belongs to class 1
My questions is what is this R?
From the figure, its horizontal distance from the line.
Scenario 2:
Threshold = 0
w1 = 2, w2 = 1
The line separating class 0 and 1 has the equation 2*x1 + x2 = 0
P = (1,1)
R = 2*1 + 1 = 3 > 0
Sample P belongs to class 1
From the figure, its vertical distance from the line.
R is supposed to mean some form of distance from the classifying line. More the distance, more farther away from the line and we are more confident about the classification.
Just want to know what kind of distance from the line is R?

Adaptive Median Filter

I am constructing code for adaptive median filter . When i execute it it gives me error at line No 12. Not enough arguments. and on line 28.Unexpected MATLAB Expression.
function f = adpmedian(g, Smax)
%ADPMEDIAN Perform adaptive median filtering.
% F = ADPMEDIAN(G, SMAX) performs adaptive median filtering of
% image G. The median filter starts at size 3-by-3 and iterates up
% to size SMAX-by-SMAX. SMAX must be an odd integer greater than 1.
% SMAX must be an odd, positive integer greater than 1.
**12>>**if (Smax <= 1) || (Smax/2 == round(Smax/2)) || (Smax ~= round(Smax))
error('SMAX must be an odd integer > 1.')
end
[M, N] = size(g);
% Initial setup.
f = g;
f(:) = 0;
alreadyProcessed = false(size(g));
% Begin filtering.
for k = 3:2:Smax
zmin = ordfilt2(g, 1, ones(k, k), 'symmetric');
zmax = ordfilt2(g, k * k, ones(k, k), 'symmetric');
zmed = medfilt2(g, [k k], 'symmetric');
`28>>` processUsingLevelB = (zmed > zmin) & (zmax > zmed) & ...
~alreadyProcessed;
zB = (g > zmin) & (zmax > g);
outputZxy = processUsingLevelB & zB;
outputZmed = processUsingLevelB & ~zB;
f(outputZxy) = g(outputZxy);
f(outputZmed) = zmed(outputZmed);
alreadyProcessed = alreadyProcessed | processUsingLevelB;
if all(alreadyProcessed(:))
break;
end
end
% Output zmed for any remaining unprocessed pixels. Note that this
% zmed was computed using a window of size Smax-by-Smax, which is
% the final value of k in the loop.
f(~alreadyProcessed) = zmed(~alreadyProcessed);

Sobel filter kernel of large size

I am using a sobel filter of size 3x3 to calculate the image derivative. Looking at some articles on the internet, it seems that kernels for sobel filter for size 5x5 and 7x7 are also common, but I am not able to find their kernel values.
Could someone please let me know the kernel values for sobel filter of size 5x5 and 7x7? Also, if someone could share a method to generate the kernel values, that will be much useful.
Thanks in advance.
Complete solution for arbitrary Sobel kernel sizes and angles
tl;dr: skip down to section 'Examples'
To add another solution, expanding on this document (it's not particularly high quality, but it shows some usable graphics and matrices starting at the bottom of page 2).
Goal
What we're trying to do is estimate the local gradient of the image at position (x,y). The gradient is a vector made up of the components in x and y direction, gx and gy.
Now, imagine we want to approximate the gradient based on our pixel (x,y) and its neighbours as a kernel operation (3x3, 5x5, or whatever size).
Solution idea
We can approximate the gradient by summing over the projections of all neighbor-center pairs onto the gradient direction. (Sobel's kernel is just a particular method of weighting the different contributions, and so is Prewitt, basically).
Explicit intermediate steps for 3x3
This is the local image, central pixel (x,y) marked as 'o' (center)
a b c
d o f
g h i
Let's say we want the gradient in positive x direction. The unit vector in positive x-direction is (1,0) [I'll later use the convention that the positive y direction is DOWN, i.e. (0,1), and that (0,0) is top left of image).]
The vector from o to f ('of' for short) is (1,0). The gradient in direction 'of' is (f - o) / 1 (value of image at pixel here denoted f minus value at center o, divided by distance between those pixels). If we project the unit vector of that particular neighbor gradient onto our desired gradient direction (1,0) via a dot product we get 1. Here is a little table with the contributions of all neighbors, starting with the easier cases. Note that for diagonals, their distance is sqrt2, and the unit vectors in the diagonal directions are 1/sqrt2 * (+/-1, +/-1)
f: (f-o)/1 * 1
d: (d-o)/1 * -1 because (-1, 0) dot (1, 0) = -1
b: (b-o)/1 * 0 because (0, -1) dot (1, 0) = 0
h: (h-o)/1 * 0 (as per b)
a: (a-o)/sqrt2 * -1/sqrt2 distance is sqrt2, and 1/sqrt2*(-1,-1) dot (1,0) = -1/sqrt2
c: (c-o)/sqrt2 * +1/sqrt2 ...
g: (g-o)/sqrt2 * -1/sqrt2 ...
i: (i-o)/sqrt2 * +1/sqrt2 ...
edit for clarification:
There are two factors of 1/sqrt(2) for the following reason:
We are interested in the contribution to the gradient in a specific direction (here x), so we need to project the directional gradient from the center pixel to the neighbor pixel onto the direction we are interested in. This is accomplished by taking the scalar product of the unit vectors in the respective directions, which introduces the first factor 1/L (here 1/sqrt(2) for the diagonals).
The gradient measures the infinitesimal change at a point, which we approximate by finite differences. In terms of a linear equation, m = (y2-y1)/(x2-x1). For this reason, the value difference from the center pixel to the neighbor pixel (y2-y1) has to be distributed over their distance (corresponds to x2-x1) in order to get the ascent units per distance unit. This yields a second factor of 1/L (here 1/sqrt(2) for the diagonals)
Ok, now we know the contributions. Let's simplify this expression by combining opposing pairs of pixel contributions. I'll start with d and f:
{(f-o)/1 * 1} + {(d-o)/1 * -1}
= f - o - (d - o)
= f - d
Now the first diagonal:
{(c-o)/sqrt2 * 1/sqrt2} + {(g-o)/sqrt2 * -1/sqrt2}
= (c - o)/2 - (g - o)/2
= (c - g)/2
The second diagonal contributes (i - a)/2. The perpendicular direction contributes zero. Note that all contributions from the central pixel 'o' vanish.
We have now calculated the contributions of all closest neighbours to the gradient in positive x-direction at pixel (x,y), so our total approximation of the gradient in x-direction is simply their sum:
gx(x,y) = f - d + (c - g)/2 + (i - a)/2
We can obtain the same result by using a convolution kernel where the coefficients are written in the place of the corresponding neighbor pixel:
-1/2 0 1/2
-1 0 1
-1/2 0 1/2
If you don't want to deal with fractions, you multiply this by 2 and get the well-known Sobel 3x3 kernel.
-1 0 1
G_x = -2 0 2
-1 0 1
The multiplication by two only serves to get convenient integers. The scaling of your output image is basically arbitrary, most of the time you normalize it to your image range, anyway (to get clearly visible results).
By the same reasoning as above, you get the kernel for the vertical gradient gy by projecting the neighbor contributions onto the unit vector in positive y direction (0,1)
-1 -2 -1
G_y = 0 0 0
1 2 1
Formula for kernels of arbitrary size
If you want 5x5 or larger kernels, you only need to pay attention to the distances, e.g.
A B 2 B A
B C 1 C B
2 1 - 1 2
B C 1 C B
A B 2 B A
where
A = 2 * sqrt2
B = sqrt5
C = sqrt2.
If the length of the vector connecting any two pixels is L, the unit vector in that direction has a prefactor of 1/L. For this reason, the contributions of any pixel 'k' to (say) the x-gradient (1,0) can be simplified to "(value difference over squared distance) times (DotProduct of unnormalized direction vector 'ok' with gradient vector, e.g. (1,0) )"
gx_k = (k - o)/(pixel distance^2) ['ok' dot (1,0)].
Because the dot product of the connecting vector with the x unit vector selects the corresponding vector entry, the corresponding G_x kernel entry at position k is just
i / (i*i + j*j)
where i and j are the number of steps from the center pixel to the pixel k in x and y direction. In the above 3x3 calculation, the pixel 'a' would have i = -1 (1 to the left), j = -1 (1 to the top) and hence the 'a' kernel entry is -1 / (1 + 1) = -1/2.
The entries for the G_y kernel are
j/(i*i + j*j).
If I want integer values for my kernel, I follow these steps:
check the available range of the output image
compute highest possible result from applying floating point kernel (i.e. assume max input value under all positive kernel entries, so output value is (sum over all positive kernel values) * (max possible input image value). If you have signed input, you need to consider the negative values as well. Worst case is then the sum of all positive values + sum of all abs values of negative entries (if max input under positives, -max input under negatives). edit: the sum of all abs values has also been aptly called the weight of the kernel
calculate maximum allowed up-scaling for kernel (without overflowing range of output image)
for all integer multiples (from 2 to above maximum) of floating point kernel: check which has the lowest sum of absolute round-off errors and use this kernel
So in summary:
Gx_ij = i / (i*i + j*j)
Gy_ij = j / (i*i + j*j)
where i,j is position in the kernel counted from the center. Scale kernel entries as needed to obtain integer numbers (or at least close approximations).
These formulae hold for all kernel sizes.
Examples
-2/8 -1/5 0 1/5 2/8 -5 -4 0 4 5
-2/5 -1/2 0 1/2 2/5 -8 -10 0 10 8
G_x (5x5) -2/4 -1/1 0 1/1 2/4 (*20) = -10 -20 0 20 10
-2/5 -1/2 0 1/2 2/5 -8 -10 0 10 8
-2/8 -1/5 0 1/5 2/8 -5 -4 0 4 5
Note that the central 3x3 pixels of the 5x5 kernel in float notation are just the 3x3 kernel, i.e. larger kernels represent a continued approximation with additional but lower-weighted data. This continues on to larger kernel sizes:
-3/18 -2/13 -1/10 0 1/10 2/13 3/18
-3/13 -2/8 -1/5 0 1/5 2/8 3/13
-3/10 -2/5 -1/2 0 1/2 2/5 3/10
G_x (7x7) -3/9 -2/4 -1/1 0 1/1 2/4 3/9
-3/10 -2/5 -1/2 0 1/2 2/5 3/10
-3/13 -2/8 -1/5 0 1/5 2/8 3/13
-3/18 -2/13 -1/10 0 1/10 2/13 3/18
Exact integer representations become impractical at this point.
As far as I can tell (don't have access to the original paper), the "Sobel" part to this is properly weighting the contributions. The Prewitt solution can be obtained by leaving out the distance weighting and just entering i and j in the kernel as appropriate.
Bonus: Sobel Kernels for arbitrary directions
So we can approximate the x and y components of the image gradient (which is actually a vector, as stated in the very beginning). The gradient in any arbitrary direction alpha (measured mathematically positive, in this case clockwise since positive y is downward) can be obtained by projecting the gradient vector onto the alpha-gradient unit vector.
The alpha-unit vector is (cos alpha, sin alpha). For alpha = 0° you can obtain the result for gx, for alpha = 90° you get gy.
g_alpha = (alpha-unit vector) dot (gx, gy)
= (cos a, sin a) dot (gx, gy)
= cos a * gx + sin a * gy
If you bother to write down gx and gy as sums of neighbor contributions, you realize that you can group the resulting long expression by terms that apply to the same neighbor pixel, and then rewrite this as a single convolution kernel with entries
G_alpha_ij = (i * cos a + j * sin a)/(i*i + j*j)
If you want the closest integer approximation, follow the steps outlined above.
Other sources seem to give different definitions of the larger kernels. The Intel IPP library, for example, gives the 5x5 kernel as
1 2 0 -2 -1
4 8 0 -8 -4
6 12 0 -12 -6
4 8 0 -8 -4
1 2 0 -2 -1
Intuitively, this makes more sense to me because you're paying more attention to the elements closer to the centre. It also has a natural definition in terms of the 3x3 kernel which is easy to extend to generate larger kernels. That said, in my brief search I've found 3 different definitions of the 5x5 kernel - so I suspect that (as Paul says) the larger kernels are ad hoc, and so this is by no means the definitive answer.
The 3x3 kernel is the outer product of a smoothing kernel and a gradient kernel, in Matlab this is something like
sob3x3 = [ 1 2 1 ]' * [1 0 -1]
the larger kernels can be defined by convolving the 3x3 kernel with another smoothing kernel
sob5x5 = conv2( [ 1 2 1 ]' * [1 2 1], sob3x3 )
you can repeat the process to get progressively larger kernels
sob7x7 = conv2( [ 1 2 1 ]' * [1 2 1], sob5x5 )
sob9x9 = conv2( [ 1 2 1 ]' * [1 2 1], sob7x7 )
...
there are a lot of other ways of writing it, but I think this explains exactly what is happening best. Basically, you start off with a smoothing kernel in one direction and a finite differences estimate of the derivative in the other and then just apply smoothing until you get the kernel size you want.
Because it's just a series of convolutions, all the nice properties hold, (commutativity, associativity and so forth) which might be useful for your implementation. For example, you can trivially separate the 5x5 kernel into its smoothing and derivative components:
sob5x5 = conv([1 2 1],[1 2 1])' * conv([1 2 1],[-1 0 1])
Note that in order to be a "proper" derivative estimator, the 3x3 Sobel should be scaled by a factor of 1/8:
sob3x3 = 1/8 * [ 1 2 1 ]' * [1 0 -1]
and each larger kernel needs to be scaled by an additional factor of 1/16 (because the smoothing kernels are not normalised):
sob5x5 = 1/16 * conv2( [ 1 2 1 ]' * [1 2 1], sob3x3 )
sob7x7 = 1/16 * conv2( [ 1 2 1 ]' * [1 2 1], sob5x5 )
...
UPDATE 23-Apr-2018: it seems that the kernels defined in the link below are not true Sobel kernels (for 5x5 and above) - they may do a reasonable job of edge detection, but they should not be called Sobel kernels. See Daniel’s answer for a more accurate and comprehensive summary. (I will leave this answer here as (a) it is linked to from various places and (b) accepted answers can not easily be deleted.)
Google seems to turn up plenty of results, e.g.
http://rsbweb.nih.gov/nih-image/download/user-macros/slowsobel.macro suggests the following kernels for 3x3, 5x5, 7x7 and 9x9:
3x3:
1 0 -1
2 0 -2
1 0 -1
5x5:
2 1 0 -1 -2
3 2 0 -2 -3
4 3 0 -3 -4
3 2 0 -2 -3
2 1 0 -1 -2
7x7:
3 2 1 0 -1 -2 -3
4 3 2 0 -2 -3 -4
5 4 3 0 -3 -4 -5
6 5 4 0 -4 -5 -6
5 4 3 0 -3 -4 -5
4 3 2 0 -2 -3 -4
3 2 1 0 -1 -2 -3
9x9:
4 3 2 1 0 -1 -2 -3 -4
5 4 3 2 0 -2 -3 -4 -5
6 5 4 3 0 -3 -4 -5 -6
7 6 5 4 0 -4 -5 -6 -7
8 7 6 5 0 -5 -6 -7 -8
7 6 5 4 0 -4 -5 -6 -7
6 5 4 3 0 -3 -4 -5 -6
5 4 3 2 0 -2 -3 -4 -5
4 3 2 1 0 -1 -2 -3 -4
Here is a simple solution made with python 3 using numpy and the #Daniel answer.
def custom_sobel(shape, axis):
"""
shape must be odd: eg. (5,5)
axis is the direction, with 0 to positive x and 1 to positive y
"""
k = np.zeros(shape)
p = [(j,i) for j in range(shape[0])
for i in range(shape[1])
if not (i == (shape[1] -1)/2. and j == (shape[0] -1)/2.)]
for j, i in p:
j_ = int(j - (shape[0] -1)/2.)
i_ = int(i - (shape[1] -1)/2.)
k[j,i] = (i_ if axis==0 else j_)/float(i_*i_ + j_*j_)
return k
It returns the kernel (5,5) like this:
Sobel x:
[[-0.25 -0.2 0. 0.2 0.25]
[-0.4 -0.5 0. 0.5 0.4 ]
[-0.5 -1. 0. 1. 0.5 ]
[-0.4 -0.5 0. 0.5 0.4 ]
[-0.25 -0.2 0. 0.2 0.25]]
Sobel y:
[[-0.25 -0.4 -0.5 -0.4 -0.25]
[-0.2 -0.5 -1. -0.5 -0.2 ]
[ 0. 0. 0. 0. 0. ]
[ 0.2 0.5 1. 0.5 0.2 ]
[ 0.25 0.4 0.5 0.4 0.25]]
If anyone know a better way to do that in python, please let me know. I'm a newbie yet ;)
Sobel gradient filter generator
(This answer refers to the analysis given by #Daniel, above.)
Gx[i,j] = i / (i*i + j*j)
Gy[i,j] = j / (i*i + j*j)
This is an important result, and a better explanation than can be found in the original paper. It should be written up in Wikipedia, or somewhere, because it also seems superior to any other discussion of the issue available on the internet.
However, it is not actually true that integer-valued representations are impractical for filters of size greater than 5*5, as claimed. Using 64-bit integers, Sobel filter sizes up to 15*15 can be exactly expressed.
Here are the first four; the result should be divided by the "weight", so that the gradient of an image region such as the following, is normalized to a value of 1.
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
Gx(3) :
-1/2 0/1 1/2 -1 0 1
-1/1 0 1/1 * 2 = -2 0 2
-1/2 0/1 1/2 -1 0 1
weight = 4 weight = 8
Gx(5) :
-2/8 -1/5 0/4 1/5 2/8 -5 -4 0 4 5
-2/5 -1/2 0/1 1/2 2/5 -8 -10 0 10 8
-2/4 -1/1 0 1/1 2/4 * 20 = -10 -20 0 20 10
-2/5 -1/2 0/1 1/2 2/5 -8 -10 0 10 8
-2/8 -1/5 0/4 1/5 2/8 -5 -4 0 4 5
weight = 12 weight = 240
Gx(7) :
-3/18 -2/13 -1/10 0/9 1/10 2/13 3/18 -130 -120 -78 0 78 120 130
-3/13 -2/8 -1/5 0/4 1/5 2/8 3/13 -180 -195 -156 0 156 195 180
-3/10 -2/5 -1/2 0/1 1/2 2/5 3/10 -234 -312 -390 0 390 312 234
-3/9 -2/4 -1/1 0 1/1 2/4 3/9 * 780 = -260 -390 -780 0 780 390 260
-3/10 -2/5 -1/2 0/1 1/2 2/5 3/10 -234 -312 -390 0 390 312 234
-3/13 -2/8 -1/5 0/4 1/5 2/8 3/13 -180 -195 -156 0 156 195 180
-3/18 -2/13 -1/10 0/9 1/10 2/13 3/18 -130 -120 -78 0 78 120 130
weight = 24 weight = 18720
Gx(9) :
-4/32 -3/25 -2/20 -1/17 0/16 1/17 2/20 3/25 4/32 -16575 -15912 -13260 -7800 0 7800 13260 15912 16575
-4/25 -3/18 -2/13 -1/10 0/9 1/10 2/13 3/18 4/25 -21216 -22100 -20400 -13260 0 13260 20400 22100 21216
-4/20 -3/13 -2/8 -1/5 0/4 1/5 2/8 3/13 4/20 -26520 -30600 -33150 -26520 0 26520 33150 30600 26520
-4/17 -3/10 -2/5 -1/2 0/1 1/2 2/5 3/10 4/17 -31200 -39780 -53040 -66300 0 66300 53040 39780 31200
-4/16 -3/9 -2/4 -1/1 0 1/1 2/4 3/9 4/16 * 132600 = -33150 -44200 -66300 -132600 0 132600 66300 44200 33150
-4/17 -3/10 -2/5 -1/2 0/1 1/2 2/5 3/10 4/17 -31200 -39780 -53040 -66300 0 66300 53040 39780 31200
-4/20 -3/13 -2/8 -1/5 0/4 1/5 2/8 3/13 4/20 -26520 -30600 -33150 -26520 0 26520 33150 30600 26520
-4/25 -3/18 -2/13 -1/10 0/9 1/10 2/13 3/18 4/25 -21216 -22100 -20400 -13260 0 13260 20400 22100 21216
-4/32 -3/25 -2/20 -1/17 0/16 1/17 2/20 3/25 4/32 -16575 -15912 -13260 -7800 0 7800 13260 15912 16575
weight = 40 weight = 5304000
The Ruby program appended below, will calculate Sobel filters and corresponding weights of any size, although the integer-valued filters are not likely to be useful for sizes larger than 15*15.
#!/usr/bin/ruby
# Sobel image gradient filter generator
# by <ian_bruce#mail.ru> -- Sept 2017
# reference:
# https://stackoverflow.com/questions/9567882/sobel-filter-kernel-of-large-size
if (s = ARGV[0].to_i) < 3 || (s % 2) == 0
$stderr.puts "invalid size"
exit false
end
s /= 2
n = 1
# find least-common-multiple of all fractional denominators
(0..s).each { |j|
(1..s).each { |i|
d = i*i + j*j
n = n.lcm(d / d.gcd(i))
}
}
fw1 = format("%d/%d", s, 2*s*s).size + 2
fw2 = format("%d", n).size + 2
weight = 0
s1 = ""
s2 = ""
(-s..s).each { |y|
(-s..s).each { |x|
i, j = x, y # "i, j = y, x" for transpose
d = i*i + j*j
if (i != 0)
if (n * i % d) != 0 # this should never happen
$stderr.puts "inexact division: #{n} * #{i} / ((#{i})^2 + (#{j})^2)"
exit false
end
w = n * i / d
weight += i * w
else
w = 0
end
s1 += "%*s" % [fw1, d > 0 ? "%d/%d" % [i, d] : "0"]
s2 += "%*d" % [fw2, w]
}
s1 += "\n" ; s2 += "\n"
}
f = n.gcd(weight)
puts s1
puts "\nweight = %d%s" % [weight/f, f < n ? "/%d" % (n/f) : ""]
puts "\n* #{n} =\n\n"
puts s2
puts "\nweight = #{weight}"
TL;DR: Use a Gaussian derivative operator instead.
As Adam Bowen explained in his answer, the Sobel kernel is a combination of a smoothing along one axis, and a central difference derivative along the other axis:
sob3x3 = [1 2 1]' * [1 0 -1]
The smoothing adds regularization (reduces sensitivity to noise).
(I'm leaving out all factors 1/8 in this post, as did Sobel himself, meaning that the operator determines the derivative up to scaling. Also, * always means convolution in this post.)
Let's generalize this:
deriv_kernel = smoothing_kernel * d/dx
One of the properties of the convolution is that
d/dx f = d/dx * f
That is, convolving an image with the elemental derivative operator yields the derivative of the image. Noting also that the convolution is commutative,
deriv_kernel = d/dx * smoothing_kernel = d/dx smoothing_kernel
That is, the derivative kernel is the derivative of a smoothing kernel.
Note that applying such a kernel to an image by convolution:
image * deriv_kernel = image * smoothing_kernel * d/dx = d/dx (image * smoothing_kernel)
That is, with this generalized, idealized derivative kernel we can compute the true derivative of the smoothed image. This is of course not the case with the Sobel kernel, as it uses a central difference approximation to the derivative.
But choosing a better smoothing_kernel, this can be achieved. The Gaussian kernel is the ideal option here, as it offers the best compromise between compactness in the spatial domain (small kernel footprint) with compactness in the frequency domain (good smoothing). Furthermore, the Gaussian is perfectly isotropic and separable. Using a Gaussian derivative kernel yields the best possible regularized derivative operator.
Thus, if you are looking for a larger Sobel operator, because you need more regularization, use a Gaussian derivative operator instead.
Let's analyze the Sobel kernel a little bit more.
The smoothing kernel is triangular, with samples [1 2 1]. This is a triangular function, which, sampled, leads to those three values:
2 + x , if -2 < x < 0
h = { 2 , if x = 0
2 - x , if 0 < x < 2
Its derivative is:
1 , if -2 < x < 0
d/dx h = { 0 , if x = 0 (not really, but it's the sensible solution)
-1 , if 0 < x < 2
So, we can see that the central difference derivative approximation can be seen as a sampling of the analytical derivative of the same triangular function used for smoothing. Thus we have:
sob3x3 = [1 2 1]' * d/dx [1 2 1] = d/dx ( [1 2 1]' * [1 2 1] )
So, if you want to make this kernel larger, simply enlarge the smoothing kernel:
sob5x5 = d/dx ( [1 2 3 2 1]' * [1 2 3 2 1] ) = [1 2 3 2 1]' * [1 1 0 -1 -1]
sob7x7 = d/dx ( [1 2 3 4 3 2 1]' * [1 2 3 4 3 2 1] ) = [1 2 3 4 3 2 1]' * [1 1 1 0 -1 -1 -1]
This is quite different from the advice given by Adam Bowen, who suggests convolving the kernel with the 3-tab triangular kernel along each dimension: [1 2 1] * [1 2 1] = [1 4 6 4 1], and [1 2 1] * [1 0 -1] = [1 2 0 -2 -1]. Note that, due to the central limit theorem, convolving this triangular kernel with itself leads to a filter that approximates the Gaussian a little bit more. The larger the kernel we create by repeated convolutions with itself, the more we approximate this Gaussian. So, instead of using this method, you might as well directly sample the Gaussian function.
Daniel has a long post in which he suggests extending the Sobel kernel in yet another way. The shape of the smoothing kernel here diverges from the Gaussian approximation, I have not tried to study its properties.
Note that none of these three possible extensions of the Sobel kernel are actually Sobel kernels, since the Sobel kernel is explicitly a 3x3 kernel (see an historical note by Sobel about his operator, which he never actually published).
Note also that I'm not advocating the extended Sobel kernel derived here. Use Gaussian derivatives!
I quickly hacked an algorithm to generate a Sobel kernel of any odd size > 1, based on the examples given by #Paul R:
public static void CreateSobelKernel(int n, ref float[][] Kx, ref float[][] Ky)
{
int side = n * 2 + 3;
int halfSide = side / 2;
for (int i = 0; i < side; i++)
{
int k = (i <= halfSide) ? (halfSide + i) : (side + halfSide - i - 1);
for (int j = 0; j < side; j++)
{
if (j < halfSide)
Kx[i][j] = Ky[j][i] = j - k;
else if (j > halfSide)
Kx[i][j] = Ky[j][i] = k - (side - j - 1);
else
Kx[i][j] = Ky[j][i] = 0;
}
}
}
Hope it helps.
Thanks for all, I will try second variant by #Adam Bowen, take C# code for Sobel5x5, 7x7, 9x9... matrix generaion for this variant (maybe with bugs, if you find bug or can optimize code - write it there):
static void Main(string[] args)
{
float[,] Sobel3x3 = new float[,] {
{-1, 0, 1},
{-2, 0, 2},
{-1, 0, 1}};
float[,] Sobel5x5 = Conv2DforSobelOperator(Sobel3x3);
float[,] Sobel7x7 = Conv2DforSobelOperator(Sobel5x5);
Console.ReadKey();
}
public static float[,] Conv2DforSobelOperator(float[,] Kernel)
{
if (Kernel == null)
throw new Exception("Kernel = null");
if (Kernel.GetLength(0) != Kernel.GetLength(1))
throw new Exception("Kernel matrix must be Square matrix!");
float[,] BaseMatrix = new float[,] {
{1, 2, 1},
{2, 4, 2},
{1, 2, 1}};
int KernelSize = Kernel.GetLength(0);
int HalfKernelSize = KernelSize / 2;
int OutSize = KernelSize + 2;
if ((KernelSize & 1) == 0) // Kernel_Size must be: 3, 5, 7, 9 ...
throw new Exception("Kernel size must be odd (3x3, 5x5, 7x7...)");
float[,] Out = new float[OutSize, OutSize];
float[,] InMatrix = new float[OutSize, OutSize];
for (int x = 0; x < BaseMatrix.GetLength(0); x++)
for (int y = 0; y < BaseMatrix.GetLength(1); y++)
InMatrix[HalfKernelSize + x, HalfKernelSize + y] = BaseMatrix[x, y];
for (int x = 0; x < OutSize; x++)
for (int y = 0; y < OutSize; y++)
for (int Kx = 0; Kx < KernelSize; Kx++)
for (int Ky = 0; Ky < KernelSize; Ky++)
{
int X = x + Kx - HalfKernelSize;
int Y = y + Ky - HalfKernelSize;
if (X >= 0 && Y >= 0 && X < OutSize && Y < OutSize)
Out[x, y] += InMatrix[X, Y] * Kernel[KernelSize - 1 - Kx, KernelSize - 1 - Ky];
}
return Out;
}
Results (NormalMap) or it copy there, where this metod - №2, #Paul R metod - №1. Now I am using last, becouse it give more smooth result and it's easy to generate kernels with this code.
Matlab implementation of Daniel's answer:
kernel_width = 9;
halfway = floor(kernel_width/2);
step = -halfway : halfway;
i_matrix = repmat(step,[kernel_width 1]);
j_matrix = i_matrix';
gx = i_matrix ./ ( i_matrix.*i_matrix + j_matrix.*j_matrix );
gx(halfway+1,halfway+1) = 0; % deals with NaN in middle
gy = gx';
I made a Python NumPy implementation of Daniel's answer. It seems to be about 3x faster than Joao Ponte's implementation.
def calc_sobel_kernel(target_shape: tuple[int, int]):
assert target_shape[0] % 2 != 0
assert target_shape[1] % 2 != 0
gx = np.zeros(target_shape, dtype=np.float32)
gy = np.zeros(target_shape, dtype=np.float32)
indices = np.indices(target_shape, dtype=np.float32)
cols = indices[0] - target_shape[0] // 2
rows = indices[1] - target_shape[1] // 2
squared = cols ** 2 + rows ** 2
np.divide(cols, squared, out=gy, where=squared!=0)
np.divide(rows, squared, out=gx, where=squared!=0)
return gx, gy

Resources