Torch / Lua, how to pair together to arrays into a table? - lua

I need to use the Pearson correlation coefficient in my Torch / Lua program.
This is the function:
function math.pearson(a)
-- compute the mean
local x1, y1 = 0, 0
for _, v in pairs(a) do
x1, y1 = x1 + v[1], y1 + v[2]
print('x1 '..x1);
print('y1 '..y1);
end
-- compute the coefficient
x1, y1 = x1 / #a, y1 / #a
local x2, y2, xy = 0, 0, 0
for _, v in pairs(a) do
local tx, ty = v[1] - x1, v[2] - y1
xy, x2, y2 = xy + tx * ty, x2 + tx * tx, y2 + ty * ty
end
return xy / math.sqrt(x2) / math.sqrt(y2)
end
This function wants an input couple table that can be used in the pairs() function.
I tried to submit to it the right input table but I was not able to get anything working.
I tried with:
z = {}
a = {27, 29, 45, 98, 1293, 0.1}
b = {0.0001, 0.001, 0.32132, 0.0001, 0.0009}
z = {a,b}
But unfortunately it does not work. It will compute pairs between the first elements of b, while I want it to compute the correlation between a and b.
How could I do?
Can you provide a good object that may work as input to the math.pearson() function?

If you are using this implementation, then I think it expects a different parameter structure. You should be passing a table of two-value tables, as in:
local z = {
{27, 0.0001}, {29, 0.001}, {45, 0.32132}, {98, 0.0001}, {1293, 0.0009}
}
print(math.pearson(z))
This prints -0.25304101592759 for me.

Related

How to convert bounding box (x1, y1, x2, y2) to YOLO Style (X, Y, W, H)

I'm training a YOLO model, I have the bounding boxes in this format:-
x1, y1, x2, y2 => ex (100, 100, 200, 200)
I need to convert it to YOLO format to be something like:-
X, Y, W, H => 0.436262 0.474010 0.383663 0.178218
I already calculated the center point X, Y, the height H, and the weight W.
But still need a away to convert them to floating numbers as mentioned.
for those looking for the reverse of the question (yolo format to normal bbox format)
def yolobbox2bbox(x,y,w,h):
x1, y1 = x-w/2, y-h/2
x2, y2 = x+w/2, y+h/2
return x1, y1, x2, y2
Here's code snipet in python to convert x,y coordinates to yolo format
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
im=Image.open(img_path)
w= int(im.size[0])
h= int(im.size[1])
print(xmin, xmax, ymin, ymax) #define your x,y coordinates
b = (xmin, xmax, ymin, ymax)
bb = convert((w,h), b)
Check my sample program to convert from LabelMe annotation tool format to Yolo format https://github.com/ivder/LabelMeYoloConverter
There is a more straight-forward way to do those stuff with pybboxes. Install with,
pip install pybboxes
use it as below,
import pybboxes as pbx
voc_bbox = (100, 100, 200, 200)
W, H = 1000, 1000 # WxH of the image
pbx.convert_bbox(voc_bbox, from_type="voc", to_type="yolo", image_size=(W,H))
>>> (0.15, 0.15, 0.1, 0.1)
Note that, converting to YOLO format requires the image width and height for scaling.
YOLO normalises the image space to run from 0 to 1 in both x and y directions. To convert between your (x, y) coordinates and yolo (u, v) coordinates you need to transform your data as u = x / XMAX and y = y / YMAX where XMAX, YMAX are the maximum coordinates for the image array you are using.
This all depends on the image arrays being oriented the same way.
Here is a C function to perform the conversion
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <math.h>
struct yolo {
float u;
float v;
};
struct yolo
convert (unsigned int x, unsigned int y, unsigned int XMAX, unsigned int YMAX)
{
struct yolo point;
if (XMAX && YMAX && (x <= XMAX) && (y <= YMAX))
{
point.u = (float)x / (float)XMAX;
point.v = (float)y / (float)YMAX;
}
else
{
point.u = INFINITY;
point.v = INFINITY;
errno = ERANGE;
}
return point;
}/* convert */
int main()
{
struct yolo P;
P = convert (99, 201, 255, 324);
printf ("Yolo coordinate = <%f, %f>\n", P.u, P.v);
exit (EXIT_SUCCESS);
}/* main */
There are two potential solutions. First of all you have to understand if your first bounding box is in the format of Coco or Pascal_VOC. Otherwise you can't do the right math.
Here is the formatting;
Coco Format: [x_min, y_min, width, height]
Pascal_VOC Format: [x_min, y_min, x_max, y_max]
Here are some Python Code how you can do the conversion:
Converting Coco to Yolo
# Convert Coco bb to Yolo
def coco_to_yolo(x1, y1, w, h, image_w, image_h):
return [((2*x1 + w)/(2*image_w)) , ((2*y1 + h)/(2*image_h)), w/image_w, h/image_h]
Converting Pascal_voc to Yolo
# Convert Pascal_Voc bb to Yolo
def pascal_voc_to_yolo(x1, y1, x2, y2, image_w, image_h):
return [((x2 + x1)/(2*image_w)), ((y2 + y1)/(2*image_h)), (x2 - x1)/image_w, (y2 - y1)/image_h]
If need additional conversions you can check my article at Medium: https://christianbernecker.medium.com/convert-bounding-boxes-from-coco-to-pascal-voc-to-yolo-and-back-660dc6178742
For yolo format to x1,y1, x2,y2 format
def yolobbox2bbox(x,y,w,h):
x1 = int((x - w / 2) * dw)
x2 = int((x + w / 2) * dw)
y1 = int((y - h / 2) * dh)
y2 = int((y + h / 2) * dh)
if x1 < 0:
x1 = 0
if x2 > dw - 1:
x2 = dw - 1
if y1 < 0:
y1 = 0
if y2 > dh - 1:
y2 = dh - 1
return x1, y1, x2, y2
There are two things you need to do:
Divide the coordinates by the image size to normalize them to [0..1] range.
Convert (x1, y1, x2, y2) coordinates to (center_x, center_y, width, height).
If you're using PyTorch, Torchvision provides a function that you can use for the conversion:
from torch import tensor
from torchvision.ops import box_convert
image_size = tensor([608, 608])
boxes = tensor([[100, 100, 200, 200], [300, 300, 400, 400]], dtype=float)
boxes[:, :2] /= image_size
boxes[:, 2:] /= image_size
boxes = box_convert(boxes, "xyxy", "cxcywh")
Just reading the answers I am also looking for this but find this more informative to know what happening at the backend.
Form Here: Source
Assuming x/ymin and x/ymax are your bounding corners, top left and bottom right respectively. Then:
x = xmin
y = ymin
w = xmax - xmin
h = ymax - ymin
You then need to normalize these, which means give them as a proportion of the whole image, so simple divide each value by its respective size from the values above:
x = xmin / width
y = ymin / height
w = (xmax - xmin) / width
h = (ymax - ymin) / height
This assumes a top-left origin, you will have to apply a shift factor if this is not the case.
So the answer

Going back to original image from image edges

If we read an image X and apply the simplest edge detector F = [1 0 −1] on it to find Y. Is it possible to go back to retrieve X from Y? Given that Yn = Xn−1−Xn+1, can you express X in terms of Y. Can we design a 3x3 filter G that performs the opposite of F?
Assuming that X-1 and X0 are known (say both 0),
X1= Y0 + X-1
X2= Y1 + X0
X3= Y2 + X1 = Y2 + Y0 + X-1
X4= Y3 + X2 = Y3 + Y1 + X0
X5= Y4 + X3 = Y4 + Y2 + Y0 + X-1
X6= Y5 + X4 = Y5 + Y3 + Y1 + X0
...
This is a recursive filter, which computes the prefix sum of every other pixel.
This will only work if you have the signed values of Y. If only the absolute value is available, you are stuck.

Correlation (offset detection) issues - Signal power concentrated at edge of domain

I'm in a bit of a bind - I am in too deep to quickly apply another technique, so here goes nothing...
I'm doing line tracking by correlating each row of a matrix with the row below and taking the max of the correlation to compute the offset. It works extremely well EXCEPT when the signals are up against the edge of the domain. It simply gives a 0. I suspect this is is because it is advantageous to simply add in place rather than shift in 0's to the edge. Here are some example signals that cause the issue. These signals aren't zero-mean, but they are when I correlate (I subtract the mean). I get the correct offset for the third image, but not for the first two.
Here is my correlation code
x0 -= mean(x0)
x1 -= mean(x1)
x0 /= max(x0)
x1 /= max(x1)
c = signal.correlate(x1, x0, mode='full')
m = interp_peak_offset(c)
foffset =(m - len(x0) + 1) * (f[2] - f[1])
I have tried clipping one of the signals by 20 samples on each side, correlating the gradient of the signal, and some other wonky methods with no success...
Any help is greatly appreciated! Thanks so much!
Instead of looking for the maximum amplitude, you should look for phase difference.
This can be achieved using the PHAT ( Phase Transform) method:
def PHAT(x, y, fs, nperseg=50):
f, pxy = csd(x, y, fs=1.0, nperseg=nperseg, return_onesided=False)
pxy_phase = np.divide(pxy, np.abs(pxy))
gcc_fun = np.real(ifft(pxy_phase)) # generelized cross correlation.
TDOA = np.argmax(gcc_fun) / float(fs)
return TDOA
I ended up minimizing the average absolute difference between the two vectors. For each time shift, I computed the absolute difference/number of points of overlap. Here is my function that does so
def offset_using_diff(x0, x1, f):
#Finds the offset of x0 from x1 such that x0(f) ~ x1(f - foffset). Does so by
#minimizing the average absolute difference between the two signals, with one signal
#shifted.
#In other words, we minimize |x0 - x1|/N where N is the number of points overlapping
#between x1 and the shifted version of x0
#Args:
# x0,x1 (vector): data
# f (vector): frequency vector
#Returns:
# foffset (float): frequency offset
OMAX = min(len(x0) // 2, 100) # max offset in samples
dvec = zeros((2 * OMAX,))
offsetvec = arange(-OMAX + 1, OMAX + 1)
y0 = x0.copy()
y1 = x1.copy()
y0 -= min(y0)
y1 -= min(y1)
y0 = pad(y0, (100, 100), 'constant', constant_values=(0, 0))
y1 = pad(y1, (100, 100), 'constant', constant_values=(0, 0))
for i, offset in enumerate(offsetvec):
d0 = roll(y0, offset)
d1 = y1
iinds1 = d0 != 0
iinds2 = d1 != 0
iinds = logical_and(iinds1, iinds2)
d0 = d0[iinds]
d1 = d1[iinds]
diff = d0 - d1
dvec[i] = sum(abs(diff))/len(d0)
m = interp_peak_offset(-1*dvec)
foffset = (m - OMAX + 1)*(f[2]-f[1])
return foffset

In case of logistic regression, how should I interpret this learning curve between cost and number of examples?

I have obtained the following learning curve on plotting the learning curves for training and cross validation sets between the error cost, and number of training examples (in 100s in the graph). Can someone please tell me if this learning curve is ever possible? Because I am of the impression that the Cross validation error should decrease as the number of training examples increase.
Learning Curve. Note that the x axis denotes the number of training examples in 100s.
EDIT :
This is the code which I use to calculate the 9 values for plotting the learning curves.
X is the 2D matrix of the training set examples. It is of dimensions m x (n+1). y is of dimensions m x 1, and each element has value 1 or 0.
for j=1:9
disp(j)
[theta,J] = trainClassifier(X(1:(j*100),:),y(1:(j*100)),lambda);
[error_train(j), grad] = costprediciton_train(theta , X(1:(j*100),:), y(1:(j*100)));
[error_cv(j), grad] = costfunction_test2(theta , Xcv(1:(j*100),:),ycv(1:(j*100)));
end
The code I use for finding the optimal value of Theta from the training set.
% Train the classifer. Return theta
function [optTheta, J] = trainClassifier(X,y,lambda)
[m,n]=size(X);
initialTheta = zeros(n, 1);
options=optimset('GradObj','on','MaxIter',100);
[optTheta, J, Exit_flag ] = fminunc(#(t)(regularizedCostFunction(t, X, y, lambda)), initialTheta, options);
end
%regularized cost
function [J, grad] = regularizedCostFunction(theta, X, y,lambda)
[m,n]=size(X);
h=sigmoid( X * theta);
temp1 = -1 * (y .* log(h));
temp2 = (1 - y) .* log(1 - h);
thetaT = theta;
thetaT(1) = 0;
correction = sum(thetaT .^ 2) * (lambda / (2 * m));
J = sum(temp1 - temp2) / m + correction;
grad = (X' * (h - y)) * (1/m) + thetaT * (lambda / m);
end
The code I use for calculating the error cost for prediction of results for training set: (similar is the code for error cost of CV set)
Theta is of dimensions (n+1) x 1 and consists of the coefficients of the features in the hypothesis function.
function [J,grad] = costprediciton_train(theta , X, y)
[m,n]=size(X);
h=sigmoid(X * theta);
temp1 = y .* log(h);
temp2 = (1-y) .* log(1- h);
J = -sum (temp1 + temp2)/m;
t=h-y;
grad=(X'*t)*(1/m);
end
function [J,grad] = costfunction_test2(theta , X, y)
m= length(y);
h=sigmoid(X*theta);
temp1 = y .* log(h);
temp2 = (1-y) .* log(1- h);
J = -sum (temp1 + temp2)/m ;
grad = (X' * (h - y)) * (1/m) ;
end
The Sigmoid function:
function g = sigmoid(z)
g= zeros(size(z));
den=1 + exp(-1*z);
g = 1 ./ den;
end

finding the distance between one camera and one object, knowing the dimensions of the object using opencv

I am working on robotic arm, and trying to find the distance between one camera and one object, knowing the dimensions of the object using open CV.
I not sure how to do it.
I tried using visual servoing method but not succeeded.
Any help will be nice
Here is a simple solution for starting:
1.) First, you have to calibrate your camera to receive the C matrix. For example:
http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
2.) Then you have to identify on the image two points of the object which distance is known. Let's be these points P1(X1, Y1, Z1) and P2(X2, Y2, Z2) in the 3D space and p1(x1, y1) and p2(x2, y2) in 2D image plane. Certainly P1 and P2 are not known, only their pairs on the image (p1 and p2) and their distance D.
3.) Next, if you know the camera calibration matrix C, you have two equations:
p1 = C * P1 and p2 = C * P2
Then: C^-1 * p1 = P1' and C^-1 * p2 = P2' where P1' = (X1 / Z1, Y1 / Z1, 1) and P2' = (X2 / Z2, Y2 / Z2, 1). This means that C^-1 * p1 returns the 3D coordinates of the point belong the Z1 = 1 distance.
But the exact distance is not 1 but it is Z1 and Z2. Now, let's suppose that Z1 == Z2 == Z and we are looking for the distance Z.
This follows that Z * C^-1 * p1 = P1 and Z * C^-1 * p2 = P2.
Next (Z * C^-1 * p1) - (Z * C^-1 * p2) = P1 - P2, and
Z * ((C^-1 * p1) - C^-1 * p2)) = P1 - P2, and
Z^2 * ((C^-1 * p1) - C^-1 * p2))^2 = (P1 - P2)^2 = D^2
Now, C, p1, p2 and D are known, only Z is the unknown variable in the last equation.
Certainly, this is only a basic solution and uses the assumpion that the distances of the object's points are very similar, but it can work.

Resources