How to do InverseDynamics with a floating base robot? - drake

I tried with the CalcInverseDynamics, but the returned tau is an 18 dimension vector, 6(floating base) + 12(actuator), which is supposed to be 12 (equal with the num of actuators). Is there any example to do InverseDynamics with the floating-base robot using known_vdot and contact force trajectories?
I tried with the LittLeDog.urdf model. My code is:
def DoID():
legs = [plant.GetBodyByName("front_left_lower_leg"),
plant.GetBodyByName("front_right_lower_leg"),
plant.GetBodyByName("back_left_lower_leg"),
plant.GetBodyByName("back_right_lower_leg")]
contacts = [foot_frame[0].CalcPoseInBodyFrame(plant_context).translation() for i in range(4)]
F_expected = np.array([0., 0., 0., 0., 0., 0.])
forces = MultibodyForces(plant)
# add SpatialForce applied to legs into MultibodyForces
for i in range(4):
legs[i].AddInForce(
plant_context, p_BP_E=contacts[i],
F_Bp_E=SpatialForce(F=F_expected),
frame_E=plant.world_frame(), forces=forces)
nv = plant.num_velocities()
vd_d = np.zeros(nv)
tau = plant.CalcInverseDynamics(plant_context, vd_d, forces)
return tau
update:
at the CalcInverseDynamics API, it writes:
tau = M(q)v̇ + C(q, v)v - tau_app - ∑ J_WBᵀ(q) Fapp_Bo_W
This should also work for the floating-base robot, with the form of
from here, different notation but the same equation. I hope when the contact force and the known_vdot (or qddot) are 'reasonable', then the will become zeros, and the become the joint torque commands. I will use APIs like CalcMassMatrix, CalcBiasTerm and CalcGravityGeneralizedForces to get .
After get the joint commands, use PD controller or other controller to apply to robot. A functional solution to 'controller a desired acceleration' may still need to formulate a QP like http://groups.csail.mit.edu/robotics-center/public_papers/Kuindersma13.pdf. But will try the simpler way first.

My guess is that you are trying to find a controller that will (approximately) follow a desired acceleration of the entire state vector using only the actuators (for littledog, you have 12 actuators, but 19 positions / 18 velocities)?
In addition, with a legged robot like littledog, you have to think about the contact forces (and their friction cones).
The most common generalization of the inverse dynamics control for situations like this involves solving a quadratic program (using a linearization of the friction cone constraints). See for instance http://groups.csail.mit.edu/robotics-center/public_papers/Kuindersma13.pdf

Related

Partial Differentiation of the Mass Matrix for Free-Floating system with 7 DoF Arm

I'm trying to partially differentiate the mass matrix with respect to the positions (including the free-floating base quaternions). However, I try to first substitute the unit quaternion constraint in the symbolic mass matrix i.e. and then differentiate with respect to all the positions (except q_0).
When I try to use plant.ToSymbolic() and then extract the symbolic mass matrix using plant.CalcMassMatrixViaInverseDynamics() (ManipulatorDynamics function taken from this example) it works for a single free-floating rigid body (3-DoF floating) and also for a free-floating box with a 3-DoF arm (6-DoF system) but the same process does not work when I change to a 7-DoF arm attached to the floating base (13-DoF system). The mass matrix is computed but the function to substitute the unit quaternion constraint takes too long (never finishes and returns). Portion of the applicable code:
q0 = Variable("q0")
q1 = Variable("q1")
q2 = Variable("q2")
q3 = Variable("q3")
unit_cons = Expression.sqrt(1 - q1**2 - q2**2 - q3**2)
mixed_q = np.array([q0, q1, q2, q3,
0, 0, 0,
0, 0, 0, 0, 0, 0, 0])
v = np.zeros(plant.num_velocities())
(M, Cv, tauG, B, tauExt) = ManipulatorDynamics(plant.ToSymbolic(), mixed_q, v)
M[0, 0].Substitute({q0: unit_cons})
This runs well for a single floating rigid body, floating body + 3-DoF arm. But the last line takes a long time and never returns for a floating base + 7-DoF arm.
Is there a better way to approach this with Drake's symbolic engine? Or would AutoDiff be a better choice for this? I only need the 1st partial derivative in this case.
As for some background, I am trying to create a custom linearization of the MultiBodyPlant which embeds the unit quaternion constraint. This requires the partial derivatives of the terms after substituting the unit quaternion constraint to only have the vector part of the quaternion in the equations of motion.
Or would AutoDiff be a better choice for this? I only need the 1st partial derivative in this case.
If you want to evaluate the partial differentiation, then I would suggest using autodiff. To handle your unit quaternion constraint, you can compute your q0 from the vector part of the quaternion.
Here is the pseudo code.
quaternion_vec_val = np.array([0.1, 0.2, 0.3])
q_non_quaternion_val = np.array([0.4, 0.5, 0.6])
q_variable = np.concantenate((quaternion_vec_val, q_non_quaternion_val))
# We want to take derivative w.r.t the vector part of the quaternion, and all non-quaternion variables. Hence we call InitializeAutoDiff on the variables we want to take derivative.
q_variable_ad = pydrake.math.Initialize(q_variable)
quaternion_scalar_ad = np.array([np.sqrt(1 - np.sum(q_variable_ad[:3]**2))])
q_ad = np.concantenate((quaternion_scalar_ad, q_variable_ad))
plant_ad = plant.ToAutoDiffXd()
# Initialize v_ad
v_ad = np.zeros(6, dtype=np.object)
for i in range(6):
v_ad[i] = pydrake.AutoDiffXd(0)
(M, C, g, B, tau_ext) = ManipulatorDynamics(plant_ad, q_ad, v_ad)
print(M(0, 0).derivatives())
(I haven't run the code, especially the part that initializes v_ad is fuzzy, but hopefully you get the idea).

Get mapping from body state to plant state indices?

Is there a way to know the mapping between the indices of a plant's state [q, v] and an individual object's [qi, vi]?
Example: if I have object-wise representations of position, velocities, or accelerations (e.g. q_obj1, q_obj2 etc.) and need to interact with a plant's state beyond what SetPositions/SetVelocities allows me to do---for instance computing M.dot(qdd), or feeding into MultibodyPositionToGeometryPose.
From the documentation it seemed like maybe the C++ method MakeStateSelectorMatrix or its inverse could be useful here, but how would one do this in python?
Edit: here's a more clear example
# object-wise positions, velocities, and accelerations
q1, qd1, qdd1, q2, qd2, qdd2 = ...
# set plant positions and velocities for each object
plant.SetPositionsAndVelocities(context, model1, np.concatenate([q1, qd1]))
plant.SetPositionsAndVelocities(context, model2, np.concatenate([q2, qd2]))
# get matrices for the plant manipulator equations
M = plant.CalcMassMatrixViaInverseDynamics(context)
Cv = plant.CalcBiasTerm(context)
# The following line is wrong, because M is in the plant state ordering
# but qdd is not!
print(M.dot(np.concatenate([qdd1, qdd2])) + Cv)
# here's what I would like, where the imaginary method ConvertToStateIndices
# maps from object indices to plant state indices,
# which SetPositions does under the hood, but I need exposed here
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
reordered_qdd += plant.ConvertToStateIndices(model1, qdd1)
reordered_qdd += plant.ConvertToStateIndices(model2, qdd2)
print(M.dot(reordered_qdd) + Cv)
Edit 2: for future reference, here is a workaround---do the following before casting MultiBodyPlant to AutoDiffXd or Expression:
indices = np.arange(plant.num_positions())
indices_dict = {body: plant.GetPositionsFromArray(model, indices).astype(int)
for body, model in zip(bodies, models)}
then you can do:
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
reordered_qdd[indices_dict[body1]] += qdd1
reordered_qdd[indices_dict[body2]] += qdd2
print(M.dot(reordered_qdd) + Cv)
Are MultibodyPlant::SetPositionsInArray or SetVelocitiesInArray up to the job you're looking for?
reordered_qdd = np.zeros(plant.num_positions, dtype=qdd.dtype)
plant.SetVelocitiesInArray(model1, qdd1, reordered_qdd)
plant.SetVelocitiesInArray(model2, qdd2, reordered_qdd)
print(M.dot(reordered_qdd) + Cv)
One could say a bit of an abuse of the API (since in this example the values are accelerations), but I think given that qd and qdd would have the same layout it should work fine.
That being said, I didn't actually test the code above, so apologies (and please let me know!) if that doesn't work out.

arbitrarily weighted moving average (low- and high-pass filters)

Given input signal x (e.g. a voltage, sampled thousand times per second couple of minutes long), I'd like to calculate e.g.
/ this is not q
y[3] = -3*x[0] - x[1] + x[2] + 3*x[3]
y[4] = -3*x[1] - x[2] + x[3] + 3*x[4]
. . .
I'm aiming for variable window length and weight coefficients. How can I do it in q? I'm aware of mavg and signal processing in q and moving sum qidiom
In the DSP world it's called applying filter kernel by doing convolution. Weight coefficients define the kernel, which makes a high- or low-pass filter. The example above calculates the slope from last four points, placing the straight line via least squares method.
Something like this would work for parameterisable coefficients:
q)x:10+sums -1+1000?2f
q)f:{sum x*til[count x]xprev\:y}
q)f[3 1 -1 -3] x
0n 0n 0n -2.385585 1.423811 2.771659 2.065391 -0.951051 -1.323334 -0.8614857 ..
Specific cases can be made a bit faster (running 0 xprev is not the best thing)
q)g:{prev[deltas x]+3*x-3 xprev x}
q)g[x]~f[3 1 -1 -3]x
1b
q)\t:100000 f[3 1 1 -3] x
4612
q)\t:100000 g x
1791
There's a kx white paper of signal processing in q if this area interests you: https://code.kx.com/q/wp/signal-processing/
This may be a bit old but I thought I'd weigh in. There is a paper I wrote last year on signal processing that may be of some value. Working purely within KDB, dependent on the signal sizes you are using, you will see much better performance with a FFT based convolution between the kernel/window and the signal.
However, I've only written up a simple radix-2 FFT, although in my github repo I do have the untested work for a more flexible Bluestein algorithm which will allow for more variable signal length. https://github.com/callumjbiggs/q-signals/blob/master/signal.q
If you wish to go down the path of performing a full manual convolution by a moving sum, then the best method would be to break it up into blocks equal to the kernel/window size (which was based on some work Arthur W did many years ago)
q)vec:10000?100.0
q)weights:30?1.0
q)wsize:count weights
q)(weights$(((wsize-1)#0.0),vec)til[wsize]+) each til count v
32.5931 75.54583 100.4159 124.0514 105.3138 117.532 179.2236 200.5387 232.168.
If your input list not big then you could use the technique mentioned here:
https://code.kx.com/q/cookbook/programming-idioms/#how-do-i-apply-a-function-to-a-sequence-sliding-window
That uses 'scan' adverb. As that process creates multiple lists which might be inefficient for big lists.
Other solution using scan is:
q)f:{sum y*next\[z;x]} / x-input list, y-weights, z-window size-1
q)f[x;-3 -1 1 3;3]
This function also creates multiple lists so again might not be very efficient for big lists.
Other option is to use indices to fetch target items from the input list and perform the calculation. This will operate only on input list.
q) f:{[l;w;i]sum w*l i+til 4} / w- weight, l- input list, i-current index
q) f[x;-3 -1 1 3]#'til count x
This is a very basic function. You can add more variables to it as per your requirements.

calculate the spatial dimension of a graph

Given a graph (say fully-connected), and a list of distances between all the points, is there an available way to calculate the number of dimensions required to instantiate the graph?
E.g. by construction, say we have graph G with points A, B, C and distances AB=BC=CA=1. Starting from A (0 dimensions) we add B at distance 1 (1 dimension), now we find that a 2nd dimension is needed to add C and satisfy the constraints. Does code exist to do this and spit out (in this case) dim(G) = 2?
E.g. if the points are photos, and the distances between them calculated by the Gist algorithm (http://people.csail.mit.edu/torralba/code/spatialenvelope/), I would expect the derived dimension to match the number image parameters considered by Gist.
Added: here is a 5-d python demo based on the suggestion - seemingly perfect!
'similarities' is the distance matrix.
import numpy as np
from sklearn import manifold
similarities = [[0., 1., 1., 1., 1., 1.],
[1., 0., 1., 1., 1., 1.],
[1., 1., 0., 1., 1., 1.],
[1., 1., 1., 0., 1., 1.],
[1., 1., 1., 1., 0., 1.],
[1., 1., 1., 1., 1., 0]]
seed = np.random.RandomState(seed=3)
for i in [1, 2, 3, 4, 5]:
mds = manifold.MDS(n_components=i, max_iter=3000, eps=1e-9, random_state=seed,
dissimilarity="precomputed", n_jobs=1)
print("%d %f" % (i, mds.fit(similarities).stress_))
Output:
1 3.333333
2 1.071797
3 0.343146
4 0.151531
5 0.000000
I find that when I apply this method to a subset of my data (distances between 329 pictures with '11' in the file name, using two different metrics), the stress doesn't decrease to 0 as linearly I'd expect from the above - it levels off after about 5 dimensions. (On the SURF results I tried doubling max_iter, and varying eps by an order of magnitude each way without changing results in the first four digits.)
It turns out the distances do not satisfy the triangle inequality in ~0.02% of the triangles, with the average violation roughly equal to 8% the average distance, for one metric examined.
Overall I prefer the fractal dimension of the sorted distances since it is doesn't require picking a cutoff. I'm marking the MDS response as an answer because it works for the consistent case. My results for the fractal dimension and the MDS case are below.
Another descriptive statistic turns out to be the triangle violations. Results for this further below. If anyone could generalize to higher dimensions, that would be very interesting (results and learning python :-).
MDS results, ignoring the triangle inequality issue:
N_dim stress_
SURF_match GIST_match
1 83859853704.027344 913512153794.477295
2 24402474549.902721 238300303503.782837
3 14335187473.611954 107098797170.304825
4 10714833228.199451 67612051749.697998
5 9451321873.828577 49802989323.714806
6 8984077614.154467 40987031663.725784
7 8748071137.806602 35715876839.391762
8 8623980894.453981 32780605791.135693
9 8580736361.368249 31323719065.684353
10 8558536956.142039 30372127335.209297
100 8544120093.395177 28786825401.178596
1000 8544192695.435946 28786840008.666389
Forging ahead with that to devise a metric to compare the dimensionality of the two results, an ad hoc choice is to set the criterion to
1.1 * stress_at_dim=100
resulting in the proposition that the SURF_match has a quasi-dimension in 5..6, while GIST_match has a quasi-dimension in 8..9. I'm curious if anyone thinks that means anything :-). Another question is whether there is any meaningful interpretation for the relative magnitudes of stress at any dimension for the two metrics. Here are some results to put it in perspective. Frac_d is the fractal dimension of the sorted distances, calculated according to Higuchi's method using code from IQM, Dim is the dimension as described above.
Method Frac_d Dim stress(100) stress(1)
Lab_CIE94 1.1458 3 2114107376961504.750000 33238672000252052.000000
Greyscale 1.0490 8 42238951082.465477 1454262245593.781250
HS_12x12 1.0889 19 33661589105.972816 3616806311396.510254
HS_24x24 1.1298 35 16070009781.315575 4349496176228.410645
HS_48x48 1.1854 64 7231079366.861403 4836919775090.241211
GIST 1.2312 9 28786830336.332951 997666139720.167114
HOG_250_words 1.3114 10 10120761644.659481 150327274044.045624
HOG_500_words 1.3543 13 4740814068.779779 70999988871.696045
HOG_1k_words 1.3805 15 2364984044.641845 38619752999.224922
SIFT_1k_words 1.5706 11 1930289338.112194 18095265606.237080
SURFFAST_200w 1.3829 8 2778256463.307569 40011821579.313110
SRFFAST_250_w 1.3754 8 2591204993.421285 35829689692.319153
SRFFAST_500_w 1.4551 10 1620830296.777577 21609765416.960484
SURFFAST_1k_w 1.5023 14 949543059.290031 13039001089.887533
SURFFAST_4k_w 1.5690 19 582893432.960562 5016304129.389058
Looking at the Pearson correlation between columns of the table:
Pearson correlation 2-tailed p-value
FracDim, Dim: (-0.23333296587402277, 0.40262625206429864)
Dim, Stress(100): (-0.24513480360257348, 0.37854224076180676)
Dim, Stress(1): (-0.24497740363489209, 0.37885820835053186)
Stress(100),S(1): ( 0.99999998200931084, 8.9357374620135412e-50)
FracDim, S(100): (-0.27516440489210137, 0.32091019789264791)
FracDim, S(1): (-0.27528621200454373, 0.32068731053608879)
I naively wonder how all correlations but one can be negative, and what conclusions can be drawn. Using this code:
import sys
import numpy as np
from scipy.stats.stats import pearsonr
file = sys.argv[1]
col1 = int(sys.argv[2])
col2 = int(sys.argv[3])
arr1 = []
arr2 = []
with open(file, "r") as ins:
for line in ins:
words = line.split()
arr1.append(float(words[col1]))
arr2.append(float(words[col2]))
narr1 = np.array(arr1)
narr2 = np.array(arr2)
# normalize
narr1 -= narr1.mean(0)
narr2 -= narr2.mean(0)
# standardize
narr1 /= narr1.std(0)
narr2 /= narr2.std(0)
print pearsonr(narr1, narr2)
On to the number of violations of the triangle inequality by the various metrics, all for the 329 pics with '11' in their sequence:
(1) n_violations/triangles
(2) avg violation
(3) avg distance
(4) avg violation / avg distance
n_vio (1) (2) (3) (4)
lab 186402 0.031986 157120.407286 795782.437570 0.197441
grey 126902 0.021776 1323.551315 5036.899585 0.262771
600px 120566 0.020689 1339.299040 5106.055953 0.262296
Gist 69269 0.011886 1252.289855 4240.768117 0.295298
RGB
12^3 25323 0.004345 791.203886 7305.977862 0.108295
24^3 7398 0.001269 525.981752 8538.276549 0.061603
32^3 5404 0.000927 446.044597 8827.910112 0.050527
48^3 5026 0.000862 640.310784 9095.378790 0.070400
64^3 3994 0.000685 614.752879 9270.282684 0.066314
98^3 3451 0.000592 576.815995 9409.094095 0.061304
128^3 1923 0.000330 531.054082 9549.109033 0.055613
RGB/600px
12^3 25190 0.004323 790.258158 7313.379003 0.108057
24^3 7531 0.001292 526.027221 8560.853557 0.061446
32^3 5463 0.000937 449.759107 8847.079639 0.050837
48^3 5327 0.000914 645.766473 9106.240103 0.070915
64^3 4382 0.000752 634.000685 9272.151040 0.068377
128^3 2156 0.000370 544.644712 9515.696642 0.057236
HueSat
12x12 7882 0.001353 950.321873 7555.464323 0.125779
24x24 1740 0.000299 900.577586 8227.559169 0.109459
48x48 1137 0.000195 661.389622 8653.085004 0.076434
64x64 1134 0.000195 697.298942 8776.086144 0.079454
HueSat/600px
12x12 6898 0.001184 943.319078 7564.309456 0.124707
24x24 1790 0.000307 908.031844 8237.927256 0.110226
48x48 1267 0.000217 693.607735 8647.060308 0.080213
64x64 1289 0.000221 682.567106 8761.325172 0.077907
hog
250 53782 0.009229 675.056004 1968.357004 0.342954
500 18680 0.003205 559.354979 1431.803914 0.390665
1k 9330 0.001601 771.307074 970.307130 0.794910
4k 5587 0.000959 993.062824 650.037429 1.527701
sift
500 26466 0.004542 1267.833182 1073.692611 1.180816
1k 16489 0.002829 1598.830736 824.586293 1.938949
4k 10528 0.001807 1918.068294 533.492373 3.595306
surffast
250 38162 0.006549 630.098999 1006.401837 0.626091
500 19853 0.003407 901.724525 830.596690 1.085635
1k 10659 0.001829 1310.348063 648.191424 2.021545
4k 8988 0.001542 1488.200156 419.794008 3.545072
Anyone capable of generalizing to higher dimensions? Here is my first-timer code:
import sys
import time
import math
import numpy as np
import sortedcontainers
from sortedcontainers import SortedSet
from sklearn import manifold
seed = np.random.RandomState(seed=3)
pairs = sys.argv[1]
ss = SortedSet()
print time.strftime("%H:%M:%S"), "counting/indexing"
sys.stdout.flush()
with open(pairs, "r") as ins:
for line in ins:
words = line.split()
ss.add(words[0])
ss.add(words[1])
N = len(ss)
print time.strftime("%H:%M:%S"), "size ", N
sys.stdout.flush()
sim = np.diag(np.zeros(N))
dtot = 0.0
with open(pairs, "r") as ins:
for line in ins:
words = line.split()
i = ss.index(words[0])
j = ss.index(words[1])
#val = math.log(float(words[2]))
#val = math.sqrt(float(words[2]))
val = float(words[2])
sim[i][j] = val
sim[j][i] = val
dtot += val
avgd = dtot / (N * (N-1))
ntri = 0
nvio = 0
vio = 0.0
for i in xrange(1, N):
for j in xrange(i+1, N):
d1 = sim[i][j]
for k in xrange(j+1, N):
ntri += 1
d2 = sim[i][k]
d3 = sim[j][k]
dd = d1 + d2
diff = d3 - dd
if (diff > 0.0):
nvio += 1
vio += diff
avgvio = 0.0
if (nvio > 0):
avgvio = vio / nvio
print("tot: %d %f %f %f %f" % (nvio, (float(nvio)/ntri), avgvio, avgd, (avgvio/avgd)))
Here is how I tried sklearn's Isomap:
for i in [1, 2, 3, 4, 5]:
# nbrs < points
iso = manifold.Isomap(n_neighbors=nbrs, n_components=i,
eigen_solver="auto", tol=1e-9, max_iter=3000,
path_method="auto", neighbors_algorithm="auto")
dis = euclidean_distances(iso.fit(sim).embedding_)
stress = ((dis.ravel() - sim.ravel()) ** 2).sum() / 2
Given a graph (say fully-connected), and a list of distances between all the points, is there an available way to calculate the number of dimensions required to instantiate the graph?
Yes. The more general topic this problem would be part of, in terms of graph theory, is called "Graph Embedding".
E.g. by construction, say we have graph G with points A, B, C and distances AB=BC=CA=1. Starting from A (0 dimensions) we add B at distance 1 (1 dimension), now we find that a 2nd dimension is needed to add C and satisfy the constraints. Does code exist to do this and spit out (in this case) dim(G) = 2?
This is almost exactly the way that Multidimensional Scaling works.
Multidimensional scaling (MDS) would not exactly answer the question of "How many dimensions would I need to represent this point cloud / graph?" with a number but it returns enough information to approximate it.
Multidimensional Scaling methods will attempt to find a "good mapping" to reduce the number of dimensions, say from 120 (in the original space) down to 4 (in another space). So, in a way, you can iteratively try different embeddings for increasing number of dimensions and look at the "stress" (or error) of each embedding. The number of dimensions you are after is the first number for which there is an abrupt minimisation of the error.
Due to the way it works, Classical MDS, can return a vector of eigenvalues for the new mapping. By examining this vector of eigenvalues you can determine how many of its entries you would need to retain to achieve a (good enough, or low error) representation of the original dataset.
The key concept here is the "similarity" matrix which is a fancy name for a graph's distance matrix (which you already seem to have), irrespectively of its semantics.
Embedding algorithms, in general, are trying to find an embedding that may look different but at the end of the day, the point cloud in the new space will end up having a similar (depending on how much error we can afford) distance matrix.
In terms of code, I am sure that there is something available in all major scientific computing packages but off the top of my head I can point you towards Python and MATLAB code examples.
E.g. if the points are photos, and the distances between them calculated by the Gist algorithm (http://people.csail.mit.edu/torralba/code/spatialenvelope/), I would expect the derived dimension to match the number image parameters considered by Gist
Not exactly. This is a very good use case though. In this case, what MDS would return, or what you would be probing with dimensionality reduction in general would be to check how many of these features seem to be required to represent your dataset. Therefore, depending on the scenes, or, depending on the dataset, you might realise that not all of these features are necessary for a good enough representation of the whole dataset. (In addition, you might want to have a look at this link as well).
Hope this helps.
First, you can assume that any dataset has a dimensionality of at most 4 or 5. To get more relevant dimensions, you would need one million elements (or something like that).
Apparently, you already computed a distance. Are you sure it is actually a relavnt metric? Is it efficient for images that are quite distant? Perhaps you can try Isomap (geodesic distance, starting for only close neighbors) and see if your embedded space may not actually be Euclidian.

Simple registration algorithm for small sets of 2D points

I am trying to find a simple algorithm to find the correspondence between two sets of 2D points (registration). One set contains the template of an object I'd like to find and the second set mostly contains points that belong to the object of interest, but it can be noisy (missing points as well as additional points that do not belong to the object). Both sets contain roughly 40 points in 2D. The second set is a homography of the first set (translation, rotation and perspective transform).
I am interested in finding an algorithm for registration in order to get the point-correspondence. I will be using this information to find the transform between the two sets (all of this in OpenCV).
Can anyone suggest an algorithm, library or small bit of code that could do the job? As I'm dealing with small sets, it does not have to be super optimized. Currently, my approach is a RANSAC-like algorithm:
Choose 4 random points from set 1 and from set 2.
Compute transform matrix H (using openCV getPerspective())
Warp 1st set of points using H and test how they aligned to the 2nd set of points
Repeat 1-3 N times and choose best transform according to some metric (e.g. sum of squares).
Any ideas? Thanks for your input.
With python you can use Open3D librarry, wich is very easy to install in Anaconda. To your purpose ICP should work fine, so we'll use the classical ICP, wich minimizes point-to-point distances between closest points in every iteration. Here is the code to register 2 clouds:
import numpy as np
import open3d as o3d
# Parameters:
initial_T = np.identity(4) # Initial transformation for ICP
distance = 0.1 # The threshold distance used for searching correspondences
(closest points between clouds). I'm setting it to 10 cm.
# Read your point clouds:
source = o3d.io.read_point_cloud("point_cloud_1.xyz")
target = o3d.io.read_point_cloud("point_cloud_0.xyz")
# Define the type of registration:
type = o3d.pipelines.registration.TransformationEstimationPointToPoint(False)
# "False" means rigid transformation, scale = 1
# Define the number of iterations (I'll use 100):
iterations = o3d.pipelines.registration.ICPConvergenceCriteria(max_iteration = 100)
# Do the registration:
result = o3d.pipelines.registration.registration_icp(source, target, distance, initial_T, type, iterations)
result is a class with 4 things: the transformation T(4x4), 2 metrict (rmse and fitness) and the set of correspondences.
To acess the transformation:
I used it a lot with 3D clouds obteined from Terrestrial Laser Scanners (TLS) and from robots (Velodiny LIDAR).
With MATLAB:
We'll use the point-to-point ICP again, because your data is 2D. Here is a minimum example with two point clouds random generated inside a triangle shape:
% Triangle vértices:
V1 = [-20, 0; -10, 10; 0, 0];
V2 = [-10, 0; 0, 10; 10, 0];
% Create clouds and show pair:
points = 5000
N1 = criar_nuvem_triangulo(V1,points);
N2 = criar_nuvem_triangulo(V2,points);
pcshowpair(N1,N2)
% Registrate pair N1->N2 and show:
[T,N1_tranformed,RMSE]=pcregistericp(N1,N2,'Metric','pointToPoint','MaxIterations',100);
pcshowpair(N1_tranformed,N2)
"criar_nuvem_triangulo" is a function to generate random point clouds inside a triangle:
function [cloud] = criar_nuvem_triangulo(V,N)
% Function wich creates 2D point clouds in triangle format using random
% points
% Parameters: V = Triangle vertices (3x2 Matrix)| N = Number of points
t = sqrt(rand(N, 1));
s = rand(N, 1);
P = (1 - t) * V(1, :) + bsxfun(#times, ((1 - s) * V(2, :) + s * V(3, :)), t);
points = [P,zeros(N,1)];
cloud = pointCloud(points)
end
results:
You may just use cv::findHomography. It is a RANSAC-based approach around cv::getPerspectiveTransform.
auto H = cv::findHomography(srcPoints, dstPoints, CV_RANSAC,3);
Where 3 is the reprojection threshold.
One traditional approach to solve your problem is by using point-set registration method when you don't have matching pair information. Point set registration is similar to method you are talking about.You can find matlab implementation here.
Thanks

Resources