geo-indexing: efficiently calculating proximity based on latitude/longitude - geolocation

My simple web app (WSGI, Python) supports text queries to find items in the database.
Now I'd like to extend this to allow for queries like "find all items within 1 mile of {lat,long}".
Of course that's a complex job if efficiency is a concern, so I'm thinking of a dedicated external module that does indexing for geo-coordinates - sort of like Lucene would for text.
I assume a generic component like this already exists, but haven't been able to find anything so far. Any help would be greatly appreciated.

Have you checked out mongo db, they have a geo indexing feature. http://www.mongodb.org/display/DOCS/Geospatial+Indexing

I could only think of a semi-brute-force attack if you plan to implement it directly with Python, which I already did with similar purposes:
#!/usr/bin/python
from math import *
def distance(p1,p2): # uses the haversine function and an ellipsoid model
lat1, long1 = p1; lat2, long2 = p2
lat1=radians(lat1); long1=radians(long1); lat2=radians(lat2); long2=radians(long2)
maior=6378.137; menor=6356.7523142
R=(maior*menor)/sqrt((maior*cos(lat1))**2 + (menor*sin(lat1))**2)
d_lat = lat2 - lat1; d_long = long2 - long1
a = sin(d_lat/2)**2 + cos(lat1) * cos(lat2) * sin(d_long/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
length = R * c
x = sin(d_long) * cos(lat2)
y = cos(lat2) * sin(lat1) - sin(lat2) * cos (lat1) * cos(d_long)
bearing = 90-(degrees(atan2(y, -x)))
return length, bearing
For the screening of points for distance, you can first find candidate points whose "x" and "y" coordinates are inside a square centered on your testing position (much faster) and just then test for actual geodesic distance.
Hope it helps!

Related

Computing 3D coordinates of keypoints in multiple images

I have multiple images of an object taken by the same calibrated camera. Let's say calibrated means both intrinsic and extrinsic parameters (I can put a checkerboard next to the object, so all parameters can be retrieved). On these images I can find matching keypoints using SIFT or SURF, and some matching algorithm, this is basic OpenCV. But how do I do the 3D reconstruction of these points from multiple images? This is not a classic stereo arrangement, so there are more than 2 images with the same object points on them, and I want to use as many as possible for increased accuracy.
Are there any built-in OpenCV functions that do this?
(Note that this is done off-line, the solution does not need to be fast, but robust)
I guess you are looking for so-called Structur from motion approaches. They are using multiple images from different viewpoints and return a 3D reconstruction (e.g. a pointcloud). It looks like OpenCV has a SfM module in the contrib package, but I have no experiences with it.
However, I used to work with bundler. It was quite uncomplicated and returns the entire information (camera calibration and point positions) as text file and you can view the point cloud with Meshlab. Please note that it uses SIFT keypoints and descriptors for correspondence establishment.
I think I have found a solution for this. Structure from motion algorithms deal with the case where the cameras are not calibrated, but in this case all intrinsic and extrinsic parameters are known.
The problem degrades into a linear least squares problem:
We have to compute the coordinates for a single object point:
X = [x, y, z, 1]'
C = [x, y, z]'
X = [[C], [1]]
We are given n images, which have these transformation matrices:
Pi = Ki * [Ri|ti]
These matrices are already known. The object point is projected on the images at
U = [ui, vi]
We can write in homogeneous coordinates (the operator * represents both matrix multiplication, dot product and scalar multiplication):
[ui * wi, vi * wi, wi]' = Pi * X
Pi = [[p11i, p12i, p13i, p14i],
[p21i, p22i, p23i, p24i],
[p31i, p32i, p33i, p34i]]
Let's define the following:
p1i = [p11i, p12i, p13i] (the first row of Pi missing the last element)
p2i = [p21i, p22i, p23i] (the second row of Pi missing the last element)
p3i = [p31i, p32i, p33i] (the third row of Pi missing the last element)
a1i = p14i
a2i = p24i
a3i = p34i
Then we can write:
Q = [x, y, z]
wi = p3i * Q + a3i
ui = (p1i * Q + a1i) / wi =
= (p1i * Q + a1i) / (p3i * Q + a3i)
ui * p3i * Q + ui * a3i - p1i * Q - a1i = 0
(ui * p3i - p1i) * Q = a1i - a3i
Similarly for vi:
(vi * p3i - p2i) * Q = a2i - a3i
And this holds for i = 1..n. We can write this in matrix form:
G * Q = b
G = [[u1 * p31 - p11],
[v1 * p31 - p21],
[u2 * p32 - p12],
[v2 * p32 - p22],
...
[un * p3n - p1n],
[vn * p3n - p2n]]
b = [[a11 - a31 * u1],
[a21 - a31 * v1],
[a12 - a32 * u2],
[a22 - a32 * v2],
...
[a1n - a3n * un],
[a2n - a3n * vn]]
Since G and b are known from the Pi matrices, and the image points [ui, vi], we can compute the pseudoinverse of G (call it G_), and compute:
Q = G_ * b

Implementing gradient descent for multiple variables in Octave using "sum"

I'm doing Andrew Ng's course on Machine Learning and I'm trying to wrap my head around the vectorised implementation of gradient descent for multiple variables which is an optional exercise in the course.
This is the algorithm in question (taken from here):
I just cannot do this in octave using sum though, I'm not sure how to multiply the sum of the hypothesis of x(i) - y(i) by the all variables xj(i). I tried different iterations of the following code but to no avail (either the dimensions are not right or the answer is wrong):
theta = theta - alpha/m * sum(X * theta - y) * X;
The correct answer, however, is entirely non-obvious (to a linear algebra beginner like me anyway, from here):
theta = theta - (alpha/m * (X * theta-y)' * X)';
Is there a rule of thumb for cases where sum is involved that governs transformations like the above?
And if so, is there the opposite version of the above (i.e. going from a sum based solution to a general multiplication one) as I was able to come up with a correct implementation using sum for gradient descent for a single variable (albeit not a very elegant one):
temp0 = theta(1) - (alpha/m * sum(X * theta - y));
temp1 = theta(2) - (alpha/m * sum((X * theta - y)' * X(:, 2)));
theta(1) = temp0;
theta(2) = temp1;
Please note that this only concerns vectorised implementations and although there are several questions on SO as to how this is done, my question is primarily concerned with the implementation of the algorithm in Octave using sum.
The general "rule of the thumb" is as follows, if you encounter something in the form of
SUM_i f(x_i, y_i, ...) g(a_i, b_i, ...)
then you can easily vectorize it (and this is what is done in the above) through
f(x, y, ...)' * g(a, b, ...)
As this is just a typical dot product, which in mathematics (in Euclidean space of finite dimension) looks like
<A, B> = SUM_i A_i B_i = A'B
thus
(X * theta-y)' * X)
is just
<X * theta-y), X> = <H_theta(X) - y, X> = SUM_i (H_theta(X_i) - y_i) X_i
as you can see this works both ways, as this is just a mathematical definition of dot product.
Referring to this part of your question specifically - "I'm not sure how to multiply the sum of the hypothesis of x(i) - y(i) by the all variables xj(i)."
In Octave you can multiply xj(i) to all the predictions using ".", so it can be written as:
m = size(X, 1);
predictions = X * theta;
sqrErrors = (predictions-y).^2;
J = 1 / (2*m) * sum(sqrErrors);
The vector multiplication automatically includes calculating the sum of the products. So you don't have to specify the sum() function. By using the sum() function, you are converting a vector into a scalar and that's bad.
You actually don't want to use summation here, because what you try to calculate are the single values for all thetas, and not the overall cost J. As you do this in one line of code, if you sum it up you end up with a single value (the sum of all thetas).
Summation was correct, though unnecessary, when you computed the values of theta one by one in the previous exercise. This works just the same:
temp0 = theta(1) - (alpha/m * (X * theta - y)' * X(:, 1));
temp1 = theta(2) - (alpha/m * (X * theta - y)' * X(:, 2));
theta(1) = temp0;
theta(2) = temp1;

SQLite query with location fields inside of region

I have a database with two columns, latitude and longitude.
I wonder if it's possible in the Transact-SQL command to retrieve only the records in a defined region of the location passed as a parameter;
For example let's say you have the following records, which is all from the same location, but we're using them for testing purposes.
Now I have the SQL query:
[self.fmQueue inDatabase:^(FMDatabase *db)
{
CLLocationCoordinate2D center = CLLocationCoordinate2DMake(latitude, longitude);
CLCircularRegion *region = [[CLCircularRegion alloc] initWithCenter:center radius:400.0f identifier:#"identifier"];
NSString* queryCommand = [NSString stringWithFormat:#"SELECT * FROM %# WHERE ?EXPRESSION_ABOUT_REGION?", readingsTable];
FMResultSet* result = [db executeQuery:queryCommand];
if (completionblock)
{
completionblock([self readingsArrayFromResultSet:result]);
}
}];
I could just retrieve records and then compare for every each one with a constructed CLLocationCoordinate2D, [region containsCoordinate:CLLocationCoordinate2D], but that feels so bad performant.
I'm looking for the most performant and appropriate solution to retrieve the desired region related location records in the T-SQL query.
Issues with the Previous Approximation
The previous answer here models the Earth as a Cartesian plane, with latitude and longitude as x and y, and then draws either a circle or a rectangle around the points---the actual shape does not matter, both methods have the same properties.
This can be fine if your constraints are really loose, and they may be here in your particular instance. For the sake of making this answer more useful to everybody I've answered the problem assuming your constraints are not quite as loose. For mobile geolocation apps in general this approximation causes problems. Maybe you've seen an app that's got it wrong. The resulting distances don't make any sense. There are several flaws with the previous approach:
Bad approximation: The scale of the Earth is big so simple approximations can be off by a scale of miles. This is bad if the user is in a car and really bad if the user is on foot. Simple tricks with degrees are not an improvement over circles or squares: latitude and longitude are not evenly distributed, and you need a full approximation like Haversine to get accurate data.
Excludes good points: Unless you severely expand the shape, your sample will exclude valid points.
Includes bad points: And the more you expand the shape to capture excluded valid points, the more you capture invalid points.
Broken sort order: Without the real calculation, you won't be able to correctly sort the output by distance. This means any ordered list you produce will be way off. Usually you want the points in order of closeness.
Two calculations: If you go back and fix it, you're doing two computations.
Single-point Haversine in C/Objective-C
The selection code is already in the other answer. This is the recomputation you referenced that you'll have to perform for each point after in Objective-C or C:
/////////////////////////////////////
//Fill these in or turn them into function arguments
float lat1=center_point.latitude;
float lon1=center_point.longitude;
float lat2=argument_point.latitude;
float lon2=argument_point.longitude;
/////////////////////////////////////
//Conversion factor, degrees to radians
float PI=3.14159265358979;
float f=PI/180;
lat1*=f; lon1*=f; lat2*=f; lon2*=f;
//Haversine Formula (from R.W. Sinnott, "Virtues of the Haversine", Sky and Telescope, vol. 68, no. 2, 1984, p. 159):
float dlon = lon2 - lon1;
float dlat = lat2 - lat1;
float a = pow((sin(dlat/2)),2) + cos(lat1) * cos(lat2) * pow((sin(dlon/2)),2);
float c = 2 * asin(MIN(1,sqrt(a)));
float R = 6378.137;//km Radius of Earth
float d = R * c;
d *= 0.621371192; //optional: km to miles [statute] conversion factor
//NSString conversion?
//if (d >= .23) return [NSString stringWithFormat:#"%0.1f m",d]; //.23m ~ 400 yds
//d *= 5280; //miles to feet conversion factor
//d /= 3; //feet to yards
//int y=(int)d;
//return [NSString stringWithFormat:#"%d yds",y];
return d;
That should be all the tools you need to complete your task as discussed.
Haversine in SQLite
I would like to show you a direct SQLite-only solution, but I was never able to get Haversine to run satisfactorily directly inside of SQLite. You don't get a square root in SQLite. You don't get a pow() function, though you can repeat an argument times itself. You don't get sin, cos, or sinh. There are extensions that add some of these features. I don't know how well-supported they are compared to the base SQLite. Even with them it's going to be too slow.
People seem to recommend updating the columns with pre-computed sines. That's fine as long as you don't need to take the sine of a difference over a whole column, in which case you're writing a new column to the table every time you need to make a new calculation, which is terrible. At any rate, I'd like to show you a comparison of how slow SQLite is on computing the Haversine, but I can't get it to compute it at all. (I think my memory of SQLite being slow on this is actually a memory of MySQL on the server being slow.)
All-points Solution in Kerf
The preceding discussion I hope is a close-to-exhaustive look at what you can do with the standard tools.
The good news is if you do it right this calculation is fast on the phone. I built a thing for you that I think solves your problem in a better way. If you are willing to use another tool, in Kerf this problem is easy. I went back and committed to the repo vectorized operations for trigonometric functions so that the calculations would be fast. My iPhone 5 will do 10,000 points in 20 milliseconds, and 100,000 points in 150 milliseconds. You can do a million points in 1.5 seconds, but at that point you'd need a throbber. Disclosure as per the rules: I built it.
From the Kerf REPL:
//ARTIFICIAL POINT GENERATION ///////////////////
n: 10**4
point_lat: 80 + rand(40.0)
point_lon: 80 + rand(40.0)
mytable: {{lats: 60 + rand(n, 60.0), lons: 60 + rand(n, 60.0)}}
lats : mytable.lats
lons : mytable.lons
/////////////////////////////////////
//COMPUTATION////////////////////////
dlon: lons - point_lon
dlat: lats - point_lat
distances_km: (6378.137 * 2) * asin(mins(1,sqrt(pow(sin(dlat/2),2) + cos(point_lat) * cos(lats) * pow(sin(dlon/2) ,2))))
//distances_miles: 0.621371192 * distances_km //km to miles [statute] conversion
//sort_order: ascend distances_km
Or via the Kerf iOS SDK. Removing the semicolon at the end of a statement will allow you to log it as JSON to the terminal.
KSKerfSDK *kerf = [KSKerfSDK new];
kerf.showTimingEnabled = YES;
//Sample Data Generation
[kerf jsonObjectFromCall:#"n: 10**4;"];
[kerf jsonObjectFromCall:#"point_lat: 80 + rand(40.0);"];
[kerf jsonObjectFromCall:#"point_lon: 80 + rand(40.0);"];
[kerf jsonObjectFromCall:#"mytable: {{lats: 60 + rand(n, 60.0), lons: 60 + rand(n, 60.0)}};"];
[kerf jsonObjectFromCall:#"lats : mytable.lats;"];
[kerf jsonObjectFromCall:#"lons : mytable.lons;"];
//Computation
[kerf jsonObjectFromCall:#"dlon: lons - point_lon;"];
[kerf jsonObjectFromCall:#"dlat: lats - point_lat;"];
NSLog(#"%#", [kerf jsonObjectFromCall:#"distances_km: (6378.137 * 2) * asin(mins(1,sqrt(pow(sin(dlat/2),2) + cos(point_lat) * cos(lats) * pow(sin(dlon/2) ,2)))); "]);
To test whether points (x,y) are inside a circle with center (cx,cy) and radius r, use the equation (x-cx)² + (y-cy)² <= r².
This does not correspond to circle because longitude and latitude values do not have the same length in the earth's surface, but it's near enough.
In SQL:
... WHERE (longitude - :lon) * (longitude - :lon) +
(latitude - :lat) * (latitude - :lat) <= :r * :r
If you use a rectangle instead, you can use simpler expressions that have a chance of being optimized with an index:
... WHERE longitude BETWEEN :XMin AND :XMax
AND latitude BETWEEN :YMin AND :YMax

Simulate and present normal distribution

My task is to compare different methods of simulating normal distribution. For example, I use following code, to generate 2 vectors, each 1000 values (Box-Muller method):
k=1;
mu=0;
N = 1000;
alpha = rand(1, N);
beta = rand(1, N);
val1 = sqrt(-2 * log(alpha)) .* sin(2 * pi * beta);
val2 = sqrt(-2 * log(alpha)) .* cos(2 * pi * beta);
hist([val1,val2]);
hold on;
%Now I want to make normal distr pdf over hist to see difference
[f,x] = ecdf(mu+sigma*[val1,val2]);
p = normpdf(x,mu, sigma);
plot(x,p*N,'r');
However, it's look very ugly - I can't distinct val1 from val2 and also my pdf doesn't fit histogram well. I think I'm doing something wrong with this pdf, but I don't know what. I found on the Internet different code:
r = rand(1000,2); % 2 cols of uniform rand
%Box-Muller
%n = sqrt(-2*log(r(:,1)))*[1,1].*[cos(2*pi*r(:,2)), sin(2*pi*r(:,2))];
hist(n) % plot two histograms
It looks better, but I don't know how to plot normal distribution pdf over it - method with ecdf cause error.
I'm rather new in Matlab and sometimes I make simple mistakes (like with vector dimensions) but for now I barely can see them.
Can someone help me with above or propose another way to simulate normal random variables and comparision to it (with B-M method or another, just not so complicated)?
I think your plots have different scales, corrected code would look like this:
clear all;
sigma=1; mu=0; N = 1000;
alpha = rand(1, N); beta = rand(1, N);
val1 = sqrt(-2 * log(alpha)) .* sin(2 * pi * beta);
val2 = sqrt(-2 * log(alpha)) .* cos(2 * pi * beta);
vals = [val1,val2];
Nbins = 50; [h,hx] = hist(vals,Nbins);
bar(hx,h*0.5/(hx(2)-hx(1)))
hold on;
%Now I want to make normal distr pdf over hist to see difference
[f,x] = ecdf(mu+sigma*vals);
p = normpdf(x,mu, sigma);
plot(x,p*N,'r');
As mentioned in the comments, quantitative comparison of the distributions requires performing statistical tests (e.g. goodness of fit http://en.wikipedia.org/wiki/Goodness_of_fit)

Madgwick's sensor fusion algorithm on iOS

i'm trying to run Madgwick's sensor fusion algorithm on iOS. Since the code is open source i already included it in my project and call the methods with the provided sensor values.
But it seems, that the algorithm expects the sensor measurements in a different coordinate system. The Apple CoreMotion Sensor System is given on the right side, Madgewick's on the left. Here is the picture of the different coordinate systems. Both systems follow the right hand rule.
For me it seems like there is a 90 degree rotation around the z axis. But this didn't work.
I also tried to flip x and y (and invert z) axis as suggested by other stackoverflow posts for WP but this didn't work also. So do you have a hint?
Would be perfect if Madgwick's alogithm output could be in the same system as the CoreMotion output (CMAttitudeReferenceFrameXMagneticNorthZVertical).
Furthermore I'm looking for a good working value for betaDef on the iPhone. betaDef is kind of the proportional gain and is currently set to 0.1f.
Any help on how to achieve the goal would be appreciated.
I'm not sure how to write this in objective c, but here's how I accomplished the coordinate transformations in vanilla c. I also wanted to rotate the orientation so that +y is north. This translation is also reflected in the below method.
This method expects a 4 element quaternion in the form of wxyz, and returns a translated quaternion in the same format:
void madgeq_to_openglq(float *fMadgQ, float *fRetQ) {
float fTmpQ[4];
// Rotate around Z-axis, 90 degres:
float fXYRotationQ[4] = { sqrt(0.5), 0, 0, -1.0*sqrt(0.5) };
// Inverse the rotation vectors to accomodate handedness-issues:
fTmpQ[0] = fMadgQ[0];
fTmpQ[1] = fMadgQ[1] * -1.0f;
fTmpQ[2] = fMadgQ[2];
fTmpQ[3] = fMadgQ[3] * -1.0f;
// And then store the translated Rotation into ret:
quatMult((float *) &fTmpQ, (float *) &fXYRotationQ, fRetQ);
}
// Quaternion Multiplication operator. Expects its 4-element arrays in wxyz order
void quatMult(float *a, float *b, float *ret) {
ret[0] = (b[0] * a[0]) - (b[1] * a[1]) - (b[2] * a[2]) - (b[3] * a[3]);
ret[1] = (b[0] * a[1]) + (b[1] * a[0]) + (b[2] * a[3]) - (b[3] * a[2]);
ret[2] = (b[0] * a[2]) + (b[2] * a[0]) + (b[3] * a[1]) - (b[1] * a[3]);
ret[3] = (b[0] * a[3]) + (b[3] * a[0]) + (b[1] * a[2]) - (b[2] * a[1]);
return;
}
Hope that helps!

Resources