how to construct a RTree using given data points - r-tree

I need to construct a R tree using given data points.I have searched for implementation of R tree.All the implementation i found construct r tree when given coordinates of rectangle as input.I need to construct r tree when given data points itself(it can be 1 dimensional).The code should take care of creating rectangles which encloses these data points and construct r tree.

Use MBRs (Minimum bounding rectangle) with min = max = coordinate. They all do it this way. Good implementations will however store point data with approximately twice the capacity in the leaves than in directory nodes.

If you're looking for a C++ implementation, the one contained in Boost.Geometry currently (Boost. 1.57) is able to store Points, Boxes and Segments. The obvious advantage is that the data in leafs of the tree is not duplicated which means that less memory is used, caching is better, etc. The usage looks like this:
#include <boost/geometry.hpp>
#include <boost/geometry/geometries/geometries.hpp>
#include <boost/geometry/index/rtree.hpp>
#include <vector>
namespace bg = boost::geometry;
namespace bgi = boost::geometry::index;
int main()
{
typedef bg::model::point<float, 2, bg::cs::cartesian> point;
typedef bg::model::box<point> box;
// a container of points
std::vector<point> points;
// create the rtree
bgi::rtree< point, bgi::linear<16> > rtree(points.begin(), points.end());
// insert some additional points
rtree.insert(point(/*...*/));
rtree.insert(point(/*...*/));
// find points intersecting a box
std::vector<point> query_result;
rtree.query(bgi::intersects(box(/*...*/)), std::back_inserter(query_result));
// do something with the result
}

I guess that using an Rtree to store points seems like a misuse. Although this kind of structure is indicated to store spatial data, after some research I just found out it is best suited for storing non-zero area regions (as the R from the name is for Region or Rectangle). Creating a simple table with a nice index should offer better performance either for updating and searching data. Consider my example below:
CREATE TABLE locations (id, latitude, longitude);
CREATE INDEX idx_locations ON locations (latitude, longitude);
is preferable over
CREATE VIRTUAL TABLE locations USING rtree( id, minLatitude, maxLatitude, minLongitude, maxLongitude);
if you are just planning to repeat minLatitude over maxLatitude and minLongitude over maxLongitude for every row as to represent points and not rectangles. Although the latter approach will work as expected, Rtrees are suited to index rectangle areas and using them to store points is a misuse with worst performance. Prefer a compound index as above.
Further reading: http://www.deepdyve.com/lp/acm/r-trees-a-dynamic-index-structure-for-spatial-searching-ZH0iLI4kb0?key=acm

Related

iOS Import .obj file to Model I/O without duplicating vertices

I'm trying to import a .obj file to use in Scene Kit using the Model I/O framework. I initially used the simple MDLAsset initWithURL: function, but after transferring the mesh to a SCNGeometry, I realized this function was triangulizing the mesh, such that each face had 3 unique vertices, and there were separate vertices at the same location for border faces. This was causing some major problems with my other functions, so I tried to fix it by instead using the MDLAsset initWithURL:vertexDescriptor:bufferAllocator:preserveTopology function with preserveTopology set to YES with the descriptor/allocator set to the default with nil. This preserving topology fixed my problem of duplicating vertices, so the faces/edges were all good, but in the process I lost the normals data.
By lost the normals, I don't mean multiple indexing, I mean after setting preserveTopology to YES, the buffer did not contain any normals values at all. Whereas before it was v1/n1/v2/n2... and the stride was 24 bytes (3 dimensions *4 bytes/float * 2 attributes), now the first half of the buffer is v1/v2/... with a stride of 12 and the entire 2nd half of the buffer is just 0.0 floats.
Also something weird with this, when you look at the SCNGeometrySources of the Geometry, there are 2 sources, 1 with semantic kGeometrySourceSemanticVertex, and 1 with semantic kGeometrySourceSemanticNormal. You would think that the semantic vertex source would contain the position data, and the semantic normal source would contain the normal data. However that is not the case. No matter what you set preserveTopology, they are buffers of size to contain both position and normal data with identical values. So when I said before there was no normal data, I mean both of these buffers, semantic vertex AND semantic normal went from being v1/n1/v2/n2... to v1/v2/.../(0.0, 0.0, 0.0)/(0.0, 0.0, 0.0)/... I went into the mdlmesh's buffer (before the transfer to scene kit) at found the same problem, so the problem must be with the initWithURL, not with the model i/o to scenekit bridge.
So I figured there must be something wrong with the default vertex descriptor and buffer allocator (since I was using nil) and went about trying to create my own that matched these 2 possible data formats. Alas after much trying I was unable to get something that worked.
Any ideas on how I should do this? How to give MDLAsset the proper vertexDescriptor and bufferAllocator (I feel like nil should be ok here) for importing a .obj file? Thanks
An obj file with vertices and normals has vertices, indicated by v lines, normals, indicated by vn lines, and faces, indicated by f lines.
The v and vn lines will just be the floating point values you expect, and the f line will be of the form -
f v0//n0 v1//n1 etc
Since OpenGL and Metal don't allow multiple indexing, you'll see the first effect of vertices being duplicated. For example,
f 0//0 1//2 2//0
can't work as a vertex buffer because it would require different indices per vertex. So typical OBJ parsers have to create new vertices that allow the face to become
f 0//0 1//1 2//2
The preserve topology option doesn't help you. It preserves the connectivity and shape of the mesh (no triangulation occurs, shared edges remain shared) but it still enforces a single index per vertex component.
One solution would be to make sure that your tool that is outputting the OBJ files uses single indexing during export, if that is an option.
Another option, and this won't solve the problem immediately, would be file a request that multiple-indexing be supported at the Model I/O level. SceneKit would still have to uniquely-index because it is has to be able to render.
Another option would be to use a format like PLY that doesn't have multiple indexing.

Difficulty counting cells due to clustering and pixel value cut off

EDIT:
I have continued working on my problem among other things and have made significant process. Using one Dr. Ashby's macro provided on ImageJwiki, and using some of my own makeshift code I can now do batch processing of images taken of Hoescht, Calcein AM, and Ethidium Homodimer stains and get decent recognition of objects. Reducing exposure time and levels of stain used (specifically calcein AM) has helped with the pixel value cut offs I was dealing with earlier. The macro still has problems with distinguishing clumped cells from one another though. To address this issue I want to implement a command in my macro that divides clusters of cells that it identifies as one cell based on the average size of our cells. The only problem is that in all my reading I haven't seen anything that mentions this. Does anyone have any thoughts on how I could implement this code? I have copied the macro below.
//get appropriate directories from user
dir1 = getDirectory("Choose Source Directory ");
dir2 = getDirectory("Choose Destination directory");
list = getFileList(dir1);
//give user an opportunity to adjust default parameters to better fit their application
Dialog.create("Adjust for objective magnification");
Dialog.addNumber("Objective Magnification (use 10 if unknown)", 10);
Dialog.addMessage("\tIf needed particle size limits can be adjusted below \nLeave mag. at 10 if customizing particle size limits\n");
Dialog.addNumber("Minimum particle size (pixels^2)",420);
Dialog.addNumber("Maximum particle size (pixels^2)",1600);
Dialog.addMessage("\tIn the following dialogs select \n first the Source Directory, \nthen a Destinaion directory for Results");
Dialog.show();
//Assigning the entered values to variables
magnification=Dialog.getNumber();
userMin=Dialog.getNumber();
userMax=Dialog.getNumber();
sMin=magnification*magnification/100*userMin;
sMax=magnification*magnification/100*userMax;
setBatchMode(true);
for (i=0; i<list.length; i++){
//print (list[i]);
open(dir1+list[i]);
name=File.nameWithoutExtension;
//Prepare the image by removing any scale and making 8-bit
run("Set Scale...", "distance=0 known=0 pixel=1 unit=pixel");
run("8-bit");
saveAs("Tiff", dir2+i+" Original "+name);//Saving with this naming scheme is required for the MeLast macro to function
//run("Brightness/Contrast...");
setMinAndMax(50, 255);
setOption("BlackBackground", false);
run("Make Binary", "method=Yen background=Light calculate black");
run("Watershed", "stack");
//Analyze particles
run("Analyze Particles...", "size="+sMin+"-"+sMax+" circularity=0.50-1.00 show=[Count Masks] display exclude include summarize");
//Save the masks file
saveAs("Tiff", dir2+i+" CountMask "+name);//Saving with this naming scheme is required for the MeLast macro to function
close();
//Save the thresholded image
saveAs("Tiff", dir2+i+" Thresholded "+name);//Saving with this naming scheme is required for the MeLast macro to function
}
//Save the results
selectWindow("Results");
saveAs("Results", dir2+"ZZ Results.xls");
//Save the summary
selectWindow("Summary");
saveAs("Text", dir2+"Z Summary.txt");
You need to find those clusters and analyze each to guess how many cells might belong to that cluster, using spatial information of the cells and other specific information in your problem domain. I believe that's an usual image analysis task.
As for cut-off pixel values, I guess you can consider the cut-off pixels as censored data. But I am not sure how meaningful it would be for 8 bit depth images.
There is another free, open-source program called CellProfiler (http://www.cellprofiler.org) that has some more specialized methods for separating cells -- more advanced than the standard watershed. See, for example, part of the manual here: http://www.cellprofiler.org/CPmanual/IdentifyPrimaryObjects.html.
Perhaps CellProfiler can do the job, or point you to the right algorithms to bring into the ImageJ macro.

Efficient matrix copying in OpenCV

I have no idea for how to implement matrix implementation efficiently in OpenCV.
I have binary Mat nz(150,600) with 0 and 1 elements.
I have Mat mk(150,600) with double values.
I like to implement as in Matlab as
sk = mk(nz);
That command copy mk to sk only for those element of mk element at the location where nz has 1. Then make sk into a row matrix.
How can I implement it in OpenCV efficiently for speed and memory?
You should take a look at Mat::copyTo and Mat::clone.
copyTo will make an copy with optional mask where its non-zero elements indicate which matrix elements need to be copied.
mk.copyTo(sk, nz);
And if you really want a row matrix then call sk.reshape() as member sansuiso already suggested. This method ...
creates alternative matrix header for the same data, with different
number of channels and/or different number of rows.
bkausbk gave the best answer. However, a second way around:
A=bitwise_and(nz,mk);
If you access A, you can copy the non-zero into a std::vector. If you want your output to be a cv::Mat instance then you have to allocate the memory first:
S=countNonZero(A); //size of the final output matrix
Now, fast element access is an actual topic of itself. Google it. Or have a look at opencv/modules/core/src/stat.cpp where countNonZero() is implemented to get some ideas.
There are two steps involved in your task.
First, you convert to double the input matrix:
cv::Mat binaryMat; // source matrix, filled somewhere
cv::Mat doubleMat; // target matrix (with doubles)
binaryMat.convertTo(doubleMat, CV64F); // Perform the conversion
Then, reshape the result as a row matrix:
doubleMat = cv::reshape(doubleMat, 1, 1);
// Alternatively:
cv::Mat doubleRow = cv::reshape(doubleMat, 1, 1);
The cv::reshape operation is efficient in the sense that the data is not copied, only the structure header changes.
This function returns a new reference to a matrix (by creating a new header), thus you should not forget to assign its result.

Check if user is near route checkpoint with GPS

Here's the situation:
I have a predetermined GPS route that the user will run. The route has some checkpoints and the user should pass near all of them (think of them as a racing game checkpoint, that prevents the user from taking shortcuts). I need to ensure that the user passes through all the checkpoints.
I want to determine an area that will be considered inside a checkpoint's radius, but I don't want it to be just a radial area, it should be an area taking into consideration the form of the path.
Didn't understand it? Neither did I. Look at this poorly drawn image to understand it better:
The black lines represents the pre-determined path, the blue ball is the checkpoint and the blue polygon is the wanted area. The green line is a more precise user, and the red line is a less accurate user (a drunk guy driving maybe? lol). Both lines should be inside the polygon, but a user who skips totally the route shouldn't.
I already saw somewhere here a function to check is the user is inside a polygon like this, but I need to know how to calculate the polygon.
Any suggestions?
EDIT:
I'm considering using the simple distanceTo() function to just draw an imaginary circle and check if the user is there. That's good because is so much simple to implement and understand, and bad because to make sure the most erronic user passes whithin the checkpoint I would need a big radius, making the correct user enter the checkpoint area sooner than expected.
And just so you guys understand the situation better, this is for an app that is supposed to be used in traffic (car or bus), and the checkpoints should be landmarks or spots that divides your route, for example, somewhere where traffic jam starts or stops.
You could just check the distance between the two, assuming you know the GeoLocation of the checkpoint.
Use the distanceTo function and setup a threshold of however many meters the user needs to be from the checkpoint to continue on.
Edit
Since you want to avoid distanceTo, here is a small function I wrote a while back to check if a point is in a polygon:
public boolean PIP(Point point, List<Point> polygon){
boolean nodepolarity=false;
int sides = polygon.size();
int j = sides -1;
for(int i=0;i<sides;i++){
if((polygon.get(i).y<point.y && polygon.get(j).y>=point.y) ||(polygon.get(j).y<point.y && polygon.get(i).y>=point.y)){
if (polygon.get(i).x+(point.y-polygon.get(i).y)/(polygon.get(j).y-polygon.get(i).y)*(polygon.get(j).x-polygon.get(i).x)<point.x) {
nodepolarity=!nodepolarity;
}
}
j=i;
}
return nodepolarity; //FALSE=OUTSIDE, TRUE=INSIDE
}
List<Point> polygon is a list of the points that make up a polygon.
This uses the Ray casting algorithm to determine how many intersections a ray makes through the polygon.
All you would need to do is create the 'boundary' around the area you need with GeoPoints being translated into pixels using the toPixels method.
Store those points into a List<> of points, and you should be all set.
check a few algos to do this in the link below
http://geospatialpython.com/2011/01/point-in-polygon.html
I know this is an old question, but maybe it would be useful for someone.
This is a simpler method, with much less computation needed. This would not trigger the first time the user comes inside the threshold area, it only gets the closest point where the user has passed near the checkpoint AND (s)he has come close enough.
The idea is to maintain a 3 item list of distances for every checkpoint, with the last three distances in it (so it would be [d(t), d(t-1), d(t-2)]). This list should be rotated on every distance calculation.
If on any distance calculation the previous d(t-1) distance is smaller than the current one d(t) and bigger than the preceding d(t-2), then the moving point has passed the checkpoint. Whether this was a real passing, or it was only a glitch, can be decided by checking the actual distance d(t-1).
private long DISTANCE_THRESHOLD = 2000;
private Checkpoint calculateCheckpoint(Map<Checkpoint, List<Double>> checkpointDistances)
{
Map<Checkpoint, Double> candidates = new LinkedHashMap<Checkpoint, Double>();
for (Checkpoint checkpoint: checkpointDistances.keySet())
{
List<Double> distances = checkpointDistances.get(checkpoint);
if (distances == null || distances.size() < 3)
continue;
if (distances.get(0) > distances.get(1) && distances.get(1) < distances.get(2) && distances.get(1) < (DISTANCE_THRESHOLD)) //TODO: make this depend on current speed
candidates.put(checkpoint, distances.get(1));
}
List<Entry<Checkpoint, Double>> list = new LinkedList<Entry<Checkpoint,Double>>(candidates.entrySet());
Collections.sort(list, comp);
if (list.size() > 0)
return list.get(0).getKey();
else
return null;
}
Comparator<Entry<Checkpoint, Double>> comp = new Comparator<Entry<Checkpoint,Double>>()
{
#Override
public int compare(Entry<Checkpoint, Double> o1, Entry<Checkpoint, Double> o2)
{
return o1.getValue().compareTo(o2.getValue());
}
};
The function gets one parameter - a Map<Checkpoint, List<Double>> with the checkpoints and the list of the last three distances. It outputs the closest Checkpoint passed or null (if there were none).
The DISTANCE_THRESHOLD should be chosen wisely.
The Comparator is just to be able to sort the Checkpoints based on their distance to the user to get the closest one.
Naturally this has some minor flaws, e.g. if the moving point is moving criss-cross, or the error movement from GPS precision is commensurable with the actual speed of the user, than this would give multiple pass marks, but this would hit almost any algorithm.

Mapping points from Euclician 2-space onto a Poincare disc

For some reason it seems that everyone writing webpages about Poincare discs is only concerned with how to represent lines and measure distances.
I'd like to morph a collection of 2D points (as defined by x,y coordinates in the Euclidian plane) onto a Poincare disc, but I have no idea what the algorithm is supposed to be like. At this point I don't even know if it's possible to create a mapping between Euclidian 2-space and a Poincare disc...
Any pointers?
Goodwill,
David
You describe your data as a collection of points. But from your comments, you want to make lines in the plane still map to lines in the disk. You seem to want to preserve the "structure" of the space somehow, which is probably why you use the term "morph". I think that you want a conformal map.
There is no conformal bijection between the disk and the plane. There is such a mapping between the half-plane and the disk, and it preserves "lines", but not the kind that you want, unfortunately.
You said "I don't even know if it's possible to create a mapping" ... there are a number of mappings for you to choose from (see the Unit Disk page for an example) but there are none with all the features you seem to want.
If I understand everything correctly, the answer you get on the other forum is for the Beltrami–Klein model. Once you have that, you can get to the coordinates in the Poicare' disk with
p = b / (1 + sqrt(1 - b * b))
Where p is the vector of coordinates in the Poincare' disk (i.e. what you need) and b is the one in the Beltrami–Klein model (i.e. what you get from the other answer).

Resources