Identify mostly visited places when coordinates (latitude and longitude) are given - machine-learning

I'm working on a project where the locations visited by people are captured in terms of latitude and longitude and analyze all these coordinates to identify the mostly visited places.
I finished up to retrieving the all the coordinates visited by the people and sending those data to a database as well as writing them to a text file. I tried to cluster the data by retrieving them from the text files. Since I'm totally new to machine learning, I'm finding it hard to figure out what to do exactly with the data.
So can anyone please help me to figure out a correct approach to identify the mostly visited places by analyzing the coordinates that I'm having ?

As stated, there is quite a bit of missing information for this question but I will have a go from what I understand.
I can think of two scenarios, and the approach to solving each is not something I would really consider as machine learning. The first scenario is that you have already attributed the lat/long to a definitive location e.g. “visitors of Buckingham Palace”, which would have a set lat/long coordinate associated with it. You’d then be able to generate a list of (Monument_lat, Monument_lon, weight) where weight is the number of visitors attributed to that location. Then it would simply be a case of sorting that list by weight, as has been suggested. I’m not clear on why you don’t think this is the most efficient way (list sorting is trivial and fast).
The other scenario involves raw lat/long data from a phone where you might have extremely similar lat/long pairs, but not exactly the same. You want to group these to single locations. You could divide the region of interest into small rectangular zones where you store the lat/long data for each of the corners of the zones. You then run a ray-casting algorithm to solve the point-in-polygon problem, thereby attributing the raw lat/long data to a zone, and you find the centre coordinate of each zone to apply the "weight".
I don’t know what language you are using, but there is an open-source ray casting algorithm for Python. Depending on the scope of your problem, there could be slight alterations that you might want to make. Firstly, if you are defining the location by a monument name and you don’t have too many, you could go on Google Maps and define your own lat/long corners of zones, to store as a list. If you’re not interested in classifying in a monument-name fashion, you simply divide the whole area into even rectangles. If you wanted, say, 10 metre precision across an entire country then you need to have layers of different sized zones to minimise the computational effort. You might divide the country into 10x10km squares and do a ray cast on that scale to give a rough sorting stage, before doing another ray cast on a 10x10m scale within the 10x10km zone.

Related

Contact mechanics in Drake

I have a general question regarding the accuracy of the contact mechanics of Drake. So far I have tried some different open source robotic simulation tools. They all appear to have the same problem when simulating the contact between two meshed objects, that the objects are unstable and fall off each other. Eg. in Gazebo I tried stacking two meshed objects (see https://youtu.be/_4qQh3pvAZ8) without success.
I am trying to learn assembly tasks using reinforcement learning. RL needs a lot of iterations (simulations) before it converges to a valid policy. Since RL needs to learn something in a reasonable time it is not possible to increase the accuracy (by reducing the step size), because that will also increase the computation time too much. In the end the only solution was to go to Adams, which is an (expensive) multibody mechanics software toolbox, where there is more freedom to optimize the contact between two specific objects. I also tried the simulator Klampt, where the contact is more accurate but it also adds a layer around each object.
Today I came across Drake and saw that in the videos the contact mechanics are really accurate. But most of the obects seem non-meshed objects (blocks and cilinders), of which the behavior is easier to approximate. So I am wondering if Drake also exhibits inaccurate behavior, like the video, with meshed objects stacked on top of each other? Also if the speed of the simulation is around the same speed as the real world?
I can't comment specifically on what may be causing your instability in other applications (e.g., Gazebo), but I can shed some light into contact stability in Drake.
Drake's default contact model is a very frequently imlemented "point contact" model (discussed here). Given two bodies in contact, intersection between the representative collision geometries is detected and the amount of penetration is reported by a pair of points representing the maximum amount of penetration (and the force is applied at that contact point).
For a sphere on a plane, this is perfectly sufficient, because the contact between a rigid sphere and plane is a single point. For stacking boxes, it is a poor approximation; the contact interface between two stacked boxes isn't a point, but a polygon where force would be applied across the full contact interface. Representing it as a point introduces artificial torques.
Drake has an additional contact model -- one that is currently in development and incomplete. It is called "hydroelastic" contact and instead of representing contact by measurement at a single point, it computes an entire surface of contact, distributing the contact force over that full surface. As you might imagine, this leads to far more stable contact. However, because the model is not complete, there are restrictions on how you can use it and when it'll actually provide value. However, the feature is available in Drake's public API and you are free to investigate it. A basic explanation of the characterization can be found here.
Some further thoughts based on the details above:
General, non-convex meshes.
For the point contact model, non-convex meshes are not directly supported. Instead, it uses the implicit convex hull of that mesh (which can negatively impact performance). If you know the mesh to be convex, you can declare it as such and Drake can use techniques which may improve the efficiency.
The hydroelastic contact model does support non-convex meshes, but they must be for strictly "rigid" objects. And that model only computes contact between soft and rigid objects. So, if you're hoping to compute contact between two non-convex meshes, you won't be able to use this model. In its current state, you need to model things as contact between rigid meshes and soft primitives. (e.g., you can create a "soft" box to serve as a table, and place any number of rigid, non-convex meshes on it stably, but contact between those rigid objects will not be stable).
Tricks for stable contact
One "trick" for getting better stability in contact with a point contact model is to change the representation of the collision geometry. Place small spheres on the surface of the object. The goal is to ensure that when it is in contact with other objects, you will get contact at multiple points. The challenge here is placing the spheres such that meaningful contact will generally get you at least three results.

Point in polygon based search vs geo hash based search

I'm looking for some advice.
I'm developing a system with geographic triggers, these enable my device to perform certain actions depending on where it is. The triggers are contained within polygons that are stored in my database I've explored multiple options to get this working, however, I'm not very familiar with geo-spacial systems.
An option would be to use the current location of the device and query the DB directly to give me all the polygons that contain that point, thus, all the triggers since they are linked together. A potential problem with this approach, I think, would be the possible amount of polygons stored, and the frequency of the queries, since this system serves multiple devices simultaneously and each one of them polls every few seconds.
An other option I'm exploring is to encode the polygons to an array of geo-hashes and then attach the trigger to each one of them.
Green is the geohashes that the trigger will be attached to, yellow are areas that need to be recalculated with a higher precision. The idea is to encode the polygon in the most efficient way down to X precision.
An other optimization I came up with is to only store the intersection of polygons with roads since these devices are only use in motor vehicles.
Doing this enable the device to work offline performing it's own encoding and lookup, with a potential disadvantage being that the device will have to implement logic to stay up-to-date with triggers added or removed ( potentially every 24 hours )
I'm looking for the most efficient way to implement this given some constrains such as:
Potentially unreliable networks ( the device has LTE connectivity )
Limited processing power, the devices for now are based on a raspberry pi 3 Compute module, however, they perform other tasks such as image processing.
Limited storage, since they store videos and images.
Potential large amount of triggers/polygons
Potential large amount of devices.
Any thoughts are greatly appreciated.

Image analysis technique to determine approximate change in view over a short period of time?

I am working on an open source package for robot owners. I want to do a decent job of detecting when the robot is having movement problems. One of the problems the robot commonly has is that the back wheel gets "tucked underneath" in a bad way and makes it turn very slowly when on carpet. I believe that with a combination of accelerometer value inspection and (I hope) a relatively simple yet robust vision analysis technique, I will be able to tell when the robot is having this specific problem.
What I need is to be able to analyze two images, separated by about 1/2 second in time, and get a numerical value that tells about how close they are, but in a way that has some intelligence about the objects in the screen instead of just a simple color/hue/etc. analysis. I've heard of an algorithm called optical flow that is used in object and scene tracking, but I'm hoping I don't need something heavyweight.
Is there an machine vision algorithm/function that can analyze two JPEG's and tell if they belong to the same scene and viewpoint, yet can also deliver a numerical monotonically increasing value that tells me rough how different they are? If I could get that numerical value and compare it to the number of milliseconds past, while examining the current accelerometer activity, I believe I can detect when the robot is having the "slow turn of death" problem.
If so, please tell me the basic technique involved, and if you know of machine vision library that implements it, which one it is.
but in a way that has some intelligence about the objects in the screen instead of just a simple color/hue/etc. analysis
What you are suggesting is a complex problem by itself, so forget about 'lightweight' solutions. Probably you are going to need something like optical flow.
Other options I would recommend you looking into are:
Vanishing points detection and variation from image to image. This quite fits into your problem domain Wikipedia
Disparity map: related to optical flow. Used for stereographic vision, but I think you can use it for the kind of application you are looking for. Take a look at this

pattern recognition in a time series

I understand that the question I am asking seem the be somewhat related to another question which has been asked already here and here.
But I feel that this is an entirely different question. (I have also submitted this question on the dsp.stackexchange)
I have a huge (over 100K data points) time series data of the position (x, y coordinates) of an element in space. This element is vibrating randomly, and both amplitude and the frequency of vibration is random. I want to look at the events which are similar and see if there is any pattern in those events, are they periodic or related somehow.
I am working on a biological problem and have very little knowledge about signal processing. I can provide more details. Any help would be really appreciated.
One of the areas of the research on time series patterns is called Motif Detection or Discovey, some use association strategies, other are probabilistic.
Some links here http://dl.acm.org/results.cfm?query=motif+discovery&Go.x=5&Go.y=12

People counting using OpenCV

I'm starting a search to implement a system that must count people flow of some place.
The final idea is to have something like http://www.youtube.com/watch?v=u7N1MCBRdl0 . I'm working with OpenCv to start creating it, I'm reading and studying about. But I'd like to know if some one can give me some hints of source code exemples, articles and anything elese that can make me get faster on my deal.
I started with blobtrack.exe sample to study, but I got not good results.
Tks in advice.
Blob detection is the correct way to do this, as long as you choose good threshold values and your lighting is even and consistent; but the real problem here is writing a tracking algorithm that can keep track of multiple blobs, being resistant to dropped frames. Basically you want to be able to assign persistent IDs to each blob over multiple frames, keeping in mind that due to changing lighting conditions and due to people walking very close together and/or crossing paths, the blobs may drop out for several frames, split, and/or merge.
To do this 'properly' you'd want a fuzzy ID assignment algorithm that is resistant to dropped frames (ie blob ID remains, and ideally predicts motion, if the blob drops out for a frame or two). You'd probably also want to keep a history of ID merges and splits, so that if two IDs merge to one, and then the one splits to two, you can re-assign the individual merged IDs to the resulting two blobs.
In my experience the openFrameworks openCv basic example is a good starting point.
I'll not put this as the right answer.
It is just an option for those who are able to read in Portugues or can use a translator. It's my graduation project and there is the explanation of a option to count people in it.
Limitations:
It's do not behave well on envirionaments that change so much the background light.
It must be configured for each location that you will use it.
Advantages:
It's fast!
I used OpenCV to do the basic features as, capture screen, go trough the pixels, etc. But the algorithm to count people was done by my self.
You can check it on this paper
Final opinion about this project: It's not prepared to go alive, to became a product. But it works very well as base for study.

Resources