How do I combine GPS track data with another time-coded dataset? - mapping

I have GPS track data from a logging device, in GPX format (or any other format, easily done with gpsbabel). I also have non-GPS data from another measurement device over the same time period. However, the measurement intervals of both devices are not synced.
Is there any software available that can combine the measurement data with the GPS data, so I can plot the measured values in a spatial context?
This would require matching of measurement times and interpolation of GPS trackpoints, combining the results in a new track file.
I could start scripting all of this, but if there are existing tools that can do this, I'd be happy to know about them. I was thinking that GPSBabel might be able to do this, but I haven't found how.

a simple Excel Macro would do your job

In desktop GIS software you could import the two data types in their appropriate formats (which you haven't specified) whether they are shapefiles or even simply tables. Then a table join can be undertaken based on attributes. By selecting the measurement times as the join fields then a table will be created where if the measurement times values are shared in both your types of data the rows will be appended to one another.

Related

How to deal with historicization data in a data lake vs data warehouse?

It is possible (or even a core functionality) having data historicized within a classic data warehouse. Data will be added to the data warehouse over time and it is possible to move in time over the data.
If I just want to use the data lake and to have also data historicization for the business user, would this be possible? And if yes, how would a possible approach look like?
Yes - you can do it. If you just do inserts of data then you will have, by default, a full history of all your data.
The possible approaches would be entirely dependent on the technology you were running to support your data lake, how you have structured your data in the data lake, the tools your business users were using to access the data, etc. So without much more information from you it's not possible to give you an answer - other than the generic "yes, it is possible to hold historic data in a data lake"
Your classic data warehouse will bring data together, modelled with time series at the centre.
Data lakes hold the raw data in the original format, which typically will not be stored with time series in mind. You are able to store your data so that the time series and historical changes can be worked out, but a data lake will be missing the pre modelled, easily accessible time series aspect of a data warehouse.

Firebase - App developing - calculate the delta without generating a high data traffic

We are developing a social app with Firebase (swift / iOS).
We face the problem that we have two data trees and have to calculate the delta without generating a high data traffic.
Example:
We have a structure cars and a structure user.
The structure cars contain 100 different vehicle models.
The user structure contains all vehicle models that have already been driven by the user.
We now want to implement a high-performance solution in order to determine all the vehicles that have not yet been driven by a user without downloading the whole tree structure.
The number of users and the number of vehicles are growing steadily.
Does anyone have a solution approach or idea in which direction we need to think?
love, alex
I think they key to effectively using firebase is data duplication. So if you want to display a list of cars the user has and hasn't driven, create a separate table containing only the information displayed in that list, like the path to an image, the make & model, using unique IDs as the keys to entries in that table. You wouldn't need to know things like top speed and price until they tap into details, right? (I'm making some assumptions here though.)
Then, simply get the list of unique IDs for the cars the user already has driven, and manipulate your offline model accordingly.
Right now I'm using an external server to manage data duplication, that propagates a write operation to other places in the database when necessary. I'm on my phone right now but I think Ray Wenderlich has an article about this.

Bloomberg real-time data with lot sizes

I am trying to download real-time trading data from Bloomberg using the api.
So far I can get bid / ask / last prices successfully but in some exchanges (like canada) quote sizes are in lots.
I can query the lots sizes of course with reference data api and write them for every security in the database or something like that but to convert the size for every quote tick is very "expensive" conversion since they come every second and maybe more often.
So is there any other way to achieve this?
Why do you need to multiply each value by lot size? As long as the lot size is constant each quote is comparable and any computation can be implemented using the exchange values. Any results scaled in a presentation layer if necessary.

Synchronized random numbers

I have 2 devices, and I am looking for a way to sync random number generation between them.
More background: The 2 devices connect, one device sends the other a file containing a data set. The data set is then loaded on both devices. The data is displayed with randomization at various levels. I want the display to be synced between the devices, however still randomized.
A conceptual example: Take a stack of pictures. A copy of the stack is sent to the remote device and stored for future use. The stacks are then shuffled the same way on both devices so that drawing the first picture on each device will result in the same output. This is overly simplified, there are far more random numbers required in my application so optimizations such as sharing the sort order are not applicable...
Breaking it down: I need a simple way to draw from the same random number pool on 2 devices. I do not know how many random draws may occur before the devices sync, but once synced it should be predictable that they will draw the same number of random numbers since they are using the same data sets, however there is a chance one could draw more than the other before proceeding to the next batch (which would require a re-sync of the random data).
I'm looking to avoid having to transfer sort orders, position info etc. for each entity already transferred in the data set at display time (which also raises structural concerns since the project wasn't initially designed to share that info) by being able to generate the same placement, which requires the random numbers come out in the same order.
Any thoughts or suggestions would be much appreciated.
You can use an LCG algorithm and set the same seed for the generation. Because an LCG algorithm is deterministic, as long as you seed both devices with the same seed, they will produce exactly the same pseudo-random numbers.
You can find more information on the LCG algorithm here:
Linear congruential generator
This LCG is used for example by java.util.Random.
If you give rand() the same seed on each device, i.e. srand( SEED );, the (pseudo-)random numbers that come out are guaranteed to be the same every time, and you can keep pulling numbers out indefinitely without reseeding.
Most random number generators let you set the "seed". If you create two random number generators, implementing the exact same generation algorithm, on two different machines (need not even be of the same type or running the same operating system) and then supply both machines with the same "seed" value, they will both produce the exact same random number sequence.
So your "sync" should really only need to transfer one number (generally itself a randomly-chosen number) from the first machine to the second. Then both machines use that same number as the "seed".
(I'd look up the specifics for the iPhone random number generators, but the Apple documentation site has apparently been affected by the Minnesota government shutdown.)
If you do not always want to specify the seed, you could simply designate one device as the master. When the master generates a random number, it sends a message to the other device containing that random number.
If it is truly random no seed number will generate the same number on second machine. It is implied that both random and chaos theories would apply.

Tools for mapping time series data

I'm looking for suggestions/examples of tools or APIs that enable the mapping of large amounts of time series data into an intensity map.
The data includes dimensions for country, series, and year. Here's an example http://spreadsheets.google.com/ccc?key=t9ZwziZAgy768ZTXDEg8Maw&authkey=CPn0pdoH&hl=en_GB&ui=1
UUorld is a good choice if you want to create videos that show data changing over time. They're heavy on the 3-D, but I found some examples of what appear to be 2-D intensity maps in the gallery. The trial version is free and does not expire.
For static images, I love indiemapper. It's very simple to use and has beautiful color palettes and typography options. It also has 16 different map projections, if you're into that. The free trial is 30 days.
The caveat with these (and other mapping software) is that you may have to convert your data into a certain format, depending on what it is now. For example, indiemapper takes shapefiles, KML, and GPX as input.
Try GeoCommons, you would need to reformat your spreadsheet a bit, but once you get it in there you can join to country boundaries, creat an interactive temporal map, and embed it wherever you want. everything is web based so no need to download anything.

Resources