Fastest free geocoding coordinates library for large datasets - geolocation

I know the tittle might be ambitious.
I'm using a large dataset and while using nominatim to determine 'latitude and longitude' with the city and country given, it spent so much hours...
I looked out for free libraries to manage large datasets fastly but can't find a specific one.
I read about QGIS but it's a plugin right? Not a code/script without even installing
The good ones are paid
Coordinates geolocation library to be used in large datasets

Related

What is the best visualization tool to analyze evolving communities in graph data?

I'm trying to analyze graphs about social media. The graphs contains time information, so it's possible to do some time series analysis. For each time point, I can run a community detection algorithm (e.g. Louvain method) to detect communities at that time. I can see that the communities are evolving: nodes in smaller communities are sometimes merging into a bigger community, and sometimes they are splitting up. However, I failed to find a comprehensive visualization tool to analyze and demonstrate the evolution of the communities.
Does anyone recommend a tool to serve this purpose? Thank you.

RapidMiner - Time Series Segmentation

As I am fairly new to RapidMiner, I have a Historical Financial Data Set (with attributes Date, Open, Close, High, Low, Volume Traded) from Yahoo Finance and I am trying to find a way to segment it such as in the image below:
I am also planning on performing this segmentation on more than one of such Data Sets and then comparing between each segmentation (i.e. Segment 1 for Data Set A against Segment 1 for Data Set B), so I would preferably require an equal number of segments each.
I am aware that certain extensions are available within the RapidMiner Marketplace, however I do not believe that any of them have what I am looking for. Your assistance is much appreciated.
Edit: I am currently trying to replicate the Voting-Based Outlier Mining for Multiple Time Series (V-BOMM) with multiple data sets. So far, I am able to perform the operation by recording and comparing common dates against each other.
However, I would like to enhance the process to compare Segments rather than simply dates. I have gone through the existing functionalities of RapidMiner, and thus far I don't believe any fit my requirements.
I have also considered Dynamic Time Warping, but I can't seem to find an available functionality in RapidMiner.
Ultimate question: Can someone guide me to functionalities that can help replicate the segmentation in the attached image such that the segments can be compared between Historic Data Sets in RapidMiner? Also, can someone guide me on how to implement Dynamic Time Warping using RapidMiner?
I would use the new version of the Time Series extension, using the windowing features to segment the time series into whatever parts you want. There is a nice explanation of the new tools in the blog section of the community.

Collecting webGL app framerate histogram data

I'm thinking to stick for a particular framework to work for my academic course but only based on results I should prove. I want to plot the graph for all the three frameworks where No.of Vertices is one axis and FPS (threshold is 60) is on other axis. Will that be good enough to take single predefined model in formats like obj, collada, json etc and load it in three frameworks? Then log the frame rate and number of vertices to some external file and thereafter use the data for plotting a graph to report the best framework among three based on Performance parameter. But I'm looking for some boilerplate codes for all these frameworks to load different models (can be used for number of vertices dimension in my graph) and log the frame rates for every second to external file. This is the approach I've been thinking. But couldn't find much help on this on internet. I wish someone could help me?
You can get FPS histogram data using stats.js library which is bundled with all Three.js examples
https://github.com/mrdoob/stats.js
Exporting the collected data to a file can be done using HTML5 File System API.
http://www.html5rocks.com/en/tutorials/file/filesystem/

Extract points from very large girds

I have 10 grids (currently stored as ascii grids from a GIS), each of them with about 4.5GB uncompressed. In addition I have about 100,000 location with an x and y coordinate. I need to extract the grid value at each of this location. I am currently doing it with GRASS GIS which works, but is very slow. Can anyone recommend me a library or a programming language most suitable for such a task?
Thanks in advance!
Sounds like the classic use-case for Hadoop MapReduce.
Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.

Writing an image processing application for analysis of satellite imagery

I have to start work on application for analysis of satellite imagery to identify some man made structure. I would like to use C or Java for this.
For satellite I am planning to use Google Maps data.
I have three questions here:
What is best source for GIS data besides Google Maps/earth.
Best language to write such an application considering i will have to use third-party APIs
Is there a open image processing engine available which identifies man made structures?
Thats a lot of questions but I hope the smarter guys here can help me here.
Overly processed imagery such as Google or Bing maps is a horrible source of imagery for performing feature extraction or feature recognition. Usually, you want the most unprocessed, raw form possible with camera models... of course, if you don't have access to this sort of data, then you have to work with what you have.
A more important consideration of Google Maps/Earth imagery is that you may run afoul of their License Agreement. I suggest you check it before you decide on their data as your imagery source. In particular, if you bypass their API's, you've violated their license agreement.
As far as libraries and langauges, there are dozens of machine vision libraries available. I can't recommend one over the other as I've only been a down-stream consumer of their results. My understanding of the problem is that the biggest concern is how you build the "models" to compare against... i.e. how do you give the system an "example" of what you're looking for.
Once you've found a library, then you can make a decision on the language. Generally, a high-level language like Python or Matlab is used for this kind of prototyping. Once a method has been found, then conversion to a "higher performance" language is done--if necessary.
Personally, I'd probably use Python because (1) it's freely available, (2) has a significant community in the scientific and research worlds, and (3) can interop with a wide variety of languages and platforms.
Specifically, check out Glovis: http://glovis.usgs.gov/
You can browse the earth, and download maps from several different satellites and sensors. Even though you have to go through a bogus "ordering" process, the imagery is free.
You may find the USGS (United States Geological Survey) website helpful. They provide both GIS information and a wide range of data sets.
I agree with James Schek. Google gives you RGB images - not the most helpful fot your task. Most imagery will have a couple of additional channels that may be better suited for you. Different channels show different features, water, urban areas, types of foliage etc. For example an infra-red channel could be used to pick out buildings in a cool climate. If you contact several data provider they may be able to recommend the best channels to use in their data.
Ariel imagery can be huge, numerous terrabytes for a detailed world database. Carefully consider how much information you need to process. If you are only doing a few square miles performance is not an issue. If you are processing thousands of square miles, performance becomes an issue. Processing millions, performance is mission critical and must be considered from day one.
Knowing the number of channels you need to process, your performance requirements and the file format of your data, look around for libraries that fulfil all your requirements. Many of them are written in C/C++ so using a language that interops with them both could be helpful
Take a look at this demo:
Finding Vegetation in a Multispectral Image
, part of the Image Processing Toolbox in MATLAB. It is related to your problem of analysing satellite images to find specific patterns.
I believe it's an excellent example of the sort of things you can achieve easily with MATLAB using very little code.

Resources