Spatial Reasoning Using SparQ tool - spatial

I am trying to use the SparQ too, the tool has been installed but when building it says the error attached in the screenshot.
Any help or advice would be appreciated!
Here is the tool:
https://github.com/dwolter/SparQ
Qualitative reasoning about spatial and temporal domains aims at dealing with knowledge, for example, spatial region X overlaps with region Y, etc. SparQ implements many qualitative formalisms (Allen's interval algebra, RCC-5/8, etc.) and several reasoning algorithms.
Contacting the tools developer and waiting for their replay!

Related

What is the best visualization tool to analyze evolving communities in graph data?

I'm trying to analyze graphs about social media. The graphs contains time information, so it's possible to do some time series analysis. For each time point, I can run a community detection algorithm (e.g. Louvain method) to detect communities at that time. I can see that the communities are evolving: nodes in smaller communities are sometimes merging into a bigger community, and sometimes they are splitting up. However, I failed to find a comprehensive visualization tool to analyze and demonstrate the evolution of the communities.
Does anyone recommend a tool to serve this purpose? Thank you.

sharding a neo4j graph, min-cut

I've heard of a max flow min cut method for sharding or segmenting a graph database. Does someone have a sample cypher query that can do that say against the movielens dataset? Basically I want to segment users into different shards/clusters based on what they like so maybe the min cuts can naturally find clusters of users around the genres say Horror, Drama, or maybe it will create non-intuitive clusters/segments like hipster/romantics and conservative/comedy/horror groups.
my short answer is no, sorry I don't know how you would express that.
my longer answer is even if this were possible - which it very well may be - I would advise against it.
multiple algorithms 'do' min-cut max-flow, these will all have different performance characteristics and, because clustering is computationally expensive, I'd guess you want control over the specific algorithm implementation used.
Cypher is a declarative language, you specify what you're looking for but not how to do it, and it will be difficult to specify such a complex problem in a way that the Cypher engine can figure out what you're trying to do. that will make it hard for Cypher (or any declarative language engine) to produce an efficient query plan.
my suggestion is find the specific algorithm you wish to use and implement it using the Neo4j Java API.
if you're running Neo4j in embedded mode you're done at that point. if you're running Neo4j server you'll then just have to run that code as an Unmanaged Server Extension
AFAIK you're after 'Community Detection' algorithms. There are non-overlapping (communities do not overlap) and overlapping variants, where non-overlapping is generally easier to implement and understand. Common algorithms are:
Non-overlapping: Louvain
Overlapping: Label Propagation Algorithm (LPA) (typically non-overlapping, but there are extensions to make it overlapping)
Here are a few C++ code examples for the algorithms: Louvain, Oslom (overlapping), LPA (non-overlapping), and Infomap)
And if you want bleeding edge I was recommended the SCD algorithm
Academic paper: "High Quality, Scalable and Parallel Community Detection for Large Real Graphs"
C++ implementation

GIS mapping tool on detailed scale

I'm looking for good GIS solutions that work on a relatively smaller, detailed scale. Specifically, I was wondering what APIs or toolkits are available for mapping out spaces in a building (like rooms, hallways, shelves). This need not be a 3D solution, like one might envisage for architectural CAD-type drawings. Something relatively lightweight would be ideal.
I feel like ArcGIS is a bit of overkill for that, though I may very well be wrong.
Since it's mapping but not quite GPS/routing/distance/Earth/road-type mapping I'm in a bit of a quandary.
Thanks in advance for your input!
The boundary between GIS and CAD softwares depends on the scale and level of detail of your data. GIS are said more suitable for topographic scales (less than 1:10000) with simplified 2D representations, and CAD softs for bigger scales (more then 1:10000) with very detailled representations, usually in 3D. Nevertheless, there is a convergence between both: GIS supports more and more 3D representations, and CAD softs provide some spatial analysis features.
If your purpose is to build a 3D model, I will suggest you a CAD software like blender. You can also have a look at sketchup or the tools used to build citygml data.
PostGIS is available for many solutions, just check wikipedia.

Map Reduce Algorithms on Terabytes of Data?

This question does not have a single "right" answer.
I'm interested in running Map Reduce algorithms, on a cluster, on Terabytes of data.
I want to learn more about the running time of said algorithms.
What books should I read?
I'm not interested in setting up Map Reduce clusters, or running standard algorithms. I want rigorous theoretical treatments or running time.
EDIT: The issue is not that map reduce changes running time. The issue is -- most algorithms do not distribute well to map reduce frameworks. I'm interested in algorithms that run on the map reduce framework.
Technically, there's no real different in the runtime analysis of MapReduce in comparison to "standard" algorithms - MapReduce is still an algorithm just like any other (or specifically, a class of algorithms that occur in multiple steps, with a certain interaction between those steps).
The runtime of a MapReduce job is still going to scale how normal algorithmic analysis would predict, when you factor in division of tasks across multiple machines and then find the maximum individual machine time required for each step.
That is, if you have a task which requires M map operations, and R reduce operations, running on N machines, and you expect that the average map operation will take m time and the average reduce operation r time, then you'll have an expected runtime of ceil(M/N)*m + ceil(R/N)*r time to complete all of the tasks in question.
Prediction of the values for M,R,m, and r are all something that can be accomplished with normal analysis of whatever algorithm you're plugging into MapReduce.
There are only two books that i know of that are published, but there are more in the works:
Pro hadoop and Hadoop: The Definitive Guide
Of these, Pro Hadoop is more of a beginners book, whilst The Definitive Guide is for those that know what Hadoop actually is.
I own The Definitive Guide and think its an excellent book. It provides good technical details on how the HDFS works, as well as covering a range of related topics such as MapReduce, Pig, Hive, HBase etc. It should also be noted that this book was written by Tom White who has been involved with the development of Hadoop for a good while, and now works at cloudera.
As far as the analysis of algorithms goes on Hadoop you could take a look at the TeraByte sort benchmarks. Yahoo have done a write up of how Hadoop performs for this particular benchmark: TeraByte Sort on Apache Hadoop. This paper was written in 2008.
More details about the 2009 results can be found here.
There is a great book about Data Mining algorithms applied to the MapReduce model.
It was written by two Stanford Professors and it if available for free:
http://infolab.stanford.edu/~ullman/mmds.html

Writing an image processing application for analysis of satellite imagery

I have to start work on application for analysis of satellite imagery to identify some man made structure. I would like to use C or Java for this.
For satellite I am planning to use Google Maps data.
I have three questions here:
What is best source for GIS data besides Google Maps/earth.
Best language to write such an application considering i will have to use third-party APIs
Is there a open image processing engine available which identifies man made structures?
Thats a lot of questions but I hope the smarter guys here can help me here.
Overly processed imagery such as Google or Bing maps is a horrible source of imagery for performing feature extraction or feature recognition. Usually, you want the most unprocessed, raw form possible with camera models... of course, if you don't have access to this sort of data, then you have to work with what you have.
A more important consideration of Google Maps/Earth imagery is that you may run afoul of their License Agreement. I suggest you check it before you decide on their data as your imagery source. In particular, if you bypass their API's, you've violated their license agreement.
As far as libraries and langauges, there are dozens of machine vision libraries available. I can't recommend one over the other as I've only been a down-stream consumer of their results. My understanding of the problem is that the biggest concern is how you build the "models" to compare against... i.e. how do you give the system an "example" of what you're looking for.
Once you've found a library, then you can make a decision on the language. Generally, a high-level language like Python or Matlab is used for this kind of prototyping. Once a method has been found, then conversion to a "higher performance" language is done--if necessary.
Personally, I'd probably use Python because (1) it's freely available, (2) has a significant community in the scientific and research worlds, and (3) can interop with a wide variety of languages and platforms.
Specifically, check out Glovis: http://glovis.usgs.gov/
You can browse the earth, and download maps from several different satellites and sensors. Even though you have to go through a bogus "ordering" process, the imagery is free.
You may find the USGS (United States Geological Survey) website helpful. They provide both GIS information and a wide range of data sets.
I agree with James Schek. Google gives you RGB images - not the most helpful fot your task. Most imagery will have a couple of additional channels that may be better suited for you. Different channels show different features, water, urban areas, types of foliage etc. For example an infra-red channel could be used to pick out buildings in a cool climate. If you contact several data provider they may be able to recommend the best channels to use in their data.
Ariel imagery can be huge, numerous terrabytes for a detailed world database. Carefully consider how much information you need to process. If you are only doing a few square miles performance is not an issue. If you are processing thousands of square miles, performance becomes an issue. Processing millions, performance is mission critical and must be considered from day one.
Knowing the number of channels you need to process, your performance requirements and the file format of your data, look around for libraries that fulfil all your requirements. Many of them are written in C/C++ so using a language that interops with them both could be helpful
Take a look at this demo:
Finding Vegetation in a Multispectral Image
, part of the Image Processing Toolbox in MATLAB. It is related to your problem of analysing satellite images to find specific patterns.
I believe it's an excellent example of the sort of things you can achieve easily with MATLAB using very little code.

Resources