I want to integrate my Neo4j graph database on Rails app with GraphLab for data analytics. Is it possible to integrate GraphLab directly without explicitly taking out the database snapshots?
Are there any other tools that can be easily integrated with Neo4j for the same?
If not possible, then the concern is that Neo4j doesn't allow to export data in csv format. While GraphLab only allows csv imports.
If the graph is small enough to fit into RAM, you could do this import in a few steps:
Use neo4j-shell-tools to export from neo4j to GraphML.
Use NetworkX to import from GraphML to a NetworkX Graph object (let's call it g).
Use a loop or list comprehension to add the vertices and edges from the NetworkX graph to a graphlab.SGraph (let's call it sg):
import graphlab
sg = graphlab.SGraph()
sg = sg.add_vertices([graphlab.Vertex(i) for i in g.nodes()])
sg = sg.add_edges([graphlab.Edge(*edge) for edge in g.edges()])
You could also use py2neo (as described in the comments above to query the graph) but instead of writing to CSV, directly build the SGraph from the queries, either using add_vertices and add_edges, or by building vertex/edge SFrames and then using those to construct the graph. This might be a faster solution for production (with no intermediate disk representation) and may also help get around the memory size limitation if your graph is larger than will fit in RAM.
Related
Is there any way in which I can get all the versions of all OneDrive items in my Drive using Graph API. And I want a single query to complete this work.
DriveItemVersion resource type doesn't seem to support this (https://learn.microsoft.com/en-us/graph/api/resources/driveitemversion?view=graph-rest-1.0). It looks like we need a separate query to get versions of each OneDrive item. This is not a very efficient way.
Let me know if there is any workaround/fix for this problem.
This isn't possible and would be extremely inefficient. Just as an example, I have ~100k DriveItems in my OneDrive. Attempting to retrieve all of the items, and each version would take an exceedingly long time.
It is far more efficient to retrieve the minimum DriveItem properties you need using a Delta query. You can then process individual DriveItems in batches. Once complete you can then retrieve another Delta and processes any files that have changed in the meantime.
I would also suggest taking another look at your requirements. There are very few scenarios where it makes sense to query every file in a Drive. You shouldn't attempt to apply the same patterns used for local/networked storage to cloud storage solutions (be it OneDrive, Google Drive, DropBox, etc.). They are much more akin to a database of binaries than a file system.
Is there a way to export the neo4j data to svg output like the one that appears on the neo4j browser so we can use it inside our application.
This question is with respect to neo4j gem in rails and anyother suggestions are also welcome
So the Neo4j browser application builds its own user interface (including SVG's) using the standard query data from the Neo4j javascript driver. The Neo4j database doesn't contain a build in way to export information as SVG. Nor should it. SVG graphics are a business decision. The way you'd like to visualize graph data in your app is unlikely to be the same way I want to visualize graph data. You'll need to build this functionality yourself.
Neo4j has an article on graph visualization which may help you implement this. I believe the Neo4j browser app makes use of the very popular (and open source) D3.js data visualization library for building its graph visualizations / SVGs.
https://neo4j.com/developer/guide-data-visualization/
The neo4j browser can be used to view and export the result of a cypher query as an image. Is there a way to do this using the REST/Java or any other API interface?
I can probably get the result as a Json and visualize the result using linkurious but the inbuilt neo4j visualization is better for my purpose.
Any ideas?
Thanks!
There are a large number of options. Coming at it from the REST API, there are a few client packages available. One is RNeo4j for R. The README for the package includes a section on visualization. See that here
I am working on using Neo4j with py2neo for analyzing Twitter data. I'm a newbie in all of these, so the question might be pretty basic. But I could not find the answer in any of the documentations.
I have two csv files, one with 100 followers, the other with about 22000 tweets.
For the tweet I have informations like it is a reply to another tweet and the other users who have been mentioned in this tweet.
I want to add followers and tweets as nodes, then using the reply_to and the mentions_user field of the tweets to add connections between tweets (reply_to) and tweet and user (mentions).
Adding the nodes works well with batch. However, when I want to iterate through all Tweets using py2neo to add the relationships I get OutOfMemoryError: Java heap space.
I'm trying to iterate through the tweets like this:
for tweet in graph.find("Tweet")
My questions are now:
a) Is there another way in py2neo to iterate through (a lot of) nodes?
b) A little broader: I read in the py2neo documentation it is better to use cypher transactions than batch. Should I do that and could that also help for a)?
Thanks in advance for any help!
KMM
There are certainly ways to load bulk data effectively but this particular method (finding all items of a particular "type") is not one that takes advantage of the graph structure of the database and therefore won't scale well.
You can of course increase the Java heap size if this is a one-off and you may get away with it. But your best bet is probably to look into the LOAD CSV operation: http://neo4j.com/docs/stable/query-load-csv.html
What is the best Open Source visualization software for Neo4J? By best, I mean:
* Fully featured
* Open Source
* Still being developed/supported for latest Neo4J stable release
* Interactive
I've tried the data browser in Neo4J's web admin, but get the impression there are many other offerings at: http://www.neo4j.org/develop/visualize
I've spent some time looking at offerings there, but it looks like many offerings are either no longer supported for the latest Neo4J stable release, are still under development, or are not Open Source.
I've been looking at Neoclipse and Gephi, but:
* Can't tell if Neoclipse is really very widely used
* Don't know how robust graphML export from Gremlin is (the Gephi Neo4J plugin seems oriented towards the older Neo4J v1.5; also Gephi can't display multiple relationships between nodes (though it can count them).
Any shared wisdom would be happily accepted!
VivaGraphJS is one available choice. Max De Marzi frequently blogs about visualizing the graphs so see if you can find others.
There are a few open-source visualization software for Neo4j. I recommend :
Gephi : it allows visualization and SNA. It has a great community (https://gephi.org)
Cytoscape : it is mostly used for bioinformatics but it is a great
platform to work with graph data (http://www.cytoscape.org/)
There are less featured alternatives :
Neovigator : a tool to visually explore graphs
(https://github.com/maxdemarzi/neovigator)
Neoclipse : you can view and edit your data
(https://github.com/neo4j-contrib/neoclipse)
D3.js : a data visualization library
Sigma.js, VivaGraphJS : graph visualization libraries, both
compatible with WebGL
The Neo4j website mentions some of the options : http://www.neo4j.org/develop/visualize
Also take a look at Mashed Datatoes, a bar chart, pie chart like visualization for Neo4j database.
It uses Movie database for demo. Try selecting "Person" as start label name.