"Resultset too large (over 1000 rows)" in neo4j browser - neo4j

I'm using neo4j 2.1.2 community edition. I have loaded the CSV file which is having 2500 rows and i have created nodes and relationships among the columns. When i run the below cypher query
match (n) return count(*);
I'll get the nodes count as 17275. So when i match the nodes like match (n) return n and try to get the corresponding graph in a neo4j browser, it says
Resultset too large (over 1000 rows)
I know it's due to the nodes requested is more than 1000. So if i want to see the complete graph in neo4j browser, how can i do it?
The same query i tried in the neo4j web-admin, i wan able to get the data in tabular format but i wanted to see the data as a graph.
Also I'm not able to find neo4j-Shell in my neo4j installation bin directory. Why is that?
Thanks

Update 1
The Neo4J Web UI is built on top of D3.js using SVG: due to SVG performances in a browser when you have more than 500 nodes in a network, the user experience starts to degrade quite quickly.
Handling more than 1000 nodes adds to the technical challenge: in fact with so many nodes what happens most of the time is the "hairball" effect.
This is a blog post that might be useful (disclaimer: I am a developer for KeyLines) about visualizing big network with some design hints.
As you can imagine visualizing more than 1000 nodes is not that easy and that's why some companies such Cambridge Intelligence (KeyLines), Tom Sawyer (Perspective) or Linkourius came up with specific products for that.
You can of course build the visualization yourself for fun with open source libraries but keep in mind that it can take a very long time.
If your Neo4J project is not commercial I can suggest to have a look to Gephi to visualize it: it is a Desktop Application and it has a Neo4J adapter plugin. It can easily handle huge datasets but of course it lacks the same portability of a webapp.
In case you need ONLY a storage for your graph/data than a visualization is not required, you're right.
Original Answer
I think you might have to implement a custom visualization too see such graph in the browser, using one option of those in this page: http://www.neo4j.org/develop/visualize .
Alternatively have a look to this most extensive list here: Big data visualization using "search, show context, and expand on demand" concept
Or maybe have a different visualization approach with one of the following: Data Visualization libraries

Look at settings in neo4j browser. You can change Graph Visualization how you like. But browser can work much slower if you wanna see the complete graph.

Related

Neo4j to grafana

I want to present release data complexity which is associated with each node like at epic, userstory etc in grafana in form of charts but grafana do not support neo4j database.Is there any way Directly or indirectly to present neo4j database in grafana?
I'm having the same issues and found this question among others. From my research I cannot agree with this answer completely, so I felt I should point some things out, here.
Just to clarify: a graph database may seem structurally different from a relational or time series database, but it is possible to build Cypher queries that basically return graph data as tables with proper columns as it would be with any other supported data source. Therefore this sentence of the above mentioned answer:
So what you want to do is just not possible.
is not absolutely true, I'd say.
The actual problem is, there is no datasource plugin for Neo4j available at the moment. You would need to implement one on your own, which will be a lot of work (as far as I can see), but I suspect it to be possible. For me at least, this will be too much work to do, so I won't use any approach to read data directly from Neo4j into Grafana.
As a (possibly dirty) workaround (in my case), a service will regularly copy relevant portions of the Neo4j graph into a relational database (or a time series database, if the data model is sufficiently simple for that), which Grafana is aware of (see datasource plugins), so I can query it from there. This is basically the replication idea also given in the above mentioned answer. In this case you obviously end up with at least 2 different database systems and an additional service, which is not so insanely great, but at the moment it seems to be the quickest way to resolve the problem with the missing datasource plugin. Maybe this is applicable in your case, too.
Using neo4j's graphite metrics you can actually configure data to be sent to grafana, and from there build whichever dashboards you like.
Up until recently, graphite/grafana wasn't supported, but it is now (in the recent 3.4 series releases), along with prometheus and other options.
Update July 2021
There is a new plugin called Node Graph Panel (currently in beta) that can visualise graph structures in Grafana. A prerequisite for displaying your graph is to make sure that you have an API that exposes two data frames, one for nodes and one for edges, and that you set frame.meta.preferredVisualisationType = 'nodeGraph' on both data frames. See the Data API specification for more information.
So, one option would be to setup an API around your Neo4j instance that returns the nodes and edges according to the specifications above. Note that I haven't tried it myself (yet), but it seems like a viable solution to get Neo4j data into Grafana.
Grafana support those databases, but not Neo4j : Graphite, InfluxDB, OpenTSDB, Prometheus, Elasticsearch, CloudWatch
So what you want to do is just not possible.
You can replicate your Neo4j data inside of those database, but the datamodel is really different ... (timeseries vs graph).
If you just want to have some charts, you can use Apache Zeppeline for that.

Queries on 200 GB graph

I am in need to use a scalable solution to create a Geohash connected graph.
I find Cypher for APache Spark a project that let use cypher on spark dataframes to create a graph, however it can only create immutable graphs by mapping the different data-frames,so i didn't get the graph that i need.
I can get the graph that i need if i run some other cypher queries on a Neo4j Browser, however my stored graph is about 200 GB.
So i'm asking if that logic and fast to run queries on 200 GB of graph data using Neo4j browser and apoc functions ?
If you're asking if Neo4j can handle databases of this size, then the answer is yes. But you'll see different results depending on how your data is modeled and the kind of queries you want to run.
Performance correlates not necessarily with the size of the graph, but on the portion of the graph touched and traversed by your queries. Graph-wide analytical queries must touch the entire graph, while tightly defined queries that touch a smaller local part of the graph will be quite quick.
Anything you can do in your queries to constrain the portion of the graph you have to traverse or filter will help out your query speed, so good modeling and usage of indexes and constraints is key.

database solution for multiple isolated graphs

I have an interesting problem that I don't know how to solve.
I have collected a large dataset of 80 million graphs (they are CFG as in Control Flow Graph produced by programs I have analysed from Github) which I need to be able to search efficiently.
I looked into existing solutions like Neo4j but they are all designed to store a global single graph.
In my case this is the opposite all graphs are independent -like rows in a table - but I need to search through all of them efficiently.
For example I want to find all CFGs that has a particular IF condition or a WHILE loop with a particular condition.
What's the best database for this use case?
I don't think that there's a reason not to simply store all those graphs in a single graph, whether it's Neo4j or a different graph database. It's not a problem to have many disparate graphs in a single graph where the disparate graphs are disconnected from one another.
As for searching them efficiently, you would either (1) identify properties in your CFGs that you want to search on and convert them to some indexed value of the graph or (2) introduce some graph structure (additional vertices/edges) between the CFGs that will allow you to do the searches you want via graph traversal.
Depending on what you need to search on approach 1 may not be flexible enough for you especially, if what you intend to search on is not completely known at the time of loading the data. Also, it is important to note that with approach 2 you do not really lose the fact that you have 80 million distinct graphs just because you provided some connection between them. Those physical connections don't change that basic logical fact. You just need to consider those additional connections when you write traversals that you expect to occur only within a single CFG.
I'm not sure what Neo4j supports in this area, but with Apache TinkerPop (an open source graph processing framework that lets you write vendor agnostic code over different graph databases, including Neo4j), you might consider doing some form of graph partitioning to help with approach 2. Or you might subgraph() the larger graph to only contain the CFG and then operate with that purely in memory when querying. Both of these approaches will help you to blind your query to just the individual CFG you want to traverse.
Ultimately, however, I see this issue as a modelling problem. You will just need to make some choices on how to best establish the schema for your use case and virtually any graph database should be able to support that.

Visualizing graph database

Assuming i am working with neo4j, the only way i can think of that would visualize my mock up data is to generate cypher code and paste it into neo4j's data browser
Is there another (better, simplier?) way one can use to create visualization without using cypher? Generating cypher code seems like a complex enough task by itself.
Writing tests is of course another way of making sure relationships are set up right, but as i am learning the system, i'd like to visually see things to make sure they are set up as expected.
This gist contains an example on how to use the Neo4J Graphviz component to generate output in Graphviz DOT notation, which is supported by a range of graph visualizing software. (And of course Graphviz itself)
(Link to the original blog post where I found the example: http://blog.neo4j.org/2012/05/graph-this-rendering-your-graph-with.html)
There is a new solution to explore the content of a Neo4j graph database using a web browser: http://linkurio.us/
It allows you to search nodes by properties, inspect nodes, expand neighborhood...
Disclamer: I'm co-founder of Linkurious and Gephi.
There are some options listed on http://www.neo4j.org/develop/visualize also.

BigData Vs Neo4J

I´ve been looking for a triple store for my project. In this project i want to store my data according to certain ontologies (OWL).
From my research i ended up with two tecnologies Neo4J and BigData that seems to fit well in this case.
I want to know if any of this two is more apropriated to use with RDF, RDFS, OWL and SPARQL Queries.
Neo4j can be used to store as entity-relationship-entity form. In case of Bigdata, you should not be upload your whole data into Neo4j because it will become very heavy and process will be very much slow. You should use complimentary db for storing actual data and store ids and some params into Neo4j for Graph traversal to perform sort of Graph Analytics. Neo4j is mainly build up for Graph Analytics that its power or you have to use Graph engine e.g GraphX (Spark).
Thanks,
You might want to try out the SparQL plugin for Neo4j, see here for a HTTP based test, and this Berlin Dataset Test for embedded usage.
Neo4J is a specific technology, while big data is more a generic term. I think what you're asking about OLAP and OLTP. As data gets bigger, there are differences between use cases for RDF style graph databases, which are often used for OLAP (On-line Analytical Processing) style analytics. In short, OLAP is designed for analytics that look across an big data set, while OLTP is more aimed at INSERT/DELETEs (on potentially big data).
OLAP-based traversals tend to process the entire graph, while OLTP based traversals tend to process smaller data sets by starting with one or a handful of vertices and traversing from there.
For example, let’s say you wanted to calculate the average age of friends of one particular user. Great use case for OLTP, since the query data set is small. However, if you wanted to calculate the average age of everyone on the database, OLAP is the preferred technology.
OLAP is optimal for deep analysis of a lot of data, while OLTP is better suited for fast running queries and a lot of INSERTs. If you’re trying to achieve a SLA where the analytics must complete within a certain timeframe, consider the type of analytics and which one is better suited. Or maybe you need both.

Resources