I'm looking for a Cypher query that will show an excerpt of the data in a Neo4j database. I need this to provide a quick overview of what kind of data can be found in the db.
The query should show a certain number of nodes for all labels with all possible relations between them. Basically I want to get a subset of the nodes in the database which contains the full complexity of the data in the database.
I tried to accomplish this with LIMIT but this only limits the total number of nodes returned.
Thanks for your help
There's a set of procedures for this via the APOC library(Neo4j 3x): apoc.meta.graph
e.g. CALL apoc.meta.graph will iterate over the graph and collect labels and relationships it finds.
There's a writeup of the meta procedures in this blog post.
Related
Is there any efficient way to get all relationship types currently defined?
I know this works:
match ()-[r]-() RETURN DISTINCT TYPE(r)
But I guess this will consume significant time if the number of relationships is huge and there is no inherent indexing under the hood.
CALL db.relationshipTypes()
To learn more about what mysteries a graph holds see
How to get a high level inventory of objects in your graph
At https://neo4j.com/download/ you can down load Neo4j desktop which will install a local copy of Neo4j server. Here is the online guide.
In there is a list of sample queries to help you learn about a database.
I am working on an application in which I have only "11263" number of nodes in my neo4j database.
I am using following cypher query to form the relationships between the nodes:
let CreateRelations(fromToList : FromToCount list)=
client.Cypher
.Unwind(fromToList, "fromToList")
.Match("(source)", "(target)")
.Where("source.Id= fromToList.SId and target.Id= fromToList.FId ")
.Merge("(source)-[relation:Fights_With]->(target)")
.OnCreate()
.Set("relation.Count= fromToList.Count,relation.Date= fromToList.Date")
.OnMatch()
.Set("relation.Count= (relation.Count+ fromToList.Count )")
.Set("relation.Date= fromToList.Date")
.ExecuteWithoutResults()
It is taking almost 47 to 50 sec to form say 1000 relations in a neo4j database.
I am new to the neo4j DB, Is there is any other efficient way to do it?
The big thing slowing you down is that you're not using an index to lookup your starting nodes. Your match to source is performing a scan of all nodes in your db to find possible matches, per row in your unwound list. Then it does the same thing with target.
You need to add labels on your nodes, and if they already have labels, use the labels in your query. You'll need either an index or unique constraint on the label and id property so the index will be used for lookup.
Best way to go about tuning your queries is to try them out in the browser, and use EXPLAIN to ensure you're using index lookups, and if it's still slow, use PROFILE on the query (it will execute the query) to see the rows generated and db hits as the query executes.
I've got my graph database, populated with nodes, relationships, properties etc. I'd like to see an overview of how the whole database is connected, each relationship to each node, properties of a node etc.
I don't mean view each individual node, but rather something like an ERD from a relational database, something like this, with the node labels. Is this possible?
You can use the metadata by running the command call db.schema().
In Neo4j v4 call db.schema() is deprecated, you can now use call db.schema.visualization()
As far as I know, there is no straight-forward way to get a nicely pictured diagram of a neo4j database structure.
There is a pre-defined query in the neo4j browser which finds all node types and their relationships. However, it traverses the complete graph and may fail due to memory errors if you have to much data.
Also, there is neoprofiler. It's a tool which claims to so what you ask. I never tried and it didn't get too many updates lately. Still worth a try: https://github.com/moxious/neoprofiler
Even though this is not a graphical representation, this query will give you an idea on what type of nodes are connected to other nodes with what type of relationship.
MATCH (n)
OPTIONAL MATCH (n)-[r]->(x)
WITH DISTINCT {l1: labels(n), r: type(r), l2: labels(x)}
AS `first degree connection`
RETURN `first degree connection`;
You could use this query to then unwind the labels to write that next cypher query dynamically (via a scripting language and using the REST API) and then paste that query back into the neo4j browser to get an example set of the data.
But this should be good enough to get an overview of your graph. Expand from here.
I have a fullDB, (a graph clustered by Country) that contains ALL countries and I have various single country test DBs that contain exactly the same schema but only for one given country.
My query's "start" node, is identified via a match on a given value for a property e.g
match (country:Country{name:"UK"})
and then proceeds to the main query defined by the variable country. So I am expecting the query times to be similar given that we are starting from the same known node and it will be traversing the same number of nodes related to it in both DBs.
But I am getting very difference performance for my query if I run it in the full DB or just a single country.
I immediately thought that I must have some kind of "Cartesian Relationship" issue going on so I profiled the query in the full DB and a single country DB but the profile is exactly the same for each step in the plan. I was assuming that the profile would reveal a marked increase in db hits at some point in the plan, but the values are the same. Am I mistaken in what profile is displaying?
Some sizing:
The fullDB would have 70k nodes, the test DB 672 nodes, the time in full db for the query to complete is 218764ms while the test db is circa 3407ms.
While writing this I realised that there will be an increase in the number of outgoing relationships on certain nodes (suppliers can supply different countries) which I think is probably the cause, but the question remains as to why I am not seeing any indication of this in the profiling.
Any thoughts welcome.
What version are you using?
Both query times are way too long for your dataset size.
So you might check your configuration / disk.
Did you create an index/constraint for :Country(name) and is that index online?
And please share your query and your query plans.
I know that neo4j stores data structured in graphs rather than in tables. In RDBMS we will be having schemas of the tables but in neo4j we will not be having the tables. Only nodes, relations and properties are defined. So is there any concept of metadata in neo4j. Like is there any information stored about nodes, relationships in the database? If yes, how and what it stores in the metadata? Also where can we find the metadata related information in the graph database (location)
Thanks,
Neo4J doesn't directly store metadata in the way that you're looking for. The NeoProfiler tool was written precisely for this purpose. You can run it on a Neo4J database, and it will pull out as much information on labels, indexes, constraints, properties, nodes, and relationships as it can. The way that this works isn't too far off of the queries that #ulkas suggests in the other answer here, the output is just much better.
More broadly, in an RDBMS the schema information you pull out substantially constrains the database. The schema there is like a set of rules; you can't insert data unless it conforms to that schema. In Neo4J, because it's so flexible, even if there was a schema it would just be documentation of what's there, it would not be a set of constraints on what you can put in. At any time, you can insert new data that has nothing to do with the present schema (except that you can't violate things like uniqueness constraints).
If you want to see an equivalent schema for your database in neo4j, check out neoprofiler linked above. A few people out there have written about "metagraphs" - that is, they talk about representing a neo4j schema as a graph itself, where for example a node refers to a label. Relationships from that "label node" then go out to other kinds of label nodes, specifying what sorts of relationships can exist between nodes. For example, nodes labeled "Employee" may frequently have "works_for" relationships to nodes of label "Company".
no, direct metadata are not present. the maximum you can do is to query all the structure types and have a small inside what kind of graph could be stored in the db.
START r=rel(*)
RETURN type(r), count(*)
START n=node(*)
RETURN labels(n), count(*)
the specific database files are stored in the folder data/graph.db but besides some index and key files they are binary and not easy to read.
Meanwhile there is the official APOC Library.
This includes functions like apoc.meta.graph, apoc.meta.schema and others.
The link above describes the installation, if you run into sandbox errors, check the answers in this question