I found the following information on how to calculate the size of neo4j database: https://neo4j.com/developer/guide-sizing-and-hardware-calculator/#_disk_storage
An example disk space calculation is:
10,000 Nodes x 14B = 140kB 1,000,000 Rels x 33B = 31.5MB 2,010,000
Props x 41B = 78.6MB
Total is 110.2MB
Is there a query that could simply fetch this information for me?
For the node count the query is simple:
match (n) return count(n);
for the rel count the query would be as follows:
match (n)-[r]-() return count(r);
How do I get the count of all properties of all nodes and relations combined though?
Answering your main question, use the keys function to get a list of property names and summarize their length:
MATCH (n)
WITH SUM(SIZE(KEYS(n))) AS countOfNodeProps,
COUNT(n) AS countOfNodes
MATCH ()-[r]->()
WITH countOfNodeProps,
countOfNodes,
SUM(SIZE(KEYS(r))) AS countOfRelProps,
COUNT(r) AS countOfRels
RETURN countOfNodeProps,
countOfRelProps,
(countOfNodeProps + countOfRelProps) as countOfProps,
countOfNodes,
countOfRels
But it's easier to use the apoc.monitor.store function to get the exact information about the storage:
CALL apoc.monitor.store() YIELD
logSize,
stringStoreSize,
arrayStoreSize,
relStoreSize,
propStoreSize,
totalStoreSize,
nodeStoreSize
RETURN *
Related
How can I know how many nodes and edges are involved in a MATCH? Is there another way besides Explain / Profile Match?
If you mean how many nodes are matched in a path, such as a variable-length path, then you can assign a path variable for this:
MATCH p = (k:Person {name:'Keanu Reeves'})-[*..8]-(t:Person {name:'Tom Hanks'})
WITH p LIMIT 1
RETURN p, length(p) as pathLength, length(p) + 1 as numberOfNodesInPath
You can also use nodes(p) and relationships(p) to get the collection of nodes and relationships that make up the path, and you can use size() on those collections to get their size.
There exists the COUNT() function of Cypher that allows you to count the number of elements. As for example in this query:
MATCH (n)
RETURN COUNT(n);
This query will count all nodes in your database.
You can find more information in the cypher manual, under the aggregating functions. Check it out.
The following Cypher snippet should return the number of distinct nodes and relationships found by any given MATCH clause. Just replace <your code here> with your MATCH pattern.
MATCH <your code here>
WITH COLLECT(NODES(p)) AS ns, SUM(SIZE(RELATIONSHIPS(p))) AS relCount
UNWIND ns AS nodeList
UNWIND nodeList AS node
RETURN COUNT(DISTINCT node) AS nodeCount, relCount;
How can I get the percentage of nodes by labes in Neo4j?
It should be something like this?:
MATCH (n)
WITH COUNT(*) As total
MATCH (n)
WHERE NOT (n)--()
WITH DISTINCT count(labels(n)) as c, labels(n) as l
RETURN (c/total)*100, l;
Thanks in advance.
The fastest way to get this info is via the count store, and APOC Procedures has the easiest means to get access to this all at once:
CALL apoc.meta.stats() YIELD nodeCount, labels
WITH toFloat(nodeCount) as nodeCount, labels
UNWIND keys(labels) as label
RETURN label, labels[label] as count, round(labels[label] / nodeCount * 100) as percentage
Keep in mind that since nodes can be multi-labeled, your percentages will likely exceed 100%.
I'm starting with Neo4j and using graphs, and I'm trying to get the following:
I have to find the subtraction(difference) between the number of users (each user is a node) and the number of differents names they have. I have 16 nodes, and each one has his own name (name is one of the properties it has), but some of them have the same name (for example the node A has (Name:Amanda,City:Roma) and node B has (Name:Amanda, City:Paris), so I will have less name's count because some of them are repeated.
I have tried this:
match (n) with n, count(n) as c return sum(c)
That gives me the number of nodes. And then I tried this
match (n) with n, count(n) as nodeC with n, count( distinct n.Name) as
nameC return sum(nodeC) as sumN, sum(nameC) as sumC, sumN-sumC
But it doesn't work (I'm not sure if even i'm getting the names well, because when I try it, separated, it doesn't work neither).
I think this is what you are looking for:
MATCH (n)
RETURN COUNT(n) - COUNT(DISTINCT n.name) AS diff;
My query is:
MATCH (n)-[:NT]->(p)
WHERE ...some properties filters...
RETURN n,p
The result is on the screenshot below.
How to count the total nodes?
I need 14 as a text result. Something like RETURN COUNT(n)+COUNT(p) but it shows 24.
The following request doesn't work correctly:
MATCH (n)-[:NT]->(p)
WHERE ...some properties filters...
RETURN count(n)
Returns me 12, which is the number of relationships pairs as on the picture, not nodes.
MATCH (n)-[:NT]-(p)
WHERE ...some properties filters...
RETURN count(n)
Returns 24.
How to count toward that two nodes (in this example) that have outgoing ONLY arrows? Should be 14 at once.
UPD:
MATCH (n)-[:NT]->(p)
WHERE ...
RETURN DISTINCT FILTER(x in n.myID WHERE NOT x in p.myID)
MATCH (n)-[:NT]->(p)
WHERE ...
RETURN DISTINCT FILTER(x in p.myID WHERE NOT x in n.myID)
The COUNT of DISTINCT UNION of myID gives me the result.
I don't know how to make it with cypher.
Or the DISTINCT UNION of collections:
MATCH (n)-[:NT]->(p)
WHERE ...
RETURN collect(DISTINCT p.myID), collect(DISTINCT n.myID)
The result is:
collect(DISTINCT p.myID)
26375, 26400, 21636, 29939, 20454, 26543, 19089, 4483, 26607, 30375, 26608, 26605
collect(DISTINCT n.myID)
11977, 19478, 20454
Which is 15 items. One is common. If you UNION or DISTINCT the 20454 the total COUNT would be 14. The actual number of nodes on the picture.
I can not achieve this simple pattern.
Your original queries are working correctly.
If you want to get a count of distinct n nodes, your queries should RETURN COUNT(DISTINCT n).
To count the number of nodes that only have outgoing relationships:
MATCH (n)-->()
WHERE NOT ()-->(n)
COUNT(DISTINCT n);
To count the number of distinct nodes that are directly involved in an :NT relationship:
MATCH (n)-[:NT]-()
COUNT(DISTINCT n);
MATCH (n)-[:NT]->(p)
WHERE ...some properties filters...
WITH collect(DISTINCT p.myID) AS set1
MATCH (n)-[:NT]->(p)
WHERE ...some properties filters...
WITH collect(DISTINCT n.myID) AS set2, set1
WITH set1 + set2 AS BOTH
UNWIND BOTH AS res
RETURN COUNT(DISTINCT res);
Is it possible to extract in a single cypher query a limited set of nodes and the total number of nodes?
match (n:Molecule) with n, count(*) as nb limit 10 return {N: nb, nodes: collect(n)}
The above query properly returns the nodes, but returns 1 as number of nodes. I certainly understand why it returns 1, since there is no grouping, but can't figure out how to correct it.
The following query returns the counter for the entire number of rows (which I guess is what was needed). Then it matches again and limits your search, but the original counter is still available since it is carried through via the WITH-statement.
MATCH
(n:Molecule)
WITH
count(*) AS cnt
MATCH
(n:Molecule)
WITH
n, cnt LIMIT 10
RETURN
{ N: cnt, nodes:collect(n) } AS molecules
Here is an alternate solution:
match (n:Molecule) return {nodes: collect(n)[0..5], n: length(collect(n))}
84 ms for 30k nodes, shorter but not as efficient as the above one proposed by wassgren.