Neo4j Update node with aggregated count - neo4j

I'm trying to take a count and set a property of a node with the value. e.g the following query:
MATCH (n:node)-[]->() return n, count(*)
returns the node alongside each of the counts. I would expect to be able to do something like this:
MATCH (n:node)-[]->() set n.relationCount = count(*)
However executing the above returns an error:
Aggregations should not be used like this.

I was looking through similar questions and although i didn't find the precise use of aggregates in setting data I did manage to reverse engineer an example into this:
MATCH (n:node)-[]->()
WITH n, count(*) as c
SET n.data = c
Which appears to work!

Related

WHERE condition in neo4j | Filtering by relationship property

How does the where condition in neo4j works ?
I have simple data set with following relationship =>
Client -[CONTAINS {created:"yesterday or today"}]-> Transaction -[INCLUDES]-> Item
I would like to filter above to get the items for a transaction which were created yesterday, and I use the following query -
Match
(c:Client) -[r:CONTAINS]-> (t:Transaction),
(t) -[:INCLUDES]-> (i:Item)
where r.created="yesterday"
return c,t,i
But it still returns the dataset without filtering. What is wrong ? And how does the filtering works in neo4j for multiple MATCH statements say when I want to run my query on filetered dataset from previous steps?
Thank you very much in advance.
Your query seems fine to me. However, there are 2 things I would like to point out here:
In this case, the WHERE clause can be removed and use match by property instead.
The MATCH clause can be combined.
So, the query would be:
MATCH (c:Client) -[r:CONTAINS {created: "yesterday"}]-> (t:Transaction) -[:INCLUDES]-> (i:Item)
RETURN c, t, i
Regarding your second question, when you want to run another query on the filtered dataset from the previous step, use WITH command. Instead of returning the result, WITH will pipe your result to the next query.
For example, with your query, we can do something like this to order the result by client name and return only the client:
MATCH (c:Client) -[r:CONTAINS {created: "yesterday"}]-> (t:Transaction) -[:INCLUDES]-> (i:Item)
WITH c, t, i
ODERBY c.name DESC
RETURN c
There does not seem to be anything wrong with the cypher statement.
Applying subsequent MATCH statements can be done with the WITH clause, it's well documented here : https://neo4j.com/docs/cypher-manual/current/clauses/with/

can we use datetime filter in MATCH clause instead of WHERE clause on a node

I have some sample tweets stored as neo4j. Below query finds top hashtags from specific country. It is taking a lot of time because the time filter for status type nodes is in where clause and is slowing the response. Is it possible to move this filter to MATCH clause so that status nodes are filtered before relationships are found?
match (c:country{countryCode:"PK"})-[*0..4]->(s:status)-[*0..1]->(h:hashtag) where (s.createdAt >= datetime('2017-06-01T00:00:00') AND s.createdAt
>= datetime('2017-06-01T23:59:59')) return h.name,count(h.name) as hCount order by hCount desc limit 100
thanks
As mentioned in my comment, whether a predicate for a property is in the MATCH clause or the WHERE clause shouldn't matter, as this is just syntactical sugar and is interpreted the same way by the query planner.
You can use PROFILE or EXPLAIN to see the query plan to see what it's doing. PROFILE will give you more information but will have to actually execute the query. You can attempt to use planner hints to force the planner to plan the match differently which may yield a better approach.
You will want to ensure you have an index on :status(createdAt).
You can also try altering your match a little, and moving the portion connecting to the country in question into your WHERE clause instead. Also it's a good idea to get the count based upon the hashtag node itself (assuming there's only one :hashtag node for a given name) so you can order and limit before you do property access:
MATCH (s:status)-[*0..1]->(h:hashtag)
WHERE (s.createdAt >= datetime('2017-06-01T00:00:00') AND s.createdAt
>= datetime('2017-06-01T23:59:59'))
AND (:country{countryCode:"PK"})-[*0..4]->(s)
WITH h, count(h) as hCount
ORDER BY hCount DESC
LIMIT 100
RETURN h.name, hCount

cql doesnt pass node in WITH statement

I'm using the following query to count the number of created users and create a user if the user with that id doesnt exist:
MERGE (uc:UserCounter)
ON CREATE SET uc.count = 0
WITH uc
MATCH (u:User{id:X})
WITH uc, count(u) as counts
MERGE (u:User{id:X})
ON CREATE SET uc.count = uc.count+1, u.id = uc.count, u.creation_ts = TIMESTAMP()
RETURN counts
I'm also returning counts to see if the user existed before or not. This query gives me back
(no rows). After some debugging, I came to the conclusion, that the uc node is not been passed until the end. What am I missing ?
This looks like the same problem as this question: if the user doesn't exist yet, the MATCH will not return any row despite the count() aggregation. You'll need an OPTIONAL MATCH for it to work instead.
Your query seems off though: why would you match/merge on id X, then overwrite it on creation with the current count? It's also dubious it would work correctly when executed concurrently.

How to update Nodes within a random manner in Neo4j

how can i update a random set of nodes in Neo4j. I tried the folowing:
match (Firstgraph)
with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph)
order by rand();
match (G1:FirstGraph)
where id(G1)=Id
set G1.Version=5
My idea is the get a random set then update it, but i got the error:
Expected exactly one statement per query but got: 2
Thanks for your help.
Let's find out what's the problem here, first of all, your error
Expected exactly one statement per query but got: 2
This is coming from your query, if we check it, we see that you did two queries in the same sentence, that's why you get this error.
match (Firstgraph) with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph) order by
rand(); match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
This is not a good query, because you can't use ; in a query sentence, it's the query end marker, so you can't do another query after this, but you can use UNION:
match (Firstgraph) with id(Firstgraph) as Id
return
Firstgraph.name, Firstgraph.version,id(Firstgraph) order by rand()
UNION
match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
Also, if you want to match a random set of nodes, you can simply do this (this example is for a 50% chances to get each node):
Match (node) Where rand() > 0.5 return node
And then do whatever you want with the node using WITH

Neo4j 1.9.RC2, cypher sorting and ranking

i am using Neo4j 1.9.RC2 and i test the ORDER BY with WITH.
What i want to do is to generate a dynamic ranking and store the current sort index into each node sorted.
i have something like : parent-[r:has_child]->rank_node
I would like to do something like :
start n=node(1)
match n-[r:has_child]->rank_node
with rank_node
order by rank_node.score
set rank_node.position = "CURRENT ORDER BY INDEX"
I woul like to have a counter that increment from 0 to "n" ... I can't manage to do that ...
Here CURRENT ORDER BY INDEX is like the current index of each node return by order by.
i don't know if it is possible to do that with cyper? It would be very usefull because we can do big sorting and insert directly the position in the node to get it later directly ...
Talked to Michael Hunger and we solved it like this:
start n=node(0)
match n-[r:rank]->rank_node
with rank_node, n
match n-[r:rank]->rn
where rn.score <= rank_node.score
with rank_node,count(*) as pos
set rank_node.rank = pos
return rank_node;
For live example see: http://console.neo4j.org/?id=d07p7r
MATCH (a:person)
OPTIONAL MATCH ()-[r:knows|knowsyy]->(a)
RETURN COUNT(*) AS rank,a.mobno // //rank with two direction
person=label
know and knowsyy=relation
MATCH (n:person)-[r:knows]->(a:phonbook)
RETURN COUNT(*) AS rank,n.mobno,r.name ORDER BY n.mobno desc //rank with relation

Resources