How to update Nodes within a random manner in Neo4j - neo4j

how can i update a random set of nodes in Neo4j. I tried the folowing:
match (Firstgraph)
with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph)
order by rand();
match (G1:FirstGraph)
where id(G1)=Id
set G1.Version=5
My idea is the get a random set then update it, but i got the error:
Expected exactly one statement per query but got: 2
Thanks for your help.

Let's find out what's the problem here, first of all, your error
Expected exactly one statement per query but got: 2
This is coming from your query, if we check it, we see that you did two queries in the same sentence, that's why you get this error.
match (Firstgraph) with id(Firstgraph) as Id
return Firstgraph.name, Firstgraph.version,id(Firstgraph) order by
rand(); match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
This is not a good query, because you can't use ; in a query sentence, it's the query end marker, so you can't do another query after this, but you can use UNION:
match (Firstgraph) with id(Firstgraph) as Id
return
Firstgraph.name, Firstgraph.version,id(Firstgraph) order by rand()
UNION
match (G1:FirstGraph) where id(G1)=Id set G1.Version=5
Also, if you want to match a random set of nodes, you can simply do this (this example is for a 50% chances to get each node):
Match (node) Where rand() > 0.5 return node
And then do whatever you want with the node using WITH

Related

Unexpected relation and node COUNT in Neo4j

Whenever I am using a query to get the count of a specific node, I always get the number greater than 1 even though there is only one distinct type of that node existing.
Sample query:
MATCH (p)-[rel]->(v:myDistinctNode) RETURN COUNT(v)
Output: 80
MATCH (p)-[rel]->(v:myDistinctNode) RETURN COUNT(DISTINCT v)
Output: 1
I see different results while using DISTINCT, but I cannot use DISTINCT all the time. Why I am seeing this and how can I avoid it? Thanks!
Neo4j Kernel-Version: 3.5.14
The short answer is that you need to use a collect statement to make it work.
MATCH (p)<-[rel]-(v:myDistinctNode) WITH collect(v) AS nodes RETURN count(nodes)
This should return one.
I'm not a cypher expert, but I believe the reason it doesn't work is that the cypher result seems more like a table where in one row you have p, another row you have r, and the last row you have v. Even though v is a unique entity, there are still 80 rows that have v.

Neo4j count Query

match(m:master_node:Application)-[r]-(k:master_node:Server)-[r1]-(n:master_node)
where (m.name contains '' and (n:master_node:DeploymentUnit or n:master_node:Schema))
return distinct m.name,n.name
Hi,I am trying to get total number of records for the above query.How I change the query using count function to get the record count directly.
Thanks in advance
The following query uses the aggregating funtion COUNT. Distinct pairs of m.name, n.name values are used as the "grouping keys".
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
RETURN m.name, n.name, COUNT(*) AS cnt
I assume that m.name contains '' in your query was an attempt to test for the existence of m.name. This query uses the EXISTS() function to test that more efficiently.
[UPDATE]
To determine the number of distinct n and m pairs in the DB (instead of the number of times each pair appears in the DB):
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
WITH DISTINCT m.name AS n1, n.name AS n2
RETURN COUNT(*) AS cnt
Some things to consider for speeding up the query even further:
Remove unnecessary label tests from the MATCH pattern. For example, can we omit the master_node label test from any nodes? In fact, can we omit all label testing for any nodes without affecting the validity of the result? (You will likely need a label on at least one node, though, to avoid scanning all nodes when kicking off the query.)
Can you add a direction to each relationship (to avoid having to traverse relationships in both directions)?
Specify the relationship types in the MATCH pattern. This will filter out unwanted paths earlier. Once you do so, you may also be able to remove some node labels from the pattern as long as you can still get the same result.
Use the PROFILE clause to evaluate the number of DB hits needed by different Cypher queries.
You can find examples of how to use count in the Neo4j docs here
In your case the first example where:
count(*)
Is used to return a count of each returned item should work.

Including vars in Neo4j WITH statement changes query output

I'm trying to find the number of nodes of a certain kind in my database that are connected to more than one other node of another kind. In my case, it's place nodes connected to several name nodes. I have a query that works:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p,count(n) as counts
WHERE counts > 1
RETURN p;`
However, that only returns the place nodes, and ideally I'd like it to return all the nodes and edges involved. I've found a question on returning variables from before the WITH, but if I include any of the other variables I've defined, the query returns no responses, i.e. this query returns nothing:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p, count(n) as counts, rels
WHERE counts > 1
RETURN p;
I don't know how to return the information that I want without changing the results of the query. Any help would be much appreciated
The reason your second query returns nothing is because its WITH clause specifies as aggregation "grouping keys" both p and rels. Since each rels path has only a single n value, counts would always be 1.
Something like this might work for you:
MATCH path=(p:Place)-[:Called]->(:Name)
WITH p, COLLECT(path) as paths
WHERE SIZE(paths) > 1
RETURN p, paths;
This returns each matching Place node and all its paths.
Try this:
MATCH (p:Place)-[c:Called]->(n:Name)
WHERE size((p)-[:Called]->(:Name)) > 1
WITH p,count(n) as counts, collect(n) AS names, collect(c) AS calls
RETURN p, names, calls, counts ORDER BY counts DESC;
This query makes use of Cypher's collect() function to create lists of the names and called relationships for each place that has more than Called relationship with a Name node.

How to query for multiple OR'ed Neo4j paths?

Anyone know of a fast way to query multiple paths in Neo4j ?
Lets say I have movie nodes that can have a type that I want to match (this is psuedo-code)
MATCH
(m:Movie)<-[:TYPE]-(g:Genre { name:'action' })
OR
(m:Movie)<-[:TYPE]-(x:Genre)<-[:G_TYPE*1..3]-(g:Genre { name:'action' })
(m)-[:SUBGENRE]->(sg:SubGenre {name: 'comedy'})
OR
(m)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
The problem is, the first "m:Movie" nodes to be matched must match one of the paths specified, and the second SubGenre is depenedent on the first match.
I can make a query that works using MATCH and WHERE, but its really slow (30 seconds with a small 20MB dataset).
The problem is, I don't know how to OR match in Neo4j with other OR matches hanging off of the first results.
If I use WHERE, then I have to declare all the nodes used in any of the statements, in the initial MATCH which makes the query slow (since you cannot introduce new nodes in a WHERE)
Anyone know an elegant way to solve this ?? Thanks !
You can try a variable length path with a minimal length of 0:
MATCH
(m:Movie)<-[:TYPE|:SUBGENRE*0..4]-(g)
WHERE g:Genre and g.name = 'action' OR g:SubGenre and g.name='comedy'
For the query to use an index to find your genre / subgenre I recommend a UNION query though.
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' })
RETURN distinct m
UNION
(m:Movie)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
RETURN distinct m
Perhaps the OPTIONAL MATCH clause might help here. OPTIONAL MATCH beavior is similar to the MATCH statement, except that instead of an all-or-none pattern matching approach, any elements of the pattern that do not match the pattern specific in the statement are bound to null.
For example, to match on a movie, its genre and a possible sub-genre:
OPTIONAL MATCH (m:Movie)-[:IS_GENRE]->(g:Genre)<-[:IS_SUBGENRE]-(sub:Genre)
WHERE m.title = "The Matrix"
RETURN m, g, sub
This will return the movie node, the genre node and if it exists, the sub-genre. If there is no sub-genre then it will return null for sub. You can use variable length paths as you have above as well with OPTIONAL MATCH.
[EDITED]
The following MATCH clause should be equivalent to your pseudocode. There is also a USING INDEX clause that assumes you have first created an index on :SubGenre(name), for efficiency. (You could use an index on :Genre(name) instead, if Genre nodes are more numerous than SubGenre nodes.)
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' }),
(m)-[:SUBGENRE]->()<-[:SUB_TYPE*0..3]-(sg:SubGenre { name: 'comedy' })
USING INDEX sg:SubGenre(name)
Here is a console that shows the results for some sample data.

Multiple match with where clause in Neo4j Cypher gives error "Cannot match on a pattern containing only already bound identifiers"

We are using neo4j-community-2.1.2. Right now we have only 3 nodes Of Job label in the database And we do Schema indexing on all fields that are used in this query . Total DB hits approx 40
Query is ->
PROFILE match (job1:Job) where (job1.jobType="Adhoc" or job1.jobType="Virtual") AND (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1
match (job2:Job)-[REQUIRED_SKILL]-(skill:Skill) where skill.name="Neo4j" and (job2 in jobs1) with collect(job2) as jobs2
match (job3:Job)-[REQUIRED_SKILL]-(skill:Skill) where skill.name="Java" and (job3 IN jobs2) with collect(job3) as jobs3 return jobs3
So we try to do something like that
match (job1:Job) where (job1.jobType="Adhoc" or job1.jobType="Virtual")
match (job1) where (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1 return jobs1
Because result of first match goes to next match . So that in next filter there is only need to filter less number of nodes But we get this exception
Cannot match on a pattern containing only already bound identifiers (line 2, column 1)
"match (job1) where (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1 return jobs1"
Optimize this Query
You cannot match job1 twice, once it is matched you can use the same instance again (using WITH), or in this case, you can filter on both conditions using AND. Also your query would be simpler by replacing OR with IN inclusion test, like this:
match (job1:Job)
where job1.jobType in ["Adhoc", "Virtual"]
and job1.mode in ["Free", "Paid"]
return collect(job1) as jobs1

Resources