PatternExpressions are not allowed to introduce new variables

PatternExpressions are not allowed to introduce new variables - neo4j

I have two types of relationship that can exist between two same nodes. I want to extract the nodes that only have type1 and not type2 relationship. My query is:
Match (n) where (n)-[:type1]-(m) and (not (n)-[:type2]-(m)) return n
This gives error:
PatternExpressions are not allowed to introduce new variables: 'm'. (line 1, column 32 (offset: 31))
"Match (n) where (n)-[:type1]-(m) and (not (n)-[:type2]-(m)) return n"
^
Neither Googling around nor the documentation Patterns - Neo4j Cypher Manual give me any useful help. Do you know why is this?

How about this.
match(n)-[:type1]-(m)
where not (n)-[:type2]-(m)
return n

Related

Adding New Relationships in Neo4j Database using apoc.periodic.iterate

I have a Neo4j database with two kinds of nodes - Authors and Articles. Some of the articles have more than one author. I am trying to create an undirected relationship between the authors who worked together on an article. My current non-functional query is this:
CALL apoc.periodic.iterate(
"MATCH (a:Author) WHERE (a)-[:WROTE]->()<-[:WROTE]-(b:Author) RETURN a,b",
"WITH {a} AS a, {b} as b CREATE (a)-[r:COAUTHOR]-(b)", {batchSize:10000, parallel:true})
I get the following error:
Failed to invoke procedure apoc.periodic.iterate: Caused by: org.neo4j.exceptions.SyntaxException: PatternExpressions are not allowed to introduce new variables: 'b'. (line 1, column 60 (offset: 59))
"EXPLAIN MATCH (a:Author) WHERE (a)-[:WROTE]->()<-[:WROTE]-(b:Author) RETURN a,b"
I can see that the issue is that I am trying to do too much in the first MATCH statemetn, but I'm new to Cypher and am having trouble breaking it up.
Thanks very much,
John

The problem stems from your first statement:
MATCH (a:Author)
WHERE (a)-[:WROTE]->()<-[:WROTE]-(b:Author)
RETURN a,b
Cypher does not allow introducing new variables on the WHERE part of the query, so your (b:Author) will not be allowed since it's located on the WHERE clause.
To fix, move (b:Author) to the MATCH statement. Also, if you are finding a pattern on the WHERE clause, you should use the exists() function in order to return a boolean.
MATCH (a:Author), (b:Author)
WHERE exists((a)-[:WROTE]->()<-[:WROTE]-(b))
RETURN a, b

Returning multiple columns

Hi All newbie here in Neo4J,
I am trying to return keys or properties using the following simple query in the neo4J broswer environment.
MATCH (n:movies),(m:people)
RETURN properties(n,m)
What I am trying to achieve is to return the properties for both the movies and people nodes
However, I would always get an error
Too many parameters for function 'properties' (line 2, column 9 (offset: 36))
" RETURN properties(n,m)"
I have tried,
MATCH (n:movies),(m:people)
RETURN properties(k) in [n,m]
The error I would get
Variable `k` not defined (line 2, column 20 (offset: 47))
" RETURN properties(k) in [n,m]"
I am trying to pass a list here into k but NEO4J is not permitting me to do so. Is it even possible to pass a list into the function properties() ??
Thank you in advance.

The properties function takes exactly one node or a relationship as input.
MATCH (n:movies),(m:people) RETURN properties(n), properties(m)
will create a Cartesian Product.
i.e. If you have five movies and ten people, you will get a result of all 50 combinations.
If you aren't looking for a cartesian product, you would have to define a specific pattern or restrict the MATCH clause further.
If you want just the individual properties without combining them, consider Union.
MATCH (n:Movie)
RETURN properties(n) as `Properties`
UNION ALL
MATCH (m:Person)
RETURN properties(m) as `Properties`
Why am I using aliases for a seemingly simple query? To avoid this:
All sub queries in an UNION must have the same column names (line 3,
column 1 (offset: 39))
For working with lists:
The collect function lets you create/construct a list from the results while UNWIND expands a list into a sequence of rows.

properties() only takes one argument, you can try
MATCH (n:movies),(m:people) RETURN properties(n) as prop_n, properties(m) as prop_m
or more optimal query would be
MATCH (n:movies) optional match (m:people) RETURN properties(n) as prop_n, properties(m) as prop_m
MATCH (n:movies),(m:people)
RETURN properties(k) in [n,m]
since you have not defined k so you are getting the error. Also according to doc properites() takes "An expression that returns a relationship, a node, or a map" as an argument. Your query is not supported.

Neo4j indices slow when querying across 2 labels

I've got a graph where each node has label either A or B, and an index on the id property for each label:
CREATE INDEX ON :A(id);
CREATE INDEX ON :B(id);
In this graph, I want to find the node(s) with id "42", but I don't know a-priori the label. To do this I am executing the following query:
MATCH (n {id:"42"}) WHERE (n:A OR n:B) RETURN n;
But this query takes 6 seconds to complete. However, doing either of:
MATCH (n:A {id:"42"}) RETURN n;
MATCH (n:B {id:"42"}) RETURN n;
Takes only ~10ms.
Am I not formulating my query correctly? What is the right way to formulate it so that it takes advantage of the installed indices?

Here is one way to use both indices. result will be a collection of matching nodes.
OPTIONAL MATCH (a:B {id:"42"})
OPTIONAL MATCH (b:A {id:"42"})
RETURN
(CASE WHEN a IS NULL THEN [] ELSE [a] END) +
(CASE WHEN b IS NULL THEN [] ELSE [b] END)
AS result;
You should use PROFILE to verify that the execution plan for your neo4j environment uses the NodeIndexSeek operation for both OPTIONAL MATCH clauses. If not, you can use the USING INDEX clause to give a hint to Cypher.

You should use UNION to make sure that both indexes are used. In your question you almost had the answer.
MATCH (n:A {id:"42"}) RETURN n
UNION
MATCH (n:B {id:"42"}) RETURN n
;
This will work. To check your query use profile or explain before your query statement to check if the indexes are used .

Indexes are formed and and used via a node label and property, and to use them you need to form your query the same way. That means queries w/out a label will scan all nodes with the results you got.

What is the difference between multiple MATCH clauses and a comma in a Cypher query?

In a Cypher query language for Neo4j, what is the difference between one MATCH clause immediately following another like this:
MATCH (d:Document{document_ID:2})
MATCH (d)--(s:Sentence)
RETURN d,s
Versus the comma-separated patterns in the same MATCH clause? E.g.:
MATCH (d:Document{document_ID:2}),(d)--(s:Sentence)
RETURN d,s
In this simple example the result is the same. But are there any "gotchas"?

There is a difference: comma separated matches are actually considered part of the same pattern. So for instance the guarantee that each relationship appears only once in resulting path is upheld here.
Separate MATCHes are separate operations whose paths don't form a single patterns and which don't have these guarantees.

I think it's better to explain providing an example when there's a difference.
Let's say we have the "Movie" database which is provided by official Neo4j tutorials.
And there're 10 :WROTE relationships in total between :Person and :Movie nodes
MATCH (:Person)-[r:WROTE]->(:Movie) RETURN count(r); // returns 10
1) Let's try the next query with two MATCH clauses:
MATCH (p:Person)-[:WROTE]->(m:Movie) MATCH (p2:Person)-[:WROTE]->(m2:Movie)
RETURN p.name, m.title, p2.name, m2.title;
Sure you will see 10*10 = 100 records in the result.
2) Let's try the query with one MATCH clause and two patterns:
MATCH (p:Person)-[:WROTE]->(m:Movie), (p2:Person)-[:WROTE]->(m2:Movie)
RETURN p.name, m.title, p2.name, m2.title;
Now you will see 90 records are returned.
That's because in this case records where p = p2 and m = m2 with the same relationship between them (:WROTE) are excluded.
For example, there IS a record in the first case (two MATCH clauses)
p.name m.title p2.name m2.title
"Aaron Sorkin" "A Few Good Men" "Aaron Sorkin" "A Few Good Men"
while there's NO such a record in the second case (one MATCH, two patterns)

There are no differences between these provided that the clauses are not linked to one another.
If you did this:
MATCH (a:Thing), (b:Thing) RETURN a, b;
That's the same as:
MATCH (a:Thing) MATCH (b:Thing) RETURN a, b;
Because (and only because) a and b are independent. If a and b were linked by a relationship, then the meaning of the query could change.

In a more generic way, "The same relationship cannot be returned more than once in the same result record." [see 1.5. Cypher Result Uniqueness in the Cypher manual]
Both MATCH-after-MATCH, and single MATCH with comma-separated pattern should logically return a Cartesian product. Except, for comma-separated pattern, we must exclude those records for which we already added the relationship(s).
In Andy's answer, this is why we excluded repetitions of the same movie in the second case: because the second expression from each single MATCH was using there the same :WROTE relationship as the first expression.

If a part of a query contains multiple disconnected patterns, this will build a cartesian product between all those parts. This may produce a large amount of data and slow down query processing. While occasionally intended, it may often be possible to reformulate the query that avoids the use of this cross product, perhaps by adding a relationship between the different parts or by using OPTIONAL MATCH (identifier is: (a)) .
IN short their is NO Difference in this both query but used it very carefully.

In a more generic way, "The same relationship cannot be returned more than once in the same result record." [see 1.5. Cypher Result Uniqueness in the Cypher manual]
How about this statement?
MATCH p1=(v:player)-[e1]->(n)
MATCH p2=(n:team)<-[e2]-(m)
WHERE e1=e2
RETURN e1,e2,p1,p2

Neo4j cypher query efficiency and syntax

I am attempting to query an ontology of health represented as an acyclic, directed graph in Neo4j v2.1.5. The database consists of 2 million nodes and 5 million edges/relationships. The following query identifies all nodes subsumed by a disease concept and caused by a particular bacteria or any of the bacteria subtypes as follows:
MATCH p = (a:ObjectConcept{disease}) <-[:ISA*]- (b:ObjectConcept),
q=(c:ObjectConcept{bacteria})<-[:ISA*]-(d:ObjectConcept)
WHERE NOT (b)-->()--(c) AND NOT (b)-->()-->(d)
RETURN distinct b.sctid, b.FSN
This query runs in < 1 second and returns the correct answers. However, adding one additional parameter adds substantial time (20 minutes). Example:
MATCH p = (a:ObjectConcept{disease}) <-[:ISA*]- (b:ObjectConcept),
q=(c:ObjectConcept{bacteria})<-[:ISA*]-(d:ObjectConcept),
t=(e:ObjectConcept{bacteria})<-[:ISA*]-(f:ObjectConcept),
WHERE NOT (b)-->()--(c)
AND NOT (b)-->()-->(d)
AND NOT (b)-->()-->(e)
AND NOT (b)-->()-->(f)
RETURN distinct b.sctid, b.FSN
I am new to cypher coding, but I have to imagine there is a better way to write this query to be more efficient. How would Collections improve this?
Thanks

I already answered that on the google group:
Hi Scott,
I presume you created indexes or constraints for :ObjectConcept(name) ?
I am working with an acyclic, directed graph (an ontology) that models
human health and am needing to identify certain diseases (example:
Pneumonia) that are infectious but NOT caused by certain bacteria
(staph or streptococcus). All concepts are Nodes defined as
ObjectConcepts. ObjectConcepts are connected by relationships such as
[ISA], [Pathological_process], [Causative_agent], etc.
The query requires:
a) Identification of all concepts subsumed by the concept Pneumonia as follows:
MATCH p = (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)
this already returns a number of paths, potentially millions, can you check that with
MATCH p = (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept) return count(*)
b) Identification of all concepts subsumed by Genus Staph and Genus Strep (including the concept Genus Staph and Genus Strep) as follows. Note:
with b MATCH (b) q = (c:ObjectConcept{Strep})<-[:ISA*]-(d:ObjectConcept), h = (e:ObjectConcept{Staph})<-[:ISA*]-(f:ObjectConcept)
this is then the cross product of the paths from "p", "q" and "h", e.g. if all 3 of them return 1000 paths, you're at 1bn paths !!
c) Identify all nodes(p) that do not have a causative agent of Strep (i.e., nodes(q)) or Staph (nodes(h)) as follows:
with b,c,d,e,f MATCH (b),(c),(d),(e),(f) WHERE (b)--()-->(c) OR (b)-->()-->(d) OR (b)-->()-->(e) OR (b)-->()-->(f) RETURN distinct b.Name;
you don't need the WITH or even the MATCH (b),(c),(d),(e),(f)
what connections are there between b and the other nodes ? do you have concrete ones? for the first there is also missing one direction.
the where clause can be a problem, in general you want to show that perhaps this query is better reproduced by a UNION of simpler matches
e.g
MATCH (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)-->()-->(c:ObjectConcept{name:Strep}) RETURN b.name
UNION
MATCH (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)-->()-->(e:ObjectConcept{name:Staph}) RETURN b.name
UNION
MATCH (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)-->()-->(d:ObjectConcept)-[:ISA*]->(c:ObjectConcept{name:Strep}) return b.name
UNION
MATCH (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)-->()-->(d:ObjectConcept)-[:ISA*]->(c:ObjectConcept{name:Staph}) return b.name
another option would be to utilize the shortestPath() function to find one or all shortest path(s) between Pneumonia and the bacteria with certain rel-types and direction.
Perhaps you can share the dataset and the expected result.

The query was successfully accomplished using UNION functions as follows:
MATCH p = (a:ObjectConcept{sctid:233604007}) <-[:ISA*]- (b:ObjectConcept),
q = (c:ObjectConcept{sctid:58800005})<-[:ISA*]-(d:ObjectConcept)
WHERE NOT (b)-->()--(c) AND NOT (b)-->()-->(d)
RETURN distinct b
UNION
MATCH p = (a:ObjectConcept{sctid:233604007}) <-[:ISA*]- (b:ObjectConcept),
t = (e:ObjectConcept{sctid:65119002}) <-[:ISA*]- (f:ObjectConcept)
WHERE NOT (b)-->()-->(e) AND NOT (b)-->()-->(f)
RETURN distinct b
The query runs in sub 20 seconds vs. 20 minutes by reducing the cardinality of the objects being queried.

Categories

HOME

lua

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

PatternExpressions are not allowed to introduce new variables - neo4j

How about this. match(n)-[:type1]-(m) where not (n)-[:type2]-(m) return n

Related

Adding New Relationships in Neo4j Database using apoc.periodic.iterate

Returning multiple columns

Neo4j indices slow when querying across 2 labels

What is the difference between multiple MATCH clauses and a comma in a Cypher query?

Neo4j cypher query efficiency and syntax

Categories

Resources