Removing a node from the Cypher query result in Neo4j - neo4j

I have the following cypher code:
MATCH (n)
WHERE toLower(n.name) STARTS WITH toLower('ja')
RETURN n
This case-insensitive query returns all the nodes which their names start with the substring "ja". For example if I execute this in my db it will return ["Javier", "Jacinto", "Jasper", "Jacob"]
I need this script to also remove the unwanted nodes on this list, for example let's say that an array containing ["Jasper, Javier"] is sent to the data access layer indicating that those two nodes shouldn't be returned, leaving the final query result as follows: ["Jacinto", "Jacob"]
How can I perform this?

If you know before making the query which items should be excluded you can say:
MATCH (n)
WHERE toLower(n.name) STARTS WITH toLower('ja')
AND NOT (toLower(n.name) IN ['jasper', 'javier'])
RETURN n

Related

Neo4j count Query

match(m:master_node:Application)-[r]-(k:master_node:Server)-[r1]-(n:master_node)
where (m.name contains '' and (n:master_node:DeploymentUnit or n:master_node:Schema))
return distinct m.name,n.name
Hi,I am trying to get total number of records for the above query.How I change the query using count function to get the record count directly.
Thanks in advance
The following query uses the aggregating funtion COUNT. Distinct pairs of m.name, n.name values are used as the "grouping keys".
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
RETURN m.name, n.name, COUNT(*) AS cnt
I assume that m.name contains '' in your query was an attempt to test for the existence of m.name. This query uses the EXISTS() function to test that more efficiently.
[UPDATE]
To determine the number of distinct n and m pairs in the DB (instead of the number of times each pair appears in the DB):
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
WITH DISTINCT m.name AS n1, n.name AS n2
RETURN COUNT(*) AS cnt
Some things to consider for speeding up the query even further:
Remove unnecessary label tests from the MATCH pattern. For example, can we omit the master_node label test from any nodes? In fact, can we omit all label testing for any nodes without affecting the validity of the result? (You will likely need a label on at least one node, though, to avoid scanning all nodes when kicking off the query.)
Can you add a direction to each relationship (to avoid having to traverse relationships in both directions)?
Specify the relationship types in the MATCH pattern. This will filter out unwanted paths earlier. Once you do so, you may also be able to remove some node labels from the pattern as long as you can still get the same result.
Use the PROFILE clause to evaluate the number of DB hits needed by different Cypher queries.
You can find examples of how to use count in the Neo4j docs here
In your case the first example where:
count(*)
Is used to return a count of each returned item should work.

Neo4J cypher: remove loops from flattened resultset

This is an extension of the following question:
I have a data lineage related graph in Neo4J with variable length path containing intermediate nodes (tables) as an array:
match p=(s)-[r:airflow_loads_to*]->(t)
where s.database_name='hive'
and s.schema_name='test'
and s.name="source_table"
return s.name, [n in nodes(p) | n.name] as arrayOfName,t.name
Now, this resultset contains loops that I want to omit. I can 'remove' these loops without the flattening ArrayOfName result by running:
match p=(s)-[r:airflow_loads_to*]->(t)
where s.database_name='hive'
and s.schema_name='test'
and s.name="source_table"
return s.name, min(size(r)) as pathlen, t.name
order by pathlen
However, when I add ArrayOfName back, the resultset contains all rows again.
I guess this means chaining the results somehow, using the pathlen as a filter or preventing that loops exist in the path p at all.
But I am stuck on how to accomplish this...
did you try
p=shortestPath((s)-[r:airflow_loads_to*]->(t))
because it seems that is what you need

Neo4j cypher queries returning different count result

I want a query that starting from a node, it counts the possible end nodes given relation type:
For example this query:
MATCH (start:typeA{my_id:"abc"})-[:rel]->(l:typeB) return count(l)
works great and returns a proper number, i.e., 500. The same happens with:
MATCH p=(start:BusStop{StopCode:"0247"})-[:CAN_BOARD]->(:Leg) return count(p)
However if I do:
MATCH (start:typeA{my_id:"abc"}) return count((start)-[:rel]->(:typeB))
returns 1.
What is the difference between this query and the previous ones?
The result of a path expression (as used in your last query) is a list of paths. This is different than the result when the same path pattern is used in a MATCH clause.
You would have gotten 500 if you changed your last query to use SIZE() instead of COUNT():
MATCH (start:typeA{my_id:"abc"}) return SIZE((start)-[:rel]->(:typeB))

Including vars in Neo4j WITH statement changes query output

I'm trying to find the number of nodes of a certain kind in my database that are connected to more than one other node of another kind. In my case, it's place nodes connected to several name nodes. I have a query that works:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p,count(n) as counts
WHERE counts > 1
RETURN p;`
However, that only returns the place nodes, and ideally I'd like it to return all the nodes and edges involved. I've found a question on returning variables from before the WITH, but if I include any of the other variables I've defined, the query returns no responses, i.e. this query returns nothing:
MATCH rels=(p:Place)-[c:Called]->(n:Name)
WITH p, count(n) as counts, rels
WHERE counts > 1
RETURN p;
I don't know how to return the information that I want without changing the results of the query. Any help would be much appreciated
The reason your second query returns nothing is because its WITH clause specifies as aggregation "grouping keys" both p and rels. Since each rels path has only a single n value, counts would always be 1.
Something like this might work for you:
MATCH path=(p:Place)-[:Called]->(:Name)
WITH p, COLLECT(path) as paths
WHERE SIZE(paths) > 1
RETURN p, paths;
This returns each matching Place node and all its paths.
Try this:
MATCH (p:Place)-[c:Called]->(n:Name)
WHERE size((p)-[:Called]->(:Name)) > 1
WITH p,count(n) as counts, collect(n) AS names, collect(c) AS calls
RETURN p, names, calls, counts ORDER BY counts DESC;
This query makes use of Cypher's collect() function to create lists of the names and called relationships for each place that has more than Called relationship with a Name node.

Multiple match with where clause in Neo4j Cypher gives error "Cannot match on a pattern containing only already bound identifiers"

We are using neo4j-community-2.1.2. Right now we have only 3 nodes Of Job label in the database And we do Schema indexing on all fields that are used in this query . Total DB hits approx 40
Query is ->
PROFILE match (job1:Job) where (job1.jobType="Adhoc" or job1.jobType="Virtual") AND (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1
match (job2:Job)-[REQUIRED_SKILL]-(skill:Skill) where skill.name="Neo4j" and (job2 in jobs1) with collect(job2) as jobs2
match (job3:Job)-[REQUIRED_SKILL]-(skill:Skill) where skill.name="Java" and (job3 IN jobs2) with collect(job3) as jobs3 return jobs3
So we try to do something like that
match (job1:Job) where (job1.jobType="Adhoc" or job1.jobType="Virtual")
match (job1) where (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1 return jobs1
Because result of first match goes to next match . So that in next filter there is only need to filter less number of nodes But we get this exception
Cannot match on a pattern containing only already bound identifiers (line 2, column 1)
"match (job1) where (job1.mode="Free" or job1.mode="Paid") with collect(job1) as jobs1 return jobs1"
Optimize this Query
You cannot match job1 twice, once it is matched you can use the same instance again (using WITH), or in this case, you can filter on both conditions using AND. Also your query would be simpler by replacing OR with IN inclusion test, like this:
match (job1:Job)
where job1.jobType in ["Adhoc", "Virtual"]
and job1.mode in ["Free", "Paid"]
return collect(job1) as jobs1

Resources