Neo4j Subgraph Projection using a string inside string query - neo4j

Working on a project, I was trying to reduce the number of variables to make something easier to visualize for creating embeddings and checking if they work.
I realized there was a projection and a subprojection. I can definitely create a new neo4j graph, but that seems like a slow solution.
so just following the tutorial, they have
CALL gds.graph.project(
'apps_undir',
['App', 'Genre']
{Genre_Category: {orientation: 'UNDIRECTED'}}
)
then something like
CALL gds.beta.graph.project.subgraph(
'subapps',
'apps_undir',
"n:App OR (n:Genre AND n.name = 'Action' OR n.name = 'RPG')",
'*'
)
I realize this isn't python, but it's the idea I'm trying to express. With the string query as 'n:App OR (n:Genre AND n.name = Action OR n.name = RPG)' I get the error:
Failed to invoke procedure gds.beta.graph.project.subgraph: Caused by: org.neo4j.gds.beta.filter.expression.SemanticErrors: Semantic errors while parsing expression:
Invalid variable `Action`. Only `n` is allowed for nodes
Invalid variable `RPG`. Only `n` is allowed for nodes
Unknown property `name`.
Unknown property `name`.
the error produced is
"Neo.ClientError.Statement.SyntaxError
Invalid input 'subgraph': expected"
As subgraph is only in beta functionality isn't great, but all node names apparently need to be n,
for the actual subgraph, and performing an embedding on that
if it helps, this was taken from a steam database scrape from 2016 and a couple csv values are below:
appid;Genre
8890;RPG
8890;Strategy
10530;Action
10530;RPG
15540;Indie
15560;Action
15620;Strategy

There are a couple of problems with your workflow. When you project a graph in GDS with the following command, it doesn't include any node properties by default.
CALL gds.graph.project(
'apps_undir',
['App', 'Genre']
{Genre_Category: {orientation: 'UNDIRECTED'}}
)
There are ways to include node properties in the graph projections, however string format is not supported. Therefore, you cannot project the name property which appears to be a string. To achieve what you want to do you should probably use Cypher projection.
CALL gds.graph.project.cypher('subapps',
'MATCH (n) WHERE n:App OR (n:Genre AND n.name IN ["Action", "RPG"]) RETURN id(n) AS id',
'MATCH (s)-[:Genre_Category]-(t) RETURN id(s) AS source, id(t) AS target',
{validateRelationship:false})
A couple of pointers for Cypher projection. To define the relationships I have used (s)-[:Genre_Category]-(t) pattern. Notice the lack of relationship direction. By avoiding the relationship direction definition, the relationships will be projected as "undirected". You need to include the validateRelationship parameter since you perform node filtering in the node projection, but not in the relationship projection.

Related

Neo4J Matching Nodes Based on Multiple Relationships

I had another thread about this where someone suggested to do
MATCH (p:Person {person_id: '123'})
WHERE ANY(x IN $names WHERE
EXISTS((p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(:Dias {group_name: x})))
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
This does what I need it to, returns nodes that fit the criteria without returning the relationships, but now I need to include another param that is a list.
....(:Dias {group_name: x, second_name: y}))
I'm unsure of the syntax.. here's what I tried
WHERE ANY(x IN $names and y IN $names_2 WHERE..
this gives me a syntax error :/
Since the ANY() function can only iterate over a single list, it would be difficult to continue to use that for iteration over 2 lists (but still possible, if you create a single list with all possible x/y combinations) AND also be efficient (since each combination would be tested separately).
However, the new existenial subquery synatx introduced in neo4j 4.0 will be very helpful for this use case (I assume the 2 lists are passed as the parameters names1 and names2):
MATCH (p:Person {person_id: '123'})
WHERE EXISTS {
MATCH (p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(d:Dias)
WHERE d.group_name IN $names1 AND d.second_name IN $names2
}
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
By the way, here are some more tips:
If it is possible to specify the direction of each relationship in your query, that would help to speed up the query.
If it is possible to remove any node labels from a (sub)query and still get the same results, that would also be faster. There is an exception, though: if the (sub)query has no variables that are already bound to a value, then you would normally want to specify the node label for the one node that would be used to kick off that (sub)query (you can do a PROFILE to see which node that would be).

Cypher query fails with variable length paths when trying to find all paths with unique node occurences

I have a highly interconnected graph where starting from a specific node
i want to find all nodes connected to it regardless of the relation type, direction or length. What i am trying to do is to filter out paths that include a node more than 1 times. But what i get is a
Neo.DatabaseError.General.UnknownError: key not found: UNNAMED27
I have managed to create a much simpler database
in neo4j sandbox and get the same message again using the following data:
CREATE (n1:Person { pid:1, name: 'User1'}),
(n2:Person { pid:2, name: 'User2'}),
(n3:Person { pid:3, name: 'User3'}),
(n4:Person { pid:4, name: 'User4'}),
(n5:Person { pid:5, name: 'User5'})
With the following relationships:
MATCH (n1{pid:1}),(n2{pid:2}),(n3{pid:3}),(n4{pid:4}),(n5{pid:5})
CREATE (n1)-[r1:RELATION]->(n2),
(n5)-[r2:RELATION]->(n2),
(n1)-[r3:RELATION]->(n3),
(n4)-[r4:RELATION]->(n3)
The Cypher Query that causes this issue in the above model is
MATCH p= (n:Person{pid:1})-[*0..]-(m)
WHERE ALL(c IN nodes(p) WHERE 1=size(filter(d in nodes(p) where c.pid = d.pid)) )
return m
Can anybody see what is wrong with this query?
The error seems like a bug to me. There is a closed neo4j issue that seems similar, but it was supposed to be fixed in version 3.2.1. You should probably create a new issue for it, since your comments state you are using 3.2.5.
Meanwhile, this query should get the results you seem to want:
MATCH p=(:Person{pid:1})-[*0..]-(m)
WITH m, NODES(p) AS ns
UNWIND ns AS n
WITH m, ns, COUNT(DISTINCT n) AS cns
WHERE SIZE(ns) = cns
return m
You should strongly consider putting a reasonable upper bound on your variable-length path search, though. If you do not do so, then with any reasonable DB size your query is likely to take a very long time and/or run out of memory.
When finding paths, Cypher will never visit the same node twice in a single path. So MATCH (a:Start)-[*]-(b) RETURN DISTINCT b will return all nodes connected to a. (DISTINCT here is redundant, but it can affect query performance. Use PROFILE on your version of Neo4j to see if it cares and which is better)
NOTE: This works starting with Neo4j 3.2 Cypher planner. For previous versions of
the Cypher planner, the only performant way to do this is with APOC, or add a -[:connected_to]-> relation from start node to all children so that path doesn't have to be explored.)

cypher match query merge result set

I'm newer in CQL anyone help me to solve query to find list of followers with a flag if i also follow him. i'm trying like this way
MATCH (n:users_master)-[r:FOLLOW]->(m:users_master)
OPTIONAL MATCH (n:users_master)<-[r2:FOLLOW]-(m:users_master)
CASE when EXISTS(r2) THEN n.flag= 1 ELSE n.flag=0 END
where id(m)=35
RETURN n
notice here i'd also like to add a virtual property flag in result set LIKE {"updated_at":"12/26/2016, 3:45:38 PM",
"created_on":"12/26/2016, 3:45:38 PM",
"last_name":"john",
"first_name":"john",
"email":"new#test",
"facebook_id":"12341",
"status":"Active",
"id":35
"flag":1
}
The EXISTS() function can be used to check for the existence of a pattern, and in your case can replace your OPTIONAL MATCH.
Also, variables in your patterns aren't needed if you aren't going to be using them, so you shouldn't need them on your relationships at all.
MATCH (n:users_master)-[:FOLLOW]->(m:users_master)
WHERE id(m)=35
RETURN n, EXISTS( (n)<-[:FOLLOW]-(m) ) as flag
'flag' will be a separate column with a boolean on whether or not the follow is reciprocal.
As far as adding a 'virtual property', in Neo4j 3.1+, you can use Map Projection to add custom values to the map projection of returned nodes.
You could use this query with map projection to include the flag in the return of node properties:
MATCH (n:users_master)-[:FOLLOW]->(m:users_master)
WHERE id(m)=35
RETURN n {.*, flag: EXISTS( (n)<-[:FOLLOW]-(m) ) }
EDIT
The map projection used in the query above was introduced and only works in Neo4j 3.1.x and up.
For versions 3.0.x, I don't believe there are many options for extracting all node properties to a map and adding a new value to that map before returning (SET clause is reserved for nodes, not maps).
You may need to install APOC procedures for a workaround, as it provides several helper procedures and functions, including map operations.
This should work after the relevant version of APOC is added to your Neo4j instance:
MATCH (n:users_master)-[:FOLLOW]->(m:users_master)
WHERE id(m)=35
RETURN apoc.map.setKey(properties(n), 'flag', EXISTS( (n)<-[:FOLLOW]-(m) )) as n

Getting relationships from all node's in Neo4j

I am trying to query using Neo4j.
I would like to print result of obtaining information while AUTO-COMPLETE is ON in Neo4j.
For example, suppose query that creating 3 nodes as shown below.
create (david:Person {name: 'david'}), (mike:Person {name: 'mike'}), (book:Book {title:'book'}), (david)-[:KNOWS]->(mike), (david)-[:WRITE]->(book), (mike)-[:WRITE]->(book)
Here are 2 images:
Auto-complete on
Auto-complete off
Figure is shown after query, and I would like to obtain all relating node’s relationships based on starting node ('book' node).
I used this query as shown below.
match (book:Book)-[r]-(person) return book, r, person
Whether AUTO-COMPLETE is ON or OFF, I expect to obtain all node’s relationships including “David knows Mike”, but system says otherwise.
I studied a lot of Syntax structure at neo4j website, and somehow it is very difficult for me. So, I upload this post to acquire assistance for you.
You have to return all the data that you need yourself explicitly. It would be bad for Neo4j to automatically return all the relationships for a super node with thousands of relationships for example, as it would mean lots of I/O, possibly for nothing.
MATCH (book:Book)-[r]-(person)-[r2]-()
RETURN book, r, person, collect(r2) AS r2
Thanks to InverseFalcon, this is my query that works.
MATCH p = (book:Book)-[r]-(person:Person)
UNWIND nodes(p) as allnodes WITH COLLECT(ID(allnodes)) AS ALLID
MATCH (a)-[r2]-(b)
WHERE ID(a) IN ALLID AND ID(b) IN ALLID
WITH DISTINCT r2
RETURN startNode(r2), r2, endNode(r2)

Create node and relationship given parent node

I am creating a word tree but when I execute this cypher query:
word = "MATCH {} MERGE {}-[:contains]->(w:WORD {{name:'{}'}}) RETURN w"
.format(parent_node, parent_node, locality[i])
where parent_node has a type Node
It throws this error:
py2neo.cypher.error.statement.InvalidSyntax: Can't create `n8823` with properties or labels here. It already exists in this context
formatted query looks like this:
'MATCH (n8823:HEAD {name:"sanjay"}) MERGE (n8823:HEAD {name:"sanjay"})-[:contains]->(w:WORD {name:\'colony\'}) RETURN w'
The formatted query is broken and won't work, but I also don't see how that could be what the formatted query actually looks like. When you do your string format you pass the same parameter (parent_node) twice so the final string should repeat whatever that parameter looks like. It doesn't, and instead has two different patterns for the match and merge clauses.
Your query should look something like
MATCH (n8823:Head {name: "sanjay"})
MERGE (n8823)-[:CONTAINS]->(w:Word {name: "colony"})
RETURN w
It's probably a bad idea to do string formatting on a Node object. Better to either use property values from your node object in a Cypher query to match the right node (and only the variable that you bind the matched node to in the merge clause) or use the methods of the node object to do the merge.
Although the MERGE clause is able to bind identifiers (like n8823), Cypher unfortunately does not allow MERGE to re-bind an identifier that had already been bound -- even if it would not actually change the binding. (On the other hand, the MATCH clause does allow "rebinding" to the same binding.) Simply re-using a bound identifier is OK, though.
So, the workaround is to change your Cypher query to re-use the bound identifier. Also, the recommended way to dynamically specify query data without changing the overall structure of a query is to use "query parameters". For py2neo, code along these lines should work for you (note that the parent_name variable would contain a name string, like "sanjay"):
from py2neo import Graph
graph = Graph()
cypher = graph.cypher
results = cypher.execute(
"MATCH (foo:{name:{a}}) MERGE (foo)-[:contains]->(w:WORD {{name:'{b}'}}) RETURN w",
a=parent_name, b=locality[i])

Resources