I've built a simple graph of one node type and two relationship types: IS and ISNOT. IS relationships means that the node pair belongs to the same group, and obviouslly ISNOT represents the not belonging rel.
When I have to get the groups of related nodes I run the following query:
"MATCH (a:Item)-[r1:IS*1..20]-(b:Item) RETURN a,b"
So this returns a lot of a is b results and I added some code to group them after.
What I'd like is to group them modifying the query above, but given my rookie level I haven't yet figured it out. What I'd like is to get one row per group like:
(node1, node3, node5)
(node2,node4,node6)
(node7,node8)
I assume what you call groups are nodes present in a path where all these nodes are connected with a :IS relationship.
I think this query is what you want :
MATCH p=(a:Item)-[r1:IS*1..20]-(b:Item)
RETURN nodes(p) as nodes
Where p is a path describing your pattern, then you return all the nodes present in the path in a collection.
Note that, a simple graph (http://console.neo4j.org/r/ukblc0) :
(item1)-[:IS]-(item2)-[:IS]-(item3)
will return already 6 paths, because you use undericted relationships in the pattern, so there are two possible paths between item1 and item2 for eg.
Related
Let's consider a trivial graph with a directed relationship:
CREATE
(`0` :Car {value:"Ford"})
, (`1` :Car {value:"Subaru"})
, (`0`)-[:`DOCUMENT` {value:"DOC-1"}]->(`1`);
The following query MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car) RETURN * returns:
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
├──────────────────┼──────────────────┼─────────────────┤
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
The graph defines only a Ford->Subaru relationship, why are two relationships?
How to interpret the reversed one (line 1; not specified in the CREATE) statement?
Note: This is a follow-up to Convert multiple relationships between 2 nodes to a single one with weight asked by me earlier. I solved my problem, but I'm not convinced my answer is the best solution.
Your MATCH statement here doesn't specify the direction, therefore there are two possible paths that will match the pattern (remember that the ordering of nodes in the path is important and distinguishes paths from each other), thus your two answers.
If you specify the direction of the relationship instead you'll find there is only one possible path that matches:
MATCH (n1:Car)-[r:DOCUMENT]->(n2:Car)
RETURN *
As for the question of why we get two paths back when we omit the direction, remember that paths are order-sensitive: two paths that have the same elements but with a different order of the elements are different paths.
To help put this into perspective, consider the following two queries:
# Query 1
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Ford'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
# Query 2
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Subaru'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
Conceptually (and also used by the planner, in absence of indexes), to get to each of the above results you start off with the results of the full match as in your description, then filter to the only one which meets the given criteria.
The results above would not be consistent with the original directionless match query if that original query only returned a single row instead of two.
Additional information from the OP
It will take a while to wrap my head around it, but it does work this way and here's a piece of documentation to confirm it's by design:
When your pattern contains a bound relationship, and that relationship pattern doesn’t specify direction, Cypher will try to match the relationship in both directions.
MATCH (a)-[r]-(b)
WHERE id(r)= 0
RETURN a,b
This returns the two connected nodes, once as the start node, and once as the end node.
Having this query working in Cypher (Neo4j):
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
RETURN p
which returns all possible paths belonging a specific group (group is just a property to classify nodes), I am struggling to get a query that returns the paths in common between both collection of paths. It would be something like this:
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
MATCH p=(g3:Node)-[:FOLLOWED_BY *2..2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15
RETURN INTERSECTION(path1, path2)
Of course I made that up. The goal is to get all the paths in common between both queries.
The start/end nodes of your 2 MATCHes have different groups, so they can never find common paths.
Therefore, when you ask for "paths in common", I assume you actually want to find the shared middle nodes (between the 2 sets of 3-node paths). If so, this query should work:
MATCH p1=(g:Node)-[:FOLLOWED_BY *2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
WITH COLLECT(DISTINCT NODES(p1)[1]) AS middle1
MATCH p2=(g3:Node)-[:FOLLOWED_BY *2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15 AND NODES(p2)[1] IN middle1
RETURN DISTINCT NODES(p2)[1] AS common_middle_node;
I have database with entities person (name,age) and project (name).
can I query the database in cypher that specifies me it is person or project?
for example consider I have these two instances for each :
Node (name = Alice, age= 20)
Node (name = Bob, age = 31)
Node (name = project1)
Node (name = project2)
-I want to know, is there any way that I just say project1 and it tells me that this is a project.
-or I query Alice and it says me this is a person?
Thanks
So your use case is to search things by name, and those things can be of several types instead of a single type.
Just to note, in general, this is not what Neo4j is built for. Typically in Neo4j queries you know the type of the thing you're searching for, and you're exploring relationships between that thing (or things) to figure out associations or data derived from that.
That said, there are ways to do this, though it's worth going through the rest of your use cases and seeing if Neo4j is really the best tool for what you're trying to do
Whenever you're querying by a property, you either want a unique constraint on the label/property, or an index on the label/property. Note that you need a combination of a label and a property for this; you cannot blindly ask for a node with a property without specifying a label and get good performance, as it will have to do a scan of all nodes in your database (there are some older manual indexes in Neo4j, but I'm not sure if these will continue to be supported; the schema indexes are recommended by the developers).
There is a workaround to this, as Neo4j allows multiple labels on the same node. If you only expect to query certain types by name (for example, only projects and people), you might create a :Named label, and set that label on all :Project and :Person nodes (and any other labels where it should apply). You can then create an index on :Named.name. That way your query would be something like:
MATCH (n:Named)
WHERE n.name = 'blah'
WITH LABELS(n) as types
WITH FILTER(type in types WHERE type <> 'Named') as labels
RETURN labels
Keep in mind that you haven't specified if a name should be unique among node types, so it could be possible for a :Person or a :Project or multiple :Persons to have the same name, unsure how that affects what should happen on your end. If every named thing ought to have a unique name, you should create a unique constraint on :Named.name (though again, it's on you to ensure that every node you create that ought to be :Named has the :Named label applied on creation).
You should use node labels (like Person and Project) to represent node "types".
For example, to create a person and a project:
CREATE (:Person {name: 'Alice', age: 20})
CREATE (:Project {name: 'project1'})
To find the project(s) named 'Fred':
MATCH (p:Project {name: 'Fred'})
RETURN p;
To get a collection of the labels of node n, you can invoke the LABELS(n) function. You can then look in that collection to see if the label you are looking for is in there. For example, if your Cypher query somehow obtains a node n, then this snippet would return n if and only if it has the Person label:
.
.
.
WHERE 'Person' IN LABELS(n)
RETURN n;
[UPDATED]
If you want to find all nodes with the name property value of "Fred":
MATCH (n {name: 'Fred'})
...
If you want to find all relationships with the name property value of "Fred":
MATCH ()-[r {name: 'Fred'})-()
...
If you want to match both in a single query, you have many ways to do that, depending on your exact use case. For example, if you want a cartesian product of the matching nodes and relationships:
OPTIONAL MATCH (n {name: 'Fred'})
OPTIONAL MATCH ()-[r {name: 'Fred'})-()
...
How can we add a relationship to the query.
Say A-[C01]-B-[C02]-D and A-[C01]-B-[C03]-E
C01 C02 C03 are relationship codes I want to get output
B E
because I want only nodes that can be reached unbroken by C01 or C03
How can I get this result in Cypher?
You may want to clarify, what you're asking for seems like a very simple case of matching. You may want to provide some more info, such as node labels and how you're matching to your start nodes, since without these we have to make things up for example code.
MATCH (a:Thing)
WHERE a.ID = 123
WITH a
MATCH (a)-[:C01|C03*]->(b:Thing)
RETURN b
The key here is specifying multiple relationship types to traverse, using * for multiplicity, so it will match on all nodes that can be reached by any chain of those relationships.
What I'm looking for
With variable length relationships (see here in the neo4j manual), it is possible to have a variable number of relationships with a certain label between two nodes.
# Cypher
match (g1:Group)-[:sub_group*]->(g2:Group) return g1, g2
I'm looking for the same thing with nodes, i.e. a way to query for two nodes with a variable number of nodes in between, but with a label condition on the nodes rather than the relationships:
# Looking for something like this in Cypher:
match (g1:Group)-->(:Group*)-->(g2:Group) return g1, g2
Example
I would use this mechanism, for example, to find all (direct or indirect) members of a group within a group structure.
# Looking for somthing like this in Cypher:
match (group:Group)-->(:Group*)-->(member:User) return member
Take, for example, this structure:
group1:Group
|-------> group2:Group -------> user1:User
|-------> group3:Group
|--------> page1:Page -----> group4:Group -----> user2:User
In this example, user1 is a member of group1 and group2, but user2 is only member of group4, not member of the other groups, because a non-Group labeled node is in between.
Abstraction
A more abstract pattern would be a kind of repeat operator |...|* in Cypher:
# Looking for repeat operator in Cypher:
match (g1:Group)|-[:is_subgroup_of]->(:Group)|*-[:is_member_of]->(member:User)
return member
Does anyone know of such a repeat operator? Thanks!
Possible Solution
One solution I've found, is to use a condition on the nodes using where, but I hope, there is a better (and shorter) soluation out there!
# Cypher
match path = (member:User)<-[*]-(g:Group{id:1})
where all(node in tail(nodes(path)) where ('Group' in labels(node)))
return member
Explanation
In the above query, all(node in tail(nodes(path)) where ('Group' in labels(node))) is one single where condition, which consists of the following key parts:
all: ALL(x in coll where pred): TRUE if pred is TRUE for all values in
coll
nodes(path): NODES(path): Returns the nodes in path
tail(): TAIL(coll): coll except first element–––I'm using this, because the first node is a User, not a Group.
Reference
See Cypher Cheat Sheet.
How about this:
MATCH (:Group {id:1})<-[:IS_SUBGROUP_OF|:IS_MEMBER_OF*]-(u:User)
RETURN DISTINCT u
This will:
find all subtrees of the group with ID 1
only traverse the relationships IS_GROUP_OF and IS_MEMBER_OF in incoming direction (meaning sub-groups or users that belong to group with ID or one of its sub-groups)
only return nodes which have a IS_MEMBER_OF relationship to a group in the subtree
and discard duplicate results (users who belong to more than one of the groups in the tree would otherwise appear multiple times)
I know this relies on relationships types rather than node labels, but IMHO this is a more graphy approach.
Let me know if this would work or not.