loose nodes properties while using apoc.nodes.group() - neo4j

I have a set of nodes which look like this :
dalle {
"ident": "A-1-1-1",
"networkId": 1,
"numberId": 1,
"floor": 1,
"room": 1,
"building": "A",
"buildingId": 1
}
I want to group my nodes so I do this command :
CALL apoc.nodes.group(['dalle'], ['building', 'floor', 'room'])
YIELD nodes, relationships
RETURN nodes, relationships
The result I got is really nice except one detail, I loose some properties, my nodes are now :
{
"floor": 3,
"count_*": 1,
"building": "C",
"room": 1
}
Why do I loose properties ?
I tried to update the nodes to set somes properties back like this :
CALL apoc.nodes.group(['dalle'], ['building', 'floor', 'room'])
YIELD nodes, relationships
FOREACH(n IN nodes | SET n.ident=n.building+n.floor)
RETURN nodes, relationships
but it changes nothing to my query result.
thanks !

You should read the documentation for the apoc.nodes.group procedure to see if it is actually the right thing for you to be using.
That procedure returns virtual nodes and relationships. Since virtual nodes and relationships are not actually in the DB, the SET clause does not work on them.
If you want more properties to show up in the virtual nodes, then you'd have to add them to the properties list you pass to the procedure. For example: ['building', 'floor', 'room', 'ident'].

Related

Cypher: how do I check that at least one of the nodes in a set matches a given property?

I have a data model in neo4j where a Person node may be "merged" with another — not literally merged, just a relation in the form:
(a:Person)-[:MERGED]-(other:Person)
And, of course, b can be merged with someone else, in a potentially endless path.
I have a query to return a list of persons, with the 'merged' persons — that is, anyone in the :MERGED path — embedded as a property.
MATCH (a:Person)
CALL {
WITH a
MATCH path = (a)-[:MERGED*]-(other)
RETURN COLLECT(other{.label}) as b
}
RETURN a{.label, merged_items:b}
This returns, for example, something like:
{
"label": "John Smith",
"merged_items": [
{
"label": "Toby Jones"
},
{
"label": "Seamus McGibbon"
},
{
"label": "Aaron Drew"
}
]
}
for each of the Persons in this chain of merges (so actually the full result has four items, with each of the connected people being a — this is precisely what I want).
Now, I want to be able to filter the results by the Person.label, but any one of the Persons in the chain could match (either a OR any of the others).
Any idea how I might go about this?
I've tried a lot of different things (any(), for example) but can't get it to work.
The syntax for any() is WHERE any(e IN list WHERE predicate(e))
So in your case, this should work.
WITH COLLECT(other{.label}) as b
WHERE any(e IN b WHERE e.label = a.label)
RETURN b
You could in principle already apply it to the path before you collect. The tail(list) is so that it excludes a which would be the first node of the path.
MATCH ...
WHERE any(n in tail(nodes(path)) WHERE n.label = a.label)

Neo4j - return nodes and relationships as a nested object/map

I'm trying to essentially get all of the data under a specific high level node and return it as a nested JSON. I've been messing around with it for a while, and I could probably make something work with a lot of OPTIONAL MATCHES and WITH, but it looks gigantic, and it's not clean.
In addition, I run into issues with duplicates. In the schema below, if x1 has multiple x1 nodes, and multiple y1 child nodes, then I get duplicated x1 nodes in my result set. Which is not ideal.
Should I just run separate queries, and combine in code? Should I try to go down this path of writing a massive query and try to combine in a single JSON on the fly?
I'll define my schema first, so I have something like
(Movie)-[rel1]-(Actor)-[x]
where [x] can be:
[x1]-(node1)-[y1]
[y1]-(something)
[x2]-(node2)
[y1]-(something)
[x3]-(node2)
[y1]-(something)
[x4]-(node2)
[x5]-(node2)
and what I want is something like:
{
Movie:
...movie properties,
Actor: [
x1: [
{
...x1 properties,
y1: [
{...y1 properties},
{...y1 properties}
]
},
{ another x1 node with properties, and its children of y1's}
]
x2... (like above)
x3... (like above)
]
}
This is the query I've been playing around with. I've tried a few different iterations.
OPTIONAL MATCH (Movie)-[rel1]->(Actor)-[x1]->(node1)-[y1]-(something1)
WITH Movie, Actor, { node1: COLLECT(node1), something1: COLLECT(something1)} as node1
// continue writing same query as above, with x2 and y2, etc
RETURN {
Movie: Movie,
Actor: COLLECT({
Actor: Actor,
node1: node1, // this returns duplicates
})
}

Neo4j: Cypher query returns wrong json result

I have a problem with my cypher query.
Situation explained:
A user is able to connect to other CONTACT nodes, but he can also connect to EVENT nodes. Other users can also connect to these event nodes. We expect to retrieve the nodes we are connected to (CONTACT & EVENT) but we also need to retrieve the event nodes of the CONTACT nodes that we are connected to.
This is the graph we want to see when we retrieve the connected nodes from the bottom center CONTACT node:
But we receive this json output:
{
"_type": "Node",
"_id": 1,
"nodeType": "EVENT",
"nodeId": 1,
"connected_with": [
{
"_type": "Node",
"_id": 0,
"nodeType": "CONTACT",
"nodeId": 1
},
{
"_type": "Node",
"_id": 2,
"nodeType": "CONTACT",
"nodeId": 2,
"connected_with": [
{
"_type": "Node",
"_id": 0,
"nodeType": "CONTACT",
"nodeId": 1
}
]
}
]
}
We want to go 2 levels deep, meaning we want to see
contacts that we are connected to but also contacts we
"met" at an event hence the reason we want to go 2 levels deep.
We currently have this cypher query running but as previously mentioned, it's not working.
MATCH path = (n:Node {nodeId: 1})<-[:CONNECTED_WITH*]-(nodes)
WITH collect(path) as paths
CALL apoc.convert.toTree(paths) yield value as json
RETURN json
Any help would be appreciated!
Your results seem to match what you say you want, except that it is in tree form (which you asked for).
You state that you do not "see" what you expected (presumably in the neo4j Browser). This is because the results you asked for are not plain nodes, relationships, and/or paths.
Try this, instead (note also the upper bound of 2 on the depth of the variable-length path pattern):
MATCH path = (n:Node {nodeId: 1})<-[:CONNECTED_WITH*..2]-(nodes)
RETURN path
Aside: Having just a single node label, Node, with a nodeType property that specifies the exact "type" of node is not generally the right way to model things. It makes it harder to understand the DB, tends to complicate your code, and makes it harder to take advantage of indexing. You probably want to have separate labels (say, Person and Event). You may also want to have different relationship types as well.

Neo4j query node property.

I have database with entities person (name,age) and project (name).
can I query the database in cypher that specifies me it is person or project?
for example consider I have these two instances for each :
Node (name = Alice, age= 20)
Node (name = Bob, age = 31)
Node (name = project1)
Node (name = project2)
-I want to know, is there any way that I just say project1 and it tells me that this is a project.
-or I query Alice and it says me this is a person?
Thanks
So your use case is to search things by name, and those things can be of several types instead of a single type.
Just to note, in general, this is not what Neo4j is built for. Typically in Neo4j queries you know the type of the thing you're searching for, and you're exploring relationships between that thing (or things) to figure out associations or data derived from that.
That said, there are ways to do this, though it's worth going through the rest of your use cases and seeing if Neo4j is really the best tool for what you're trying to do
Whenever you're querying by a property, you either want a unique constraint on the label/property, or an index on the label/property. Note that you need a combination of a label and a property for this; you cannot blindly ask for a node with a property without specifying a label and get good performance, as it will have to do a scan of all nodes in your database (there are some older manual indexes in Neo4j, but I'm not sure if these will continue to be supported; the schema indexes are recommended by the developers).
There is a workaround to this, as Neo4j allows multiple labels on the same node. If you only expect to query certain types by name (for example, only projects and people), you might create a :Named label, and set that label on all :Project and :Person nodes (and any other labels where it should apply). You can then create an index on :Named.name. That way your query would be something like:
MATCH (n:Named)
WHERE n.name = 'blah'
WITH LABELS(n) as types
WITH FILTER(type in types WHERE type <> 'Named') as labels
RETURN labels
Keep in mind that you haven't specified if a name should be unique among node types, so it could be possible for a :Person or a :Project or multiple :Persons to have the same name, unsure how that affects what should happen on your end. If every named thing ought to have a unique name, you should create a unique constraint on :Named.name (though again, it's on you to ensure that every node you create that ought to be :Named has the :Named label applied on creation).
You should use node labels (like Person and Project) to represent node "types".
For example, to create a person and a project:
CREATE (:Person {name: 'Alice', age: 20})
CREATE (:Project {name: 'project1'})
To find the project(s) named 'Fred':
MATCH (p:Project {name: 'Fred'})
RETURN p;
To get a collection of the labels of node n, you can invoke the LABELS(n) function. You can then look in that collection to see if the label you are looking for is in there. For example, if your Cypher query somehow obtains a node n, then this snippet would return n if and only if it has the Person label:
.
.
.
WHERE 'Person' IN LABELS(n)
RETURN n;
[UPDATED]
If you want to find all nodes with the name property value of "Fred":
MATCH (n {name: 'Fred'})
...
If you want to find all relationships with the name property value of "Fred":
MATCH ()-[r {name: 'Fred'})-()
...
If you want to match both in a single query, you have many ways to do that, depending on your exact use case. For example, if you want a cartesian product of the matching nodes and relationships:
OPTIONAL MATCH (n {name: 'Fred'})
OPTIONAL MATCH ()-[r {name: 'Fred'})-()
...

Create multiple nodes and relationships in several Cypher statements

I want to create multiple neo4j nodes and relationships in one Cypher transaction. I'm using py2neo which allows issuing multiple Cypher statements in one transaction .
I thought I'd add a statement for each node and relationship I create:
tx.append('CREATE (n:Label { prop: val })')
tx.append('CREATE (m:Label { prop: val2 })')
Now I want to create a relationship between the two created nodes:
tx.append('CREATE (n)-[:REL]->(m)')
This doesn't work as expected. No relationship is created between the first two nodes, since there's no n or m in the context of the last statement (there is a new relationship between two new nodes - four nodes are created in total)
Is there a way around this? Or should I combine all the calls to CREATE (around 100,000 per logical transaction) in one statement?
It just hurts my brain thinking about such a statement, because I'll need to store everything on one big StringIO, and I lose the ability to use Cypher query parameters - I'll need to serialize dictionaries to text myself.
UPDATE:
The actual graph layout is more complicated than that. I have multiple relationship types, and each node is connected to at least two other nodes, while some nodes are connected to hundreds of nodes.
You don't need multiple queries. You can use a single CREATE to create each relationship and its related nodes:
tx.append('CREATE (:Label { prop: val })-[:REL]->(:Label { prop: val2 })')
Do something like this:
rels = [(1,2), (3,4), (5,6)]
query = """
CREATE (n:Label {prop: {val1} }),
(m:Label {prop: {val2} }),
(n)-[:REL]->(m)
"""
tx = graph.cypher.begin()
for val1, val2 in rels:
tx.append(query, val1=val1, val2=val2)
tx.commit()
And if your data is large enough consider doing this in batches of 5000 or so.

Resources