Creating unique relationship property in a FOREACH in neo4j - foreach

I am trying to accomplish something quite simple in python but not so in Neo4j.
I'd appreciate any comment and suggestions to improve the procedure!
Within Python script, I am trying to create a relationship as well as its property for every pair of two nodes. From a data analysis (not a csv file), I ended up having a dataframe with three columns as following:
name1 name2 points
===========================
Jack Sara 0.3
Jack Sam 0.4
Jack Jill 0.2
Mike Jack 0.4
Mike Sara 0.5
...
From this point, I would like to create all nodes for the people: Jack, Sara, Sam, Mike, etc and as well as their relationship with a property name points.
First I tried to match all nodes and then use "FOREACH" to update the relationship property one at a time.
tx = graph.cypher.begin()
qs2 = "MATCH (p1:person {name:"Jack"}), (p2:person)
WHERE p2.name IN ["Sara","Jill","Mike"]
FOREACH (r IN range(10) |
CREATE (p1)-[:OWES TO {score:{score_list}[r]}]->(p2))"
Above statement does not return what I expected. Instead of matching one node to another, it calls all nodes in p2 and create the relationship between the paris, resulting multiple copies of the same information.
Is there a notation to indicate one node at a time? If you think there is a better approach than above, please do share with me. Thank you!

The easiest approach would be to export the data to be imported into csv file and use then the LOAD CSV command in cypher.
LOAD CSV WITH HEADERS FROM <url> AS csvLine
MATCH (p1:Person {name:csvLine.name1}), (p2:Person {name:csvLine.name2})
CREATE (p1)-[:OWES_TO {score:csvLine.points}]->(p2)
In case you cannot use that approach you can use a parameterized Cypher statement using the transactional http endpoint. The parameter is a single element map containing an array of your data structure. On http level the request body would look like:
{
"statements": [
{
"parameters": {
"data": [
{
"name1": "Jack", "name2": "Sara", "points": 0.3
},
{
"name1": "Jack", "name2": "Sam", "points": 0.4
},
{
"name1": "Jack", "name2": "Jill", "points": 0.2
} // ...
]
},
"statement": "UNWIND {data} AS row
MATCH (p1:Person {name:row.name1}), (p2:Person {name:row.name2})
CREATE (p1)-[:OWES_TO {row.points}]->(p2)"
}
]
}
update regarding comment below
Q: How can I create the parameters from pyhton?
A: use the python json module
import json
json.dumps({'data':[{'name1':'Jack', 'name2':'Sara', 'points':0.3},{'name1':'Jack', 'name2':'Sam', 'points':0.4}]})

Related

How to Create One to Many Relationships From JSON with Neo4j Cypher

I would like to create one to many relationships from JSON items in a file. Specifically, each JSON item contains an author and the id of books they have published. I have author nodes and book nodes that already exist in the database.
The data looks like:
{"id": "1", "name": "Dr. Suess", "books": [{"i": "100", "i": "101"}]}
{"id": "2", "name": "Shell Silverstein", "books": [{"i": "200", "i": "201"}]}
I am trying to import the nodes with the following code:
CALL apoc.load.json('file:/data.txt') YIELD value AS q
MATCH (a:Author {{id:q.id}})
UNWIND q.books as books
WITH a, books
MATCH (b:Books {{id:books.i}})
CREATE (a)-[:AUTHORED]->(b)
However, this is importing a fraction of the nodes I am expecting. Any suggestions on how to approach this problem would be greatly appreciated!
Well if you say that not all the authors and books are imported it means that the two MATCH statements don't find what they are looking for.
One possible scenario is that you have the IDs stored as an integer, but now you are trying to match them as a string. With the provided information, it is hard to assume anything else.
I would change the MATCH into MERGE statements to see if that is the problem.
CALL apoc.load.json('file:/data.txt') YIELD value AS q
MERGE (a:Author {{id:q.id}})
UNWIND q.books as books
WITH a, books
MERGE (b:Books {{id:books.i}})
CREATE (a)-[:AUTHORED]->(b)

How to transform an UNWIND query to FOREACH in Neo4J Cypher?

I have the following Neo4J Cypher query:
UNWIND $mentionsRelations as mention
MATCH (u:User{name:mention.from})
RETURN u.uid;
The params are:
{
"mentionsRelations": [
{
"from": "a",
"to": "b"
},
{
"from": "c",
"to": "d"
}
]
}
Is it possible to rewrite it to get the same results using the FOREACH query?
I know it doesn't accept the MATCH parameter, but I'm just curious if there's a workaround to get the same results?
Basically what I want is to reiterate through the mentionsRelations using FOREACH and to then output any matches in the database it uncovers.
Thanks!
Not currently possible, FOREACH is only for write operations and can't be used for just matching to nodes.
Is there some reason UNWIND won't work for you?
You can also do a match based upon list membership, which should work similarly, though you'd have to extract out the values to use for the property match:
WITH [mention in $mentionsRelationships | mention.from] as froms
MATCH (u:User)
WHERE u.name in froms
RETURN u.uid;

Neo4j Join from CSV

I am the following sample nodes:
{
"name": "host_1",
"id": 0
}
{
"name": "host_2",
"id": 1
}
Then I have connections/authentications between those nodes in a CSV file.
{
"src_id": "291",
"dest_id": "162"
}
{
"src_id": "291",
"dest_id": "257"
}
I am trying to build the relationships (authentications between hosts) with the CSV file, but I'm having trouble getting the query finalized before I can create the relationship.
Is there a way to make an alias for a match similar to a SQL join?
LOAD CSV WITH HEADERS FROM "file:///redteam_connections.csv" AS row
MATCH (n:nodes {id: toInteger(row.dest_id)}), (n:nodes {id: toInteger(row.src_id)})
I'd like to make an alias such as
(n:nodes {id: toInteger(row.dest_id)}) AS dest_node, (n:nodes {id: toInteger(row.src_id)}) AS src_node
RETURN src_node.name, dest_node.name
based on my research, this doesn't appear possible. Any suggestions would be appreciated. Is it a limitation or problem with the structure of my dataset?
The problem you're running into is you're using the same variable, n, to refer to both nodes, so that isn't going to work. If you want to use src_node and dest_node as variables, you can:
LOAD CSV WITH HEADERS FROM "file:///redteam_connections.csv" AS row
MATCH (destNode:nodes {id: toInteger(row.dest_id)}), (srcNode:nodes {id: toInteger(row.src_id)})
CREATE (destNode)-[:AUTHENTICATION]->(srcNode)
You definitely want to add in index on :nodes(id) so your lookups are fast, and you may want to reconsider the :nodes label. By convention labels tend to be capitalized and singular (plural is usually used for when you actually collect() items into a list), so :Node would be more appropriate here.
If your CSV is large, I also recommend you use periodic commit to allow batching and prevent blowing your heap.

Neo4j: Improve cypher performance

I have the following cypher query where I have to UNWIND around 100 data. But my problem is, to run query, it takes so much time to execute (about 3-4 mins).
My Query:
CREATE (pl:List {id: {id}, title: {title} })
WITH pl as pl
MATCH (b:Units {id: {bId} })
MERGE (b)-[rpl:UNITS_LIST]->(pl)
WITH pl as pl
UNWIND {Ids} as Id
MATCH (p:Infos {id: Id})
WITH p as cn, pl as pl
SET cn :Data
WITH pl as pl, cn as cn
MERGE (pl)-[cnpt:DATA_LIST { email: cn.email } ]->(cn)
RETURN pl
Sample Data
List:
{
id: 'some-unique-id',
name: "some-name''
}
Ids ( Ids should be around 100 ):
[ 'some-info-id-01','some-info-id-03' ]
Infos (Neo4j DB):
[
{ id: 'some-info-id-01', others: 'some-others-data-01' },
{ id: 'some-info-id-02', others: 'some-others-data-02' },
{ id: 'some-info-id-03', others: 'some-others-data-03' }
]
Any suggestion to improve this cypher query ??
PS, I'm running this CQL in my node.js app.
This query looks like it should be pretty fast if you have proper indexes in place.
You should have these in place:
CREATE INDEX ON :Infos(id);
CREATE INDEX ON :Units(id);
CREATE INDEX ON :List(id);
With those indexes, the query should be fast because mostly you're looking up nodes by those IDs and then doing very small things on top of that. Even with 100 IDs that's not that hard of a query.
The counterpoint is that if you don't have your ID fields indexed, neo4j will have to look through most/all of them to figure out which items to match. The more data you have, the slower this query will get.
If you have these things indexed and you're still seeing very slow performance, you need to EXPLAIN the query and post the plan for further feedback.

Neo4j match Relationship parameters satisfying a certain schema

Using cypher, is there any way to match a path where the relationships satisfy a certain input schema generically?
I know I can do something like
parameters: {
"age": 20
}
MATCH (n)-[r:MY_RELATION]-() WHERE $age>18 AND $age<24 ...
when i want to match only relations satisfying the schema { "type": "integer", "minimum": 19, "maximum": 23 }.
But then i have to specify the min and max range within the relationship. What if I want to match strings against a schema, or even more complex types like address with subparameters etc.?
Is there a generic way to do it?
edit:
I'll try and state the question more clearly. What I want is a graph and a (parameterized) query to traverse that graph, such that:
i get all the nodes in the traversal
the relationships in the graph pose certain constraints on the query parameters (like minimum on an integer)
the traversal only follows edges where the constraint is met
i need to make constraints on integers, like min/max, but as well on strings, like pattern, etc.
edit 2:
What I want may not even be possible.
I want all of the information about the constraint to reside in the edge, including the parameter to test against. So I would want something along the lines of
parameters: { "age": 20, "location": "A" }
MATCH (n)-[r]-()
WHERE r.testtype='integer' AND getParameterByName(r.testparamname) < r.max
OR r.testtype='string' AND getParameterByName(r.testparamname)=r.allowedStringValue
Of course, as can be read in the neo4j documentation about parameter functionality it should not be possible to dynamically load the parameter via a name that resides in the DB.
There may yet be some workaround?
[UPDATED]
Original answer:
Your question is not stated very clearly, but I'll try to answer anyway.
I think something like this is what you want:
parameters: {
"minimum": 19,
"maximum": 23
}
MATCH (n)-[r:MY_RELATION]-() WHERE $maximum >= r.age >= $minimum
...
There is no need to specify a "type" parameter. Just make sure your parameter values are of the appropriate type.
New answer (based on updated question):
Suppose the parameters are specified this way (where test indicates the type of test):
parameters: {
"age": 20,
"test": "age_range"
}
Then you could do this (where r would contain the properties test, min, and max):
MATCH (n)-[r:MY_RELATION]-(m)
WHERE r.test = $test AND r.min <= $age <= r.max
RETURN n, r, m;
Or, if you do not need all the relationships to be of the same type, this should also work and may be easier to visualize (where r would be of, say, type "age_range", and contain the properties min and max):
MATCH (n)-[r]-(m)
WHERE TYPE(r) = $test AND r.min <= $age <= r.max
RETURN n, r, m;
To help you decide which approach to use, you should profile the two approaches with your code and some actual data to see which is faster for you.
Even Newer answer (based on edit 2 in question)
The following parameter and query should do what you want. Square brackets can be used to dyamically specify the name of a property.
parameters: {
"data": {
"age": 20,
"location": "A"
}
}
MATCH (n)-[r]-()
WHERE r.testtype='integer' AND $data[r.testparamname] < r.max
OR r.testtype='string' AND $data[r.testparamname]=r.allowedStringValue
...
Does this solution meet your requirements?
Considering the following small sample data set
MERGE (p1:Person {name: 'P 01'})
MERGE (p2:Person {name: 'P 02'})
MERGE (p3:Person {name: 'P 03'})
MERGE (p1)-[:MY_RELATION { minimum: 19, maximum: 23 }]->(p2)
MERGE (p2)-[:MY_RELATION { minimum: 19, maximum: 20 }]->(p3)
This query will only return the nodes and relationship where the supplied parameter fits the relationship constraints (e.g. $age = 21 should only return a single row). It is basically the inverse of #cybersam's proposal.
MATCH (s:Person)-[r:MY_RELATION]->(e:Person)
WHERE r.minimum <= $age <= r.maximum
RETURN *

Resources