Neo4j: Match Merge throwing Neo.ClientError.Statement.SyntaxError - neo4j
I was trying to run a query in Neo4j to make a relationship between a recipe and ingredients:
MATCH (spongeCake:Cake {name: "Sponge Cake"}),
(white:Flour {name: "white"}),
(egg:Ingredient {name: "egg"}),
(butter:Ingredient {name: "butter"}),
(milk:Ingredient {name: "milk"}),
(sugar:Ingredient {name: "sugar"}),
(brown:Flour {name: "brown"}),
MERGE (spongeCake)-[r:CONTAINS {quantity: 4, unit: "medium"}]->(egg),
(spongeCake)-[r:CONTAINS {quantity: 50, unit: "grams"}]->(brown),
(spongeCake)-[r:CONTAINS {quantity: 255, unit: "grams"}]->(sugar),
(spongeCake)-[r:CONTAINS {quantity: 25, unit: "grams"}]->(milk),
(spongeCake)-[r:CONTAINS {quantity: 300, unit: "grams"}]->(white),
(spongeCake)-[r:CONTAINS {quantity: 45, unit: "grams"}]->(butter);
For some reason MERGE is giving me a lot of trouble and I am getting the following error:
Invalid input 'MERGE': expected "(", "allShortestPaths" or "shortestPath" (line 9, column 1 (offset: 250))
"MERGE (spongeCake)-[r:CONTAINS {quantity: 4, unit: "medium"}]->(egg)"
^
How can I do this correctly?
This is the syntax of what you want to achieve.
There is a comma before Merge so it will not work.
Too many commas in match so it will create cartesian products
Many commas in MERGE and it will not work, so I removed it.
Learn the syntax well. Goodluck!
..
MATCH (spongeCake:Cake {name: "Sponge Cake"})
MATCH (white:Flour {name: "white"})
MATCH (egg:Ingredient {name: "egg"}),
MATCH (butter:Ingredient {name: "butter"})
MATCH (milk:Ingredient {name: "milk"})
MATCH (sugar:Ingredient {name: "sugar"})
MATCH (brown:Flour {name: "brown"})
MERGE (spongeCake)-[:CONTAINS {quantity: 4, unit: "medium"}]->(egg)
MERGE (spongeCake)-[:CONTAINS {quantity: 50, unit: "grams"}]->(brown)
MERGE (spongeCake)-[:CONTAINS {quantity: 255, unit: "grams"}]->(sugar)
MERGE (spongeCake)-[:CONTAINS {quantity: 25, unit: "grams"}]->(milk)
MERGE (spongeCake)-[:CONTAINS {quantity: 300, unit: "grams"}]->(white)
MERGE (spongeCake)-[:CONTAINS {quantity: 45, unit: "grams"}]->(butter)
RETURN spongeCake
Related
Neo4 cypher: Check if node with relationships to a list of node IDs exists
I have the following node structure: (:Patch)-[:INCLUDES]->(:Roster)-[:HAS]->(:PublicList)-[:INCLUDES]->(u:Unit) Then I have an array of :Unit ids: [197, 196, 19, 20, 191, 171, 3, 174, 194, 185] I would like to check whether a :PublicList that has the :INCLUDES relationship to all the :Unit ids in the list already exists. I tried writing a COUNT and MATCH query like this, but this just seems like an error-prone long-winded approach: MATCH (p:Patch)-[:INCLUDES]->(r:Roster)-[:HAS]-(d:PublicList) WITH COLLECT(d) as drafts UNWIND drafts as draft WITH draft UNWIND [197, 196, 19, 20, 191, 171, 3, 174, 194, 185] as unitID MATCH (draft)-[:INCLUDES]->(u:Unit) WHERE id(u) = unitID WITH count(DISTINCT u) as draftUnits WITH COLLECT(draftUnits) as matchCounts RETURN matchCounts Can someone help me write this so it returns a boolean if a :PublicList has a:INCLUDES relationship to all the IDs in the list?
I suggest to first match the units, put them into a collection and then use the ALL predicate to check that the PublicList has a connection to all units. MATCH (n:Unit) WHERE id(n) IN [197, 196, 19, 20, 191, 171, 3, 174, 194, 185] WITH collect(n) AS units MATCH (p:Patch)-[:INCLUDES]->(r:Roster)-[:HAS]-(d:PublicList) WHERE ALL(x IN units WHERE (d)-[:INCLUDES]->(x)) RETURN count(*) AS matchCount If you want to return the PublicList along with a boolean value if it matches all of them, you can slightly adjust like this MATCH (n:Unit) WHERE id(n) IN [197, 196, 19, 20, 191, 171, 3, 174, 194, 185] WITH collect(n) AS units MATCH (p:Patch)-[:INCLUDES]->(r:Roster)-[:HAS]-(d:PublicList) RETURN d, ALL(x IN units WHERE (d)-[:INCLUDES]->(x)) as matchAll
Your query looks good but can be improved. Just to fix it, you need to use u.id = unitID instead of WHERE id(u) = unitID The latter is an internal id function which is uses a unique identification to all other nodes in the same database while the latter is a simple property named: id
Neo4j Cypher query execution plan optimization
I have the following Cypher query: MATCH (dg:DecisionGroup {id: -2})-[rdgd:CONTAINS]->(childD:Decision:Profile ) MATCH (childD)-[:EMPLOYMENT_AS]->(root2:Employment ) WHERE root2.id IN ([1]) WITH DISTINCT childD, dg, rdgd MATCH path3=(root3:Location )-[:CONTAINS*0..]->(descendant3:Location) WHERE (descendant3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]) OR root3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])) UNWIND nodes(path3) AS pathNode3 WITH childD, dg, rdgd, COLLECT(DISTINCT pathNode3) AS pathNodes3 MATCH (childD)-[:LOCATED_IN]->(pathNode3) WHERE pathNode3 IN pathNodes3 WITH DISTINCT childD, dg, rdgd WHERE (childD.`active` = true) AND (childD.`experienceMonths` >= 129) AND ( (childD.`minSalaryUsd` <= 8883) OR (childD.`minHourlyRateUsd` <= 126) ) MATCH (childD)-[criterionRelationship8:HAS_VOTE_ON]->(c:Criterion {id: 2}) WHERE (criterionRelationship8.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship10:HAS_VOTE_ON]->(c:Criterion {id: 36}) WHERE (criterionRelationship10.`avgVotesWeight` >= 1.0) AND (criterionRelationship10.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship13:HAS_VOTE_ON]->(c:Criterion {id: 4}) WHERE (criterionRelationship13.`properties.experienceMonths` >= 0) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship15:HAS_VOTE_ON]->(c:Criterion {id: 22}) WHERE (criterionRelationship15.`avgVotesWeight` >= 1.0) AND (criterionRelationship15.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd OPTIONAL MATCH (childD)-[ru:CREATED_BY]->(u:User) WITH childD, u, ru, dg, rdgd OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion) WHERE c.id IN [2, 36, 4, 22] WITH c, childD, u, ru, dg, rdgd, (vg.avgVotesWeight * (CASE WHEN c IS NOT NULL THEN coalesce({`22`:1.2236918603185925, `2`:2.9245935245152226, `36`:0.2288013749943646, `4`:3.9599506966378435}[toString(c.id)], 1.0) ELSE 1.0 END)) as weight, vg.totalVotes as totalVotes WITH childD, u, ru , dg, rdgd , toFloat(sum(weight)) as weight, toInteger(sum(totalVotes)) as totalVotes ORDER BY weight DESC , childD.createdAt DESC SKIP 0 LIMIT 20 WITH * OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User) RETURN rdgd, ru, u, rup, up, childD AS decision, weight, totalVotes, [ (c1)<-[vg1:HAS_VOTE_ON]-(childD) WHERE c1.id IN [2, 36, 4, 22] | {criterion: c1, relationship: vg1} ] AS weightedCriteria This query is automatically generated by my Cypher query builder. Right now on 1000 Profiles the query executes ~8 seconds. Looks like this part of the query causes most of the issues: MATCH (childD)-[:EMPLOYMENT_AS]->(root2:Employment ) WHERE root2.id IN ([1]) WITH DISTINCT childD, dg, rdgd MATCH path3=(root3:Location )-[:CONTAINS*0..]->(descendant3:Location) WHERE (descendant3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]) OR root3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])) UNWIND nodes(path3) AS pathNode3 WITH childD, dg, rdgd, COLLECT(DISTINCT pathNode3) AS pathNodes3 MATCH (childD)-[:LOCATED_IN]->(pathNode3) WHERE pathNode3 IN pathNodes3 WITH DISTINCT childD, dg, rdgd Is there a way to optimize this? This is PROFILE output: UPDATED I reimplemented initial part of the query to the following: WITH [] as ceNodeList MATCH (root2:Employment ) WHERE root2.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) WITH ceNodeList, root2, COLLECT(root2) AS listRoot2 WITH apoc.coll.unionAll(ceNodeList, listRoot2) AS ceNodeList WITH apoc.coll.toSet(ceNodeList) as ceNodeList MATCH (root3:Location ) WHERE root3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73]) WITH ceNodeList, root3, COLLECT(root3) AS listRoot3 OPTIONAL MATCH (root3)-[:CONTAINS*0..]->(descendant3:Location) OPTIONAL MATCH (ascendant3:Location)-[:CONTAINS*0..]->(root3) WITH ceNodeList, listRoot3, COLLECT( DISTINCT ascendant3) AS listAscendant3, COLLECT( DISTINCT descendant3) AS listDescendant3 WITH listRoot3, listAscendant3, apoc.coll.unionAll(ceNodeList, apoc.coll.unionAll(listDescendant3, apoc.coll.unionAll(listRoot3, listAscendant3))) AS ceNodeList WITH apoc.coll.toSet(ceNodeList) as ceNodeList UNWIND ceNodeList AS ceNode WITH DISTINCT ceNode MATCH (dg:DecisionGroup {id: -2})-[rdgd:CONTAINS]->(childD:Decision:Profile ) -[:REQUIRES]->(ceNode) WITH DISTINCT childD, dg, rdgd, collect(ceNode) as ceNodes WITH childD, dg, rdgd, ceNodes, reduce(ceNodeLabels = [], n IN ceNodes | ceNodeLabels + labels(n)) as ceNodeLabels WHERE all(x IN ['Employment', 'Location'] WHERE x IN ceNodeLabels) WITH childD, dg, rdgd return count(childD) Now it works several times faster, but still not perfect. Is there something I may do in order to improve this? UPDATED1 WITH [] as ceNodeList MATCH (root2:Location ) WHERE root2.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]) WITH ceNodeList, root2 OPTIONAL MATCH (root2)-[:CONTAINS*0..]->(descendant2:Location) OPTIONAL MATCH (ascendant2:Location)-[:CONTAINS*0..]->(root2) WITH ceNodeList, COLLECT(root2) AS listRoot2, COLLECT( DISTINCT ascendant2) AS listAscendant2, COLLECT( DISTINCT descendant2) AS listDescendant2 WITH apoc.coll.union(ceNodeList, apoc.coll.union(listDescendant2, apoc.coll.union(listRoot2, listAscendant2))) AS ceNodeList WITH ceNodeList MATCH (root3:Employment ) WHERE root3.id IN ([101, 102, 103, 104, 105]) WITH ceNodeList, COLLECT(root3) AS listRoot3 WITH apoc.coll.union(ceNodeList, listRoot3) AS ceNodeList WITH ceNodeList UNWIND ceNodeList as seNode WITH collect(seNode.id) as seNodeIds with apoc.coll.toSet(seNodeIds) as seNodeIds MATCH (dg:DecisionGroup {id: -2})-[rdgd:CONTAINS]->(childD:Profile ) -[:REQUIRES]->(ceNode) WHERE ceNode.id in seNodeIds WITH DISTINCT childD, dg, rdgd, collect(ceNode) as ceNodes WITH childD, dg, rdgd, ceNodes, reduce(ceNodeLabels = [], n IN ceNodes | ceNodeLabels + labels(n)) as ceNodeLabels WHERE all(x IN ['Employment', 'Location'] WHERE x IN ceNodeLabels) WITH childD, dg, rdgd
Try this: WITH [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35] AS ids WITH reduce(idsMap = {}, x IN ids | apoc.map.setEntry(idsMap, toString(x), true)) MATCH (dg:DecisionGroup {id: -2})-[rdgd:CONTAINS]->(childD:Decision:Profile ) MATCH (childD)-[:EMPLOYMENT_AS]->(root2:Employment ) WHERE root2.id = 1 WITH DISTINCT childD, dg, rdgd, idsMap MATCH (descendant3:Location) WHERE apoc.map.get(idsMap, toString(descendant3.id), false) = true MATCH path3=(root3:Location )-[:CONTAINS*0..]->(descendant3) WHERE apoc.map.get(idsMap, toString(root3.id), false) = true UNWIND nodes(path3) AS pathNode3 WITH childD, dg, rdgd, COLLECT(DISTINCT pathNode3) AS pathNodes3 MATCH (childD)-[:LOCATED_IN]->(pathNode3) WHERE pathNode3 IN pathNodes3 WITH DISTINCT childD, dg, rdgd WHERE (childD.`active` = true) AND (childD.`experienceMonths` >= 129) AND ( (childD.`minSalaryUsd` <= 8883) OR (childD.`minHourlyRateUsd` <= 126) ) MATCH (childD)-[criterionRelationship8:HAS_VOTE_ON]->(c:Criterion {id: 2}) WHERE (criterionRelationship8.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship10:HAS_VOTE_ON]->(c:Criterion {id: 36}) WHERE (criterionRelationship10.`avgVotesWeight` >= 1.0) AND (criterionRelationship10.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship13:HAS_VOTE_ON]->(c:Criterion {id: 4}) WHERE (criterionRelationship13.`properties.experienceMonths` >= 0) WITH DISTINCT childD, dg, rdgd MATCH (childD)-[criterionRelationship15:HAS_VOTE_ON]->(c:Criterion {id: 22}) WHERE (criterionRelationship15.`avgVotesWeight` >= 1.0) AND (criterionRelationship15.`properties.experienceMonths` >= 1) WITH DISTINCT childD, dg, rdgd OPTIONAL MATCH (childD)-[ru:CREATED_BY]->(u:User) WITH childD, u, ru, dg, rdgd OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion) WHERE c.id IN [2, 36, 4, 22] WITH c, childD, u, ru, dg, rdgd, (vg.avgVotesWeight * (CASE WHEN c IS NOT NULL THEN coalesce({`22`:1.2236918603185925, `2`:2.9245935245152226, `36`:0.2288013749943646, `4`:3.9599506966378435}[toString(c.id)], 1.0) ELSE 1.0 END)) as weight, vg.totalVotes as totalVotes WITH childD, u, ru , dg, rdgd , toFloat(sum(weight)) as weight, toInteger(sum(totalVotes)) as totalVotes ORDER BY weight DESC , childD.createdAt DESC SKIP 0 LIMIT 20 WITH * OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User) RETURN rdgd, ru, u, rup, up, childD AS decision, weight, totalVotes, [ (c1)<-[vg1:HAS_VOTE_ON]-(childD) WHERE c1.id IN [2, 36, 4, 22] | {criterion: c1, relationship: vg1} ] AS weightedCriteria Here, I have created a map from the ids given and then used it instead of IN operator. Update: I think your new query can be simplified a bit. We can combine apoc.coll.unionAll and apoc.coll.toSet, with a single call to apoc.coll.union, try this: MATCH (root2:Employment) WHERE root2.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) WITH COLLECT(root2) AS ceNodeList MATCH (root3:Location) WHERE root3.id IN ([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73]) WITH ceNodeList, root3, COLLECT(root3) AS listRoot3 OPTIONAL MATCH (root3)-[:CONTAINS*0..]-(descendants:Location) WITH ceNodeList, listRoot3, COLLECT(DISTINCT descendant3) AS listDescendant3 WITH apoc.coll.union(ceNodeList, apoc.coll.union(listDescendant3, listRoot3)) AS ceNodeList UNWIND ceNodeList AS ceNode WITH DISTINCT ceNode MATCH (dg:DecisionGroup {id: -2})-[rdgd:CONTAINS]->(childD:Decision:Profile)-[:REQUIRES]->(ceNode) WITH DISTINCT childD, dg, rdgd, collect(ceNode) as ceNodes WITH childD, dg, rdgd, ceNodes, reduce(ceNodeLabels = [], n IN ceNodes | ceNodeLabels + labels(n)) as ceNodeLabels WHERE all(x IN ['Employment', 'Location'] WHERE x IN ceNodeLabels) WITH childD, dg, rdgd return count(childD)
Neo4j Cypher group by a column in a list of rows for aggregation
I have the following Neo4j Cypher query: MATCH (v:Vacancy {deleted: false})-[vv:HAS_VOTE_ON]->(c:Criterion)<-[vp:HAS_VOTE_ON]-(p:Profile {id: 703, deleted: false}) WHERE vv.avgVotesWeight > 0 AND vv.avgVotesWeight <= vp.avgVotesWeight WITH v, p MATCH (v)-[vv1:HAS_VOTE_ON]->(cv:Criterion) OPTIONAL MATCH (p)-[vp1:HAS_VOTE_ON]->(cv) WITH v.id as vacancyId, cv.id as criterionId, coalesce(vv1.`properties.skillCoefficient`, 1.0) as vacancyCriterionCoefficient, coalesce(vp1.avgVotesWeight, 0) as profileCriterionVoteWeight, coalesce(vp1.totalVotes, 0) as profileCriterionTotalVotes RETURN vacancyId, criterionId, vacancyCriterionCoefficient, profileCriterionVoteWeight, profileCriterionTotalVotes which returns the following values: Now, for each Vacancy (with the same vacancyId) I need to calculate totalProfileCriterionVoteWeight (SUM) for all criteria by the folowing formula: vacancyCriterionCoefficient * profileCriterionVoteWeight For this purpose, I need to group somehow the rows by vacancyId. Could you please show how it is possible with a Cypher here?
You can replace your last line with: WITH distinct(vacancyId) as vacancyId, sum(vacancyCriterionCoefficient * profileCriterionVoteWeight) as totalProfileCriterionVoteWeight RETURN vacancyId, totalProfileCriterionVoteWeight Which For the data shown in the picture will return: ╒═══════════╤═════════════════════════════════╕ │"vacancyId"│"totalProfileCriterionVoteWeight"│ ╞═══════════╪═════════════════════════════════╡ │704 │22 │ ├───────────┼─────────────────────────────────┤ │706 │16 │ └───────────┴─────────────────────────────────┘ Explanation: distinct allows to "group" the rows, only with an "accumulator" to other fields. Here we just needed to use SUM as an accumulator. In order to test it, I used sample data: MERGE (a:Node{vacancyId:704, criterionId: 6907, vacancyCriterionCoefficient: 1, profileCriterionVoteWeight: 1, profileCriterionTotalVotes: 1}) MERGE (b:Node{vacancyId:704, criterionId: 6909, vacancyCriterionCoefficient: 3, profileCriterionVoteWeight: 5, profileCriterionTotalVotes: 1}) MERGE (c:Node{vacancyId:704, criterionId: 6908, vacancyCriterionCoefficient: 2, profileCriterionVoteWeight: 3, profileCriterionTotalVotes: 1}) MERGE (d:Node{vacancyId:706, criterionId: 6909, vacancyCriterionCoefficient: 1, profileCriterionVoteWeight: 5, profileCriterionTotalVotes: 1}) MERGE (e:Node{vacancyId:706, criterionId: 6908, vacancyCriterionCoefficient: 3, profileCriterionVoteWeight: 3, profileCriterionTotalVotes: 1}) MERGE (f:Node{vacancyId:706, criterionId: 6907, vacancyCriterionCoefficient: 2, profileCriterionVoteWeight: 1, profileCriterionTotalVotes: 1}) And query: MATCH (n) WITH n.vacancyId as vacancyId, n.criterionId as criterionId, n.vacancyCriterionCoefficient as vacancyCriterionCoefficient, n.profileCriterionVoteWeight as profileCriterionVoteWeight, n.profileCriterionTotalVotes as profileCriterionTotalVotes WITH distinct(vacancyId) as vacancyId, sum(vacancyCriterionCoefficient * profileCriterionVoteWeight) as totalProfileCriterionVoteWeight //return vacancyId, criterionId, vacancyCriterionCoefficient, profileCriterionVoteWeight, profileCriterionTotalVotes RETURN vacancyId, totalProfileCriterionVoteWeight Which provide the results above
Neo4j query concerning two elements
I need help trying to do a query in Neo4j that I can't seem to figure out. The query is to return all cakes that contain both the ingredients: Milk and Cream. Below is a snippet of a cake node and the ingredients (There are more ingredients and cakes but I didn't post them here as they are all formatted the same): (brownies:Cake {name: "Brownies"}), (brownies)-[:CONTAINS {quantity: 50, unit: "grams"}]->(white), (brownies)-[:CONTAINS {quantity: 250, unit: "grams"}]->(selfraising), (brownies)-[:CONTAINS {quantity: .5, unit: "grams"}]->(salt), (brownies)-[:CONTAINS {quantity: 125, unit: "grams"}]->(sugar), (brownies)-[:CONTAINS {quantity: 250, unit: "grams"}]->(cocoa), (brownies)-[:CONTAINS {quantity: 125, unit: "grams"}]->(lemonade), (brownies)-[:CONTAINS {quantity: 125, unit: "grams"}]->(cola), (brownies)-[:GARNISHED_WITH {how: "chopped on top"}]->(cherry), (brownies)-[:GARNISHED_WITH {how: "chopped on top"}]->(orange), (limeJuice:Ingredient {name: "lime juice"}), (cranberryJuice:Ingredient {name: "cranberry juice"}), (lemonJuice:Ingredient {name: "lemon juice"}), (orangeJuice:Ingredient {name: "orange juice"}), (tomatoJuice:Ingredient {name: "tomato juice"}), (lemonade:Ingredient {name: "lemonade"}), (soda:Ingredient {name: "soda water"}), (spice:Ingredient {name: "spice water"}), (cola:Ingredient {name: "cola"}), Neo4j seems to have trouble identifying ingredients but I'm not entirely sure that my query is formatted correctly regardless, here is what I have so far: MATCH(x:Cake)-[:CONTAINS]-> (Ingredient: "milk" or "cream") Return x
Your Ingredient node check is problematic. Needs to be more like: MATCH(x:Cake)-[:CONTAINS]-> (i:Ingredient) WHERE i.name IN ['milk', 'cream'] Return x
Here is one way to get the cakes that contain ALL the ingredients from a list: MATCH (cake:Cake) WHERE ALL(x IN ['milk', 'cream'] WHERE (cake)-[:CONTAINS]->(:Ingredient{name: x})) RETURN cake
py2neo unique nodes with unique relations given timestamp
I am trying to create a graph that stores time based iterations between nodes. I would like the nodes to be unique and relationships between nodes to be unique given the timestamp property. My first attempt creates 2 nodes and 1 relationship which is not what I want. from py2neo import neo4j, node, rel graph_db = neo4j.GraphDatabaseService() graph_db.get_or_create_index(neo4j.Node, "node_index") batch = neo4j.WriteBatch(graph_db) # a TALKED_TO b at timestamp 0 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.get_or_create_indexed_relationship('rel_index', 'type', 'TALKED_TO', 0, 'TALKED_TO', 1, {"timestamp": 0}) # a TALKED_TO b at timestamp 1 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.get_or_create_indexed_relationship('rel_index', 'type', 'TALKED_TO', 3, 'TALKED_TO', 4, {"timestamp": 1}) # a TALKED_TO b at timestamp 2 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.get_or_create_indexed_relationship('rel_index', 'type', 'TALKED_TO', 6, 'TALKED_TO', 7, {"timestamp": 0}) results = batch.submit() print results #[Node('http://localhost:7474/db/data/node/2'), #Node('http://localhost:7474/db/data/node/3'), #Relationship('http://localhost:7474/db/data/relationship/0'), #Node('http://localhost:7474/db/data/node/2'), #Node('http://localhost:7474/db/data/node/3'), #Relationship('http://localhost:7474/db/data/relationship/0'), #Node('http://localhost:7474/db/data/node/2'), #Node('http://localhost:7474/db/data/node/3'), #Relationship('http://localhost:7474/db/data/relationship/0')] My second attempt creates 2 nodes and 0 relations, not sure why it fails to create any relationships. from py2neo import neo4j, node, rel graph_db = neo4j.GraphDatabaseService() graph_db.get_or_create_index(neo4j.Node, "node_index") batch = neo4j.WriteBatch(graph_db) # a TALKED_TO b at timestamp 0 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.create(rel(0, 'TALKED_TO', 1, {"timestamp": 0})) # a TALKED_TO b at timestamp 1 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.create(rel(3, 'TALKED_TO', 4, {"timestamp": 1})) # a TALKED_TO b at timestamp 2 batch.get_or_create_indexed_node('node_index', 'name', 'a', {'name': 'a'}) batch.get_or_create_indexed_node('node_index', 'name', 'b', {'name': 'b'}) batch.create(rel(6, 'TALKED_TO', 7, {"timestamp": 0})) results = batch.submit() print results #[Node('http://localhost:7474/db/data/node/2'), #Node('http://localhost:7474/db/data/node/3'), #None] So how do I achieve what is depicted in the image below?
Okay so I think I figured it out but I'm not sure if its efficient. Does anyone know a better way than the following? # Create nodes a and b if they do not exist. query = """MERGE (p:Person { name: {name} }) RETURN p""" cypher_query = neo4j.CypherQuery(neo4j_graph, query ) result = cypher_query .execute(name='a') result = cypher_query .execute(name='b') # Create a relationship between a and b if it does not exist with the given timestamp value. query = """ MATCH (a:Person {name: {a}}), (b:Person {name: {b}}) MERGE (a)-[r:TALKED_TO {timestamp: {timestamp}}]->(b) RETURN r """ cypher_query = neo4j.CypherQuery(neo4j_graph, query) result = cypher_query.execute(a='a', b='b', timestamp=0) result = cypher_query.execute(a='a', b='b', timestamp=1)