Neo4j how to delete nodes recursively from some start node - neo4j

In my Neo4j database I have a following entities:
#NodeEntity
public class Product {
private final static String CONTAINS = "CONTAINS";
private final static String DEFINED_BY = "DEFINED_BY";
private final static String VOTED_FOR = "VOTED_FOR";
private final static String PARENT = "PARENT";
private final static String CREATED_BY = "CREATED_BY";
#GraphId
private Long id;
#RelatedTo(type = PARENT, direction = Direction.INCOMING)
private Product parent;
#RelatedTo(type = CONTAINS, direction = Direction.OUTGOING)
private Set<Product> childProducts = new HashSet<>();
#RelatedTo(type = DEFINED_BY, direction = Direction.INCOMING)
private Set<Criterion> criterias = new HashSet<>();
#RelatedTo(type = VOTED_FOR, direction = Direction.INCOMING)
private Set<Vote> votes = new HashSet<>();
#RelatedTo(type = CREATED_BY, direction = Direction.OUTGOING)
private User user;
}
#NodeEntity
public class Criterion {
private final static String CREATED_BY = "CREATED_BY";
private final static String DEFINED_BY = "DEFINED_BY";
#GraphId
private Long id;
#RelatedTo(type = DEFINED_BY, direction = Direction.OUTGOING)
private Product owner;
#RelatedTo(type = CREATED_BY, direction = Direction.OUTGOING)
private User user;
}
#NodeEntity
public class Vote {
private static final String VOTED_ON = "VOTED_ON";
private final static String VOTED_FOR = "VOTED_FOR";
private static final String CREATED_BY = "CREATED_BY";
#GraphId
private Long id;
#RelatedTo(type = VOTED_FOR, direction = Direction.OUTGOING)
private Product product;
#RelatedTo(type = VOTED_ON, direction = Direction.OUTGOING)
private Criterion criterion;
#RelatedTo(type = CREATED_BY, direction = Direction.OUTGOING)
private User user;
}
Product is a composite entity and can contain child Products.
Starting from some Product node in the hierarchy I need to delete all Votes on this Product and then I need to recursively delete all child Products, Criteria defined by these nodes and Votes. User nodes must not be deleted.
I have tried this Cypher query:
MATCH (p:Product)-[r:CONTAINS*]-(e) WHERE id(p) = {productId} FOREACH (rel IN r| DELETE rel) DELETE e
but it deletes only Products and doesn't delete Votes on the start node and all child Criteria and Votes. Please help me with a correct Cypher query. Thanks.

I'd split that up into two queries. The first one recursively collects the the product hierarchy downwards and the second one deletes one product node and its direct environment.
Getting the product hierarchy is simple:
MATCH (p:Product)-[:CONTAINS*]->(childProduct)
WHERE id(p) = {productId}
RETURN id(childProduct) as id
To delete one product we need to delete all relationships of that product node. Additionally all related nodes need to be deleted as well if they are neither user (you want to keep them) or products (think of parent products - they should be kept as well). Also we need to be sure the target nodes are not connected, thats's why the 2nd optional match:
MATCH (p:Product)
OPTIONAL MATCH (p)-[r]-(t)
WHERE id(p) = {productId}
DELETE r,p
WITH t
OPTIONAL MATCH (t)-[r2:VOTE_ON|:CREATED_BY]->()
WHERE none(x in labels(t) WHERE x in ["User", "Product"])
DELETE t,r2
I did not test this query myself since you didn't provide a test graph. So take this as an idea and modify it until it works.
update
In a chat we found that this cypher statement solves the problem, note that Product has been replaced by Decision label in the model:
MATCH (p:Decision)
WHERE p.name = "NoSQL"
WITH p
OPTIONAL MATCH (p)-[r]-(t)
DELETE p,r
WITH t,r
OPTIONAL MATCH (t)-[r2:VOTED_ON|:CREATED_BY|:VOTED_FOR]-()
WITH t, r2,r
WHERE none(x in labels(t) WHERE x in ["User", "Decision"])
DELETE t,r2

Related

Neo4j SDN4 entity inheritance and indexes

I have a following Cypher query:
PROFILE MATCH (childD:Decision)
WITH childD
ORDER BY childD.createDate
DESC SKIP 0 LIMIT 10
MATCH (childD:Decision)-[ru:CREATED_BY]->(u:User)
OPTIONAL MATCH (childD:Decision)-[rup:UPDATED_BY]->(up:User)
RETURN ru, u, rup, up, childD AS decision, [ (childD)-[rdt:BELONGS_TO]->(t:Tag) | t ] AS tags
Right now on my Neo4j database (~23k Decision nodes) this query works ~50 ms and I don't understand or it uses index on childD.createDate field.
This is PROFILE output:
This is my SDN 4 entities:
#NodeEntity
public abstract class BaseEntity implements BaseEntityVisitable {
private static final String CREATED_BY = "CREATED_BY";
private static final String UPDATED_BY = "UPDATED_BY";
#GraphId
private Long graphId;
#Index(unique = false)
private Date createDate;
#Relationship(type = CREATED_BY, direction = Relationship.OUTGOING)
private User createUser;
#Index(unique = false)
private Date updateDate;
#Relationship(type = UPDATED_BY, direction = Relationship.OUTGOING)
private User updateUser;
....
}
#NodeEntity
public class Decision extends BaseEntity {
private static final String BELONGS_TO = "BELONGS_TO";
private static final String CONTAINS = "CONTAINS";
private static final String DEFINED_BY = "DEFINED_BY";
#Index(unique = true)
private Long id;
#Index(unique = false)
private String name;
....
}
This is :schema output:
Indexes
ON :BaseEntity(createDate) ONLINE
ON :BaseEntity(updateDate) ONLINE
ON :Decision(lowerName) ONLINE
ON :Decision(name) ONLINE
ON :Decision(totalChildDecisions) ONLINE
ON :Decision(totalViews) ONLINE
ON :Decision(id) ONLINE (for uniqueness constraint)
Please note that createDate index is set on :BaseEntity and not on :Decision
Hot to check that this index works(or not) for this part of the query: ORDER BY childD.createDate
I think you're confusing an index with a sorting order. There is no reason whatsoever that this query would use an index as you're not giving it any value to search the index with. It could be that the index-implementation has the dates in order, but there's no rule that says this has to be so (and obviously the query is not using an index to sort the Decision nodes).
Hope this helps.
Regards,
Tom

Neo4j Cypher find entity by exact collection of associated nodes(ids)

In my Neo4j/SDN4 project I have a following node entity:
#NodeEntity
public class Nomination extends Commentable {
private final static String CONTAINS = "CONTAINS";
private final static String DEFINED_BY = "DEFINED_BY";
private String name;
#Relationship(type = CONTAINS, direction = Relationship.OUTGOING)
private Set<Criterion> criteria = new HashSet<>();
...
}
I need to implement a Cypher query that will try to find Nomination by exact collection of associated criteria (by criterion ids).
Right now I have a following query:
MATCH (n:Nomination)-[:CONTAINS]->(c:Criterion) WHERE id(n) = {nominationId} AND id(c) IN {criterionIds} RETURN n
but it is not enough because of Nomination can contain less criteria that was provided in {criterionIds} but I need to check exact match(order of criteria doesn't matter)
How to reimplement this query in order to do this ?
Use COLLECT and then ALL function.
https://neo4j.com/docs/developer-manual/current/cypher/functions/predicates/#functions-all
MATCH (n:Nomination)-[:CONTAINS]->(c:Criterion)
WHERE id(n) = {nominationId}
WITH n,COLLECT(id(c)) AS foundCritIds
WHERE ALL (id IN {criterionIds} WHERE id in foundCritIds)
RETURN n
Here's an alternate approach, you may want to PROFILE each to see which works best for you:
MATCH (c:Criterion)
WHERE id(c) in {criterionIds}
WITH COLLECT(c) as criterion
WITH criterion, head(criterion) as firstC
MATCH (firstC)<-[:CONTAINS]-(n:Nomination)
WHERE SIZE((n)-[:CONTAINS]->(:Criterion)) = SIZE(criterion)
AND ALL(crit in criterion[1..] WHERE (n)-[:CONTAINS]->(crit))
RETURN n

Neo4j Cypher query with null or not null value

In my Spring Data Neo4j project I have a following entities:
#NodeEntity
public class Decision extends Commentable {
private final static String CONTAINS = "CONTAINS";
private final static String DEFINED_BY = "DEFINED_BY";
private String name;
#Relationship(type = DEFINED_BY, direction = Relationship.INCOMING)
private Set<CriterionGroup> criterionGroups = new HashSet<>();
#Relationship(type = DEFINED_BY, direction = Relationship.INCOMING)
private Set<Criterion> criteria = new HashSet<>();
...
}
#NodeEntity
public class Criterion extends Authorable {
private final static String CONTAINS = "CONTAINS";
private final static String DEFINED_BY = "DEFINED_BY";
private String name;
#Relationship(type = CONTAINS, direction = Relationship.INCOMING)
private CriterionGroup group;
#Relationship(type = DEFINED_BY, direction = Relationship.OUTGOING)
private Decision owner;
...
}
#NodeEntity
public class CriterionGroup extends Authorable {
private final static String DEFINED_BY = "DEFINED_BY";
private final static String CONTAINS = "CONTAINS";
private String name;
#Relationship(type = DEFINED_BY, direction = Relationship.OUTGOING)
private Decision owner;
#Relationship(type = CONTAINS, direction = Relationship.OUTGOING)
private Set<Criterion> criteria = new HashSet<>();
...
}
I have a following SDN repository method:
#Query("MATCH (d:Decision)<-[:DEFINED_BY]-(c:Criterion) WHERE id(d) = {decisionId} and c.name = {name} RETURN c")
Criterion findCriterionDefinedByDecisionByName(#Param("decisionId") Long decisionId, #Param("name") String name);
Based on this query I can get Criterion that belongs to Decision and have a specific name.
Criterion in my domain model may (or may not)belong to CriterionGroup.
I need to extend this query in order to add one more condition for checking CriterionGroup associated with this Criterion. In another words I need to return Criterion with a specific {name} for a specific {decisionId} that belongs(or not in case of null value) to a provided {criterionGroupId}. In case of {criterionGroupId} == null I need to find Criterion that do not belong to any CriterionGroup.
I need something like this:
#Query("MATCH (d:Decision)<-[:DEFINED_BY]-(c:Criterion)...????????...... WHERE id(d) = {decisionId} and c.name = {name} RETURN c")
Criterion findCriterionDefinedByDecisionByName(#Param("decisionId") Long decisionId, #Param("name") String name, #Param("criterionGroupId") Long criterionGroupId);
Please help me to write this query.
This should work, and shouldn't be too slow either unless you have tens of thousands of CriterionGroups per Criterion:
MATCH (d:Decision)<-[:DEFINED_BY]-(c:Criterion)
WHERE id(d) = {decisionId}
AND c.name = {name}
OPTIONAL MATCH (c)<-[:CONTAINS]-(cg:CriterionGroup)
WITH c, extract(g IN collect(cg) | id(g)) AS cgIds
WHERE CASE
WHEN {criterionGroupId} IS NULL THEN size(cgIds) = 0
ELSE {criterionGroupId} IN cgIds
END
RETURN c
Alternatively, you could have 2 methods in your Repository, to manage each case directly, using
MATCH (d:Decision)<-[:DEFINED_BY]-(c:Criterion)
WHERE id(d) = {decisionId}
AND c.name = {name}
AND NOT (c)<-[:CONTAINS]-(:CriterionGroup)
RETURN c
when criterionGroupId is null, and
MATCH (d:Decision)<-[:DEFINED_BY]-(c:Criterion),
(c)<-[:CONTAINS]-(cg:CriterionGroup)
WHERE id(d) = {decisionId}
AND c.name = {name}
AND id(cg) = {criterionGroupId}
RETURN c
otherwise.

How to classify the attributes according to the attribute

I have an entity class like this:
#NodeEntity
public class Patent {
#GraphId
private Long patentId;
private String patentName;
//2016-02-01
private String authorizedTime;
private String patentNumber;
#Fetch
#RelatedTo(type = "authorizedPerson")
private Set<Researcher> authorizedPersons = new HashSet<Researcher>();
private String createTime;
private String description;
I want to get result like this:
year total
2013 6
2014 7
I try to do use this Cypher query:
match (n1:Patent)
with collect( DISTINCT subString(n1.authorizedTime,0,4)) as coll,subString(n1.authorizedTime,0,4) as val
return coll, reduce(s=0, val IN coll | s + 1) as numByY ;
But no success.
How to classify the attributes according to the attribute?
Thanks a lot!
This query should return each year and the number of nodes with that year, ordered by year:
MATCH (n1:Patent)
RETURN LEFT(n1.authorizedTime, 4) AS year, COUNT(*) AS total
ORDER BY year;

Neo4j Cypher delete query

I have a following Neo4j Cypher query for Decision entity deleting:
MATCH (d:Decision)
WHERE id(d) IN {decisionsIds}
OPTIONAL MATCH (d)-[r]-(t)
DELETE d, r WITH t, r
WHERE NOT (id(t) IN {decisionsIds})
OPTIONAL MATCH (t)-[r2:VOTED_ON|:CREATED_BY|:VOTED_FOR]-()
WHERE r2 <> r WITH t, r2
WHERE none(x in labels(t) WHERE x in ['User', 'Decision']) DELETE t, r2
Previously I had a Vote entity with relationships VOTED_ON and VOTED_FOR to entities Criterion and Decision. Also, Vote has relationship CREATED_BY to User entity.
Everything worked fine.
Today, I have changed this schema. I have introduced new VoteGroup entity.
Now, VoteGroup instead of Vote contains relationships VOTED_ON and VOTED_FOR to entities Criterion and Decision:
#NodeEntity
public class VoteGroup extends BaseEntity {
private static final String VOTED_ON = "VOTED_ON";
private final static String VOTED_FOR = "VOTED_FOR";
private final static String CONTAINS = "CONTAINS";
#GraphId
private Long id;
#RelatedTo(type = VOTED_FOR, direction = Direction.OUTGOING)
private Decision decision;
#RelatedTo(type = VOTED_ON, direction = Direction.OUTGOING)
private Criterion criterion;
#RelatedTo(type = CONTAINS, direction = Direction.OUTGOING)
private Set<Vote> votes = new HashSet<>();
private double avgVotesWeight;
private long totalVotesCount;
.....
}
Vote entity now looks like:
#NodeEntity
public class Vote extends BaseEntity {
private final static String CONTAINS = "CONTAINS";
private final static String CREATED_BY = "CREATED_BY";
#GraphId
private Long id;
#RelatedTo(type = CONTAINS, direction = Direction.INCOMING)
private VoteGroup group;
#RelatedTo(type = CREATED_BY, direction = Direction.OUTGOING)
private User author;
private double weight;
....
}
Please help me to change the mentioned Cypher query in order to delete Votes. After my schema changes it now deletes VoteGroups(it's okay) but doesn't deletes Votes. I need to delete Votes and relationships between Votes and User also.
UPDATED
The new following query working now(at least all my tests passed):
MATCH (d:Decision)
WHERE id(d) IN {decisionsIds}
OPTIONAL MATCH (d)-[r]-(t)
DELETE d, r
WITH t, r
WHERE NOT (id(t) IN {decisionsIds})
OPTIONAL MATCH (t)-[r2:VOTED_ON|:CREATED_BY|:VOTED_FOR]-()-[r3:CONTAINS]-(t2)
WHERE r2 <> r
WITH t, r2, t2, r3
WHERE none(x in labels(t)
WHERE x in ['User', 'Decision'])
DELETE t, r2, t2, r3
but I'm still not sure that this query is 100% correct... Anyway, I'll add a bunch of tests in order to check everything.
Could you please validate this query also? Especially I'm not sure that deleted all the relationships and did not leave the garbage in the database.
Looks pretty complicated.
Can't you just specify the path and delete the whole path?
MATCH (d:Decision) WHERE id(d) IN {decisionsIds}
OPTIONAL MATCH path = (d)-[r1:VOTED_ON|:CREATED_BY|:VOTED_FOR]->(t)<-[r2:VOTED_ON|:CREATED_BY|:VOTED_FOR]-()
DELETE path
// or
DELETE d,t,r1,r2

Resources