It appears that I can't add a relationship unless there is already some data in some Entities that obey that relationship. Is this correct? I want to be able to set up my relationships and Labels first and then populate with data and have the data just use the relationships.
I am using:
MATCH (from:this_label),(to:that_label)
WHERE from.id = to.uuid
CREATE (from)-[:hasARelationship]->(to);
Basically, I want to be able to define a bunch of relationships on nodes of a certain label, even if those node-type do not yet exist. And then when some data of those nodes comes into the database it will hook up the relationships automatically.
It may be helpful to distinguish between the responsibilities of enforcing a constraint and fulfilling a constraint.
Neo4j allows for indices and constraints associated with labels. Indices and constraints created for a label are used to index and constrain the nodes that have that label. As of version 2.2.5, there is only one type of constraint: a uniqueness constraint for a single property. There have been talk about adding constraints for combinations of properties, and for relationships, but I don't know the status of these conversations.
The Neo4j schema constraints enforce something, but they will not fulfill, in the sense of changing your operations on the database to satisfy the constraint. If there were constraints enforcing that a node with label A may only be created if it has a relationship of type R to a node with label B, they would block your operation if it did not satisfy the constraint, but they would not satisfy it for you.
The best way to achieve this is a) to satisfy this requirement in your client application, or b) to create an extension for Neo4j. For an extension example, consider neo4j-uuid by Stefan Armbruster. It listens to transactions (using what's called a TransactionEventListener) and makes sure that any node that is created in the database has a UUID. This extension satisfies what could only be enforced by a corresponding Neo4j schema constraint (there are other differences, e.g., the constraint would be limited to the scope of a label).
A way to achieve your intention could be to either create an extension which listens to what you write to the database and satisfies your constraint, altering your operations if necessary; or, one which provides an invocation target in the server (a RESTful endpoint) that you can invoke whenever you want to create a node with a particular label. The extension would then create the node and other elements necessary to fulfill your schema. A downside to the former could be the overhead of listening to all your operations, a downside to the latter could be that it breaks your flow of interaction with the database to introduce a separate type of invocation (e.g., if you normally execute cypher statements and have to pause to issue a separate POST request and interpret the response before continuing).
If I understand you correctly, you want to use MERGE instead of MATCH.
MERGE (from:this_label) -[:hasARelationship]-> (to:that_label) WHERE from.id = to.uuid
If you are trying to create relationships without nodes, I guess that is not even possible in NEO4J. Infact, it wouldn't be possible in any graph in general.
It does not make sense to pre-populate your DB with relationships that connect to dummy nodes. Among the many reasons are these:
You would not be able to make any meaningful queries involving such relationships
Trying to fill in the dummy nodes later on with actual data may be a complex endeavor
It is very easy to created relationships right when they are needed. neo4j is a "schemaless" DB (except when you define uniqueness constraints, as #jjaderberg mentions). You can create a relationship of any type connecting nodes with any labels (or no labels) at any time. To keep things organized, you may choose to write your DB client code and Cypher queries to conform to your own conceptual "schema", but neo4j has no such a requirement.
Related
In Neo4j, is there a way of enforcing that a node of a label X is not connected to a node of label Y?
For example, if someone tried to run a query such as:
MERGE (:X)-[:SOME_RELATIONSHIP]->(:Y)
would there be a way to guarantee that such a query would fail?
Thank you!
Neo4j's constraints don't currently support relationship existence or restriction, so you'd need to put in some extra work.
If you have APOC Procedures, you could register a trigger which could get evaluated to check if a relationship being created connects two nodes of those labels and use apoc.util.validate() to generate an error which will fail and rollback the transaction.
If you want to do this without APOC, it's a bit more work, as you'll need to create a TransactionEventHandler, and then a kernel extension to load your event handler. Here's a blog entry on this approach.
hi how can i transform this SQL Query as CYPHER Query ? :
SELECT n.enginetype, n.Rocket20, n.Yearlong, n.DistanceOn,
FROM TIMETAB AS n
JOIN PLANEAIR AS p ON (n.tailnum = p.tailNum)
If it is requisition before using that query to create any relationship or antyhing please write and help with that one too.. thanks
Here's a good guide for comparing SQL with Cypher and showing the equivalent Cypher for some SQL queries.
If we were to translate this directly, we'd use :PLANEAIR and :TIMETAB node labels (though I'd recommend using better names for these), and we'll need a relationship between them. Let's call it :RELATION.
Joins in SQL tend to be replaced with relationships between nodes, so we'll need to create these patterns in your graph:
(:PLANEAIR)-[:RELATION]->(:TIMETAB)
There are several ways to get your data into the graph, usually through LOAD CSV. The general approach is to MERGE your :PLANEAIR and :TIMETAB nodes with some id or unique property (maybe TailNum?, use ON CREATE SET ... after the MERGE to add the rest of the properties to the node when it's created, and then MERGE the relationship between the nodes.
The MERGE section of the developers manual should be helpful here, though I'd recommend reading through the entire dev manual anyway.
With this in place, the Cypher equivalent query is:
MATCH (p:PLANEAIR)-[:RELATION]->(n:TIMETAB)
RETURN n.Rocket20,p.enginetype, n.year, n.distance
Now this is just a literal translation of your SQL query. You may want to reconsider your model, however, as I'm not sure how much value there is in keeping time-related data for a plane separate from its node. You may just want to have all of the :TIMETAB properties on the :PLANEAIR node and do away with the :TIMETAB nodes completely. Of course your queries and use cases should guide how to model that data best.
EDIT
As far as creating the relationship between :PLANEAIR and :TIMETAB nodes (and again, I recommend using better labels for these, and maybe even keeping all time-related properties on a :Plane node instead of a separate one), provided you already have those nodes created, you'll need to do a joining match, but it will help to have a unique constraints on :PLANEAIR(tailnum) :TIMETAB(tailNum) (or an index, if this isn't supposed to be a unique property):
CREATE CONSTRAINT ON (p:PLANEAIR)
ASSERT p.tailNum IS UNIQUE
CREATE CONSTRAINT ON (n:TIMETAB)
ASSERT n.TailNum IS UNIQUE
Now we're ready to create the relationships
MATCH (p:PLANEAIR)
MATCH (n:TIMETAB)
WHERE p.tailNum = n.tailNum
CREATE (p)-[:RELATION]->(n)
REMOVE n.tailNum
Now that the relationships are created, and :TIMETAB tailNum property removed, we can drop the unique constraint on :TIMETAB(tailNum), since the relationship to :PLANEAIR is all we need.
DROP CONSTRAINT ON (n:TIMETAB)
ASSERT n.tailNum IS UNIQUE
I recently discovered that a race condition exists when executing concurrent MERGE statements. Specifically, duplicate nodes can be created in the scenario where a node is created after the MATCH step but before the CREATE step of a given MERGE.
This can be worked around in some instances using unique constraints on the merged nodes; however, this falls short in scenarios where:
There is no single unique property to enforce (e.g. pairs of properties need to be unique but individual ones don't).
Trying to merge relationships and paths.
Does using CREATE UNIQUE solve this problem (or do the same pitfalls exist)? If so, is it the only option? It feels like the usefulness of MERGE is fairly heavily diminished when it effectively can't guarantee the uniqueness of the path or node being merged...
When MERGE statements are executed concurrently, these situations may occur. Basically, each transaction gets a view of the graph at the first point of reading, and won't see updates made after that point (with some variations). The main exception to this are uniquely constrained nodes, where Neo4j will initialise a fresh reader from the index when reading, regardless of what was previously read in the transaction.
A workaround could be to create a 'dummy' property and a unique constraint on it and one of the node labels. In Neo4j 2.2.5, this should work to get around your problem.
To clarify, let's assume that I have a relationship type: "connection." Connections has a property called: "typeOfConnection," which can take on values in the domain:
{"GroupConnection", "FriendConnection", "BlahConnect"}.
When I query, I may want to qualify connection with one of these types. While there are not many types, there will be millions of connections with each property type.
Do I need to put an index on connection.typeOfConnection in order to ensure that all connections will not be traversed?
If so, I have been unable to find a simple cypher statement to do this. I've seen some stuff in the documentation describing how to do this in Java, but I'm interacting with Neo using Py2Neo, so it would be wonderful if there was a cypher way to do this.
This is a mixed granularity property graph data model. Totally fine, but you need to replace your relationship qualifiers with intermediate nodes. To do this, replace your relationships with one type node and 2 relationships so that you can perform indexing.
Your model has a graph with a coarse-grained granularity. The opposite extreme is referred to as fine-grained granularity, which is the foundation of the RDF model. With property graph you'll need to use nodes in place of relationships that have labels applied by their type if you're going to do this kind of coarse-grained graph.
For instance, let's assume you have:
MATCH (thing1:Thing { id: 1 })-->(:Connection { type: "group" }),
(group)-->(thing2:Thing)
RETURN thing2
Then you can index on the label Connection by property type.
CREATE INDEX ON :Connection(type)
This allows you the flexibility of not typing your relationships if your application requires dynamic types of connections that prevent you from using a fine-grained granularity.
Whatever you do, don't work around your issue by dynamically generating typed relationships in your Cypher queries. This will prevent your query templates from being cached and decrease performance. Either type all your relationships or go with the intermediate node I've recommended above.
I am writing a sever plugin for Neo4J. The plugin receives a cypher query, and executes it. Currently, my implementation uses a CypherExecutor.
I now need to further constrain the results. (For example, imagine that the results need to be filtered by ACLs.)
One approach is to filter the results after executing the query. I'd rather not do this, for performance reasons as well as other limitations (for example, any aggregate results would be wrong.)
I considered adding the constraints to the query itself. I've looked at the command.AbstractQuery subclasses produced using the CypherParser. That object model is immutable.
I am wondering whether I will need to resort to cloning Neo4J's ExecutionEngine and CypherCompiler, just to extend the ExecutionPlanBuilder... I would like to avoid this option if at all possible.
Any recommendations about how this can be done?
In my case, I am simply trying to simulate multiple isolated graphs. I am OK with how this might be modeled -- whether I add a 'tenantId' to each node, or maintain a tenant node and add (:Tenant)<-[:scopedTo]-(n) relationships to every node.