Ensure unique nodes with neo4jclient - neo4jclient

Is there a way to ensure uniqueness when creating a node with neo4jclient?
This link transactions shows how to do it using java and transactions, but I don't see any transaction support in neo4jclient. I was able to do it using an explicit Cypher string query something like this:
"start n=node:node_auto_index(name={id})
with count(*) as c
where c=0
create x={name:{id}}
return c"
But this is obviously a hack. Is there a better way?

Transaction support will come with Neo4j 2.0 and a later version of Neo4jClient. This issue is tracking the work: https://bitbucket.org/Readify/neo4jclient/issue/91/support-cypher-transactions-integrated
That doesn't give you uniqueness though...
Neo4j doesn't have unique indexes are anything to auto-enforce this idea. (I expect we'll see this with Neo4j 2.0 labels in the future, but not yet.)
You need to either a) know that what you're creating is unique, or b) check first.
You seem to be taking the B route.
Transactions allow you to do the check then the create within a single transactional action, but still multiple calls over the wire.
The Cypher text that you've written out is actually preferred: you do the check and create in a single statement. I'm intrigued to know why you think this is a hack.
You can execute this statement via Neo4jClient with something like:
var id = 123;
graphClient.Cypher
.Start(new { n = Node.ByIndexLookup("node_auto_index", "name", id)})
.With("count(*) as c")
.Where("c=0")
.Create("x={0}", new MyType { name = id })
.Return<Node<MyType>>("c")
Some of the With and Where statements would be nice if they were cleaner, but it's functional for now.
There's also Cypher's CREATE UNIQUE clause which might cover your scenario too.

Related

Auto increment property in Neo4j

As far as I understand it the IDs given by Neo4j (ID(node)) are unstable and behave somewhat like row numbers in SQL. Since IDs are mostly used for relations in SQL and these are easily modeled in Neo4j, there doesn't seem to be much use for IDs, but then how do you solve retrieval of specific nodes? Having a REST API which is supposed to have unique routes for each node (e.g. /api/concept/23) seems like a pretty standard case for web applications.
But despite it being so fundamental, the only viable way I found were either via
language specific frameworks
as an unconnected node which maintains the increments:
// get unique id
MERGE (id:UniqueId{name:'Person'})
ON CREATE SET id.count = 1
ON MATCH SET id.count = id.count + 1
WITH id.count AS uid
// create Person node
CREATE (p:Person{id:uid,firstName:'Gabriel',lastName:'Smith'})
RETURN p AS person
Source: http://www.neo4j.org/graphgist?8012859
Is there really not a simpler way and if not, is there a particular reason for it? Is my approach an anti-pattern in the context of Neo4j?
Neo4j internal ids are a bit more stable than sql row id's as they will never change during a transaction for e.g.
And indeed exposing them for external usage is not recommended. I know there are some intentions at Neo internals to implement such a feature.
Basically people tend to use two solutions for this :
Using a UUID generator at the application level like for PHP : https://packagist.org/packages/rhumsaa/uuid and add a label/uuid unique constraint on all nodes.
Using a very handful Neo4j plugin like https://github.com/graphaware/neo4j-uuid that will add uuid properties on the fly, so it remove you the burden to handle it at the application level and it is easier to manage the persistence state of your node objects
I agree with Pavel Niedoba.
I came up with this without and UniqueID Node:
MATCH (a:Person)
WITH a ORDER BY a.id DESC LIMIT 1
CREATE (n:Person {id: a.id+1})
RETURN n
It requires a first Node with an id field though.

Neo4J API Rest: Create UNIQUE relationship or return FAIL without indexing

I was wondering whether it is possible to create a relationship or return fail if it exists through a Cypher query via REST. Besides, I do not want to create any kind of index.
This is my use case: User can like a comment only once. So I want to create the relationship (User)-[:LIKES]->(Comment) or return fail if it exists, using a Cypher query via REST.
My approach is to use CREATE UNIQUE and RETURN some kind of code that I will interpret in my back-end to know if I have to send 409 Conflict to the back-end's client. But this approach seems messy...
Any idea? Thanks.
If you are willing to put a property in your LIKES relationship you could do something like this.
WITH timestamp() AS now
MERGE (user)-[like:LIKES]->(comment)
ON CREATE SET like.created_at = timestamp()
RETURN like.created_at >= now
If the query returns true you know the like was created otherwise it existed previously and you can handle it accordingly.

How to use "MATCH.. CREATE UNIQUE.." with neography

I'm trying to write an importer script that will take a MySQL resultset and add nodes and relationships in my Neo4J DB. To simplify things, my resultset has the following fields:
application_id
amount
application_date
account_id
I want to create a node with Application label with the application_id, amount, application_date fields and another node with Account label with account_id field and a relationship between them.
My SQL query is from the applications table so I'm not afraid of dups there, but an account_id can appear more than once and I obviously don't want to create multiple nodes for that.
I'm using neography (but willing to switch if there is something simpler). What would be the best (easiest) way to achieve that?
My script will drop the database before it starts so no leftovers to take care of.
Should I create an index before and use create_unique_node?
I think I can do what I want in cypher using "MATCH .. CRAETE UNIQUE..", what's the equivalent of that in neography? I don't understand how index_name gets into the equation...
Should I or should I not define the constraint on Account?
Many thanks,
It's my first time with graph DBs so apologies if I miss a concept here..
From what you describe, it looks like you should use the constraints that come with Neo4j 2.0 : http://docs.neo4j.org/chunked/milestone/query-constraints.html#constraints-create-uniqueness-constraint
CREATE CONSTRAINT ON (account:ACCOUNT) ASSERT account.account_id IS UNIQUE
Then you can use the MATCH .. CREATE UNIQUE clauses for all your inserts. You can use neography to submit the cypher queries, see the examples here: https://github.com/maxdemarzi/neography/wiki/Scripts-and-queries

efficiency of where clause in cypher vs match

I'm trying to find 10 posts that were not LIKED by user "mike" using cypher. Will putting a where clause with a NOT relationship be efficient than matching with an optional relationship then checking if that relationship is null in the where clause? Specifically I want to make sure it won't do the equivalent of a full table scan and make sure that this is a scalable query.
Here's what I'm using
START user=node:node_auto_index(uname:"mike"),
posts=node:node_auto_index("postId:*")
WHERE not (user-[:LIKES]->posts)
RETURN posts SKIP 20 LIMIT 10;
Or can I do something where I filter on a MATCH optional relationship
START user=node:node_auto_index(uname="mike"),
posts=node:node_auto_index("postId:*")
MATCH user-[r?:LIKES]->posts
WHERE r IS NULL
RETURN posts SKIP 100 LIMIT 10;
Some quick tests on the console seem to show faster performance in the 2nd approach. Am I right to assume the 2nd query is faster? And, if so why?
i think in the first query the engine runs through all postID nodes and manually checks the condition of not (user-[:LIKES]->posts) for each post ID
whereas in the second example (assuming you use at least v1.9.02) the engine picks up only the post nodes, which actually aren't connected to the user. this is just optimalization where the engine does not go through all postIDs nodes.
if possible, always use the MATCH clause in your queries instead of WHERE, and try to omit the asterix in the declaration START n=node:index('name:*')

How do I provide multiple queries in Neo4j Cypher?

I want to use the results from the first query in the second query. I am not sure how to do this in Cypher?
Current code,
START user1=node:USER_INDEX(USER_INDEX = "userA")
MATCH user1-[r1:ACCESSED]->docid1<-[r2:ACCESSED]-user2, user2-[r3:ACCESSED]->docid2
WHERE r2.Topic=r3.Topic
RETURN distinct docid2.Label;
I want to have different conditions checked in the WHERE clause for the same docid2 set of nodes and accumulate the results and perform order by based on a date field.
I am not able to provide multiple match and return within the same transaction.
That is when I am trying to have two different cypher scripts and combine them in a third query. Is this possible in cypher?
Or is there any option to write custom functions and invoke them?
Do we have stored Cypher scripts like Stored Gremlin scripts?
As Michael mentioned in the comment, you can use the "with" statement to stream result into further statements. Unfortunately, you can't start another statement after the "where" clause. Multiple return statements would be kind of illogical, but you can do multiple things in a single query e.g.:
START x=node:node_auto_index(key="x")
with count(x) as exists
start y=node:node_auto_index(key="y")
where exists = 0
create (n {key:"y"})<-[:rel]-y
return n, y
This would check if the "x" node exists and if it doesn't, proceed to create it and add a couple of parameters.
If you wish to do more sophisticated things on result sets, your best options are either batch requests or the Java API...

Resources