What is the difference between randomUUID and GraphAware UUID in Neo4j? - neo4j

I am currently using GraphAware UUID Library to generate UUID in neo4j, later I found out that it has a randomUUID() function to generate UUID, which one should be used?, will randomUUID() create unique id on server?

They both call java.util.UUID#randomUUID(), that's where the similarity between them ends.
The built-in Cypher's randomUUID() is a function, which you have to call manually in each cypher query where you want to assign a UUID.
The neo4j-uuid module is a set of extensions to Neo4j, which allow you to transparently assign UUID (or other types of ids - depending on configured id generator) to nodes and relationships, ensures the ids can't be changed or deleted. It also maintains an explicit index for the nodes / relationships. See the readme for the full feature set.
If your use case is simply to assign a uuid to (some) nodes or relationships then use the built in function. If you can take advantage of the other features of the neo4j-uuid module - use that.

For manual use cases, creating the UUID yourself in a Cypher query, they're functionally identical (GraphAware implemented this first I think, we got to it later). Yes the ids will be created on the server and will be unique, both
I believe GraphAware's UUID module covers more than just this, doing automatic assigning of UUIDs to newly created nodes and relationships and extra validation on top of that.

Related

Assumptions regarding Node ID strings in Neo4j - cypher

In my recent question, Modeling conditional relationships in neo4j v.2 (cypher), the answer has led me to another question regarding my data model and the cypher syntax to represent it. Lets say in my model, there is a node CLT1 that is what I'll call the Source node. CLT1 has relationships to other 286 Target nodes. This is a model of a target node:
CREATE
(Abnormally_high:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{Pro1:'x',Prop2:'y',Prop3:'z'})
Key point: I am assuming the string after the CREATE clause is
The ID of this target node
The ID is significant because its content has domain-specific meaning
and is query-able.
in this case its the phrase ...."Abnormally_high".
I made this assumption based on the movie database example.
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
The first strings after CREATE definitely have domain-specific meaning!
In my earlier post I discuss Problem 2. I find that problem 2 arises because among the 286 target nodes, there are many instances where there was at least one more Target node who shares the identical ID. In this instance, the ID is "Abnormally_high". The other Target nodes may differ in the value of any of Label1 - Label10 or the associated properties.
Apparently, Cypher doesn't like that. In Problem 2, I was discussing the ways to deal with the fact that cypher doesn't like using the same node ID multiple times even though the labels or properties were different.
My problem are my assumptions about the Target node ID.
AM I RIGHT?
I am now thinking that I could instead use this....
CREATE (CLT1_target_1:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{name:'Abnormally_high',Prop2:'y',Prop3:'z'})
If indeed the first string after the CREATE clause is an ID, then all I have to do is put a unique target node identifier.... like CLT1_target_1 and increment up to CLT1_target_286. If I do this, then I can have the name as a property and change whatever label or property I want.
Do I have this right?
You are wrong. In Cypher, a node name (like "Abnormally_high") is just a variable name that exists for the lifetime of the query (and sometimes not even that long). The node name used in a Cypher query is never persisted in any way, and can be any arbitrary string.
Also, in neo4j, the term "ID" has a specific meaning. The neo4j DB will automatically assign a (currently) unique integer ID to each new node. You have no control over the ID value assigned to a node. And when a node is deleted, neo4j can reassign its ID to a new node.
You should read the neo4j manual (available at docs.neo4j.org), especially the section on Cypher, to get a better understanding.

Has anyone used Neo4j node IDs as foreign keys to other databases for large property sets?

I am building a large graph database that has a significant set of meta data about each node (thousands of properties per node). I am currently going through the process of determining which meta data should be a node within Neo4j, which should become a property of the Node and which should be housed in a separate database.
My thinking is to use the meta data in 3 ways:
1 - If the property is shared between many nodes, to make that property it's own node and create an edge to that property.
2 - If the property is important to traversing the graph, but not "highly" shared, to add that as a node property. (Which could also be indexed within Neo4j if needed)
3 = If the meta data is strictly describing that node, to have that stored in a separate NoSQL database, with the Neo4J Node ID becoming the foreign key to the other database.
While it seems like the most efficient use of using the graph database, it seems like a pain to have the different property types and having to determine which type of property it is before using it. (Likely a property lookup key-value store) It would also likely mean that I would need an easy way to promote a property from 3 to 2 to 1 for instances when a property becomes highly shared, or needed for efficient traversal.
Has anyone taken this approach? Any thoughts to share, or things to avoid?
Do never ever store a Neo4j node id in an external system. The node id is basically a offset in the respective store file. If you delete a node its id might be reused when new nodes are created.
The right approach is to have a "good" identifier (e.g. uuid) as a node property and put that into Neo4j's index. That uuid is then save to be stored in third party systems.
Some time ago I've created a unmanaged extension that adds a uuid to each new node and prevent manual changes to these uuids: https://github.com/sarmbruster/neo4j-uuid.
Update (2013-08-21)
I've blogged about UUIDs with Neo4j.

optimistic concurrency with neo4j using the REST API

Is there any way to implement optimistic concurrency during updating and creating neo4j nodes using the REST API? I'd like to create a user node with a unique name only if that name doesn't exist. Don't want two users to accidentally overwrite each other if they pick the same username at the same time.
Additionally, I'd could also have something like an incrementing version number to check for concurrency on the node. In SQL I would normally have an update with a where clause that checks for id and version number. Is there something similar I can do with cypher that would be easy to implment and wouldn't require me to type all the property names out into a long query?
You could try a unique index: http://docs.neo4j.org/chunked/stable/rest-api-unique-indexes.html
Cypher "CREATE UNIQUE" syntax may also be a help: http://docs.neo4j.org/chunked/stable/query-create-unique.html

Unique value in neo4j Nodes

How can I define some value should be unique in Neo4j?
for example think I want to store users data in node,so username should be unique.Is there any way to do it like what we do in sql ( define username as unique properties )?
For this you will need to use a node index and the uniqueness features available within the API to ensure that only one node is filed under each key-value pair. If you're working with Neo4j embedded then have a look at:
http://api.neo4j.org/1.8/org/neo4j/graphdb/index/Index.html#putIfAbsent(T,
java.lang.String, java.lang.Object)
http://api.neo4j.org/1.8/org/neo4j/graphdb/index/UniqueFactory.html
For the REST interface, you may have uniqueness support already in the library that you are using or, if you are not using a library, this page should help:
http://docs.neo4j.org/chunked/milestone/rest-api-unique-indexes.html
As node structures are not enforced in the same way that record structures are enforced in most RDBMSs, there is no direct equivalent to the UNIQUE KEY feature that you mention. Index uniqueness should however give you the same end result.
Hope this helps
Nige

Neo4j: possible to create an alternative to node ids based on integer increments?

Neo4j's nodes tend to be set on the basis of integer increments. I can see this having issues in an application that needs to merge multiple two databases. Is it possible to configure the database to use another format, such as UUIDs to identify each node?
What I have done before is set a property on each node to store a GUID and created an index using the IndexService that creates a GUID index. I have then worked with that index to retrieve nodes based on GUID rather than the internal Neo4J generated ids.
No, it's not.
[Stack Overflow requires 30 chars]
Here is a neo4j extension that adds uuid properties to each node.
https://github.com/sarmbruster/neo4j-uuid
Quote from the author why you should use uuid if you are dealing with multiple database:
... node.getId() is a bad choice since after deletion of a node its id might be recycled.

Resources