Neo4j SDN 4 emulate sequence object(not UUID) - neo4j

Is it possible in Neo4j or SDN4 to create/emulate something similar to a PostgreSQL sequence database object?
I need this thread safe functionality in order to be able to ask it for next, unique Long value. I'm going to use this value as a surrogate key for my entities.
UPDATED
I don't want to go with UUID because I have to expose these IDs within my web application url parameters and in case of UUID my urls look awful. I want to go with a plain Long values for IDs like StackOverflow does, for example:
stackoverflow.com/questions/42228501/neo4j-sdn-4-emulate-sequence-objectnot-uuid

This can be done with user procedures and functions. As an example:
package sequence;
import org.neo4j.procedure.*;
import java.util.concurrent.atomic.AtomicInteger;
public class Next {
private static AtomicInteger sequence = new AtomicInteger(0);
#UserFunction
public synchronized Number next() {
return sequence.incrementAndGet();
}
}
The problem of this example is that when the server is restarted the counter will be set to zero.
So it is necessary to keep the last value of the counter. This can be done using these examples:
https://maxdemarzi.com/2015/03/25/triggers-in-neo4j/
https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/trigger/Trigger.java

No. As far as I'm aware there isn't any similar functionality to sequences or auto increment identifiers in Neo4j. This question has also been asked a few times in the past.
The APOC project might be worth checking out for this though. There seems to be a request to add it.

If your main interest is in having a way to generate unique IDs, and you do not care if the unique IDs are strings, then you should consider using the APOC facilities for generating UUIDs.
There is an APOC function that generates a UUID, apoc.create.uuid. In older versions of APOC, this is a procedure that must be invoked using the CALL syntax. For example, to create and return a single Foo node with a new UUID:
CREATE (f:Foo {uuid: apoc.create.uuid()})
RETURN f;
There is also an APOC procedure, apoc.create.uuids(count), that generates a specified number of UUIDs. For example, to create and return 5 Foo nodes with new UUIDs:
CALL apoc.create.uuids(5) YIELD uuid
CREATE (f:Foo {uuid: uuid})
RETURN f;

The most simplest way in Neo4j is to disable ids reuse and use node Graph ID like sequencer.
https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
Table A.83. dbms.ids.reuse.types.override
Description: Specified names of id types (comma separated) that should be reused. Currently only 'node' and 'relationship' types are supported.
Valid values: dbms.ids.reuse.types.override is a list separated by "," where items are one of NODE, RELATIONSHIP
Default value: [RELATIONSHIP, NODE]

Related

Auto increment id Neo4j to retrieve elements in insert order

Recently, I am experimenting Neo4j. I like the idea but I am facing a problem that I have never faced with relational databases.
I want to perform these inserts and then return them exactly in the insertion order.
Insert elements:
create(p1:Person {name:"Marc"})
create(p2:Person {name:"John"})
create(p3:Person {name:"Paul"})
create(p4:Person {name:"Steve"})
create(p5:Person {name:"Andrew"})
create(p6:Person {name:"Alice"})
create(p7:Person {name:"Bob"})
While to return them:
match(p:Person) return p order by id(p)
I receive the elements in the following order:
Paul
Andrew
Marc
John
Steve
Alice
Bob
I note that these elements are not returned respecting the query insertion order (through the id function).
In fact the id of my elements are the following:
Marc: 18221
John: 18222
Paul: 18208
Steve: 18223
Andrew: 18209
Alice: 18224
Bob: 18225
How does the Neo4j id function work? I read that it generates an auto incremental id but it seems a little strange his mechanism. How do I return items respecting the query insertion order? I thought about creating a timestamp attribute for each node but I don't think it's the best choice
If you're looking to generate sequence numbers in Neo4j then you need to manage this yourself using a strategy that works best in your application.
In ours we maintain sequence numbers in key/value pair nodes where Scope is the application name given to the sequence number range, and Value is the last sequence number used. When we generate a node of a given type, such as Product, then we increment the sequence number and assign it to our new node.
MERGE (n:Sequence {Scope: 'Product'})
SET n.Value = COALESCE(n.Value, 0) + 1
WITH n.Value AS seq
CREATE (product:Product)
SET product.UniqueId = seq
With this you can create as many sequence numbers you need just by creating sequence nodes with unique scope names.
For more examples and tests see the AutoInc.Neo4j project https://github.com/neildobson-au/AutoInc/blob/master/src/AutoInc.Neo4j/Neo4jUniqueIdGenerator.cs
The id of Neo4j is maintained internally, which your business code should not depend on.
Generally it's auto incrementally, but if there is delete operation, you may reuse the deleted id according to the Reuse Policy of Neo4j Server.

Property values can only be of primitive types or arrays thereof in Neo4J Cypher query

I have the following params set:
:params "userId":"15229100-b20e-11e3-80d3-6150cb20a1b9",
"contextNames":[{"uid":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","name":"zhora"}],
"statements":[{"text":"oranges apples bananas","concepts":["orange","apple","banana"],
"mentions":[],"timestamp":15481867295710000,"name":"# banana","uid":"34232870-1e7f-11e9-8609-a7f6b478c007",
"uniqueconcepts":[{"name":"orange","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"apple","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"banana","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000}],"uniquementions":[]}],"timestamp":15481867295710000,"conceptsRelations":[{"from":"orange","to":"apple","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710000,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"apple","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"orange","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":4,"weight":2}],"mentionsRelations":[]
Then when I make the following query:
MATCH (u:User {uid: $userId})
UNWIND $contextNames as contextName
MERGE (context:Context {name:contextName.name,by:u.uid,uid:contextName.uid})
ON CREATE SET context.timestamp=$timestamp
MERGE (context)-[:BY{timestamp:$timestamp}]->(u)
WITH u, context
UNWIND $statements as statement
CREATE (s:Statement {name:statement.name, text:statement.text, uid:statement.uid, timestamp:statement.timestamp})
CREATE (s)-[:BY {context:context.uid,timestamp:s.timestamp}]->(u)
CREATE (s)-[:IN {user:u.id,timestamp:s.timestamp}]->(context)
WITH u, s, context, statement
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName}) ON CREATE SET c.uid=apoc.create.uuid()
CREATE (c)-[:BY {context:context.uid,timestamp:s.timestamp,statement:s.suid}]->(u)
CREATE (c)-[:OF {context:context.uid,user:u.uid,timestamp:s.timestamp}]->(s)
CREATE (c)-[:AT {user:u.uid,timestamp:s.timestamp,context:context.uid,statement:s.uid}]->(context) )
WITH u, s
UNWIND $conceptsRelations as conceptsRelation MATCH (c_from:Concept{name: conceptsRelation.from}) MATCH (c_to:Concept{name: conceptsRelation.to})
CREATE (c_from)-[:TO {context:conceptsRelation.context,statement:conceptsRelation.statement,user:u.uid,timestamp:conceptsRelation.timestamp, uid:apoc.create.uuid(), gapscan:conceptsRelation.gapscan, weight: conceptsRelation.weight}]->(c_to)
RETURN DISTINCT s.uid
But when I run it, I get this error:
Neo.ClientError.Statement.TypeError
Property values can only be of primitive types or arrays thereof
Anybody knows why it's coming up? My params seem to be set correctly, I didn't see they couldn't be used in this way... Thanks!
Looks like the problem is here:
...
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName})
...
uniqueconcepts in your parameter is a list of objects, not a list of strings, so when attempting to MERGE conceptName, it errors out as conceptName isn't a primitive type (or array or primitive types). I think you'll want to use uniqueConcept instead of conceptName, and in your MERGE use name:uniqueConcept.name. Check for other usages of the elements of statement.uniqueconcepts.
This answer is for other n00bs like me that are trying to put a composite datatype into a property without reading the friendly manual, and get the error above. Google points here, so I felt appropriate to add this answer.
Specifically, I wanted to store a list [(datetime, event), ...] of tuples into a property of a relation.
Potential encountered errors are:
Neo.ClientError.Statement.TypeError: Property values can only be of primitive types or arrays thereof
Neo.ClientError.Statement.TypeError: Neo4j only supports a subset of Cypher types for storage as singleton or array properties. Please refer to section cypher/syntax/values of the manual for more details.
The bottom line is well summarized in this forum post by a Neo4j staff member:
Neo4j doesn't allow maps as properties (no sub-properties allowed, basically), and though lists are allowed as properties, they cannot be lists of maps (or lists of lists for that matter).
Basically I was trying to bypass the natural functionality of the DB. There seem to be 2 workarounds:
Dig your heels in as suggested here, and store the property as e.g. a JSON string
Rethink the design, and model these kind of properties into the graph (i.e. being more specific with the nodes)
After a little rethinking I came up with a much simpler data model that didn't require composite properties in relations. Although option 1 may have its uses, when we have to insist against a well-designed system (which neo4j is), that is usually an indicator that we should change course.
Andres

Cypher query with literal map syntax & dynamic keys

I'd like to make a cypher query that generates a specific json output. Part of this output includes an object with a dynamic amount of keys relative to the children of a parent node:
{
...
"parent_keystring" : {
child_node_one.name : child_node_one.foo
child_node_two.name : child_node_two.foo
child_node_three.name : child_node_three.foo
child_node_four.name : child_node_four.foo
child_node_five.name : child_node_five.foo
}
}
I've tried to create a cypher query but I do not believe I am close to achieving the desired output mentioned above:
MATCH (n)-[relone:SPECIFIC_RELATIONSHIP]->(child_node)
WHERE n.id='839930493049039430'
RETURN n.id AS id,
n.name AS name,
labels(n)[0] AS type,
{
COLLECT({
child.name : children.foo
}) AS rel_two_representation
} AS parent_keystring
I had planned for children.foo to be a count of how many occurrences of each particular relationship/child of the parent. Is there a way to make use of the reduce function? Where a report would generate based on analyzing the array proposed below? ie report would be a json object where each key is a distinct RELATIONSHIP and the property value would be the amount of times that relationship stems from the parent node?
Thank you greatly in advance for guidance you can offer.
I'm not sure that Cypher will let you use a variable to determine an object's key. Would using an Array work for you?
COLLECT([child.name, children.foo]) AS rel_two_representation
I think, Neo4j Server API output by itself should be considered as any database output (like MySQL). Even if it is possible to achieve, with default functionality, desired output - it is not natural way for database.
Probably you should look into creating your own server plugin. This allows you to implement any custom logic, with desired output.

Adding index on neo4j node property value

I have imported freebase dump to neo4j. But currently i am facing issue with get queries because of size of db. While import i just created node index and indexed URI property to index for each node. For each node i am adding multiple properties like label_en, type_content_type_en.
props.put(URI_PROPERTY, subject.stringValue());
Long subjectNode = db.createNode(props);
tmpIndex.put(subject.stringValue(), subjectNode);
nodeIndex.add(subjectNode, props);
Now my cypher queries are like this. Which are timing out. I am unable to add index on label_en property. Can anybody help?
match (n)-[r*0..1]->(a) where n.label_en=~'Hibernate.*' return n, a
Update
BatchInserter db = BatchInserters.inserter("ttl.db", config);
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(db);
BatchInserterIndex index = indexProvider.nodeIndex("ttlIndex", MapUtil.stringMap("type", "exact"));
Question: When i have added node in nodeindex i have added with property URI
props.put(URI_PROPERTY, subject.stringValue());
Long subjectNode = db.createNode(props);
nodeIndex.add(subjectNode, props);
Later in code i have added another property to node(Named as label_en). But I have not added or updated nodeindex. So as per my understanding lucene does not have label_en property indexed. My graph is already built so i am trying to add index on label_en property of my node because my query is on label_en.
Your code sample is missing how you created your index. But I'm pretty sure what you're doing is using a legacy index, which is based on Apache Lucene.
Your Cypher query is using the regex operator =~. That's not how you use a legacy index; this seems to be forcing cypher to ignore the legacy index, and have the java layer run that regex on every possible value of the label_en property.
Instead, with Cypher you should use a START clause and use the legacy indexing query language.
For you, that would look something like this:
START n=node:my_index_name("label_en:Hibernate.*")
MATCH (n)-[r*0..1]->(a)
RETURN n, a;
Notice the string label_en:Hibernate.* - that's a Lucene query string that says to check that property name for that particular string. Cypher/neo4j is not interpreting that; it's passing it through to Lucene.
Your code didn't provide the name of your index. You'll have to change my_index_name above to whatever you named it when you created the legacy index.

Node identifiers in neo4j

I'm new to Neo4j - just started playing with it yesterday evening.
I've notice all nodes are identified by an auto-incremented integer that is generated during node creation - is this always the case?
My dataset has natural string keys so I'd like to avoid having to map between the Neo4j assigned ids and my own. Is it possible to use string identifiers instead?
Think of the node-id as an implementation detail (like the rowid of relational databases, can be used to identify nodes but should not be relied on to be never reused).
You would add your natural keys as properties to the node and then index your nodes with the natural key (or enable auto-indexing for them).
E..g in the Java API:
Index<Node> idIndex = db.index().forNodes("identifiers");
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
idIndex.add(n, "id",n.getProperty("id"));
// later
Node n = idIndex.get("id","my-natural-key").getSingle(); // node or null
With auto-indexer you would enable auto-indexing for your "id" field.
// via configuration
GraphDatabaseService db = new EmbeddedGraphDatabase("path/to/db",
MapUtils.stringMap(
Config.NODE_KEYS_INDEXABLE, "id", Config.NODE_AUTO_INDEXING, "true" ));
// programmatic (not persistent)
db.index().getNodeAutoIndexer().startAutoIndexingProperty( "id" );
// Nodes with property "id" will be automatically indexed at tx-commit
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
// Usage
ReadableIndex<Node> autoIndex = db.index().getNodeAutoIndexer().getAutoIndex();
Node n = autoIndex.get("id","my-natural-key").getSingle();
See: http://docs.neo4j.org/chunked/milestone/auto-indexing.html
And: http://docs.neo4j.org/chunked/milestone/indexing.html
This should help:
Create the index to back automatic indexing during batch import We
know that if auto indexing is enabled in neo4j.properties, each node
that is created will be added to an index named node_auto_index. Now,
here’s the cool bit. If we add the original manual index (at the time
of batch import) and name it as node_auto_index and enable auto
indexing in neo4j, then the batch-inserted nodes will appear as if
auto-indexed. And from there on each time you create a node, the node
will get indexed as well.**
Source : Identifying nodes with Custom Keys
According Neo docs there should be automatic indexes in place
http://neo4j.com/docs/stable/query-schema-index.html
but there's still a lot of limitations
Beyond all answers still neo4j creates its own ids to work faster and serve better. Please make sure internal system does not conflict between ids then it will create nodes with same properties and shows in the system as empty nodes.
the ID's generated are default and cant be modified by users. user can use your string identifiers as a property for that node.

Resources