Adding index on neo4j node property value

Adding index on neo4j node property value - neo4j

I have imported freebase dump to neo4j. But currently i am facing issue with get queries because of size of db. While import i just created node index and indexed URI property to index for each node. For each node i am adding multiple properties like label_en, type_content_type_en.
props.put(URI_PROPERTY, subject.stringValue());
Long subjectNode = db.createNode(props);
tmpIndex.put(subject.stringValue(), subjectNode);
nodeIndex.add(subjectNode, props);
Now my cypher queries are like this. Which are timing out. I am unable to add index on label_en property. Can anybody help?
match (n)-[r*0..1]->(a) where n.label_en=~'Hibernate.*' return n, a
Update
BatchInserter db = BatchInserters.inserter("ttl.db", config);
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider(db);
BatchInserterIndex index = indexProvider.nodeIndex("ttlIndex", MapUtil.stringMap("type", "exact"));
Question: When i have added node in nodeindex i have added with property URI
props.put(URI_PROPERTY, subject.stringValue());
Long subjectNode = db.createNode(props);
nodeIndex.add(subjectNode, props);
Later in code i have added another property to node(Named as label_en). But I have not added or updated nodeindex. So as per my understanding lucene does not have label_en property indexed. My graph is already built so i am trying to add index on label_en property of my node because my query is on label_en.

Your code sample is missing how you created your index. But I'm pretty sure what you're doing is using a legacy index, which is based on Apache Lucene.
Your Cypher query is using the regex operator =~. That's not how you use a legacy index; this seems to be forcing cypher to ignore the legacy index, and have the java layer run that regex on every possible value of the label_en property.
Instead, with Cypher you should use a START clause and use the legacy indexing query language.
For you, that would look something like this:
START n=node:my_index_name("label_en:Hibernate.*")
MATCH (n)-[r*0..1]->(a)
RETURN n, a;
Notice the string label_en:Hibernate.* - that's a Lucene query string that says to check that property name for that particular string. Cypher/neo4j is not interpreting that; it's passing it through to Lucene.
Your code didn't provide the name of your index. You'll have to change my_index_name above to whatever you named it when you created the legacy index.

Related

Neo4j SDN 4 emulate sequence object(not UUID)

Is it possible in Neo4j or SDN4 to create/emulate something similar to a PostgreSQL sequence database object?
I need this thread safe functionality in order to be able to ask it for next, unique Long value. I'm going to use this value as a surrogate key for my entities.
UPDATED
I don't want to go with UUID because I have to expose these IDs within my web application url parameters and in case of UUID my urls look awful. I want to go with a plain Long values for IDs like StackOverflow does, for example:
stackoverflow.com/questions/42228501/neo4j-sdn-4-emulate-sequence-objectnot-uuid

This can be done with user procedures and functions. As an example:
package sequence;
import org.neo4j.procedure.*;
import java.util.concurrent.atomic.AtomicInteger;
public class Next {
private static AtomicInteger sequence = new AtomicInteger(0);
#UserFunction
public synchronized Number next() {
return sequence.incrementAndGet();
}
}
The problem of this example is that when the server is restarted the counter will be set to zero.
So it is necessary to keep the last value of the counter. This can be done using these examples:
https://maxdemarzi.com/2015/03/25/triggers-in-neo4j/
https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/trigger/Trigger.java

No. As far as I'm aware there isn't any similar functionality to sequences or auto increment identifiers in Neo4j. This question has also been asked a few times in the past.
The APOC project might be worth checking out for this though. There seems to be a request to add it.

If your main interest is in having a way to generate unique IDs, and you do not care if the unique IDs are strings, then you should consider using the APOC facilities for generating UUIDs.
There is an APOC function that generates a UUID, apoc.create.uuid. In older versions of APOC, this is a procedure that must be invoked using the CALL syntax. For example, to create and return a single Foo node with a new UUID:
CREATE (f:Foo {uuid: apoc.create.uuid()})
RETURN f;
There is also an APOC procedure, apoc.create.uuids(count), that generates a specified number of UUIDs. For example, to create and return 5 Foo nodes with new UUIDs:
CALL apoc.create.uuids(5) YIELD uuid
CREATE (f:Foo {uuid: uuid})
RETURN f;

The most simplest way in Neo4j is to disable ids reuse and use node Graph ID like sequencer.
https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
Table A.83. dbms.ids.reuse.types.override
Description: Specified names of id types (comma separated) that should be reused. Currently only 'node' and 'relationship' types are supported.
Valid values: dbms.ids.reuse.types.override is a list separated by "," where items are one of NODE, RELATIONSHIP
Default value: [RELATIONSHIP, NODE]

How do I use spring-data-neo4j with spatial indexes and cypher?

I want to use a spatial index in Neo4j with the spring-data-neo4j framework. Additionally I want to query the index using cypher. The database is embedded.
I'm a bit clueless as to how to get it to all wire together.
With a domain object like this,
#NodeEntity
class Junction {
#GraphId Long id;
#Indexed(indexType = IndexType.POINT, indexName = "junctionLocations") Point wkt;
}
SDN ought to be maintaining the index for me. It appears so, as I can do spatial queries using the repository:
interface JunctionGraph extends GraphRepository<Junction>, SpatialRepository<Junction> {}
with
junctionGraph.findWithinBoundingBox("junctionLocations", new Box(lowerBound.point, upperBound.point))
However, I understand that to query this index using cypher (via a #Query in the repository) this spatial index configuration won't work. I think that this is because each node needs to be manually added to the spatial index (or at least a proxy for the node). That means that adding this to JunctionGraph:
#Query("START n=node:junctionLocations('withinDistance:[{0}, {1}, {2}]') MATCH n-[*]->(i:Item) return i")
Collection<Item> getItemsWithin(double lat, double lon, double radius)
doesn't work.
Does anyone have a working recipe? It appears to be a bit black magic to me, and I'm unsure what the best way to proceed within SDN is.

It works, you just have to create the whole query string outside and pass it as parameter, placeholders within string constants are not substituted.
#Query("START n=node:junctionLocations({0}) MATCH n-[*]->(i:Item) return i")
Collection<Item> getItemsWithin(String query)
you have to do the replacement yourself, .e.g. using String.format
String.format("withinDistance:[%f, %f, %f]",lat,lon,radius)

how to get solr results in given order specified in query

I have framed query to submit to solr which is of following format.
id:95154 OR id:68209 OR id:89482 OR id:94233 OR id:112481 OR id:93843
i want to get records according to order from starting. say i need to get document with id 95154 document first then id 68209 next and so on. but its not happening right now its giving last id 93843 first and some times random.i am using solr in grails 2.1 and my solr version is 1.4.0. here is sample way i am getting documents from solr
def server = solrService.getServer('provider')
SolrQuery sponsorSolrQuery = new SolrQuery(solarQuery)
def queryResponse = server.query(sponsorSolrQuery);
documentsList = queryResponse.getResults()

As #injecteer mentions, there is nothing built-in to Lucene to consider the sequence of clauses in a boolean query, but:
You are able to apply boosts to each term, and as long as the field is a basic field (meaning, not a TextField), the boosts will apply cleanly to give you a decent sort by score.
id:95154^6 OR id:68209^5 OR id:89482^4 OR id:94233^3 OR id:112481^2 OR id:93843

there's no such thing in Lucene (I strongly assume, that in Solr as well). In Lucene you can sort the results based on contents of documents' fields, but not on the order of clauses in a query.
that means, that you have to sort the results yourself:
documentsList = queryResponse.getResults()
def sordedByIdOrder = solarQueryAsList.collect{ id -> documentList.find{ it.id == id } }

How do I use an index that has dots in the name?

I'm just starting to learn the Cypher query language and GraphDb in general. I've created some indexes using the class name of my nodes like:
"com.acme.node.SomeNodeType"
I can't for the life of me figure out how to reference this index in Cypher. I found this thread but using ` didn't work for me.
So I guess I have 2 questions:
Is it possible to use an index with dots in the name?
If so, how do I specify the name in the query?

can you try to query them with '' like
start n = node:`my.index`('name:test') return n
?

Node identifiers in neo4j

I'm new to Neo4j - just started playing with it yesterday evening.
I've notice all nodes are identified by an auto-incremented integer that is generated during node creation - is this always the case?
My dataset has natural string keys so I'd like to avoid having to map between the Neo4j assigned ids and my own. Is it possible to use string identifiers instead?

Think of the node-id as an implementation detail (like the rowid of relational databases, can be used to identify nodes but should not be relied on to be never reused).
You would add your natural keys as properties to the node and then index your nodes with the natural key (or enable auto-indexing for them).
E..g in the Java API:
Index<Node> idIndex = db.index().forNodes("identifiers");
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
idIndex.add(n, "id",n.getProperty("id"));
// later
Node n = idIndex.get("id","my-natural-key").getSingle(); // node or null
With auto-indexer you would enable auto-indexing for your "id" field.
// via configuration
GraphDatabaseService db = new EmbeddedGraphDatabase("path/to/db",
MapUtils.stringMap(
Config.NODE_KEYS_INDEXABLE, "id", Config.NODE_AUTO_INDEXING, "true" ));
// programmatic (not persistent)
db.index().getNodeAutoIndexer().startAutoIndexingProperty( "id" );
// Nodes with property "id" will be automatically indexed at tx-commit
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
// Usage
ReadableIndex<Node> autoIndex = db.index().getNodeAutoIndexer().getAutoIndex();
Node n = autoIndex.get("id","my-natural-key").getSingle();
See: http://docs.neo4j.org/chunked/milestone/auto-indexing.html
And: http://docs.neo4j.org/chunked/milestone/indexing.html

This should help:
Create the index to back automatic indexing during batch import We
know that if auto indexing is enabled in neo4j.properties, each node
that is created will be added to an index named node_auto_index. Now,
here’s the cool bit. If we add the original manual index (at the time
of batch import) and name it as node_auto_index and enable auto
indexing in neo4j, then the batch-inserted nodes will appear as if
auto-indexed. And from there on each time you create a node, the node
will get indexed as well.**
Source : Identifying nodes with Custom Keys

According Neo docs there should be automatic indexes in place
http://neo4j.com/docs/stable/query-schema-index.html
but there's still a lot of limitations

Beyond all answers still neo4j creates its own ids to work faster and serve better. Please make sure internal system does not conflict between ids then it will create nodes with same properties and shows in the system as empty nodes.

the ID's generated are default and cant be modified by users. user can use your string identifiers as a property for that node.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Adding index on neo4j node property value - neo4j

Related

Neo4j SDN 4 emulate sequence object(not UUID)

How do I use spring-data-neo4j with spatial indexes and cypher?

how to get solr results in given order specified in query

How do I use an index that has dots in the name?

Node identifiers in neo4j

Categories

Resources