Should we update indexes after node update in neo4jphp? - neo4j

According to this manual https://github.com/jadell/neo4jphp/wiki/Indexes we should worry about adding and removing nodes to indexes by ourselves.
OK, I'm adding nodes to indexes after creating them. But should I also update the indexes when I change some of the node's properties?

Neo4j has two indexing systems: The Legacy Indexes and Indexes.
Legacy indexes
This is a stand-alone indexing service that Neo4j ships with, and it gives you very little for free, it does not keep up to date with changes you make to the graph, other than lazilly removing items that you've deleted in the graph.
If you want something in a legacy index, you must manually put it in there, and if you want it to reflect a change in the graph, you must manually update the index.
The sole reason these indexes remain, other than for backwards compatibility, is that they support complex indexes like geo-spatial indexing and rich full text indexing functionality. These are not yet supported by the new Indexes.
Read more about legacy indexes here: http://docs.neo4j.org/chunked/stable/indexing.html
Indexes
These were added in 2.0.0, and work the same way indexes do in relational databases - they are an optimization that you can introduce, and they are automatically kept in sync with the "primary" data, in our case, with changes with the graph.
An Index is defined on a combination of a Label and a Property Key, and subsequent lookups on that Label/Property key combination will (if the query planner determines this is the most efficient thing to do) use that index.
Read more about indexes here: http://docs.neo4j.org/chunked/stable/graphdb-neo4j-schema.html

If you are using legacy indexes (described by #jakewins), unless you have auto-indexing turned on for the fields being indexed, yes, you must manually remove and re-add the nodes when the property values change.

Related

Indices in Neo4j - questions and doubts

The only indices that I know about them are indices on properties (these indices are created on particular labels (node types)). I have some doubts, however.
Are there exists indices on edges/relationships?
I often read that Neo4j leveraged Lucene Index. Is it still used? What is aim?
Are there exists any other indicses than indices on properties?
Thanks in advance,
Neo4j has two indexing systems.
The more modern one is referred to as "schema indexes", and these are the ones that are automatic and apply to properties of a given label for quick lookup by those properties when the given properties and label are provided within a query. This does not currently support indexing of relationship properties. These started out based on lucene, but we've gradually replaced the implementation with our own native indexing solution. Discussion of these, as well as any noteworthy information and limitations, can be found in our index configuration documentation.
The other indexing system is an older manual system that is called "explicit indexes", though this has previously been called "manual indexes". This is also based on lucene, but these are not automatic -- it is up to the user to manually add or remove entries to the index and keep them up to date when data in the database changes. This makes usage and maintenance cumbersome, and we recommend avoid using this system if possible.
Built-in procedures are the means to create and lookup using explicit indexes, as these are never used automatically under the hood (as opposed to schema indexes). APOC Procedures also offers various means of interfacing with explicit indexes.
The main reason one would use explicit indexes is because you are able to create an index on relationships for properties and get fast lookup when querying the index. This also allows for a full text lookup across multiple labels and properties, provided the index has been configured in such a way.
Separate from all of these, it should be noted that usage of labels is itself a kind of index, as it provides quick access to all nodes with the given label.

Is this syntax not right for executing an APOC query?

call apoc.index.nodes('Product', 'name:iPhone*') yield node return node
In my graph I have 'iPhone X' and 'iPhone Plus', but this query doesn't return anything. I also have an index on 'name' property of Product.
Indexes
ON :Product(name) ONLINE
apoc.index.nodes is one of the APOC procedures for "manual indexes", which are also confusingly referred to in various docs as "legacy indexes" and "explicit indexes". Such indexes use the Apache Lucene library and are NOT the same as the standard neo4j indexes that most people use, and the way you create/update/use such indexes is also not standard.
For example, you cannot create a "manual index" via a Cypher CREATE INDEX clause. And neo4j Browser's :schema command will not show any manual indexes.
If you will only be searching :Product(name) via manual indexes, then you should drop your standard index for :Product(name), since it will not be needed but will add overhead (time and space) to your DB.
One way to create/update/use manual indexes is through the special APOC procedures. The APOC documentation for manual indexes (linked above) provides a good amount of information about how to add nodes and relationships to such indexes, and how to search using them.
As an example, before you can use the query in your question, you first have to add all the :Product(name) values to the Product manual index. If you want to add them all at once, you can use the following query (and since it has to return something, it just returns a count of the number of Products):
MATCH (p:Product)
CALL apoc.index.addNode(p, ['name'])
RETURN count(*)
[UPDATED]
Manual indexing is typically only used for partial and fuzzy text search use cases. When you just need exact value matching, standard indexes are recommended, especially since they require much less effort on your part. The reason manual indexes are called "manual" is because the responsibility for maintaining them falls entirely on your shoulders. That is, your node/relationship/property addition/removal/update queries would normally have to add/remove/update any relevant manual index entries as well. Note that when you update a property that is manually indexed, you have to remove the old index entry and then add the new entry.

Neo4j index on relationship property

Is it possible in Neo4j to create an index on relationship property? Right now I faced a very poor performance over comparison/filtering operations on relationship property value. This is the example of my issue Neo4j Cypher count query performance optimizaztion
In neo4j 3.3.x, there are now built-in procedures for explicit indexes, which include the ability to create "explicit" indexes for relationships.
"Explicit" indexes are not the same as the normal "schema" indexes that you are already aware of (which are automatically maintained for you once you create an index or uniqueness constraint). They are called "explicit" because you have to write code to add nodes or relationships to such indexes, and you also have to write code to get nodes or relationships from such indexes. But, it might be worth the effort in some cases.

Embedded automatic full text indexing completely removed from Neo4j as of 3.0.0?

I'm moving from Neo4j 2.2.* to (still prerelease) 3.0.0 and all of a sudden it seems that configuration parameters
node_auto_indexing=true
relationship_auto_indexing=true
node_keys_indexable=some_node_property
relationship_keys_indexable=some_rel_property
had gone and are not available any more. This is sad because I need full-text indexing (namely, fuzzy search queries and range searches), I was happily using it since 2.0.0 and had a naive hope that new Lucene 5.5 will make my life better with 3.0.0.
Is this functionality completely removed? START clause is still here in Cypher, neo4j-shell still has command which allows manipulating "legacy" FT indices so my question is:
how do I populate my FT index without using Java or another external programming language?
case 1: I import some bunch of "static" data into the graph which
will rarely be updated (consider dictionary) and need to arrange FTS
on those once, and manually perform complete reindex on occasional updates of the dataset;
case 2: nodes and relationships with specific properties
automagically get indexed upon creation or upon assignment of a new value to the property with specific name, near-realtime, as it used to be before.
New schema indexes are cool in 3.0.0 and range searches are implemented, but a) they work only on properties of nodes, no relationships, b) they don't allow full-text, fuzzy queries, and AFAIK regular expression matching does not use index.
Thanks for your suggestions!
WBR, Andrii
Andrii,
only the default config parameters have been removed not the functionality.
What is the actual use-case you are using the FTS indexes (on rels) for?
In 3.0 you can still use the start-clause but using stored procedures you can add nodes and relationship explicitly to indexes. And you can use similar procedures to query your indexes even more efficiently, e.g. by passing in start and end-nodes.
See (WIP): https://github.com/jexp/neo4j-apoc-procedures#manual-indexes

Neo4j auto increase schema index

It is recommended not to use Neo4j's id property because it may change, but rather create our own identifier. Then to identify my users, I plan to create a user_id property on the nodes labelled User and put an index on it. However, I cannot figure out a way to make it auto increase.
After some searching, I noticed there are two kinds of indexes in Neo4j, the schema index and the legacy index. Could anyone explain to me the difference between them? And is there a way to make my user_id index auto increase?
Schema indices are effectively labels, e.g. :User. You can also create indices on the properties of those labels if you wish. There's also no need to specify which index you're using as this is done automatically, in this case.
Legacy indices are the node indices that were around prior to Neo4j 2.0. They're a traditional index where you can specify what you're indexing and which properties they apply to, but, they're only used in START statements, which are optional (and on their way to deprecation).
For more detail, have a look here (http://docs.neo4j.org/chunked/stable/graphdb-neo4j-schema.html) and here (http://docs.neo4j.org/chunked/stable/indexing.html).
As for auto-incrementing, I'm unaware of any such functionality for user-defined index keys.
HTH

Resources