I am using node.js to connect to the neo4j database. Whenever I have to set an index for a node, I do it manually by going to the neo4j browser (localhost:7474).
CREATE INDEX ON :user(username)
First of all to be clear, this is an automatic index? Any changes or additions to :user are automatically maintained? Let me know if I am wrong.
If so, how does full text index work on neo4j? Is it the same process and neo manages automatically? For example does the following create full text index? Or we need to do something else?
CREATE INDEX ON :user(aboutme)
I built my own nodejs adapter to connect to neo4j, and so I only have access to cypher queries at the moment. To create an index I only have access to cypher or the browser (7474). So what is the proper way to create automatic fulltext index, preferably from the browser itself? And how do I access it using the cypher (or do I have to access it? does neo automatically figure out what index to use?). Online documentation and tutorials are a bit complicated for beginners :/ .
(I want to be able to do text search on the :user(aboutme) property)
If you want a full text index (where you can use the index to match on parts of the string and not the full, exact string), that isn't currently supported by the new cypher automatic indexes (like CREATE INDEX ON :user(username)). To do that you need to use what are called the legacy indexes. These use lucene under the covers and are much more powerful, but I believe they will eventually go away once that same functionality is supported with the new indexes.
Personally for full-text search I prefer using something like elasticsearch as it is easier to set up and use.
Related
I would like to build an in-memory Lucene index from a collection of node properties and then search against that index.
These search transactions will be happening in parallel, I need to be able to construct a separate search index for each transaction. It seems like this would not be possible using native (manual) Neo4j indexes, since they are "global", hence the use of a memory-based search index, am I mistaken?
This is working great for our use case. A couple of notes:
Use MemoryIndex rather than RAMDirectory.
Be sure to specify "provided" in your dependencies for Lucene.
You can only use 5.5.0, which is what is used natively by Neo4j.
I want to play with neo4j and spacial indexes. I can't find any documentation that demonstrates how to do this through cypher, only through the REST API.
Is it possibly to create spacial indexes through Cypher, say in the neo4j web console?
There is currently no way to create a spatial index using Cypher. You can either use java API or a REST call, see docs at http://neo4j-contrib.github.io/spatial/#rest-api-create-a-spatial-index for details. Since Neo4j browser allows to send HTTP POST you can type there:
:POST /db/data/index/node {"name":"geom", "config":
{"provider":"spatial", "geometry_type":"point", "lat":"lat", "lon":"lon"}
}
Alternatively you can use the index command within neo4j-shell.
Update for Neo4j 3.0
Neo4j Spatial for 3.0 provides stored procedures to manage the spatial index - and therefore everything can be done through cypher. See https://github.com/neo4j-contrib/spatial/blob/master/src/main/java/org/neo4j/gis/spatial/procedures/SpatialProcedures.java.
Note: this version is not yet released, so you have to build from source yourself.
I want to search in content and I don't want to get fault result.
assume users search 'br' I don't want to see in output results that have <br> or <P> and other html elements
Simply, you must strip the tags before you search. However, that would mean not being able to query the database directly. Rather, you'd have to pull all the objects first, and then query the collection in memory.
If you're going to be doing a lot of this or have large collections of objects (where pulling all of them for the initial query would be a performance drag), then you should look into a true search solution. I've been working with Elasticsearch, which seems to be just about the best out there in my opinion. It's easy to set up, easy to use, and has third-party .NET integration through the nuget package, NEST.
With a true search solution, you can index your content fields, stripped of HTML, and then run your queries on the index instead of directly on your database. You'll also get powerful advanced features such as faceting, which would be difficult or impossible to do directly with Entity Framework.
Alternatively, if you can't go full board on the search and it's unacceptable to query everything up front (which really it pretty much always is), then your only other option is to create another companion field for each HTML content property, and always save a HTML-stripped copy of the text there. Then, use that field for your search queries.
My Neo4j index has over 1.4M entries. My queries are running very slow. I have cached most of the database. However, now I have found that lot of disk read of the lucene index are taking place.
Per this article following code will help witch caching the index.
Index<Node> index = graphDb.index().forNodes( "actors" );
((LuceneIndex<Node>) index).setCacheCapacity( "name", 300000 );
Anyway I can do it via Neo4jClient? I have got so far as
var indexes = _graphClient.GetIndexes(IndexFor.Node);
var index = indexes.ElementAt(0);
But then it does not give me an option to set the cache capacity. Any thoughts how I can set the cache parameters via Neo4jClient or reduce the index look up time? TIA.
Neo4jClient works via the REST API. The behaviour you are describing is from the native Java API, and not exposed via the REST API. There is no way to do this via Neo4jClient, or any other REST based driver. You may be able to do it via config instead.
Using a default index, one can do nodeIndex.get("message", "Hello") for exact matches, or nodeIndex.query("message", "Hel*") for approximate Lucene-based queries. This works correctly for me from Java.
But how do I do approximate queries through the webadmin Data Browser interface? Exact matches work fine, such as:
node:index:nodeIndex:message:"Hello"
but I can't see how to do the wildcard queries. The syntax is shown in the pop-up help panel as:
node:index:[index]:[query]
but I don't know what to put for the [query] part, and can't find any examples of this in the manual or the wiki. Have tried the following without success:
node:index:nodeIndex:"message:Hel*"
node:index:nodeIndex:message:"Hel*"
node:index:nodeIndex:"Hel*"
node:index:nodeIndex:Hel*
This should work:
node:index:nodeIndex:message:Hel*
The queryis message:Hel* so you just append it, more complex queries are also possible.
See the lucene syntax guide.
node:index:nodeIndex:message:Hel* OR message:Wor*
Issue created. https://github.com/neo4j/community/issues/138