Use different analyzer depending on node property in Neo4J

Use different analyzer depending on node property in Neo4J - neo4j

I am creating a single database to store nodes with content in several languages.
I have a model like:
(Book {title, summary, text})<-[WRITES {date}]-(Author {name, lang})
I would like to be able to perform full text search on Book's titles and text in several language.
I have tried to simply create several index with different analyzer in a very stupid way:
CALL db.index.fulltext.createNodeIndex("searchEN",["Book"],["title", "summary", "text"], {analyzer: "english")
CALL db.index.fulltext.createNodeIndex("searchFR",["Book"],["title", "summary", "text"], {analyzer: "french")
But when I try to create the index for french I get this error:
neobolt.exceptions.ClientError: There already exists an index NODE:label[0](property[1], property[9], property[11]).
A solution I would like would to limit a search in English to books that are written by an English speaking author without having to create a new node type like EnglishBook. I want to avoid it because other node types of the schema can share connections with books of different language.
For instance I still want to be able to do:
MATCH (p: Publisher)-[r: PUBLISHES]->(b: Book)
RETURN p, r, b

Related

Property values can only be of primitive types or arrays thereof in Neo4J Cypher query

I have the following params set:
:params "userId":"15229100-b20e-11e3-80d3-6150cb20a1b9",
"contextNames":[{"uid":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","name":"zhora"}],
"statements":[{"text":"oranges apples bananas","concepts":["orange","apple","banana"],
"mentions":[],"timestamp":15481867295710000,"name":"# banana","uid":"34232870-1e7f-11e9-8609-a7f6b478c007",
"uniqueconcepts":[{"name":"orange","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"apple","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000},{"name":"banana","suid":"34232870-1e7f-11e9-8609-a7f6b478c007","timestamp":15481867295710000}],"uniquementions":[]}],"timestamp":15481867295710000,"conceptsRelations":[{"from":"orange","to":"apple","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710000,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"apple","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":"2","weight":3},{"from":"orange","to":"banana","context":"94e71bf0-1e7d-11e9-8f33-4f0c99ea0da1","statement":"34232870-1e7f-11e9-8609-a7f6b478c007","user":"15229100-b20e-11e3-80d3-6150cb20a1b9","timestamp":15481867295710002,"uid":"apoc.create.uuid()","gapscan":4,"weight":2}],"mentionsRelations":[]
Then when I make the following query:
MATCH (u:User {uid: $userId})
UNWIND $contextNames as contextName
MERGE (context:Context {name:contextName.name,by:u.uid,uid:contextName.uid})
ON CREATE SET context.timestamp=$timestamp
MERGE (context)-[:BY{timestamp:$timestamp}]->(u)
WITH u, context
UNWIND $statements as statement
CREATE (s:Statement {name:statement.name, text:statement.text, uid:statement.uid, timestamp:statement.timestamp})
CREATE (s)-[:BY {context:context.uid,timestamp:s.timestamp}]->(u)
CREATE (s)-[:IN {user:u.id,timestamp:s.timestamp}]->(context)
WITH u, s, context, statement
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName}) ON CREATE SET c.uid=apoc.create.uuid()
CREATE (c)-[:BY {context:context.uid,timestamp:s.timestamp,statement:s.suid}]->(u)
CREATE (c)-[:OF {context:context.uid,user:u.uid,timestamp:s.timestamp}]->(s)
CREATE (c)-[:AT {user:u.uid,timestamp:s.timestamp,context:context.uid,statement:s.uid}]->(context) )
WITH u, s
UNWIND $conceptsRelations as conceptsRelation MATCH (c_from:Concept{name: conceptsRelation.from}) MATCH (c_to:Concept{name: conceptsRelation.to})
CREATE (c_from)-[:TO {context:conceptsRelation.context,statement:conceptsRelation.statement,user:u.uid,timestamp:conceptsRelation.timestamp, uid:apoc.create.uuid(), gapscan:conceptsRelation.gapscan, weight: conceptsRelation.weight}]->(c_to)
RETURN DISTINCT s.uid
But when I run it, I get this error:
Neo.ClientError.Statement.TypeError
Property values can only be of primitive types or arrays thereof
Anybody knows why it's coming up? My params seem to be set correctly, I didn't see they couldn't be used in this way... Thanks!

Looks like the problem is here:
...
FOREACH (conceptName in statement.uniqueconcepts |
MERGE (c:Concept {name:conceptName})
...
uniqueconcepts in your parameter is a list of objects, not a list of strings, so when attempting to MERGE conceptName, it errors out as conceptName isn't a primitive type (or array or primitive types). I think you'll want to use uniqueConcept instead of conceptName, and in your MERGE use name:uniqueConcept.name. Check for other usages of the elements of statement.uniqueconcepts.

This answer is for other n00bs like me that are trying to put a composite datatype into a property without reading the friendly manual, and get the error above. Google points here, so I felt appropriate to add this answer.
Specifically, I wanted to store a list [(datetime, event), ...] of tuples into a property of a relation.
Potential encountered errors are:
Neo.ClientError.Statement.TypeError: Property values can only be of primitive types or arrays thereof
Neo.ClientError.Statement.TypeError: Neo4j only supports a subset of Cypher types for storage as singleton or array properties. Please refer to section cypher/syntax/values of the manual for more details.
The bottom line is well summarized in this forum post by a Neo4j staff member:
Neo4j doesn't allow maps as properties (no sub-properties allowed, basically), and though lists are allowed as properties, they cannot be lists of maps (or lists of lists for that matter).
Basically I was trying to bypass the natural functionality of the DB. There seem to be 2 workarounds:
Dig your heels in as suggested here, and store the property as e.g. a JSON string
Rethink the design, and model these kind of properties into the graph (i.e. being more specific with the nodes)
After a little rethinking I came up with a much simpler data model that didn't require composite properties in relations. Although option 1 may have its uses, when we have to insist against a well-designed system (which neo4j is), that is usually an indicator that we should change course.
Andres

Example of full-text search across multiple fields in Neo4j?

I've seen some simple examples text searching STARTS WITH name such as:
http://www.jexp.de/blog/html/full-text-and-spatial-search-in-neo4j-3.html
https://blog.knoldus.com/2016/12/11/neo4j-with-scala-neo4j-vs-elasticsearch/
But I'm looking for something more along the lines of full-text search across multiple fields: title, content:
https://www.digitalocean.com/community/tutorials/how-to-use-full-text-search-in-postgresql-on-ubuntu-16-04
Can I see an example of how this should be done with Neo4j?

You can do this using the APOC Neo4j procedure library. Let's say you have node labels Book and Author and you want to make a full text query across :Book(title), :Book(content), and :Author(name) and :Author(address). First, use apoc.index.addAllNodes to create an index called bookIndex and specify the labels and properties to include in the index:
CALL apoc.index.addAllNodes('bookIndex',{
Book: ["title","content"],
Author: ["name","address"]
})
Then, to search the index:
CALL apoc.index.search('bookIndex', 'River Runs Through It')
You can use this with more complex graph queries as well:
CALL apoc.index.search('bookIndex, 'River Runs Through It')
YIELD node AS book
MATCH (book)-[:IN_GENRE]->(g:Genre)
RETURN g
Lucene query syntax is used so you can do fuzzy search, required components of the string, etc: 'Norman Maclean~' or 'Norman~ +Maclean'
See the APOC docs for more info.

Grails 3 - return list in query result from HQL query

I have a domain object:
class Business {
String name
List subUnits
static hasMany = [
subUnits : SubUnit,
]
}
I want to get name and subUnits using HQL, but I get an error
Exception: org.springframework.orm.hibernate4.HibernateQueryException: not an entity
when using:
List businesses = Business.executeQuery("select business.name, business.subUnits from Business as business")
Is there a way I can get subUnits returned in the result query result as a List using HQL? When I use a left join, the query result is a flattened List that duplicates name. The actual query is more complicated - this is a simplified version, so I can't just use Business.list().

I thought I should add it as an answer, since I been doing this sort of thing for a while and a lot of knowledge that I can share with others:
As per suggestion from Yariash above:
This is forward walking through a domain object vs grabbing info as a flat list (map). There is expense involved when having an entire object then asking it to loop through and return many relations vs having it all in one contained list
#anonymous1 that sounds correct with left join - you can take a look at 'group by name' added to end of your query. Alternatively when you have all the results you can use businesses.groupBy{it.name} (this is a cool groovy feature} take a look at the output of the groupBy to understand what it has done to the
But If you are attempting to grab the entire object and map it back then actually the cost is still very hefty and is probably as costly as the suggestion by Yariash and possibly worse.
List businesses = Business.executeQuery("select new map(business.name as name, su.field1 as field1, su.field2 as field2) from Business b left join b.subUnits su ")
The above is really what you should be trying to do, left joining then grabbing each of the inner elements of the hasMany as part of your over all map you are returning within that list.
then when you have your results
def groupedBusinesses=businesses.groupBy{it.name} where name was the main object from the main class that has the hasMany relation.
If you then look at you will see each name has its own list
groupedBusinesses: [name1: [ [field1,field2,field3], [field1,field2,field3] ]
you can now do
groupedBusinesses.get(name) to get entire list for that hasMany relation.
Enable SQL logging for above hql query then compare it to
List businesses = Business.executeQuery("select new map(b.name as name, su as subUnits) from Business b left join b.subUnits su ")
What you will see is that the 2nd query will generate huge SQL queries to get the data since it attempts to map entire entry per row.
I have tested this theory and it always tends to be around an entire page full of query if not maybe multiple pages of SQL query created from within HQL compared to a few lines of query created by first example.

Postgresql text searching, matching multiple words

I don't know the name for this kind of search, but I see that it's getting pretty common.
Let's say I have records with the following file names:
'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'
If I search with the following string 'oc' I would like the result to return 'orders_controller_spec.rb' due to match the o in orders and the c in controller.
If the string is 'os' then I'd like all 3 to match, 'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'.
If the string is 'oco' then I'd like 'orders_controller_spec.rb'
What is the name for this kind of search and how would I go about getting this done in Postgresql?

This is a called a subsequence search. One simple way to do it in Postgres is to use the LIKE operator (or several of the other options in those docs) and fill the spaces between your letters with a wildcard, which for LIKE is %. To match anything with an o followed by an s in the words column, that would look like this:
SELECT * FROM table WHERE words LIKE '%o%s%';
This is a relatively expensive search, but you can improve performance with a varchar_pattern_ops or text_pattern_ops index to support faster pattern matching.
CREATE INDEX pattern_index ON table (words varchar_pattern_ops);

Can Neo4j be effectively used to show a collection of nodes in a sortable and filterable table?

I realise this may not be ideal usage, but apart from all the graphy goodness of Neo4j, I'd like to show a collection of nodes, say, People, in a tabular format that has indexed properties for sorting and filtering
I'm guessing the Type of a node can be stored as a Link, say Bob -> type -> Person, which would allow us to retrieve all People
Are the following possible to do efficiently (indexed?) and in a scalable manner?
Retrieve all People nodes and display all of their names, ages, cities of birth, etc (NOTE: some of this data will be properties, some Links to other nodes (which could be denormalised as properties for table display's and simplicity's sake)
Show me all People sorted by Age
Show me all People with Age < 30
Also a quick how to do the above (or a link to some place in the docs describing how) would be lovely
Thanks very much!
Oh and if the above isn't a good idea, please suggest a storage solution which allows both graph-like retrieval and relational-like retrieval

if you want to operate on these person nodes, you can put them into an index (default is Lucene) and then retrieve and sort the nodes using Lucene (see for instance How do I sort Lucene results by field value using a HitCollector? on how to do a custom sort in java). This will get you for instance People sorted by Age etc. The code in Neo4j could look like
Transaction tx = neo4j.beginTx();
idxManager = neo4j.index()
personIndex = idxManager.forNodes('persons')
personIndex.add(meNode,'name',meNode.getProperty('name'))
personIndex.add(youNode,'name',youNode.getProperty('name'))
tx.success()
tx.finish()
'*** Prepare a custom Lucene query context with Neo4j API ***'
query = new QueryContext( 'name:*' ).sort( new Sort(new SortField( 'name',SortField.STRING, true ) ) )
results = personIndex.query( query )
For combining index lookups and graph traversals, Cypher is a good choice, e.g.
START people = node:people_index(name="E*") MATCH people-[r]->() return people.name, r.age order by r.age asc
in order to return data on both the node and the relationships.

Sure, that's easily possible with the Neo4j query language Cypher.
For example:
start cat=node:Types(name='Person')
match cat<-[:IS_A]-person-[born:BORN]->city
where person.age > 30
return person.name, person.age, born.date, city.name
order by person.age asc
limit 10
You can experiment with it in our cypher console.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart