Say you have some nodes in your model that may go by multiple alternative names, but all the names refer to the same object.
For example, you may want to be able to query the "World" node by using name "World" in one context, whereas in different context you want to find the same node quickly also by the name "Global".
Is it optimal to organize this information in the form of string array property aliases like this? :
If you add World to your aliases you can use the legacy node_auto_index to index that aliases field
which will index each value individually and the query it with
Start n=node:node_auto_index(aliases="Global")
return n
I think you could use Lucene for that.
You could index the same property several times with different names.
You can then query the index in the way you want through Java APIs or Cypher.
For instance:
START n = node:myIndex(myProperty="ALIAS_1"),
m = node:myIndex(myProperty="ALIAS_2")
[...]
Related
We are using neo4j version 4.1.1,
and we have a graph that represents a structure of objects.
we support translation using nodes for translation and the connection between an object and a translation node is the object name and description.
for example:
(n:object)-[r:Translation]-(:ru)
means that on relationship r is the name and description of object n in russian.
In order to search by name and description we implemented a fullText index like that:
CALL db.index.fulltext.createRelationshipIndex("TranslationRelationshipIndex",["Translation"],["Name","Description"], { eventually_consistent: "true" })
We also support search for items in order to do it we are using the index to query and we have names like "UFO41.SI01V03":
CALL db.index.fulltext.queryRelationships('TranslationRelationshipIndex', '*FO41.SI0*') YIELD relationship, score
but for names as shown above([0-9.*]) no results are returned
while results are returned for name like "ab.or"
Is there any one who knows how to make it work? I've tried all 46 analyzers available.
I know we can solve it just using match()-[r]-() where r.Name contains "<string>"
but we prefer a more efficient index-using solution to this problem.
stay safe!
and thanks in advance.
p.s if needed I can supply a few lines to recreate it locally just ask.
The analyzer will probably recognise words like ab.or differently than ab.or123 and consider them a single token in the first case and two tokens in the second case.
There is no analyzer that will really fit your needs except than creating your own.
You can however replace the . in your query with a simple AND, for eg :
CALL db.index.fulltext.queryNodes('Test', replace("*FO41.SI0*", ".", " AND "))
Will return you the results you're looking at.
Resources for creating your own analyser :
https://graphaware.com/neo4j/2019/09/06/custom-fulltext-analyzer.html
https://neo4j.com/docs/java-reference/current/extending-neo4j/full-text-analyzer-provider/
I faced a need to make a strange thing. I have some query which is can’t be changed. It’s a match query for getting record:
MATCH (j:journal) WHERE j.id in [12] RETURN j.`id` AS ID, j.`language` AS LANGUAGE
And I have some node that contains array as property: e.g. can be created like this: create (j:journal {id:12, language:[“English”, “Polish”]})
So, is there any possibility to display this node like two records with the same id, but with different language fields? Like the following:
ID | LANGUAGE
12 | English
12 | Polish
The important thing is that match query can’t be changed at all.
But the node can be changed.
I know that I can add UNWIND keyword for the language field in the source query. But there is a requirement to not to.
I didn’t find something like that in the documentation nor in the internet. I’m not sure if it’s even possible (but consumer wants it). Just I don’t have much experience with neo4j.
I understand that it can sound weird, but I need to understand if it can be implemented this way.
Thanks in advance.
If you can change the DB, you can change it so that each journal node contains a single language (as a scalar value, not in a list). However, this change might break any other queries that you might have.
If this conversion is acceptable, here is a query that should: (a) convert existing journal nodes to have a scalar language value, and (b) create new journal nodes as necessary for the remaining language values. The nodes that are spawned from an original journal node will share the same properties (except for language).
MATCH (j:journal)
WITH j, j.language[1..] AS langs
SET j.language = j.language[0]
WITH j, langs
UNWIND langs AS lang
CREATE (k:journal)
SET k = j, k.language = lang
If a node's language property had N values, you will end up with N nodes, each with the same properties -- except for the language property, which will contain a different language value (as a string). For efficiency, the original node is reused.
I need to build a graph db with massive amount of nodes and relations. every node should hold a list of string values and I need to be able to query all the nodes connected to a starting node, that have a given value in their list.
for example, I might have a node with a list of ["dog", "cat", "bird"], and I might need to query all nodes that have the value "dog" in their list.
now my question is this - what would be more efficient solution for that list in neo4j?
hold the values as an actual list, and search value inside that list during the query?
or...
instead using a list property, implement the list as separated properties and use HAS(n.property) to find all the nodes with a property?
other solution?
what would be the most efficient way (for lots of queries)?
thanks!
Implement list as seperate property, taht is the efficient way to handle large amount of data.
Then you can access data e.g
MATCH (n:node) where n.property = {SearchValue} return n;
Implementing list is not a good idea.
I have the following two cypher calls that I'd like to combine into one;
start r=relationship:link("key:\"foo\" and value:\"bar\"") return r.guid
This returns a relationship that contains a guid that I need based on a key value pair (in this case key:foo and value:bar).
Lets assume r.guid above returns 12345.
I then need all the property relationships for the object in question based on the returned guid and a property type key;
start r=relationship:properties("to:\"12345\" and key:\"baz\"") return r
This returns several relationships which have the values I need, in this case all property types baz that belong to guid 12345.
How do I combine these two calls into one? I'm sure its simple but I'm stumbling..
The answer I've gotten is that there is no way to perform an index lookup in the middle of a Cypher query, or to use a variable you have declared to perform the lookup.
Perhaps in later version of Cypher, as this ability should be standard especially with the dense node issue and the suggested solution of indexing.
I'm playing around with neo4j, and I was wondering, is it common to have a type property on nodes that specify what type of Node it is? I've tried searching for this practice, and I've seen some people use name for a purpose like this, but I was wondering if it was considered a good practice or if indexes would be the more practical method?
An example would be a "User" node, which would have type: user, this way if the index was bad, I would be able to do an all-node scan and look for types of user.
Labels have been added to neo4j 2.0. They fix this problem.
You can create nodes with labels:
CREATE (me:American {name: "Emil"}) RETURN me;
You can match on labels:
MATCH (n:American)
WHERE n.name = 'Emil'
RETURN n
You can set any number of labels on a node:
MATCH (n)
WHERE n.name='Emil'
SET n :Swedish:Bossman
RETURN n
You can delete any number of labels on a node:
MATCH (n { name: 'Emil' })
REMOVE n:Swedish
Etc...
True, it does depend on your use case.
If you add a type property and then wish to find all users, then you're in potential trouble as you've got to examine that property on every node to get to the users. In that case, the index would probably do better- but not in cases where you need to query for all users with conditions and relations not available in the index (unless of course, your index is the source of the "start").
If you have graphs like mine, where a relation type implies two different node types like A-(knows)-(B) and A or B can be a User or a Customer, then it doesn't work.
So your use case is really important- it's easy to model graphs generically, but important to "tune" it as per your usage pattern.
IMHO you shouldn't have to put a type property on the node. Instead, a common way to reference all nodes of a specific "type" is to connect all user nodes to a node called "Users" maybe. That way starting at the Users node, you can very easily find all user nodes. The "Users" node itself can be indexed so you can find it easily, or it can be connected to the reference node.
I think it's really up to you. Some people like indexed type attributes, but I find that they're mostly useful when you have other indexed attributes to narrow down the number of index hits (search for all users over age 21, for example).
That said, as #Luanne points out, most of us try to solve the problem in-graph first. Another way to do that (and the more natural way, in my opinion) is to use the relationship type to infer a practical node type, i.e. "A - (knows) -> B", so A must be a user or some other thing that can "know", and B must be another user, a topic, or some other object that can "be known".
For client APIs, modeling the element type as a property makes it easy to instantiate the right domain object in your client-side code so I always include a type property on each node/vertex.
The "type" var name is commonly used for this, but in some languages like Python, "type" is a reserved word so I use "element_type" in Bulbs ( http://bulbflow.com/quickstart/#models ).
This is not needed for edges/relationships because they already contain a type (the label) -- note that Neo4j also uses the keyword "type" instead of label for relationships.
I'd say it's common practice. As an example, this is exactly how Spring Data Neo4j knows of which entity type a certain node is. Each node has "type" property that contains the qualified class name of the entity. These properties are automatically indexed in the "types" index, thus nodes can be looked up really fast. You could implement your use case exactly like this.
Labels have recently been added to Neo4j 2.0 ( http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-labels.html ). They are still under development at the moment, but they address this exact problem.