NEO4J: caption value in node not display? - neo4j

try create 2 nodes and relation them, but caption not appear in second node "water"

You can really only pick one field per label to use as the identifier for the Neo4j Browser, as typically you'd expect there to be only one main human-readable identifier for a given thing in a data model. You've here got two concepts ('Medicine', in your model) that have very different ways of identifying them which isn't wrong but is probably unhelpful.
For example, a Person label might have a name field, a Car label might have a model field and so on - you wouldn't typically have a mix of the two, but the two labels might co-exist in the same graph and you could set the browser up to use name on Person nodes and model on Car nodes.
Your data model's a bit odd to my eyes which might be the cause of the confusion - why are m1 and m2 both Medicines, when m1 seems to really be a symptom of a disease? The node labels in the model ideally would be the nouns or concepts of your domain, and the relationships the verbs that relate them.
For example, I personally would model what you've put as an example as follows:
CREATE (s: Symptom { name: 'Fever' })
CREATE (m: Medicine { name: 'Water', chemicalName: 'Dihydrogen oxide' })
CREATE (m)-[:TREATS]->(s)
RETURN m, s
You could elect to use chemicalName as the label for your Medicine nodes, or just the plain name field as you see fit - but they would all then render consistently in the graph visualisation.

Related

How can I mitigate having bidirectional relationships in a family tree, in Neo4j?

I am running into this wall regarding bidirectional relationships.
Say I am attempting to create a graph that represents a family tree. The problem here is that:
* Timmy can be Suzie's brother, but
* Suzie can not be Timmy's brother.
Thus, it becomes necessary to model this in 2 directions:
(Sure, technically I could say SIBLING_TO and leave only one edge...what I'm not sure what the vocabulary is when I try to connect a grandma to a grandson.)
When it's all said and done, I pretty sure there's no way around the fact that the direction matters in this example.
I was reading this blog post, regarding common Neo4j mistakes. The author states that this bidirectionality is not the most efficient way to model data in Neo4j and should be avoided.
And I am starting to agree. I set up a mock set of 2 families:
and I found that a lot of queries I was attempting to run were going very, very slow. This is because of the 'all connected to all' nature of the graph, at least within each respective family.
My question is this:
1) Am I correct to say that bidirectionality is not ideal?
2) If so, is my example of a family tree representable in any other way...and what is the 'best practice' in the many situations where my problem may occur?
3) If it is not possible to represent the family tree in another way, is it technically possible to still write queries in some manner that gets around the problem of 1) ?
Thanks for reading this and for your thoughts.
Storing redundant information (your bidirectional relationships) in a DB is never a good idea. Here is a better way to represent a family tree.
To indicate "siblingness", you only need a single relationship type, say SIBLING_OF, and you only need to have a single such relationship between 2 sibling nodes.
To indicate ancestry, you only need a single relationship type, say CHILD_OF, and you only need to have a single such relationship between a child to each of its parents.
You should also have a node label for each person, say Person. And each person should have a unique ID property (say, id), and some sort of property indicating gender (say, a boolean isMale).
With this very simple data model, here are some sample queries:
To find Person 123's sisters (note that the pattern does not specify a relationship direction):
MATCH (p:Person {id: 123})-[:SIBLING_OF]-(sister:Person {isMale: false})
RETURN sister;
To find Person 123's grandfathers (note that this pattern specifies that matching paths must have a depth of 2):
MATCH (p:Person {id: 123})-[:CHILD_OF*2..2]->(gf:Person {isMale: true})
RETURN gf;
To find Person 123's great-grandchildren:
MATCH (p:Person {id: 123})<-[:CHILD_OF*3..3]-(ggc:Person)
RETURN ggc;
To find Person 123's maternal uncles:
MATCH (p:Person {id: 123})-[:CHILD_OF]->(:Person {isMale: false})-[:SIBLING_OF]-(maternalUncle:Person {isMale: true})
RETURN maternalUncle;
I'm not sure if you are aware that it's possible to query bidirectionally (that is, to ignore the direction). So you can do:
MATCH (a)-[:SIBLING_OF]-(b)
and since I'm not matching a direction it will match both ways. This is how I would suggest modeling things.
Generally you only want to make multiple relationships if you actually want to store different state. For example a KNOWS relationship could only apply one way because person A might know person B, but B might not know A. Similarly, you might have a LIKES relationship with a value property showing how much A like B, and there might be different strengths of "liking" in the two directions

Modeling arrows/relationships as nodes in Neo4j

Relationship/Arrows in Neo4j can not get more than one type/label (see here, and here). I have a data model that edges need to get labels and (probably) properties. If I decide to use Neo4j (instead of OriendDB which supports labeled arrow), I think I would have then two options to model an arrow, say f, between two nodes A and B:
1) encode an arrow f as a span, say A<--f-->B, such that f is also a node and --> and <-- are arrows.
or
2) encode an arrow f as A --> f -->B, such that f is a node again and two --> are arrows.
Though this seems to be adding unnecessary complexity on my data model, it does not seem to be any other option at the moment if I want to use Neo4j. Then, I am trying to see which of the above encoding might fit better in my queries (queries are the core of my system). For doing so, I need to resort to examples. So I have two question:
First Question:
part1) I have nodes labeled as Person and father, and there are arrows between them like Person<-[:sr]-father-[:tr]->Person in order to model who is father of who (tr is father of sr). For a given person p1 how can I get all of his ancestors.
part2) If I had Person-[:sr]->father-[:tr]->Person structure instead, for modeling father relationship, how the above same query would look like.
This is answered here when father is considered as a simple relationship (instead of being encoded as a node)
Second Question:
part1) I have nodes labeled as A nodes with the property p1 for each. I want to query A nodes, get those elements that p1<5, then create the following structure: for each a1 in the query result I create qa1<-[:sr]-isA-[:tr]->a1 such that isA and qa1 are nodes.
part2) What if I wanted to create qa1-[:sr]->isA-[:tr]->qa1 instead?
This question is answered here when isA is considered as a simple arrow (instead of being modeled as a node).
First, some terminology; relationships don't have labels, they only have types. And yes, one type per relationship.
Second, relative to modeling, I think the direction of the relationship isn't always super important, since with neo4j you can traverse it both ways easily. So the difference between A-->f-->B and A<--f-->B I think should be entirely driven what what makes sense semantically for your domain, nothing else. So your options (1) and (2) at the top seem the same to me in terms of overall complexity, which brings me to point #3:
Your main choice is between making a complex relationship into a node (which I think we're calling f here) or keeping it as a relationship. Making "a relationship into a node" is called reification and I think it's considered a fairly standard practice to accommodate a number of modeling issues. It does add complexity (over a simple relationship) but adds flexibility. That's a pretty standard engineering tradeoff everywhere.
So with all of that said, for your first question I wouldn't recommend an intermediate node at all. :father is a very simple relationship, and I don't see why you'd ever need more than one label on it. So for question one, I would pick "neither of the options you list" and would instead model it as (personA)-[:father]->(personB). More simple. You'd query that by saying
MATCH (personA { name: "Bob"})-[:father]->(bobsDad) RETURN bobsDad
Yes, you could model this as (personA)-[:sr]->(fatherhood)-[:tr]->(personB) but I don't see how this gains you much. As for the relationship direction, again it doesn't matter for performance or query, only for semantics of whatever :tr and :sr are supposed to mean.
I have nodes labeled as A nodes with the property p1 for each. I want
to query A nodes, get those elements that p1<5, then create the
following structure: for each a1 in the query result I create
qa1<-[:sr]-isA-[:tr]->a1 such that isA and qa1 are nodes.
That's this:
MATCH (aNode:A)
WHERE aNode.p1 < 5
WITH aNode
MATCH (qa1 { label: "some qa1 node" })
CREATE (qa1)<-[:sr]-(isA)-[:tr]->aNode;
Note that you'll need to adjust the criteria for qa1 and also specify something meaningful for isA.
What if I wanted to create qa1-[:sr]->isA-[:tr]->qa1 instead?
It should be trivial to modify that query above, just change the direction of the arrows, same query.

Modeling conditional relationships in neo4j v.2 (cypher)

I have two related problems I need help with.
Problem 1: How do I model a conditional relationship?
I want my data to indicate that when test CLT1's "Result" property = "High", CLT1 has relationship to Disease A. If I take a node-centric approach, I imagine that the code might look something like...
(CLT 1 {Result: "High"}) -[:INDICATES] -> (Disease A)
Further, when CLT1's "Result" property = "Low", CLT1 has a relationship to Disease B
(CLT 1 {Result: "Low"}) -[:INDICATES] -> (Disease B)
Alternatively, if I take a relationship-centric approach, the code might look like this...
(CLT 1) -[:INDICATES {Result: "High"}] -> (Disease A)
(CLT 1) -[:INDICATES {Result: "Low"} ] -> (Disease B)
Problem 2
I have had the experience that I am modeling my data, there is 1 node with a unique name, but either different labels or properties. The thing is that I want these nodes to be distinguishable. However, they are not as they look the same to cypher.
I can either give them multiple properties, labels or different names. The diversity has to be for each different class... in labels or properties (1+n labels, properties) or in different names.
Problem 2 relates to Problem 1 in that I can't model the conditional relationship or distinguish the same node (CLT1) by its labels or properties. I may have to resolve it by making the query-able "condition" in the relationship.
DO I have this right? Do I have any other options?
For your first question, I'd take the relationship-centric approach as this kind of represents the inference of the information leading from your result-node to the disease.
Should work pretty well in modeling and querying too.
For your second question. That's what node-labels are for they represent different roles a node can play, each with different relevant properties and relationships.
So you could do MATCH (p:Person {name:"Jose"}) and treat it differently from MATCH (d:Developer {name:"Jose"}). I.e look at other props and rels.

Representing an item in an inventory

I am new to Neo4j and I need some advice from the more experienced Neo4j developers.
In which situation does it makes sense for an inventory system to represent individual items as a path through their properties instead of a node with the same properties?
In order to make my self clear:
Let's say we have a eyeglass lens. This item has properties like it's SPHERE power it's CYLINDER power and an AXIS, among others.
There is a finite set of SPHERE powers but also of CYLINDER power and AXIS. The combination of those makes an item (lens).
Does it make sense to represent a lens like this:
MATCH (lens:Lens)-[:-2.00]-(sph:Sphere:{power:'-2.00'})-[:-0.50]-(cyl:Cylinder{power:'-0.50'})-[:90]-(ax:Axis{degree:'90'})
RETURN lens.brand_name, lens.price
Please note that the above item(lens) can be available from different manufacturers and with different brand names and list prices so "lens" will represent all individual brands that can match with the above query and will have as properties the brand name and price, at least.
Let's say you have a piece of data ("SPHERE"). When should it be a property of the lens node, and when should it be its own node, via relation?
Do you need to relate multiple lenses to the same sphere? This argues it should be its own node, so that multiple lenses can link to the same sphere.
Do you need to assert extra properties about the sphere value? (Like who measured it, or when?) This argues you should make it a separate node.
Do you need to store properties about the relationship? If the relationship is any more complicated than simple "HAS A" you might want a relationship between two nodes, so you can store properties on the relationship.
Any of those cases would argue you should store that piece of data as a separate node, and then relate it by relationship.
ON THE OTHER HAND, if it's a simple primitive data type (float), with a simple "HAS-A" relationship to the parent (i.e. a lens HAS-A sphere measurement) and you have no need for extra metadata, then it should be a node property.
I'm not an optometrist but I think this latter situation is your case, I'm just trying to give you a more general answer. "Sphere" should probably be a node property, but the cases above are how to think about the issue more generally for future data items.
In your special domain, with finite ranges and discrete values for each of the parameters, it absolutely makes sense to model the properties of a lens as value nodes. The resulting index graph seems not to be too large, and quite balanced (no supernodes).

Neo4j node property type

I'm playing around with neo4j, and I was wondering, is it common to have a type property on nodes that specify what type of Node it is? I've tried searching for this practice, and I've seen some people use name for a purpose like this, but I was wondering if it was considered a good practice or if indexes would be the more practical method?
An example would be a "User" node, which would have type: user, this way if the index was bad, I would be able to do an all-node scan and look for types of user.
Labels have been added to neo4j 2.0. They fix this problem.
You can create nodes with labels:
CREATE (me:American {name: "Emil"}) RETURN me;
You can match on labels:
MATCH (n:American)
WHERE n.name = 'Emil'
RETURN n
You can set any number of labels on a node:
MATCH (n)
WHERE n.name='Emil'
SET n :Swedish:Bossman
RETURN n
You can delete any number of labels on a node:
MATCH (n { name: 'Emil' })
REMOVE n:Swedish
Etc...
True, it does depend on your use case.
If you add a type property and then wish to find all users, then you're in potential trouble as you've got to examine that property on every node to get to the users. In that case, the index would probably do better- but not in cases where you need to query for all users with conditions and relations not available in the index (unless of course, your index is the source of the "start").
If you have graphs like mine, where a relation type implies two different node types like A-(knows)-(B) and A or B can be a User or a Customer, then it doesn't work.
So your use case is really important- it's easy to model graphs generically, but important to "tune" it as per your usage pattern.
IMHO you shouldn't have to put a type property on the node. Instead, a common way to reference all nodes of a specific "type" is to connect all user nodes to a node called "Users" maybe. That way starting at the Users node, you can very easily find all user nodes. The "Users" node itself can be indexed so you can find it easily, or it can be connected to the reference node.
I think it's really up to you. Some people like indexed type attributes, but I find that they're mostly useful when you have other indexed attributes to narrow down the number of index hits (search for all users over age 21, for example).
That said, as #Luanne points out, most of us try to solve the problem in-graph first. Another way to do that (and the more natural way, in my opinion) is to use the relationship type to infer a practical node type, i.e. "A - (knows) -> B", so A must be a user or some other thing that can "know", and B must be another user, a topic, or some other object that can "be known".
For client APIs, modeling the element type as a property makes it easy to instantiate the right domain object in your client-side code so I always include a type property on each node/vertex.
The "type" var name is commonly used for this, but in some languages like Python, "type" is a reserved word so I use "element_type" in Bulbs ( http://bulbflow.com/quickstart/#models ).
This is not needed for edges/relationships because they already contain a type (the label) -- note that Neo4j also uses the keyword "type" instead of label for relationships.
I'd say it's common practice. As an example, this is exactly how Spring Data Neo4j knows of which entity type a certain node is. Each node has "type" property that contains the qualified class name of the entity. These properties are automatically indexed in the "types" index, thus nodes can be looked up really fast. You could implement your use case exactly like this.
Labels have recently been added to Neo4j 2.0 ( http://docs.neo4j.org/chunked/milestone/graphdb-neo4j-labels.html ). They are still under development at the moment, but they address this exact problem.

Resources