Do labels and properties have an id value in Neo4j? - neo4j

I know that nodes and relationships have an integral value that identifies them. Is the same true of labels and/or properties? Is it sufficient to identify a node labeling by giving a node id and a string? That is, is it possible to assign the same label to the same node/relationship more than once?

All nodes and relationships have IDs and can be looked up by their IDs. You can do it like this:
MATCH (n) WHERE id(n) = 5 RETURN n;
IDs should be thought of as an internal implementation detail; don't rely on them to have any particular value or to be consistent or ordered, just rely on them to uniquely identify each node.
In general, IMHO it's good practice to assign your own meaningful identifier to nodes, probably indexed, that you can use to find your nodes, in the same way you'd assign a primary key to a relational database record.
It is possible to assign a label to more than one node; labels should be thought of more as classes of nodes, sort of like entities in an ERD. Typically labels would be things like Person, Company, Job, etc. Labels don't have much to do with identifiers though.

Do labels and properties have an id value in Neo4j?
No
Is it sufficient to identify a node labeling by giving a node id and a string?
If by this you mean that you have the node id and a label value in a String variable then yes, it is. It is also sufficient to just use the ID.
MATCH (n) WHERE ID(n) = 1234 RETURN n
Better:
MATCH (n:YourLabel) WHERE ID(n) = 1234 RETURN n
As FrobberOfBits intimated it is also preferable to ignore (withint reason) the internal IDs that Neo attaches to Nodes/Relationships in your external interactions. This is in part to do with Neo recycling it's internal identifiers (So if you create a Node it gets assigned ID 1, delete that Node, the next created Node could be assigned ID 1), and inpart to do with exposing the internals of the system. Instead you should probably attach meaningful identifiers or UUIDS where required.
Using your own identifier:
MATCH (n:YourLabel{uid:1234}) RETURN n
To make lookups fast, index them:
CREATE INDEX ON :YourLabel(uid)
Is it possible to assign the same label to the same node/relationship more than once?
Nodes can have as many labels as you want to assign them (including none), which is handy when you want to maintain a hierarchy of Node "types". Having a label makes them faster to lookup as Neo has a hint of where to start.
Relationships can only have a single type and that type is immutable. i.e If you want to change from type HAS_A to HAD_A you cannot just change the type, you must delete and re-add the relationship.
A node can be related to another node as many times as you want, using the same relationship type or different relationship types. Performancewise it is better to have different relationship types than to use properties on relationships as the lookup is faster.
CREATE (p:Person{name:"Dave"}), (m:Pet), (d:Pet),
(p)-[:HAS_PET{type:"Cat"}]->(m),
(p)-[:HAS_PET{type:"Dog"}]->(d)
Is fine, as is:
REATE (p:Person{name:"Dave"}), (m:Pet), (d:Pet),
(p)-[:HAS_CAT]->(m),
(p)-[:HAS_DOG]->(d)
Which is now faster if you wanted to do a query for matching all dogs. If Dave's dog gets reassigned you cannot do:
MATCH (p:Person{name:"Dave"})-[rel:HAS_DOG]->()
SET rel:HAS_CAT
But you could do:
MATCH (p:Person{name:"Dave"})<-[rel:HAS_PET{type:"Dog"}]-()
SET rel.type = "Cat"
But I've probably missed the point of the multiple-assignment question.

Not sure what you're aiming for:
yes, property-names, rel-types and labels have internal id's they are not stored as strings
you can identify nodes by label + property + value, one if you have a uniqueness constraint otherwise multiple
you can identify nodes by label (many)

Related

How to move all neo4j relationships with all their labels and properties from one node to another?

Suppose you've got two nodes that represent the same thing, and you want to merge those two nodes. Both nodes can have any number of relations with other nodes.
The basics are fairly easy, and would look something like this:
MATCH (a), (b) WHERE a.id == b.id
MATCH (b)-[r]->()
CREATE (a)-[s]->()
SET s = PROPERTIES(r)
DELETE DETACH b
Only I can't create a relation without a type. And Cypher doesn't support variable labels either. I'd love to be able to do something like
CREATE (a)-[s:{LABELS(r)}]->(o)
but that doesn't work. To create the relation, you need to know the type of the relation, and in this case I really don't.
Is there a way to dynamically assign types to relationships, or am I going to have to query the types of the old relation, and then string concat new queries with the proper types? That's not impossible, but a lot slower and more complex. And this could potentially match a lot of elements and even more relationships, so having to generate a separate query for every instance is going to slow things down quite a lot.
Or is there a way to change the target of the old relationship? That would probably be the fastest, but I'm not aware of any way to do that.
I think you need to take a look at APOC, especially apoc.create.relationship which enable creating relationships with dynamic type.
Adapting your example, you should end up with something along the line of (not tested):
MATCH (a), (b) WHERE a.id == b.id
MATCH (b)-[r]->(n)
CALL apoc.create.relationship(a, type(r), properties(r), n)
DETACH DELETE b
NB
relationships have TYPE and not label
the proper cypher statement to delete relationships attached to a node and the node itself is DETACH DELETE (and not DELETE DETACH)
Related resource: https://markhneedham.com/blog/2016/10/30/neo4j-create-dynamic-relationship-type/
The APOC procedure apoc.refactor.mergeNodes should be very helpful. That procedure is very powerful, and you need to read the documentation to understand how to configure it to do what you want in your specific situation.
Here is a simple example that shows how to use the procedure's default configuration to merge nodes with the same id:
MATCH (node:Foo)
WITH node.id AS id, COLLECT(node) AS nodes
WHERE SIZE(nodes) > 1
CALL apoc.refactor.mergeNodes(nodes, {}) YIELD node
RETURN node
In this example, I specified an arbitrary Foo label to avoid accidentally merging unwanted nodes. Doing so also helps to speed up the query if you have a lot of nodes with other labels (since they will not need to be scanned for the id property).
The aggregating function COLLECT is used to collect a list of all the nodes with the same id. After checking the size of the list, it is passed to the procedure.

NEO4J - Matching a path where middle node might exist or not

I have the following graph:
I would look to get all contractors and subcontractors and clients, starting from David.
So I thought of a query likes this:
MATCH (a:contractor)-[*0..1]->(b)-[w:works_for]->(c:client) return a,b,c
This would return:
(0:contractor {name:"David"}) (0:contractor {name:"David"}) (56:client {name:"Sarah"})
(0:contractor {name:"David"}) (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})
Which returns the desired result. The issue here is performance.
If the DB contains millions of records and I leave (b) without a label, the query will take forever. If I add a label to (b) such as (b:subcontractor) I won't hit millions of rows but I will only get results with subcontractors:
(0:contractor {name:"David"}) (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})
Is there a more efficient way to do this?
link to graph example: https://console.neo4j.org/r/pry01l
There are some things to consider with your query.
The relationship type is not specified- is it the case that the only relationships from contractor nodes are works_for and hired? If not, you should constrain the relationship types being matched in your query. For example
MATCH (a:contractor)-[:works_for|:hired*0..1]->(b)-[w:works_for]->(c:client)
RETURN a,b,c
The fact that (b) is unlabelled does not mean that every node in the graph will be matched. It will be reached either as a result of traversing the works_for or hired relationships if specified, or any relationship from :contractor, or via the works_for relationship.
If you do want to label it, and you have a hierarchy of types, you can assign multiple labels to nodes and just use the most general one in your query. For example, you could have a label such as ExternalStaff as the generic label, and then further add Contractor or SubContractor to distinguish individual nodes. Then you can do something like
MATCH (a:contractor)-[:works_for|:hired*0..1]->(b:ExternalStaff)-[w:works_for]->(c:client)
RETURN a,b,c
Depends really on your use cases.

neo4j: CYPHER query all properties of a node

We are evaluating Neo4J for future projects.  Currently just experimenting with learning Cypher and its capabilities.  But one thing that I think should be very straightforward has so far eluded me.  I want to be able to see all properties and their values for any given Node.  In SQL that would be something like:
select * from TableX where ID = 12345;
I have looked through the latest Neo4J docs and numerous Google searches but so far I am coming up empty.  I did find the keys() function that will return the property names in a string list, but that is marginally useful at best.  What I want is a query that will return prop names and the corresponding values like:
name     :  "Lebron"
city     :  "Cleveland"
college  :  "St. Vincent–St. Mary High School"
You may want to reread the Neo4j docs.
Returning the node itself will include the properties map for the node, which is typically the way you will get all properties (keys and values) for the node.
MATCH (n)
WHERE id(n) = 12345
RETURN n
If you explicitly just want the properties but without the metadata related to the node itself, returning properties(n) (assuming n is a node variable) will return the node's properties.
MATCH (n)
WHERE id(n) = 12345
RETURN properties(n) as props
With regard to how columns (variables) work, these are always explicit, so you don't have a way to dynamically get the columns corresponding to the properties of a node. You will instead need to use the approaches above, where the variable corresponds to either a node (where you can get at the properties map through the structure) or a properties map.
The main difference between this approach and select * in SQL is that Neo4j has no table schema, so you can use whatever properties you want on nodes of the same type, and those can differ between nodes of the same type, so there is no common structure to reference which will provide the properties for a node of a given label (you would need to scan all nodes of that label and accumulate the distinct properties to do so).

Limitation on Selection of nodes using labels

I want to keep restriction in cypher in selecting alike nodes without knowing the property values.
Let say, I have few nodes with BUYER as labels. And I don't know anything more than that regarding the database. And I wanted to see the list of properties for the BUYER nodes. And, all BUYER nodes have same set of properties. Then, I did this
My Approach:
MATCH (n:Buyer)
with keys(n) as each_node_keys
UNWIND each_node_keys as all_keys
RETURN DISTINCT(all_keys)
In my approach I can clearly see that, first line of query, MATCH(n:Buyer) is selecting all the nodes, iterating all the nodes, collecting all the properties and then filtering. Which is not a good idea.
In order to overcome this, I wanted to LIMIT the nodes we are selecting,
like instead of selecting all the nodes, How can I restrict it to select only one node and since I don't know any property values, I cannot filter using the property. Once I pick a node then I should not pick further nodes. How can I do that.
If as you said all Buyer nodes have the same property keys, you can just limit the MATCH for one node :
MATCH (n:Buyer)
WITH n LIMIT 1
RETURN keys(n)

identifying node types

I have a number of nodes of different types by that I mean nodes that have different properties on them. For instance, I have a number of nodes that all they have is a property of fileName and uploadDate. If I want to check against all file names do I just need to do
START n=node(*) WHERE has(n.File) RETURN n;
Is this the best practice (i.e. querying a flattened out database). Thanks!
Your query scans all nodes, this will become slower as your data set grows.
For identifying nodes of a certain type, there are two common approaches:
Type attribute
Set a property named 'type' (or '_type_' f.e. if you like to mark it as a system property) with the value describing your type, e.g. 'File'.
Then you can lookup nodes through an index like that:
start n=node:node_auto_index(type='File') return n;
Type nodes
Connect nodes of a certain type to 'type' node and query over relationships:
start type_node=node:node_auto_index(name='File')
match type_node<-[:IS_A]-file
return file;
(The Beer Graph on this page http://www.neo4j.org/learn/try is an example for this.)

Resources