I am trying to dump some date to Neo4J. Some of my node names (in the chosen format for dumping) has numbers, which have to be exported as node-names.
I encounter the following error when the node name or label starts with a number.
Neo.ClientError.Statement.InvalidSyntax
MERGE (1:User {name: "u1"})
Is this because, internally neo4j has a unique ID?. How do we circumvent this problem?
I believe these are just the syntax rules Neo4j uses. Also keep in mind that the thing you are referring to as the node name (1, in your example) is actually a variable name, and only persists for the duration of the query (or until it leaves scope if not carried over in a WITH clause to the next part of the query).
From the developer documentation:
Variable names are case sensitive, and can contain underscores and
alphanumeric characters (a-z, 0-9), but must always start with a
letter...The same rules apply to property names.
While I didn't see anything about label names, it looks like it follows the same syntax rules.
Property values, of course, can be anything you want.
You described the limitation as a "problem", so I'm guessing there's a perceived issue with this in your import, likely around the confusion between variables and what you called node names. If that's so, then please add some more details to your description, and I can add on to my answer accordingly.
Related
I believe this code demonstrates a bug in this software. But given my lack of experience
with Neo4j, maybe something is coded incorrectly. I would like to know if the unexpected output is due to a bug or how I should change my code
to get the correct output.
Thanks for your help.
my error text
MATCH (T8) matches every node in the database and assigns them to a variable named T8.
I believe you want to use MATCH (:T8) to match every node in the database with the label T8.
To delete every node with a T8 label you can use
MATCH (t:T8) DETACH DELETE (t)
The DETACH portion of the DELETE first deletes any relationships in or out of the node and then deletes the node.
I think you may have intended to use node labels for your nodes. You only used variables, which just have values within a statement execution. Variable values are not stored in the DB, whereas labels are.
Therefore, in your queries, there is nothing stored in the DB that distinguishes nodes that were referenced using the variable name T7 from nodes referenced using the variable name T8.
Here is an example of a Cypher node pattern with a variable name, foo, but no label:
(foo)
And here is an example with the same variable name and also a label, Bar:
(foo:Bar)
Notice that label names must be preceded by a colon (:) within a Cypher node pattern.
I faced a need to make a strange thing. I have some query which is can’t be changed. It’s a match query for getting record:
MATCH (j:journal) WHERE j.id in [12] RETURN j.`id` AS ID, j.`language` AS LANGUAGE
And I have some node that contains array as property: e.g. can be created like this: create (j:journal {id:12, language:[“English”, “Polish”]})
So, is there any possibility to display this node like two records with the same id, but with different language fields? Like the following:
ID | LANGUAGE
12 | English
12 | Polish
The important thing is that match query can’t be changed at all.
But the node can be changed.
I know that I can add UNWIND keyword for the language field in the source query. But there is a requirement to not to.
I didn’t find something like that in the documentation nor in the internet. I’m not sure if it’s even possible (but consumer wants it). Just I don’t have much experience with neo4j.
I understand that it can sound weird, but I need to understand if it can be implemented this way.
Thanks in advance.
If you can change the DB, you can change it so that each journal node contains a single language (as a scalar value, not in a list). However, this change might break any other queries that you might have.
If this conversion is acceptable, here is a query that should: (a) convert existing journal nodes to have a scalar language value, and (b) create new journal nodes as necessary for the remaining language values. The nodes that are spawned from an original journal node will share the same properties (except for language).
MATCH (j:journal)
WITH j, j.language[1..] AS langs
SET j.language = j.language[0]
WITH j, langs
UNWIND langs AS lang
CREATE (k:journal)
SET k = j, k.language = lang
If a node's language property had N values, you will end up with N nodes, each with the same properties -- except for the language property, which will contain a different language value (as a string). For efficiency, the original node is reused.
I am going over this YouTube tutorial, "Using LOAD CSV in the Real World".
The tutorial shows how to take a CSV, where each row is a complaint made against some bank, and model it as a Neo4j dictionary.
When doing so, the narrator sets Properties on the Complaint node:
CREATE (complaint:Complaint {id: line.`Complaint ID`})
SET complaint.year= TOINT(date[2]),
complaint.month= TOINT(date[0]),
complaint.day = TOINT(date[1])
I'm confused about a small point -- what makes this date information more of a 'Property' than a Label?
Could this be modeled instead where the node has this information encapsulated as Labels instead of Properties? At what point do you need one of these and not the other?
Labels and properties are very different things.
A property belongs to a node or a relationship, and has a name and a value.
A node label is similar in concept to a "class name", and has no value.
So, it does not make any sense to talk about putting a date value in a "label". You can only put a value in a property.
Note, however, that people often use a label name (e.g., "Foo") as a shorthand for "node that has the Foo label". For example, they may say "store the date in Foo" when they actually mean "store the date in the appropriate property of a node with the label Foo". Perhaps this is what is causing the confusion.
As cybersam pointed out in his answer, labels cannot contain values. They are just... labels. Like a tag. Taking this in a slightly different direction:
A long, long time ago, in a version far, far away, Neo4j didn't have labels. So, if you wanted to identify a particular type of node (say... a Person)... you'd likely include a property+value such as nodeType = 'Person'. And then you'd include a filter in your queries, such as:
WHERE node.nodeType = 'Person'
Labels make such a property type obsolete, and are also indexable. Further, you may have multiple labels on a node (which would require your legacy nodeType property to be an array, and not as efficient to search).
So: Labels for tagging/indexing. Properties for holding values.
" Node labels serve as an anchor point for a query. By specifying a label, we are specifying a subset of one or more nodes with which to start a query. Using a label helps to reduce the amount of data that is retrieved." https://graphacademy.neo4j.com/
In my recent question, Modeling conditional relationships in neo4j v.2 (cypher), the answer has led me to another question regarding my data model and the cypher syntax to represent it. Lets say in my model, there is a node CLT1 that is what I'll call the Source node. CLT1 has relationships to other 286 Target nodes. This is a model of a target node:
CREATE
(Abnormally_high:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{Pro1:'x',Prop2:'y',Prop3:'z'})
Key point: I am assuming the string after the CREATE clause is
The ID of this target node
The ID is significant because its content has domain-specific meaning
and is query-able.
in this case its the phrase ...."Abnormally_high".
I made this assumption based on the movie database example.
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
The first strings after CREATE definitely have domain-specific meaning!
In my earlier post I discuss Problem 2. I find that problem 2 arises because among the 286 target nodes, there are many instances where there was at least one more Target node who shares the identical ID. In this instance, the ID is "Abnormally_high". The other Target nodes may differ in the value of any of Label1 - Label10 or the associated properties.
Apparently, Cypher doesn't like that. In Problem 2, I was discussing the ways to deal with the fact that cypher doesn't like using the same node ID multiple times even though the labels or properties were different.
My problem are my assumptions about the Target node ID.
AM I RIGHT?
I am now thinking that I could instead use this....
CREATE (CLT1_target_1:Label1:Label2:Label3:Label4:Label5:Label6:Label7:Label8:Label9:Label10
{name:'Abnormally_high',Prop2:'y',Prop3:'z'})
If indeed the first string after the CREATE clause is an ID, then all I have to do is put a unique target node identifier.... like CLT1_target_1 and increment up to CLT1_target_286. If I do this, then I can have the name as a property and change whatever label or property I want.
Do I have this right?
You are wrong. In Cypher, a node name (like "Abnormally_high") is just a variable name that exists for the lifetime of the query (and sometimes not even that long). The node name used in a Cypher query is never persisted in any way, and can be any arbitrary string.
Also, in neo4j, the term "ID" has a specific meaning. The neo4j DB will automatically assign a (currently) unique integer ID to each new node. You have no control over the ID value assigned to a node. And when a node is deleted, neo4j can reassign its ID to a new node.
You should read the neo4j manual (available at docs.neo4j.org), especially the section on Cypher, to get a better understanding.
START names = node(*),
target=node:node_auto_index(target_name="TARGET_1")
MATCH names
WHERE NOT names-[:contains]->()
AND HAS (names.age)
AND (names.qualification =~ ".*(?i)B.TECH.*$"
OR names.qualification =~ ".*(?i)B.E.*$")
CREATE UNIQUE (names)-[r:contains{type:"declared"}]->(target)
RETURN names.name,names,names.qualification
Iam consisting of nearly 1,80,000 names nodes, i had iterated the above process to create unique relationships above 100 times by changing the target. its taking too much amount of time.How can i resolve it..
i build the query with java and iterated.iam using neo4j 2.0.0.5 and java 1.7 .
I edited your cypher query because I think I understand it, but I can barely read the rest of your question. If you edit it with white spaces and punctuation it might be easier to understand what you are trying to do. Until then, here are some thoughts about your query being slow.
You bind all the nodes in the graph, that's typically pretty slow.
You bind all the nodes in the graph twice. First you bind universally in your start clause: names=node(*), and then you bind universally in your match clause: MATCH names, and only then you limit your pattern. I don't quite know what the Cypher engine makes of this (possibly it gets a migraine and goes off to make a pot of coffee). It's unnecessary, you can at least drop the names=node(*) from your start clause. Or drop the match clause, I suppose that could work too, since you don't really do anything there, and you will still need a start clause for as long as you use legacy indexing.
You are using Neo4j 2.x, but you use legacy indexing instead of labels, at least in this query. Without knowing your data and model it's hard to know what the difference would be for performance, but it would certainly make it much easier to write (and read) your queries. So, that's a different kind of slow. It's likely that if you had labels and label indices, the query performance would improve.
So, first try removing one of the universal bindings of nodes, then use the 2.x schema tools to structure your data. You should be able to write queries like
MATCH target:Target
WHERE target.target_name="TARGET_1"
WITH target
MATCH names:Name
WHERE NOT names-[:contains]->()
AND HAS (names.age)
AND (names.qualification =~ ".*(?i)B.TECH.*$"
OR names.qualification =~ ".*(?i)B.E.*$")
CREATE UNIQUE (names)-[r:contains{type:"declared"}]->(target)
RETURN names.name,names,names.qualification
I have no idea if such a query would be fast on your data, however. If you put the "Name" label on all your nodes, then MATCH names:Name will still bind all nodes in the database, so it'll probably still be slow.
P.S. The relationships you create have a TYPE called contains, and you give them a property called type with value declared. Maybe you have a good reason, but that's potentially very confusing.
Edit:
Reading through your question and my answer again I no longer think that I understand even your cypher query. (Why are you returning both the bound nodes and properties of those nodes?) Please consider posting sample data on console.neo4j.org and explain in more detail what your model looks like and what you are trying to do. Let me know if my answer meets your question at all or I'll consider removing it.