neo4j index failed immediately without error - neo4j

NOTE this is not the same question Neo4j Index creation fails
We have 2 types of nodes in our DB which both have a string property called text. For node A this property is not unique and for node B this property is unique. (we never explicitly have any cypher code that creates any boundary/unique conditions on anything)
We want to have 2 indexes:
create index on :A(text)
create index on :B(text)
(i am currently running those cypher queries seperately from the browser interface)
the index on :B(text) takes a minute and works fine (4 million nodes), but the index on :A(text) (1 million nodes) fails immediately and eventhough i have also tried it with the python driver with the lowest debug level and looked through /var/log/neo4j/.. i can't find any indication as to why this is failing. Judging by the documentation the fact that :A(text) is not unique should not be a problem. Does anyone have hunch where something could be wrong?
EDIT:
Neo4j Version: 3.4.8
Longest :A(text) string: 17 chars (individual words)
Longest :B(text) string: 53 chars (short phrases)
EDIT: I realize fails immediately can be a bit of a vague term, so here exactly what happens when using the browser interface...
create index on :B(text)
returns
Added 1 index, completed after 1 ms.
immediately after
:schema
returns
Indexes
ON :B(text) POPULATING
after a while
:schema
returns
Indexes
ON :B(text) ONLINE
now for A
create index on :A(text)
returns
Added 1 index, completed after 1 ms.
immediately after
:schema
returns
Indexes
ON :B(text) ONLINE
ON :A(text) FAILED

Related

Simple Neo4j query is very slow on large database

I have a Neo4J database with the following properties:
Array Store 8.00 KiB
Logical Log 16 B
Node Store 174.54 MiB
Property Store 477.08 MiB
Relationship Store 3.99 GiB
String Store Size 174.34 MiB
MiB Total Store Size 5.41 GiB
There are 12M nodes and 125M relationships.
So you could say this is a pretty large database.
My OS is windows 10 64bit, running on an Intel i7-4500U CPU #1.80Ghz with 8GB of RAM.
This isn't a complete powerhouse, but it's a decent machine and in theory the total store could even fit in RAM.
However when I run a very simple query (using the Neo4j Browser)
MATCH (n {title:"A clockwork orange"}) RETURN n;
I get a result:
Returned 1 row in 17445 ms.
I also used a post request with the same query to http://localhost:7474/db/data/cypher, this took 19seconds.
something like this:
http://localhost:7474/db/data/node/15000
is however executed in 23ms...
And I can confirm there is an index on title:
Indexes
ON :Page(title) ONLINE
So anyone have ideas on why this might be running so slow?
Thanks!
This has to scan all nodes in the db - if you re-run your query using n:Page instead of just n, it'll use the index on those nodes and you'll get better results.
To expand this a bit more - INDEX ON :Page(title) is only for nodes with a :Page label, and in order to take advantage of that index your MATCH() needs to specify that label in its search.
If a MATCH() is specified without a label, the query engine has no "clue" what you're looking for so it has to do a full db scan in order to find all the nodes with a title property and check its value.
That's why
MATCH (n {title:"A clockwork orange"}) RETURN n;
is taking so long - it has to scan the entire db.
If you tell the MATCH() you're looking for a node with a :Page label and a title property -
MATCH (n:Page {title:"A clockwork orange"}) RETURN n;
the query engine knows you're looking for nodes with that label, it also knows that there's an index on that label it can use - which means it can perform your search with the performance you're looking for.

Neo4j Cypher Unknown Error

So I created a date dimension from this article
a link
I modified it and added datestamp to Day node which is Month/Day/Year (string)
I added indexes on Year.year, Month.month, Day.day && day.datestamp
When I run this query:
MATCH p=(day2:Day {datestamp:'1/1/2015'})-[:NEXT*]->(day {day:2})
return length(p)
limit 5
It takes 1667 ms to execute
When I modify the query to this:
MATCH p=(day2:Day {datestamp:'1/1/2015'})-[:NEXT*]->(day {datestamp:'1/2/2015'})
return length(p)
After it runs for about a minute, it ends in the Unknown Error message.
My schema is:
Indexes
ON :Day(day) ONLINE
ON :Day(datestamp) ONLINE
ON :Month(month) ONLINE
ON :Year(year) ONLINE
No constraints
Any ideas what I'm doing wrong?
I think I figured it out.
Looks like the 1st query that runs 1667ms only runs and completes because of limit 5, it finds 5 records and stops further execution.
While the other keeps going and going until it runs out of juice.
I think solution in this case is constraint that indicates datestamp is unique which should prevent further execution.
Still interesting, considering there's about 2600+ records connected with HAS_NEXT so traveling through those relationships shouldn't be taking this long to find out that there's only 1 record that matches that query.

How do i check for a label in neo4j 2.1.2 when using a legacy index?

I just upgraded to Neo4j 2.1.2 from 2.0.1 and some of my cypher-queries stopped working.
I am using a self-defined Lucene index to find the startnodes, navigate via a typed relationship (Partner_PartnerMeta) to a typed Node(PartnerTyp). After that i just return a subset of these nodes.
My query previously used to check for the type of startnode (PartnerMeta). Since 2.1.2 the query
START partnermeta = node:PartnerTyp_Meta("Namen:wilhelm*")
MATCH (partner:PartnerTyp)-[:Partner_PartnerMeta]->(partnermeta:PartnerMeta)
RETURN DISTINCT partner SKIP 0 LIMIT 10
results in
Cannot add labels or properties on a node which is already bound (line 2, column 52)
"MATCH (partner:PartnerTyp)-[:Partner_PartnerMeta]->(partnermeta:PartnerMeta)"
^
This error can be suppressed by omitting the ":PartnerMeta" part of the query. As the type of the node returned from the index hasn't been checked yet, i would like to verify that it is of the type "PartnerMeta" (maybe i am too paranoid that way).
My question is:
Is there a possibility to check for the type of node after the usage of START in combination with a legacy index?
This is a regression in Cypher 2.1.2 which will be fixed. It was an attempt to avoid invalid combinations of label checks.
For now, can you try:
START partnermeta = node:PartnerTyp_Meta("Namen:wilhelm*")
MATCH (partner:PartnerTyp)-[:Partner_PartnerMeta]->(partnermeta)
WHERE partnermeta:PartnerMeta
RETURN DISTINCT partner SKIP 0 LIMIT 10

Neo4j Cypher query fails and return with an Unknown Error

I'm trying to build a Cypher query to test if a specific structure exists so I can relate dates to it.
Running Neo4j 2.1.0-M01 on a Linux server, but the same issue occurred with Neo4j 2.0.1
We're starting with a clean database, 0 nodes.
First I'm running this MATCH query to prove that it runs.
Obviously this query is not going to return any nodes. But after creating the nodes, it will
fail with an 'unknown error'.
It seems like a bug to me, since a query with fewer nodes will return. Does anyone have suggestions how to rewrite this query for now?
Sorry for the large amount of code in this post.
Thanks,
-Edwin
Cypher Query:
MATCH (c:Cluster{cluster_name:'mycluster',cluster_uuid:'7bd4f66d-5faf-11db-8d0d-000e0cba569c'})
,(sc1:Controller{serialnumber:'7000610071',system_id:'1873784171',hostname:'node01',node_uuid:'7cd70205-66ae-11e0-a4a9-0deba859517d'})
,(sc1)-[:IS_PART_OF_CLUSTER]->(c)
,(sc2:Controller{serialnumber:'7000606111',system_id:'1873778118',hostname:'node02',node_uuid:'b954f0a1-6682-11e0-b8a0-517da924923d'})
,(sc2)-[:IS_PART_OF_CLUSTER]->(c)
,(sc3:Controller{serialnumber:'7000561878',system_id:'1873772083',hostname:'node03',node_uuid:'ac293586-6690-11e0-b8a0-517da924923d'})
,(sc3)-[:IS_PART_OF_CLUSTER]->(c)
,(sc4:Controller{serialnumber:'800000075807',system_id:'1873784143',hostname:'node04',node_uuid:'e8d6c7e5-663e-11e0-b8a0-517da924923d'})
,(sc4)-[:IS_PART_OF_CLUSTER]->(c)
,(sc5:Controller{serialnumber:'7000477261',system_id:'1873745662',hostname:'node05',node_uuid:'1d1ecc64-728c-11e0-bf0e-3d25383f7ed3'})
,(sc5)-[:IS_PART_OF_CLUSTER]->(c)
,(sc6:Controller{serialnumber:'7000477273',system_id:'1873745654',hostname:'node06',node_uuid:'140fb0f9-728c-11e0-afeb-49fcf0b6e6c3'})
,(sc6)-[:IS_PART_OF_CLUSTER]->(c)
,(sc7:Controller{serialnumber:'7000474908',system_id:'1873745665',hostname:'node07',node_uuid:'edbf9c62-728b-11e0-bf0e-3d25383f7ed3'})
,(sc7)-[:IS_PART_OF_CLUSTER]->(c)
,(sc8:Controller{serialnumber:'7000474910',system_id:'1873745695',hostname:'node08',node_uuid:'20dbfe67-7832-11e0-afeb-49fcf0b6e6c3'})
,(sc8)-[:IS_PART_OF_CLUSTER]->(c)
,(sc9:Controller{serialnumber:'7000609864',system_id:'1873802756',hostname:'node09',node_uuid:'a8b75397-6690-11e0-b8a0-517da924923d'})
,(sc9)-[:IS_PART_OF_CLUSTER]->(c)
,(sc10:Controller{serialnumber:'7000610021',system_id:'1873791030',hostname:'node10',node_uuid:'f6cf0705-6670-11e0-b8a0-517da924923d'})
,(sc10)-[:IS_PART_OF_CLUSTER]->(c)
,(sc11:Controller{serialnumber:'7000610057',system_id:'1873784128',hostname:'node11',node_uuid:'551b1bf8-663d-11e0-b8a0-517da924923d'})
,(sc11)-[:IS_PART_OF_CLUSTER]->(c)
,(sc12:Controller{serialnumber:'7000609981',system_id:'1873778164',hostname:'node12',node_uuid:'f6062d6c-663e-11e0-9b53-cd0ece6aa2ce'})
,(sc12)-[:IS_PART_OF_CLUSTER]->(c)
,(sc13:Controller{serialnumber:'7000610033',system_id:'1873778186',hostname:'node13',node_uuid:'ed0e61c5-6670-11e0-b07f-933da0385fdc'})
,(sc13)-[:IS_PART_OF_CLUSTER]->(c)
,(sc14:Controller{serialnumber:'7000610069',system_id:'1873784175',hostname:'node14',node_uuid:'8623ea28-66ae-11e0-ab9d-5fdc7f30dee7'})
,(sc14)-[:IS_PART_OF_CLUSTER]->(c)
,(sc15:Controller{serialnumber:'7000606109',system_id:'1873778197',hostname:'node15',node_uuid:'b4349a83-6682-11e0-ab9d-5fdc7f30dee7'})
,(sc15)-[:IS_PART_OF_CLUSTER]->(c)
,(sc16:Controller{serialnumber:'7000610045',system_id:'1873784157',hostname:'node16',node_uuid:'67d80db8-663d-11e0-9b53-cd0ece6aa2ce'})
,(sc16)-[:IS_PART_OF_CLUSTER]->(c)
,(sc17:Controller{serialnumber:'7001246085',system_id:'2014175904',hostname:'node19',node_uuid:'1c792588-ff4e-11db-94fe-3b91dd7dd242'})
,(sc17)-[:IS_PART_OF_CLUSTER]->(c)
,(sc18:Controller{serialnumber:'7001246097',system_id:'2014176797',hostname:'node20',node_uuid:'3ae9e7ae-ff44-11db-864d-af2af862bba3'})
,(sc18)-[:IS_PART_OF_CLUSTER]->(c)
RETURN c,sc1,sc2,sc3,sc4,sc5,sc6,sc7,sc8,sc9,sc10,sc11,sc12,sc13,sc14,sc15,sc16,sc17,sc18
Rows returned: 0
Cypher Query for creating the nodes:
CREATE (c:Cluster{cluster_name:'mycluster',cluster_uuid:'7bd4f66d-5faf-11db-8d0d-000e0cba569c'})
,(sc1:Controller{serialnumber:'7000610071',system_id:'1873784171',hostname:'node01',node_uuid:'7cd70205-66ae-11e0-a4a9-0deba859517d'})
,(sc1)-[:IS_PART_OF_CLUSTER]->(c)
,(sc2:Controller{serialnumber:'7000606111',system_id:'1873778118',hostname:'node02',node_uuid:'b954f0a1-6682-11e0-b8a0-517da924923d'})
,(sc2)-[:IS_PART_OF_CLUSTER]->(c)
,(sc3:Controller{serialnumber:'7000561878',system_id:'1873772083',hostname:'node03',node_uuid:'ac293586-6690-11e0-b8a0-517da924923d'})
,(sc3)-[:IS_PART_OF_CLUSTER]->(c)
,(sc4:Controller{serialnumber:'800000075807',system_id:'1873784143',hostname:'node04',node_uuid:'e8d6c7e5-663e-11e0-b8a0-517da924923d'})
,(sc4)-[:IS_PART_OF_CLUSTER]->(c)
,(sc5:Controller{serialnumber:'7000477261',system_id:'1873745662',hostname:'node05',node_uuid:'1d1ecc64-728c-11e0-bf0e-3d25383f7ed3'})
,(sc5)-[:IS_PART_OF_CLUSTER]->(c)
,(sc6:Controller{serialnumber:'7000477273',system_id:'1873745654',hostname:'node06',node_uuid:'140fb0f9-728c-11e0-afeb-49fcf0b6e6c3'})
,(sc6)-[:IS_PART_OF_CLUSTER]->(c)
,(sc7:Controller{serialnumber:'7000474908',system_id:'1873745665',hostname:'node07',node_uuid:'edbf9c62-728b-11e0-bf0e-3d25383f7ed3'})
,(sc7)-[:IS_PART_OF_CLUSTER]->(c)
,(sc8:Controller{serialnumber:'7000474910',system_id:'1873745695',hostname:'node08',node_uuid:'20dbfe67-7832-11e0-afeb-49fcf0b6e6c3'})
,(sc8)-[:IS_PART_OF_CLUSTER]->(c)
,(sc9:Controller{serialnumber:'7000609864',system_id:'1873802756',hostname:'node09',node_uuid:'a8b75397-6690-11e0-b8a0-517da924923d'})
,(sc9)-[:IS_PART_OF_CLUSTER]->(c)
,(sc10:Controller{serialnumber:'7000610021',system_id:'1873791030',hostname:'node10',node_uuid:'f6cf0705-6670-11e0-b8a0-517da924923d'})
,(sc10)-[:IS_PART_OF_CLUSTER]->(c)
,(sc11:Controller{serialnumber:'7000610057',system_id:'1873784128',hostname:'node11',node_uuid:'551b1bf8-663d-11e0-b8a0-517da924923d'})
,(sc11)-[:IS_PART_OF_CLUSTER]->(c)
,(sc12:Controller{serialnumber:'7000609981',system_id:'1873778164',hostname:'node12',node_uuid:'f6062d6c-663e-11e0-9b53-cd0ece6aa2ce'})
,(sc12)-[:IS_PART_OF_CLUSTER]->(c)
,(sc13:Controller{serialnumber:'7000610033',system_id:'1873778186',hostname:'node13',node_uuid:'ed0e61c5-6670-11e0-b07f-933da0385fdc'})
,(sc13)-[:IS_PART_OF_CLUSTER]->(c)
,(sc14:Controller{serialnumber:'7000610069',system_id:'1873784175',hostname:'node14',node_uuid:'8623ea28-66ae-11e0-ab9d-5fdc7f30dee7'})
,(sc14)-[:IS_PART_OF_CLUSTER]->(c)
,(sc15:Controller{serialnumber:'7000606109',system_id:'1873778197',hostname:'node15',node_uuid:'b4349a83-6682-11e0-ab9d-5fdc7f30dee7'})
,(sc15)-[:IS_PART_OF_CLUSTER]->(c)
,(sc16:Controller{serialnumber:'7000610045',system_id:'1873784157',hostname:'node16',node_uuid:'67d80db8-663d-11e0-9b53-cd0ece6aa2ce'})
,(sc16)-[:IS_PART_OF_CLUSTER]->(c)
,(sc17:Controller{serialnumber:'7001246085',system_id:'2014175904',hostname:'node19',node_uuid:'1c792588-ff4e-11db-94fe-3b91dd7dd242'})
,(sc17)-[:IS_PART_OF_CLUSTER]->(c)
,(sc18:Controller{serialnumber:'7001246097',system_id:'2014176797',hostname:'node20',node_uuid:'3ae9e7ae-ff44-11db-864d-af2af862bba3'})
,(sc18)-[:IS_PART_OF_CLUSTER]->(c)
RETURN c,sc1,sc2,sc3,sc4,sc5,sc6,sc7,sc8,sc9,sc10,sc11,sc12,sc13,sc14,sc15,sc16,sc17,sc18
19 nodes created, 18 relationships.
Now when I run the first query again, it takes one minute and will eventually return with 'Unknown error'.
Cypher Query:
MATCH (c:Cluster{cluster_name:'mycluster',cluster_uuid:'7bd4f66d-5faf-11db-8d0d-000e0cba569c'})
,(sc1:Controller{serialnumber:'7000610071',system_id:'1873784171',hostname:'node01',node_uuid:'7cd70205-66ae-11e0-a4a9-0deba859517d'})
,(sc1)-[:IS_PART_OF_CLUSTER]->(c)
,(sc2:Controller{serialnumber:'7000606111',system_id:'1873778118',hostname:'node02',node_uuid:'b954f0a1-6682-11e0-b8a0-517da924923d'})
,(sc2)-[:IS_PART_OF_CLUSTER]->(c)
,(sc3:Controller{serialnumber:'7000561878',system_id:'1873772083',hostname:'node03',node_uuid:'ac293586-6690-11e0-b8a0-517da924923d'})
,(sc3)-[:IS_PART_OF_CLUSTER]->(c)
,(sc4:Controller{serialnumber:'800000075807',system_id:'1873784143',hostname:'node04',node_uuid:'e8d6c7e5-663e-11e0-b8a0-517da924923d'})
,(sc4)-[:IS_PART_OF_CLUSTER]->(c)
,(sc5:Controller{serialnumber:'7000477261',system_id:'1873745662',hostname:'node05',node_uuid:'1d1ecc64-728c-11e0-bf0e-3d25383f7ed3'})
,(sc5)-[:IS_PART_OF_CLUSTER]->(c)
,(sc6:Controller{serialnumber:'7000477273',system_id:'1873745654',hostname:'node06',node_uuid:'140fb0f9-728c-11e0-afeb-49fcf0b6e6c3'})
,(sc6)-[:IS_PART_OF_CLUSTER]->(c)
,(sc7:Controller{serialnumber:'7000474908',system_id:'1873745665',hostname:'node07',node_uuid:'edbf9c62-728b-11e0-bf0e-3d25383f7ed3'})
,(sc7)-[:IS_PART_OF_CLUSTER]->(c)
,(sc8:Controller{serialnumber:'7000474910',system_id:'1873745695',hostname:'node08',node_uuid:'20dbfe67-7832-11e0-afeb-49fcf0b6e6c3'})
,(sc8)-[:IS_PART_OF_CLUSTER]->(c)
,(sc9:Controller{serialnumber:'7000609864',system_id:'1873802756',hostname:'node09',node_uuid:'a8b75397-6690-11e0-b8a0-517da924923d'})
,(sc9)-[:IS_PART_OF_CLUSTER]->(c)
,(sc10:Controller{serialnumber:'7000610021',system_id:'1873791030',hostname:'node10',node_uuid:'f6cf0705-6670-11e0-b8a0-517da924923d'})
,(sc10)-[:IS_PART_OF_CLUSTER]->(c)
,(sc11:Controller{serialnumber:'7000610057',system_id:'1873784128',hostname:'node11',node_uuid:'551b1bf8-663d-11e0-b8a0-517da924923d'})
,(sc11)-[:IS_PART_OF_CLUSTER]->(c)
,(sc12:Controller{serialnumber:'7000609981',system_id:'1873778164',hostname:'node12',node_uuid:'f6062d6c-663e-11e0-9b53-cd0ece6aa2ce'})
,(sc12)-[:IS_PART_OF_CLUSTER]->(c)
,(sc13:Controller{serialnumber:'7000610033',system_id:'1873778186',hostname:'node13',node_uuid:'ed0e61c5-6670-11e0-b07f-933da0385fdc'})
,(sc13)-[:IS_PART_OF_CLUSTER]->(c)
,(sc14:Controller{serialnumber:'7000610069',system_id:'1873784175',hostname:'node14',node_uuid:'8623ea28-66ae-11e0-ab9d-5fdc7f30dee7'})
,(sc14)-[:IS_PART_OF_CLUSTER]->(c)
,(sc15:Controller{serialnumber:'7000606109',system_id:'1873778197',hostname:'node15',node_uuid:'b4349a83-6682-11e0-ab9d-5fdc7f30dee7'})
,(sc15)-[:IS_PART_OF_CLUSTER]->(c)
,(sc16:Controller{serialnumber:'7000610045',system_id:'1873784157',hostname:'node16',node_uuid:'67d80db8-663d-11e0-9b53-cd0ece6aa2ce'})
,(sc16)-[:IS_PART_OF_CLUSTER]->(c)
,(sc17:Controller{serialnumber:'7001246085',system_id:'2014175904',hostname:'node19',node_uuid:'1c792588-ff4e-11db-94fe-3b91dd7dd242'})
,(sc17)-[:IS_PART_OF_CLUSTER]->(c)
,(sc18:Controller{serialnumber:'7001246097',system_id:'2014176797',hostname:'node20',node_uuid:'3ae9e7ae-ff44-11db-864d-af2af862bba3'})
,(sc18)-[:IS_PART_OF_CLUSTER]->(c)
RETURN c,sc1,sc2,sc3,sc4,sc5,sc6,sc7,sc8,sc9,sc10,sc11,sc12,sc13,sc14,sc15,sc16,sc17,sc18
Returns Unknown Error
Your MATCH query is way to complicated. There's no need to specify every node. The following query returns the cluster and its related controllers:
MATCH (c:Cluster{cluster_name:'mycluster',cluster_uuid:'7bd4f66d-5faf-11db-8d0d-000e0cba569c'})<-[:IS_PART_OF_CLUSTER]-(sc:Controller)
WITH c, collect(sc) as controllers
RETURN c as cluster, controllers

neo4j REST 'Server got itself in trouble'

I am running a very basic test to check my understanding and evaluate neo4j REST server (neo4j-community-1.8.M07). I am using Neo4j Python REST Client.
Each test iteration starts with a random strings for the source node name and the destination node name. The names contain only letters a..z and numbers 0..9 (oddly enough, I never got it to fail if I use A..Z and 0..9). The name may be from one char to 36 chars long and there are no repeating chars. I create 36 nodes, where the 1-st node name is only one char long and the 36-th node name has 36 chars. Then I create relations between all nodes. The name of each relation is the concatenation of the source node name and the destination node name. The final graph has 37 nodes (1 reference node and 36 nodes with names from one char to 36 non-repeating chars) and 1260 relations. Before each test iteration I clear the graph, so that it has only one (the reference) node.
The problem is that after several successful iterations neo4j REST server crashes:
Error [500]: Internal Server Error. Server got itself in trouble.
Invalid data sent
The query that crashes the system can be different - here is an example of a query_string that caused a problem:
START n_from=node:index_faqts(node_name="h"),
n_to=node:index_faqts(node_name="hg2b8wpj04ms")CREATE UNIQUE
n_from-[r:`hhg2b8wpj04ms` ]->n_to RETURN r
self.cypher_extension.execute_query( query_string )
I spent a lot of time trying to find a trend, but in vain. If I did something wrong with the queries none of the tests would ever work. I have observed crashes for number of successful test cycles between 5 and 25 rounds.
What might be causing neo4j REST server to crash?
P.S. Some details...
The nodes are created like this:
...
self.index_faqts[ "node_name" ][ p_str_node_name ] =
self.gdb.nodes.create( **p_dict_node_attributes )
...
Just in case - before issuing the query to create a new relation I check the graph to make sure that the
source and the destination nodes exist. That check never failed.
You are using too many relationship-types, currently the limit is at 32k. Might be patched in Neo4j if you have a valid use-case.

Resources