How can I revert syntax from Cypher 2.0.1 to Cypher 2.0.0 for my Neo4j queries? - neo4j

I am using a series of nested FOREACH loops in a query that functions properly on a number of installations of Neo4j with matching datasets that we've used for testing. With the recent change to Cypher 2.0.1, my query doesn't work.
My initial instinct is to replace the /var/lib/neo4j/lib/neo4j-cypher-2.0.0.jar files, but I don't want to screw things up. Any thoughts?
Sample graph: http://console.neo4j.org/?id=ktrcwx
Here's the Query (my emphasis "**" indicates the point where error occurs):
$ MATCH (total:Recipe)
> WITH count(DISTINCT total) AS tots, timestamp() AS time
> MATCH (ia:Ingredient)<-[:HAS_INGREDIENT]-(recab:Recipe)-[recHasB:HAS_INGREDIENT]->(ib:Ingredient)
> WHERE id(ia)=5167
> WITH DISTINCT ib AS idB, count(DISTINCT recab) AS recAB , count(DISTINCT recHasB) AS recB, tots, time
> MATCH (i:Ingredient)<-[:HAS_INGREDIENT]-(r:Recipe)
> WHERE id(i)=5167
> WITH [i, count(DISTINCT r.id), idB, recAB, recB, tots, time] AS c
> FOREACH (row IN c |
> FOREACH (i1 in **c[0] |**
> FOREACH (recA in c[1] |
> FOREACH (i2 in c[2]|
> FOREACH (recAB in c[3] |
> FOREACH (recB in c[4] |
> FOREACH (totalRec in c[5] |
> CREATE (i1:Ingredient )-[pm1:PMI]->(i2: Ingredient)
> SET startNode(pm1).pmiTime = c[6], endNode(pm1).pmiTime = c[6], pm1.weight = log( (totalRec*recAB) /(recA*recB) ), pm1.pmiTime= c[6]
> CREATE (i1:Ingredient )<-[pm2:PMI]-(i2: Ingredient)
> SET startNode(pm2).pmiTime = c[6], endNode(pm2).pmiTime = c[6], pm2.weight = log( (totalRec*recAB) /(recA*recB) ), pm2.pmiTime= c[6]
> )
> )
> )
> )
> )
> )
> );
Here's the Error:
SyntaxException: Type mismatch: expected Collection<T> but was Any (line 10, column 25)" FOREACH (i1 in c[0] |"
Here's the Functioning Classpath:
Neo4j Server is running at pid 4347
NEO4J_HOME: /var/lib/neo4j
NEO4J_SERVER_PORT: 7474
NEO4J_INSTANCE: /var/lib/neo4j
JAVA_HOME:
JAVA_OPTS: -server -XX:+DisableExplicitGC -Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
CLASSPATH: /var/lib/neo4j/lib/concurrentlinkedhashmap-lru-1.3.1.jar:/var/lib/neo4j/lib/geronimo-jta_1.1_spec-1.1.1.jar:/var/lib/neo4j/lib/lucene-core-3.6.2.jar:/var/lib/neo4j/lib/neo4j-cypher-2.0.0.jar:/var/lib/neo4j/lib/neo4j-cypher-commons-2.0.0.jar:/var/lib/neo4j/lib/neo4j-cypher-compiler-1.9-2.0.0.jar:/var/lib/neo4j/lib/neo4j-cypher-compiler-2.0-2.0.0.jar:/var/lib/neo4j/lib/neo4j-graph-algo-2.0.0.jar:/var/lib/neo4j/lib/neo4j-graph-matching-2.0.0.jar:/var/lib/neo4j/lib/neo4j-jmx-2.0.0.jar:/var/lib/neo4j/lib/neo4j-kernel-2.0.0.jar:/var/lib/neo4j/lib/neo4j-lucene-index-2.0.0.jar:/var/lib/neo4j/lib/neo4j-shell-2.0.0.jar:/var/lib/neo4j/lib/neo4j-udc-2.0.0.jar:/var/lib/neo4j/lib/org.apache.servicemix.bundles.jline-0.9.94_1.jar:/var/lib/neo4j/lib/parboiled-core-1.1.6.jar:/var/lib/neo4j/lib/parboiled-scala_2.10-1.1.6.jar:/var/lib/neo4j/lib/scala-library-2.10.3.jar:/var/lib/neo4j/lib/server-api-2.0.0.jar:/var/lib/neo4j/system/lib/asm-3.1.jar:/var/lib/neo4j/system/lib/bcprov-jdk16-140.jar:/var/lib/neo4j/system/lib/commons-beanutils-1.8.0.jar:/var/lib/neo4j/system/lib/commons-beanutils-core-1.8.0.jar:/var/lib/neo4j/system/lib/commons-collections-3.2.1.jar:/var/lib/neo4j/system/lib/commons-compiler-2.6.1.jar:/var/lib/neo4j/system/lib/commons-configuration-1.6.jar:/var/lib/neo4j/system/lib/commons-digester-1.8.1.jar:/var/lib/neo4j/system/lib/commons-io-1.4.jar:/var/lib/neo4j/system/lib/commons-lang-2.4.jar:/var/lib/neo4j/system/lib/commons-logging-1.1.1.jar:/var/lib/neo4j/system/lib/jackson-core-asl-1.9.7.jar:/var/lib/neo4j/system/lib/jackson-jaxrs-1.9.7.jar:/var/lib/neo4j/system/lib/jackson-mapper-asl-1.9.7.jar:/var/lib/neo4j/system/lib/janino-2.6.1.jar:/var/lib/neo4j/system/lib/javax.servlet-3.0.0.v201112011016.jar:/var/lib/neo4j/system/lib/jcl-over-slf4j-1.6.1.jar:/var/lib/neo4j/system/lib/jersey-core-1.9.jar:/var/lib/neo4j/system/lib/jersey-multipart-1.9.jar:/var/lib/neo4j/system/lib/jersey-server-1.9.jar:/var/lib/neo4j/system/lib/jetty-http-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-io-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-security-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-server-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-servlet-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-util-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-webapp-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-xml-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jsr311-api-1.1.2.r612.jar:/var/lib/neo4j/system/lib/logback-access-1.0.9.jar:/var/lib/neo4j/system/lib/logback-classic-1.0.9.jar:/var/lib/neo4j/system/lib/logback-core-1.0.9.jar:/var/lib/neo4j/system/lib/mimepull-1.6.jar:/var/lib/neo4j/system/lib/neo4j-browser-2.0.0.jar:/var/lib/neo4j/system/lib/neo4j-server-2.0.0.jar:/var/lib/neo4j/system/lib/neo4j-server-2.0.0-static-web.jar:/var/lib/neo4j/system/lib/rhino-1.7R3.jar:/var/lib/neo4j/system/lib/rrd4j-2.0.7.jar:/var/lib/neo4j/system/lib/slf4j-api-1.6.2.jar:/var/lib/neo4j/conf/
Here's the Malfunctioning Classpath:
Neo4j Server is running at pid 1361
NEO4J_HOME: /var/lib/neo4j
NEO4J_SERVER_PORT: 7474
NEO4J_INSTANCE: /var/lib/neo4j
JAVA_HOME:
JAVA_OPTS: -server -XX:+DisableExplicitGC - Dorg.neo4j.server.properties=conf/neo4j-server.properties -Djava.util.logging.config.file=conf/logging.properties -Dlog4j.configuration=file:conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled
CLASSPATH: /var/lib/neo4j/lib/concurrentlinkedhashmap-lru-1.3.1.jar:/var/lib/neo4j/lib/geronimo-jta_1.1_spec-1.1.1.jar:/var/lib/neo4j/lib/lucene-core-3.6.2.jar:/var/lib/neo4j/lib/neo4j-cypher-2.0.1.jar:/var/lib/neo4j/lib/neo4j-cypher-commons-2.0.1.jar:/var/lib/neo4j/lib/neo4j-cypher-compiler-1.9-2.0.1.jar:/var/lib/neo4j/lib/neo4j-cypher-compiler-2.0-2.0.1.jar:/var/lib/neo4j/lib/neo4j-graph-algo-2.0.1.jar:/var/lib/neo4j/lib/neo4j-graph-matching-2.0.1.jar:/var/lib/neo4j/lib/neo4j-jmx-2.0.1.jar:/var/lib/neo4j/lib/neo4j-kernel-2.0.1.jar:/var/lib/neo4j/lib/neo4j-lucene-index-2.0.1.jar:/var/lib/neo4j/lib/neo4j-shell-2.0.1.jar:/var/lib/neo4j/lib/neo4j-udc-2.0.1.jar:/var/lib/neo4j/lib/org.apache.servicemix.bundles.jline-0.9.94_1.jar:/var/lib/neo4j/lib/parboiled-core-1.1.6.jar:/var/lib/neo4j/lib/parboiled-scala_2.10-1.1.6.jar:/var/lib/neo4j/lib/scala-library-2.10.3.jar:/var/lib/neo4j/lib/server-api-2.0.1.jar:/var/lib/neo4j/system/lib/asm-3.1.jar:/var/lib/neo4j/system/lib/bcprov-jdk16-140.jar:/var/lib/neo4j/system/lib/commons-beanutils-1.8.0.jar:/var/lib/neo4j/system/lib/commons-beanutils-core-1.8.0.jar:/var/lib/neo4j/system/lib/commons-collections-3.2.1.jar:/var/lib/neo4j/system/lib/commons-compiler-2.6.1.jar:/var/lib/neo4j/system/lib/commons-configuration-1.6.jar:/var/lib/neo4j/system/lib/commons-digester-1.8.1.jar:/var/lib/neo4j/system/lib/commons-io-1.4.jar:/var/lib/neo4j/system/lib/commons-lang-2.4.jar:/var/lib/neo4j/system/lib/commons-logging-1.1.1.jar:/var/lib/neo4j/system/lib/jackson-core-asl-1.9.7.jar:/var/lib/neo4j/system/lib/jackson-jaxrs-1.9.7.jar:/var/lib/neo4j/system/lib/jackson-mapper-asl-1.9.7.jar:/var/lib/neo4j/system/lib/janino-2.6.1.jar:/var/lib/neo4j/system/lib/javax.servlet-3.0.0.v201112011016.jar:/var/lib/neo4j/system/lib/jcl-over-slf4j-1.6.1.jar:/var/lib/neo4j/system/lib/jersey-core-1.9.jar:/var/lib/neo4j/system/lib/jersey-multipart-1.9.jar:/var/lib/neo4j/system/lib/jersey-server-1.9.jar:/var/lib/neo4j/system/lib/jetty-http-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-io-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-security-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-server-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-servlet-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-util-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-webapp-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jetty-xml-9.0.5.v20130815.jar:/var/lib/neo4j/system/lib/jsr311-api-1.1.2.r612.jar:/var/lib/neo4j/system/lib/logback-access-1.0.9.jar:/var/lib/neo4j/system/lib/logback-classic-1.0.9.jar:/var/lib/neo4j/system/lib/logback-core-1.0.9.jar:/var/lib/neo4j/system/lib/mimepull-1.6.jar:/var/lib/neo4j/system/lib/neo4j-browser-2.0.1.jar:/var/lib/neo4j/system/lib/neo4j-server-2.0.1.jar:/var/lib/neo4j/system/lib/neo4j-server-2.0.1-static-web.jar:/var/lib/neo4j/system/lib/rhino-1.7R3.jar:/var/lib/neo4j/system/lib/rrd4j-2.0.7.jar:/var/lib/neo4j/system/lib/slf4j-api-1.6.2.jar:/var/lib/neo4j/conf/
Notes on intended outcome:
For every row in the collection, the query should create 2 relations: (i1)-[:PMI]->(i2) and (i2)-[:PMI]->(i1). The weight of the [:PMI] relations is the math in the Log() function. The graph is (:Ingredient) and (:Recipe) nodes. This query will create a relationship between (i1:Ingredient) and every (i2:Ingredient) that occurs in the recipe containing (i1). This allows me to understand probability of ingredient pairings.

Can you wrap your non-collection items in [ ] and see if that solves the problem? For example: FOREACH(i1 IN [c[0]] |
Ok, here's an attempt at a slight rewrite that compiles--would be great if you could post some example data on console.neo4j.org if it doesn't work:
MATCH (total:Recipe)
WITH count(DISTINCT total) AS tots, timestamp() AS time
MATCH (ia:Ingredient)<-[:HAS_INGREDIENT]-(recab:Recipe)-[recHasB:HAS_INGREDIENT]->(ib:Ingredient)
WHERE id(ia)=5167
WITH DISTINCT ib AS idB, count(DISTINCT recab) AS recAB , count(DISTINCT recHasB) AS recB, tots, time
MATCH (i:Ingredient)<-[:HAS_INGREDIENT]-(r:Recipe)
WHERE id(i)=5167
WITH i, count(DISTINCT r.id) as recA, idB as i2, recAB, recB, tots, time
CREATE (i)-[pm1:PMI {pmiTime:time, weight:log( (tots*recAB) /(recA*recB) )}]->(i2)
CREATE (i)<-[pm2:PMI {pmiTime:time, weight:log( (tots*recAB) /(recA*recB) )}]-(i2)
SET i.pmiTime = time, pm1.pmiTime = time, i2.pmiTime = time, pm2.pmiTime = time

Related

InfluxDB subquery in WHERE clause

Having some issues wrapping my brain around this one. I have two tables in InfluxDB 1.8.x, here's the relevant data layout
table a
-------------------------------------------
|time |hostname|device_cache|
|6/14/2022 9:00:30PM|device1 |dm-4 |
|6/14/2022 9:00:30PM|device2 |dm-4 |
|6/14/2022 9:00:30PM|device3 |dm-8 |
-------------------------------------------
table b
-----------------------------------------------------
|time |hostname|diskiodevice|diskiola1|
|6/14/2022 9:00:30PM|device1 |dm-0 |8 |
|6/14/2022 9:00:30PM|device1 |dm-4 |7 |
|6/14/2022 9:00:30PM|device3 |dm-3 |9 |
|6/14/2022 9:00:30PM|device2 |dm-2 |8 |
|6/14/2022 9:00:30PM|device3 |dm-8 |15 |
|6/14/2022 9:00:30PM|device2 |dm-4 |9 |
|6/14/2022 9:00:30PM|device3 |dm-3 |1 |
-----------------------------------------------------
So, what I am trying to do is get all the diskiola1 values for the diskiodevices from table b that are defined as device_cache items from table a for a particular hostname entry. Here's what I've tried:
SELECT max("diskiola1")
FROM "table b"
WHERE hostname = 'device1'
AND
time > now() - 10m
AND
"cache_device" IN
( Select distinct("device_cache") as "cache_device" FROM "table a" WHERE hostname = 'device1')
GROUP BY time(20s)
My goal is to have this as a time series in a graph to show the values of diskiola1 for a given host over a period of time for only the device_cache items. This data is given to me to work with, I really can't modify it unfortunately.
Anyone see where I'm going wrong? The error I receive is
ERR: error parsing query: found IN, expected ;
Unfortunately InfluxQL doesn't support IN operator or for the foreseeable future (see details here). InfluxQL doesn't support JOIN operation either (see details here).
Seems your "table_a" is more like a mapping table while "table_b" is storing the time series data actually. Assuming hostname is a tag while device_cache is a field for "table a"; hostname is a tag while diskiodevice and diskiola1 are fields for "table b". You could try enabling Flux and try following sample codes:
aDistinctDeviceCache = from(bucket:"yourDatabaseName/yourRentionPolicyName")
|> range(start: 2018-05-22T23:30:00Z, stop: 2018-05-23T00:00:00Z) // start and stop can be changed
|> filter(fn:(r) => r._measurement == "table a" and r.hostname == "device1" and r._field == "device_cache")
|> distinct()
bDevice1 = from(bucket:"yourDatabaseName/yourRentionPolicyName")
|> range(start: -10m)
|> filter(fn:(r) => r._measurement == "table b" and r.hostname == "device1")
|> rename(columns: {diskiodevice: "device_cache"})
maxDiskiola1ForDevice1 =
join(tables:{aPlus:aDistinctDeviceCache, bPlus:bDevice1}, on:["hostname", "device_cache"])
|> window(every: 20s)
|> max("diskiola1")
|> yield()
This will first grab distinct values from "table_a" and then rename some field of "table_b" so that we can join the two tables together in the last step.
Here are some more tips to convert your InfluxQL to Flux and convert your subqueries.

Cypher - Attempting to print all nodes to text o/p exceeds Java Heap space - Neo4j V 3.5

Attempting to print all the node properties of my large graph to a human readable text file leads to a out of Heap space error although Heap space has been made huge (256GB)
Example Cypher:
match (n:Entity) return n, n.links_to, n.links_from;
echo "match (n:Entity) return n, n.links_to, n.links_from;" | /home/user/neo4j-enterprise/bin/cypher-shell > all_node_links.out
Is there a more efficient/practical way of doing this for large graphs ?
I do not wish to export to a CSV file as I wish to parse the resultant text file in bash shell/sed.
Just found an Apoc function to solve this:
call apoc.export.csv.query('MATCH (n:Entity) Return n.name,n.links_from,n.links_to', '/home/user/all_node_links.csv', {} );

Find the position of string in a list in Neo4j, Cypher

How would i find the index value of a string within a list - for example
WITH split ("what is porsche",' ')
how would I find the position of 'porsche' as 3?
First of all, the position would be 2 as we generally start from 0 in CS.
This is a one liner :
WITH split ("what is porsche",' ') AS spl
RETURN [x IN range(0,size(spl)-1) WHERE spl[x] = "porsche"][0]
Returns 2
WITH split ("what is porsche",' ') AS spl
RETURN [x IN range(0,size(spl)-1) WHERE spl[x] = "is"][0]
Returns 1
Cypher does not have an IndexOf-like function natively. But you can install APOC Procedure and use the function apoc.coll.indexOf, like this:
WITH split ("what is porsche",' ') AS list
RETURN apoc.coll.indexOf(list, 'porsche')
The result will be:
╒════════════════════════════════════╕
│"apoc.coll.indexOf(list, 'porsche')"│
╞════════════════════════════════════╡
│2 │
└────────────────────────────────────┘
Note: The result is 2 because indexes starts at 0.
Note 2: Remember to install APOC Procedures according the version of Neo4j you are using. Take a look in the version compatibility matrix.
EDIT:
One alternative approach without using APOC Procedures, using size(), reduce() and range() functions with CASE expression:
WITH split ("what is porsche",' ') AS list
WITH list, range(0, size(list) - 1) AS indexes
WITH reduce(acc=-1, index IN indexes |
CASE WHEN list[index] = 'porsch' THEN index ELSE acc + 0 END
) as reduction
RETURN reduction
In case the index is not found then -1 will return.
As Bruno says, APOC is the right call for this but if for some reason you wanted to find the position without APOC you could go through the following rigamarole...
WITH split("what is porsche",' ') AS porsche_strings
UNWIND range(0,size(porsche_strings)-1) AS idx
WITH CASE
WHEN porsche_strings[idx] = 'porsche' THEN idx + 1
END AS position
RETURN collect(position) AS positions
Another approach for implementing this in plain Cypher:
WITH 'porsche' AS needle, 'what is porsche' AS haystack
WITH needle, split(haystack, ' ') AS words
WITH needle, [i IN range(0, length(words)-1) | [i, words[i]]] AS word
WITH filter(w IN word WHERE w[1] = needle) AS res
RETURN coalesce(res[0][0], -1)

how to delete key from list in redis matching a pattern?

Using the ruby redis client
I have a key that contains a list of values they follow the pattern of
campaign_id|telephone|query_id
there are thousands of these in a individual list what i want to do is delete all the ones that have for example the query_id of 4 from that redis list. Can you do this through some sort of pattern matching? Please could someone give me an example as i've been reading through other questions and am a bit lost
You basically have one of two options: a) do it your (RoR) application or b) do it in Redis.
I'm not a RoR expert so I can't advise on the how, but note that taking the a) path you'll basically be moving the entire list to your application and there do the filtering. The bigger your list is, the more time it will take it to cross the network.
Option b) means that you'll be filtering the list right in Redis - this can be done simply and efficiently when you use Lua. Example:
$ cat dellistbyqueryid.lua
-- removes a range of list elements that confirm to a given
-- query_id. Elements are stored as: 'campaign_id|telephone|query_id'
-- KEYS[1] - a list
-- ARGV[1] - a query_id
-- return: number of elements removed
local l = tonumber(redis.call('LLEN', KEYS[1]))
local n = 0
while l > 0 do
local curr = redis.call('LINDEX', KEYS[1], -1)
local id = curr:match( '.*|.*|(.*)' )
if id == ARGV[1] then
redis.call('RPOP', KEYS[1])
n = n + 1
else
redis.call('RPOPLPUSH', KEYS[1], KEYS[1])
end
l = l - 1
end
return n
Output:
$ redis-cli LPUSH list "foo|bar|1" "baz|qaz|2" "lua|redis|1"
(integer) 3
$ redis-cli --eval dellistbyqueryid.lua list , 1
(integer) 2
$ redis-cli LRANGE list 0 -1
1) "baz|qaz|2"

cypher query BadInputException

In my graph there are approximately 196 000 C nodes, 600 000 A nodes, and 800 000 S nodes. 99% of C's are connected to a single A (with each A having anywhere from 0 - 20 Cs related), and all A's are connected to a single S.
I am running the following query
MATCH (c:C)<-[d:D]-(:A)<-[:u]-(s:S)
WITH s, d, c,
CASE WHEN c.start - 1 - 20000 < 0
THEN 0
ELSE c.start - 1 - 20000 END AS start
RETURN s.r, c.type, d.index,
substring(s.se, start, c.end-start + 1 + 20000);
It runs for around 2.5 hours, and then I get this response:
{
"message" : "The statement has been closed.",
"exception" : "BadInputException",
"fullname" : "org.neo4j.server.rest.repr.BadInputException",
"stacktrace" : [ "org.neo4j.server.rest.repr.RepresentationExceptionHandlingIterable.exceptionOnHasNext(RepresentationExceptionHandlingIterable.java:50)", "org.neo4j.helpers.collection.ExceptionHandlingIterable$1.hasNext(ExceptionHandlingIterable.java:46)", "org.neo4j.helpers.collection.IteratorWrapper.hasNext(IteratorWrapper.java:42)", "org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:71)", "org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)", "org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)", "org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:83)", "org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:41)", "org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:215)", "org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:147)", "org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:130)", "org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:67)", "org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:101)", "java.lang.reflect.Method.invoke(Method.java:606)", "org.neo4j.server.rest.transactional.TransactionalRequestDispatcher.dispatch(TransactionalRequestDispatcher.java:139)", "org.neo4j.server.rest.security.SecurityFilter.doFilter(SecurityFilter.java:112)", "java.lang.Thread.run(Thread.java:745)" ],
"cause" : {
"message" : "The statement has been closed.",
"exception" : "NotInTransactionException",
I am just running this query via curl as follows
curl -g -H Accept:application/json -H Content-Type:application/json -X POST -d '{ "query":"MATCH (c:C)<-[d:D]-(:a)<-[:u]-(s:S) WITH s, d, c, CASE WHEN c.start - 1 - 20000 < 0 THEN 0 ELSE c.start - 1 - 20000 END AS start RETURN s.r, c.type, d.index, substring(s.se, start, c.end-start + 1 + 20000);", "params" : {} }' localhost:7474/db/data/cypher -o data.json
I have added "limit 3;" to the query and it does run and return expected results.
Have I not properly optimized the query? I have read about query optimization and can't see anything I could improve on, although I bet there is. I can not find much documentation on solving that exception either.
Any help would be great! Thanks
Edit: fixed typo
Edit: I reran the same query with an additional "WHERE c.prop = 'x'" to limit the initial C matching and it then returned an OutOfMemory Exception. I then did some more reading and came across this from Michael's post here. My query is now running and I think it is working. (There is a lot of data and it is downloading it to a file that is increasing in size.)
So you're trying to match a LOT of different paths, and I think you probably are doing more computation than is necessary. You might want to try this reformulation:
MATCH (c:C)
WITH c, CASE WHEN c.start - 20001 < 0
THEN 0 ELSE c.start - 20001 as start
MATCH (c)<-[d:D]-(:A)<-[:u]-(s:S)
WITH c, start, s, d
RETURN s.r, c.type, d.index, substring(s.se, start, c.end - start + 20001);
My thought here is that you have the fewest number of C's of any node. So start the match there, and do your math computation first, then base subsequent matches off of that. Otherwise you re-match c many extra times depending on how many of the other nodes there are. You could further break this down based on the next-most-selective A with an additional with clause. I think this will help.
Which Neo4j version are you using?
I think you're creating billions and billions of paths.
To look at the cardinalities:
(c:C*196k)<-[d:D*1..20]-(:A*600k)<-[:u*1..1]-(s:S*800k)
Profile your statement, I think it makes sense to have it start from a C and follow the path to the single a and the single s from there.
so you can use USING SCAN c:C to force Cypher to scan the C nodes via the index which should give you 196k paths.
Each of those c-nodes would then be matched along the single-node-path.
So try #FrobberOfBits suggestion along with profiling and limiting the first WITH to see if the correct data is returned.
See: http://neo4j.com/docs/stable/query-using.html#using-hinting-a-label-scan

Resources