Neo4j browser interface stops working or reconnecting - neo4j

The problem is the same no matter if I am using Safari or Chrome.
After running several times the same query shown below, I am getting the error: Disconnected from Neo4j. Please check if the cord is unplugged.
I am able to SSH to the server and run the query from the shell.
This query was the subject of another issue open earlier and the someone optimize it to the form is presented below. So is not a mater of a not optimized query, seems to be something about the browser interface.
What is wrong here?
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
MATCH (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WHERE (a.author_name = 'Camus, Albert')
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;
More details: when the browser dies in one computer, dies also in the second computer trying to connect to same database.
Also other commands i.e.
$ rails console
or
$ rails s -d
to start the rails server no longer works.
If I am restarting the Neo4j db server all are working for a little bit and frozen after that.
Below is the execution plan of the query:
neo4j-sh (?)$ EXPLAIN MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author{author_name: 'Camus, Albert'}), (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
73 ms
Compiler CYPHER 2.2
Planner COST
OptionalExpand(All)
|
+Filter(0)
|
+Expand(All)(0)
|
+Filter(1)
|
+Expand(All)(1)
|
+Filter(2)
|
+Expand(All)(2)
|
+Filter(3)
|
+Expand(All)(3)
|
+NodeUniqueIndexSeek
+---------------------+---------------+---------------------------------+-----------------------------+
| Operator | EstimatedRows | Identifiers | Other |
+---------------------+---------------+---------------------------------+-----------------------------+
| OptionalExpand(All) | 5 | a, b, d, l, p, r, s, t, u, v, w | (w)-[v:HAS_DESCRIPTION]-(d) |
| Filter(0) | 5 | a, b, l, p, r, s, t, u, w | b:Bisac |
| Expand(All)(0) | 5 | a, b, l, p, r, s, t, u, w | (w)-[u:INCLUDED]->(b) |
| Filter(1) | 4 | a, l, p, r, s, t, w | l:Language |
| Expand(All)(1) | 4 | a, l, p, r, s, t, w | (w)<-[t:USED]-(l) |
| Filter(2) | 4 | a, p, r, s, w | p:Publisher |
| Expand(All)(2) | 4 | a, p, r, s, w | (w)<-[r:PUBLISHED]-(p) |
| Filter(3) | 4 | a, s, w | w:Woka |
| Expand(All)(3) | 4 | a, s, w | (a)-[s:AUTHORED]->(w) |
| NodeUniqueIndexSeek | 1 | a | :Author(author_name) |
+---------------------+---------------+---------------------------------+-----------------------------+
Total database accesses: ?
neo4j-sh (?)$
Here is a snapshot from top (before having the browser frozen):
top - 14:59:36 up 46 days, 17:03, 2 users, load average: 2.66, 4.58, 3.75
Tasks: 116 total, 2 running, 114 sleeping, 0 stopped, 0 zombie
%Cpu(s): 97.5 us, 0.8 sy, 0.0 ni, 1.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.2 st
KiB Mem: 15666128 total, 3858028 used, 11808100 free, 169612 buffers
KiB Swap: 0 total, 0 used, 0 free. 2144784 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10260 neo4j 20 0 14.348g 1.388g 195316 S 196.9 9.3 1:57.55 java
9879 ubuntu 20 0 23680 1656 1116 R 0.3 0.0 0:00.88 top
1 root 20 0 33508 2236 860 S 0.0 0.0 0:12.25 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.55 ksoftirqd/0
4 root 20 0 0 0 0 S 0.0 0.0 0:30.10 kworker/0:0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root 20 0 0 0 0 S 0.0 0.0 0:39.08 rcu_sched
8 root 20 0 0 0 0 R 0.0 0.0 0:47.50 rcuos/0
9 root 20 0 0 0 0 S 0.0 0.0 1:00.72 rcuos/1

What is the spec of your computer?
How much data does your query return?
MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Camus, Albert')
MATCH (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;
Also what does your visual query plan look like? Please prefix your query with PROFILE save as png and share it.

The browser interface of this product Neo4j needs a major overhaul. There is no way to use this interface for serious design, modelling and development.
I executed the following stress tests a from Ruby on Rails console. No errors about disconnect, network etc. All run successfully while any of these queries frozen the browser after 5, 6, 7 executions and even if the result set is limited to 25 records. More than that, I executed all of them while the browser interface was still frozen showing that network disconnect error.
(1..1000).each do |n|
q = "MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Freud, Sigmund')
MATCH (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;"
r = Neo4j::Session.current.query(q)
print n, "\t", r.count, "\t", Time.now, "\n"
end
(1..1000).each do |n|
q = "MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Einstein, Albert')
MATCH (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;"
r = Neo4j::Session.current.query(q)
print n, "\t", r.count, "\t", Time.now, "\n"
end
(1..1000).each do |n|
q = "MATCH (p:Publisher)-[r:PUBLISHED]->(w:Woka)<-[s:AUTHORED]-(a:Author)
WHERE (a.author_name = 'Freud, Sigmund')
MATCH (l:Language)-[t:USED]->(w)-[u:INCLUDED]->(b:Bisac)
WITH p,r,w,s,a,l,t,u,b
OPTIONAL MATCH (d:Description)-[v:HAS_DESCRIPTION]-(w)
RETURN w, p, a, l, b, d, r, s, t, u, v;"
r = Neo4j::Session.current.query(q)
print n, "\t", r.count, "\t", Time.now, "\n"
end

Related

How to optimize the following neo4j Cypher query

I am new to cypher and have the below query to find mistmaches between 2 source types(for example). I believe syntactically the query looks fine but it takes 1 minute to run on data set of just 1,00,000 nodes. I am not using relations still. Can someone please help in optimizing the query? Thanks.
MATCH (VW_OXSS41:VW_OrderXStatusSummary4{SourceTypeID: "1"})
WHERE apoc.date.parse(VW_OXSS41.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss'))>=apoc.date.parse("2020-02-10",'s',('yyyy-MM-dd')) AND apoc.date.parse(VW_OXSS41.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss'))<=apoc.date.parse("2020-02-16",'s',('yyyy-MM-dd'))
WITH VW_OXSS41.IdentifierValue as X
MATCH (VW_OXSS42:VW_OrderXStatusSummary4{SourceTypeID: "2"})
WHERE apoc.date.parse(VW_OXSS42.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss'))>=apoc.date.parse("2020-02-10",'s',('yyyy-MM-dd')) AND apoc.date.parse(VW_OXSS42.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss'))<=apoc.date.parse("2020-02-16",'s',('yyyy-MM-dd'))
WITH apoc.coll.disjunction(COLLECT(X), COLLECT(VW_OXSS42.IdentifierValue)) as XX
UNWIND (XX) as YY
The updated query and the error:-
WITH apoc.date.parse("2020-02-20",'s',('yyyy-MM-dd')) AS a, apoc.date.parse("2020-02-25",'s',('yyyy-MM-dd')) AS b
MATCH (x:VW_OrderXStatusSummary4 {SourceTypeID: "2"})
WHERE a <= apoc.date.parse(x.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH a, b, COLLECT(x.IdentifierValue) AS X
MATCH (y:VW_OrderXStatusSummary4 {SourceTypeID: "1"})
WHERE a <= apoc.date.parse(y.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH X, COLLECT(y.IdentifierValue) AS Y
UNWIND apoc.coll.subtract(X,Y) AS XX
MATCH (z:VW_OrderXStatusSummary4 {SourceTypeID: "2"})
WHERE a <= apoc.date.parse(z.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
RETURN XX AS MISMATCHES,MAX(z.TimeStamp);
Variable `a` not defined (line 10, column 7 (offset: 551))
"WHERE a <= apoc.date.parse(z.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b"
Solved the above error like this:-
WITH apoc.date.parse("2020-02-21",'s',('yyyy-MM-dd')) AS a, apoc.date.parse("2020-02-25",'s',('yyyy-MM-dd')) AS b
MATCH (x:VW_OrderXStatusSummary4 {SourceTypeID: "2"})
WHERE a <= apoc.date.parse(x.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH a, b, COLLECT(x.IdentifierValue) AS X
MATCH (y:VW_OrderXStatusSummary4 {SourceTypeID: "1"})
WHERE a <= apoc.date.parse(y.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH X, COLLECT(y.IdentifierValue) AS Y
UNWIND apoc.coll.subtract(X,Y) AS XX
WITH XX, apoc.date.parse("2020-02-20",'s',('yyyy-MM-dd')) AS a, apoc.date.parse("2020-02-25",'s',('yyyy-MM-dd')) AS b
MATCH (z:VW_OrderXStatusSummary4 {SourceTypeID: "2"})
WHERE a <= apoc.date.parse(z.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
AND XX = z.IdentifierValue
RETURN XX AS MISMATCHES,MAX(z.TimeStamp);
With the correct expected output as:-
+---------------------------------------------+
| MISMATCHES | TIMESTAMP |
+---------------------------------------------+
| "W2002201453550218" | "2020-02-21 12:00:16" |
| "W2002201453550222" | "2020-02-21 12:00:16" |
| "W2002201453550223" | "2020-02-21 09:30:36" |
| "W2002201453550224" | "2020-02-21 12:00:16" |
| "W2002201453550226" | "2020-02-21 12:00:16" |
| "W2002201453550227" | "2020-02-21 12:00:16" |
| "W2002201453550237" | "2020-02-21 12:00:16" |
| "3011WOS002978598" | "2020-02-21 10:00:54" |
| "3011WOS002978595" | "2020-02-21 13:00:57" |
| "0010000000006183" | "2020-02-21 16:00:41" |
| "W2002181111547439" | "2020-02-21 04:00:34" |
| "11" | "2020-02-21 16:00:41" |
| "10112787861P1458" | "2020-02-21 10:00:54" |
+---------------------------------------------+
Wondering if there's a better approach?
You need to avoid making a cartesian product between the results of your two MATCH clauses. Let's say the two MATCH clauses would normally return N and M nodes, respectively, when executed in their own queries. Because your query combines those two MATCH clauses in the way that it does, your second MATCH clause is actually performing N*M matches (and producing N*M result rows).
You need to make sure you have created an index on :VW_OrderXStatusSummary4(SourceTypeID). That will optimize the lookups performed by the MATCH clauses.
You can simplify your Cypher code to avoid duplicated function calls.
After creating the index indicated above, try this:
WITH apoc.date.parse("2020-02-10",'s',('yyyy-MM-dd')) AS a, apoc.date.parse("2020-02-16",'s',('yyyy-MM-dd')) AS b
MATCH (x:VW_OrderXStatusSummary4 {SourceTypeID: "1"})
WHERE a <= apoc.date.parse(x.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH a, b, COLLECT(x.IdentifierValue) AS X
MATCH (y:VW_OrderXStatusSummary4 {SourceTypeID: "2"})
WHERE a <= apoc.date.parse(y.TimeStamp,'s',('yyyy-MM-dd HH:mm:ss')) <= b
WITH X, COLLECT(y.IdentifierValue) AS Y
UNWIND apoc.coll.disjunction(X, Y) AS YY
...
Performing the COLLECT(x.IdentifierValue) operation in the first WITH clause causes it to return all the x nodes in a single result row (instead of N result rows). This allows the second MATCH to avoid a cartesian product issue.

Make a path where next node is not the previous node?

I have ~1.5 M nodes in a graph, that are structured like this (picture)
I run a Cypher query that performs calculations on each relationship traversed:
WITH 1 AS startVal
MATCH x = (c:Currency)-[r:Arb*2]->(m)
WITH x, REDUCE(s = startVal, e IN r | s * e.rate) AS endVal, startVal
RETURN EXTRACT(n IN NODES(x) | n) as Exchanges,
extract ( e IN relationships(x) | startVal * e.rate) AS Rel,
endVal, endVal - startVal AS Profit
ORDER BY Profit DESC LIMIT 5
The problem is it returns the path ("One")->("hop")->("One"), which is useless for me.
How can I make it not choose the previously walked node as the next node (i.e. "One"->"hop"->"any_other_node_but_not_"one")?
I have read that NODE_RECENT should address my issue. However, there was no example on how to specify the length of recent nodes in RestAPI or APOC procedures.
Is there a Cypher query for my case?
Thank you.
P.S. I am extremely new (less than 2 month) to Neo4j and coding. So my apologies if there is an obvious simple solution.
I don't know if I understood your question completely, but I believe that you problem can be solved putting a WHERE clause on the MATCH to prevent the not desired relationship be matched, like this:
WITH 1 AS startVal
MATCH x = (c:Currency)-[r:Arb*2]->(m)
WHERE NOT (m)-[:Arb]->(c)
WITH x, REDUCE(s = startVal, e IN r | s * e.rate) AS endVal, startVal
RETURN EXTRACT(n IN NODES(x) | n) as Exchanges,
extract ( e IN relationships(x) | startVal * e.rate) AS Rel,
endVal, endVal - startVal AS Profit
ORDER BY Profit DESC LIMIT 5
Try inserting this clause after your MATCH clause, to filter out cases where c and m are the same:
WHERE c <> m
[EDITED]
That is:
WITH 1 AS startVal
MATCH x = (c:Currency)-[r:Arb*2]->(m)
WHERE c <> m
WITH x, REDUCE(s = startVal, e IN r | s * e.rate) AS endVal, startVal
RETURN EXTRACT(n IN NODES(x) | n) as Exchanges,
extract ( e IN relationships(x) | startVal * e.rate) AS Rel,
endVal, endVal - startVal AS Profit
ORDER BY Profit DESC LIMIT 5;
After using this query to create test data:
CREATE
(c:Currency {name: 'One'})-[:Arb {rate:1}]->(h:Account {name: 'hop'})-[:Arb {rate:2}]->(t:Currency {name: 'Two'}),
(t)-[:Arb {rate:3}]->(h)-[:Arb {rate:4}]->(c)
the above query produces these results:
+-----------------------------------------------------------------------------------------+
| Exchanges | Rel | endVal | Profit |
+-----------------------------------------------------------------------------------------+
| [Node[8]{name:"Two"},Node[7]{name:"hop"},Node[6]{name:"One"}] | [3,4] | 12 | 11 |
| [Node[6]{name:"One"},Node[7]{name:"hop"},Node[8]{name:"Two"}] | [1,2] | 2 | 1 |
+-----------------------------------------------------------------------------------------+

How to use average function in neo4j with collection

I want to calculate covariance of two vectors as collection
A=[1, 2, 3, 4]
B=[5, 6, 7, 8]
Cov(A,B)= Sigma[(ai-AVGa)*(bi-AVGb)] / (n-1)
My problem for covariance computation is:
1) I can not have a nested aggregate function
when I write
SUM((ai-avg(a)) * (bi-avg(b)))
2) Or in another shape, how can I extract two collection with one reduce such as:
REDUCE(x= 0.0, ai IN COLLECT(a) | bi IN COLLECT(b) | x + (ai-avg(a))*(bi-avg(b)))
3) if it is not possible to extract two collection in oe reduce how it is possible to relate their value to calculate covariance when they are separated
REDUCE(x= 0.0, ai IN COLLECT(a) | x + (ai-avg(a)))
REDUCE(y= 0.0, bi IN COLLECT(b) | y + (bi-avg(b)))
I mean that can I write nested reduce?
4) Is there any ways with "unwind", "extract"
Thank you in advanced for any help.
cybersam's answer is totally fine but if you want to avoid the n^2 Cartesian product that results from the double UNWIND you can do this instead:
WITH [1,2,3,4] AS a, [5,6,7,8] AS b
WITH REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b,
SIZE(a) AS n, a, b
RETURN REDUCE(s = 0.0, i IN RANGE(0, n - 1) | s + ((a[i] - e_a) * (b[i] - e_b))) / (n - 1) AS cov;
Edit:
Not calling anyone out, but let me elaborate more on why you would want to avoid the double UNWIND in https://stackoverflow.com/a/34423783/2848578. Like I said below, UNWINDing k length-n collections in Cypher results in n^k rows. So let's take two length-3 collections over which you want to calculate the covariance.
> WITH [1,2,3] AS a, [4,5,6] AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN aa, bb;
| aa | bb
---+----+----
1 | 1 | 4
2 | 1 | 5
3 | 1 | 6
4 | 2 | 4
5 | 2 | 5
6 | 2 | 6
7 | 3 | 4
8 | 3 | 5
9 | 3 | 6
Now we have n^k = 3^2 = 9 rows. At this point, taking the average of these identifiers means we're taking the average of 9 values.
> WITH [1,2,3] AS a, [4,5,6] AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN AVG(aa), AVG(bb);
| AVG(aa) | AVG(bb)
---+---------+---------
1 | 2.0 | 5.0
Also as I said below, this doesn't affect the answer because the average of a repeating vector of numbers will always be the same. For example, the average of {1,2,3} is equal to the average of {1,2,3,1,2,3}. It is likely inconsequential for small values of n, but when you start getting larger values of n you'll start seeing a performance decrease.
Let's say you have two length-1000 vectors. Calculating the average of each with a double UNWIND:
> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
UNWIND a AS aa
UNWIND b AS bb
RETURN AVG(aa), AVG(bb);
| AVG(aa) | AVG(bb)
---+---------+---------
1 | 500.0 | 1500.0
714 ms
Is significantly slower than using REDUCE:
> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
RETURN REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b;
| e_a | e_b
---+-------+--------
1 | 500.0 | 1500.0
4 ms
To bring it all together, I'll compare the two queries in full on length-1000 vectors:
> WITH RANGE(0, 1000) AS aa, RANGE(1000, 2000) AS bb
UNWIND aa AS a
UNWIND bb AS b
WITH aa, bb, SIZE(aa) AS n, AVG(a) AS avgA, AVG(b) AS avgB
RETURN REDUCE(s = 0, i IN RANGE(0,n-1)| s +((aa[i]-avgA)*(bb[i]-avgB)))/(n-1) AS
covariance;
| covariance
---+------------
1 | 83583.5
9105 ms
> WITH RANGE(0, 1000) AS a, RANGE(1000, 2000) AS b
WITH REDUCE(s = 0.0, x IN a | s + x) / SIZE(a) AS e_a,
REDUCE(s = 0.0, x IN b | s + x) / SIZE(b) AS e_b,
SIZE(a) AS n, a, b
RETURN REDUCE(s = 0.0, i IN RANGE(0, n - 1) | s + ((a[i] - e_a) * (b[i
] - e_b))) / (n - 1) AS cov;
| cov
---+---------
1 | 83583.5
33 ms
[EDITED]
This should calculate the covariance (according to your formula), given your sample inputs:
WITH [1,2,3,4] AS aa, [5,6,7,8] AS bb
UNWIND aa AS a
UNWIND bb AS b
WITH aa, bb, SIZE(aa) AS n, AVG(a) AS avgA, AVG(b) AS avgB
RETURN REDUCE(s = 0, i IN RANGE(0,n-1)| s +((aa[i]-avgA)*(bb[i]-avgB)))/(n-1) AS covariance;
This approach is OK when n is small, as is the case with the original sample data.
However, as #NicoleWhite and #jjaderberg point out, when n is not small, this approach will be inefficient. The answer by #NicoleWhite is an elegant general solution.
How do you arrive at collections A and B? The avg function is an aggregating function and cannot be used in the REDUCE context, nor can it be applied to collections. You should calculate your average before you get to that point, but exactly how to do that best depends on how you arrive at the two collections of values. If you are at a point where you have individual result items that you then collect to get A and B, that's the point when you could use avg. For example:
WITH [1, 2, 3, 4] AS aa UNWIND aa AS a
WITH collect(a) AS aa, avg(a) AS aAvg
RETURN aa, aAvg
and for both collections
WITH [1, 2, 3, 4] AS aColl UNWIND aColl AS a
WITH collect(a) AS aColl, avg(a) AS aAvg
WITH aColl, aAvg,[5, 6, 7, 8] AS bColl UNWIND bColl AS b
WITH aColl, aAvg, collect(b) AS bColl, avg(b) AS bAvg
RETURN aColl, aAvg, bColl, bAvg
Once you have the two averages, let's call them aAvg and bAvg, and the two collections, aColl and bColl, you can do
RETURN REDUCE(x = 0.0, i IN range(0, size(aColl) - 1) | x + ((aColl[i] - aAvg) * (bColl[i] - bAvg))) / (size(aColl) - 1) AS covariance
Thank you so much Dears, however I wonder which one is most efficient
1) Nested unwind and range inside reduce -> #cybersam
2) nested Reduce -> #Nicole White
3) Nested With (reset query by with) -> #jjaderberg
BUT Important Issue is :
Why there is an error and difference between your computations and real and actual computations.
I mean your covariance equals to = 1.6666666666666667
But in real world covariance equals to = 1.25
please check: https://www.easycalculation.com/statistics/covariance.php
Vector X: [1, 2, 3, 4]
Vector Y: [5, 6, 7, 8]
I think this differences is because that some computation do not consider (n-1) as divisor and instead of (n-1) , just they use n. Therefore when we grow divisor from n-1 to n the result will be diminished from 1.6 to 1.25.

Neo4j Divide ( / ) by Zero ( 0 )

In neo4j I am querying
MATCH (n)-[t:x{x:"1a"}]->()
WHERE n.a > 1 OR n.b > 1 AND toFloat(n.a) / (n.a+n.b) * 100 < 90
RETURN DISTINCT n, toFloat(n.a) / (n.a + n.b) * 100
ORDER BY toFloat(n.a) / (n.a + n.b) * 100 DESC
LIMIT 10
but I got / by zero error.
Since I declared one of n.a or n.b should be 1, if both zero it should skip that row and I shouldn't get this error. This looks like a logic issue in Neo4j. There is no problem when I delete AND toFloat(n.a)/(n.a+n.b)*100 < 90 from WHERE clause. But I want the results only lower than 90. How can I overcome this?
Can either of n.a or n.b be negative? I was able to reproduce this with:
WITH -2 AS na, 2 AS nb
WHERE (na > 1 OR nb > 1) AND toFloat(na)/(na+nb)*100 < 90
RETURN na, nb
And I get: / by zero
Perhaps try changing your WHERE clause to:
WITH -2 AS na, 2 AS nb
WHERE (na + nb > 0) AND toFloat(na)/(na+nb)*100 < 90
RETURN na, nb
And I get: zero rows.
It seems the second condition, toFloat(na) / (na + nb) * 100 < 90, is tested before the first. Look at the Filter(1) operator in this execution plan:
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection | 1 | 3 | 0 | anon[111], anon[138], n, toFloat(n.a)/(n.a + n.b)* 100 | anon[111]; anon[138] |
| Top | 1 | 3 | 0 | anon[111], anon[138] | { AUTOINT6}; |
| Distinct | 0 | 3 | 24 | anon[111], anon[138] | anon[111], anon[138] |
| Filter(0) | 0 | 3 | 6 | anon[29], n, t | t.x == { AUTOSTRING0} |
| Expand(All) | 1 | 3 | 6 | anon[29], n, t | ( n#7)-[t:x]->() |
| Filter(1) | 1 | 3 | 34 | n | (Ors(List(n#7.a > { AUTOINT1}, Multiply(Divide(ToFloatFunction( n#7.a),Add( n#7.a, n#7.b)),{ AUTOINT3}) < { AUTOINT4})) AND Ors(List( n#7.a > { AUTOINT1}, n.b > { AUTOINT2}))) |
| AllNodesScan | 4 | 4 | 5 | n | |
+--------------+---------------+------+--------+--------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
You can get around this by force breaking the filter into two clauses.
MATCH (n)-[t:x { x:"1a" }]->()
WHERE n.a > 1 OR n.b > 1
WITH n
WHERE toFloat(n.a) / (n.a + n.b) * 100 < 90
RETURN DISTINCT n, toFloat(n.a) / (n.a + n.b) * 100
ORDER BY toFloat(n.a) / (n.a + n.b) * 100 DESC
LIMIT 10
I found this behavior surprising, but as I think about it I suppose it isn't wrong for the execution engine to rearrange the filter in this way. There may be the assumption that the condition will abandon early on failing the first declared condition, but Cypher is exactly that: declarative. So we express the "what", not the "how", and in terms of the "what" A and B is equivalent to B and A.
Here is the query and a sample graph, you can check if it translates to your actual data:
http://console.neo4j.org/r/f6kxi5

Need documentation for GPUImageColorMatrixFilter. Where can I find documentation for what the matrix means?

I would like to implement a bandpass filter using GPUImageColorMatrixFilter. Basically, blue would equal floor(blue - (k*red)) and both red and green would just end up being zero. Where can I find documentation indicating what the columns and rows of the matrix mean?
My intuition would suggest that the 4x4 matrix is following the standard RGBA order and judging by the examples (see for instance GPUImageSepiaFilter) it looks like I'm right.
For instance, this is the identity GPUMatrix4x4
R G B A
| 1 0 0 0 | red
| 0 1 0 0 | green
| 0 0 1 0 | blue
| 0 0 0 1 | alpha
Let's name each coefficient
R G B A
| a b c d | red
| e f g h | green
| i j k l | blue
| m n o p | alpha
Applying the matrix to a RGBA color will result in the following R'G'B'A' color where the components are computed as
R' = a*R + b*G + c*B + d*A
G' = e*R + f*G + g*B + h*A
B' = i*R + j*G + k*B + l*A
A' = m*R + n*G + o*B + p*A
which is nothing but the following matrix multiplication
| a b c d | |R| |R'|
| e f g h | x |G| = |G'|
| i j k l | |B| |B'|
| m n o p | |A| |A'|

Resources