What am i doing wrong in writing this Snowflake function with variables? - stored-procedures

Another user was helping me with this problem, but i'm having trouble executing it, i'm getting error:
syntax error line 3 at position 8 unexpected 'time'. syntax error line 3 at position 23 unexpected ':'. (line 3)
it seems i'm either declaring the variables wrong, actually i'm sure of that because when i comment out "time", it gives me an error with "curDay".
Here is the function i'm trying to execute;
CREATE OR REPLACE FUNCTION DB_BI_DEV.RAW_CPMS_AAR.cfn_GetShiftIDFromDateTime (dateTime TIMESTAMP_NTZ(9), shiftCalendarID int)
RETURNS table (shiftID int)
AS
$$
DECLARE
time time TIME(:dateTime);
curDay int;
prvDay int;
shiftID int;
BEGIN
SELECT TOP 1
ID,
DATEDIFF( day, BeginDate, :dateTime ) % PeriodInDays + 1,
( :curDay + PeriodInDays - 2 ) % PeriodInDays + 1
INTO :shiftCalendarID, :curDay, :prvDay
FROM RAW_CPMS_AAR.ShiftCalendar
WHERE ID = :shiftCalendarID
OR ( :shiftCalendarID IS NULL
AND Name = 'Factory'
AND BeginDate <= :dateTime )
ORDER BY BeginDate DESC;
SELECT ID into :shiftID
FROM Shift
WHERE ShiftCalendarID = #shiftCalendarID
AND ( ( FromDay = :curDay AND FromTimeOfDay <= :time AND TillTimeOfDay > :time )
OR ( FromDay = :curDay AND FromTimeOfDay >= TillTimeOfDay AND FromTimeOfDay <= :time )
OR ( FromDay = :prvDay AND FromTimeOfDay >= TillTimeOfDay AND TillTimeOfDay > :time )
);
END;
$$

This may need some small changes, but should be close to what you require:
CREATE OR REPLACE FUNCTION DB_BI_DEV.RAW_CPMS_AAR.cfn_GetShiftIDFromDateTime (dateTime TIMESTAMP_NTZ(9), shiftCalendarID int)
RETURNS table (shiftID int)
AS
$$
WITH T0 (ShiftCalendarID, CurDay, PrvDay)
AS (
SELECT TOP 1
ID AS ShiftCalendarID,
DATEDIFF( day, BeginDate, :dateTime ) % PeriodInDays + 1 AS CurDay,
( CurDay + PeriodInDays - 2 ) % PeriodInDays + 1 AS PrvDay
FROM RAW_CPMS_AAR.ShiftCalendar
WHERE ID = :shiftCalendarID
OR ( :shiftCalendarID IS NULL
AND Name = 'Factory'
AND BeginDate <= :dateTime )
ORDER BY BeginDate DESC
),
T1 (TimeValue)
AS (
SELECT TIME_FROM_PARTS(
EXTRACT(HOUR FROM :datetime),
EXTRACT(MINUTE FROM :datetime),
EXTRACT(SECOND FROM :datetime))
)
)
SELECT ID as shiftID
FROM Shift, T0, T1
WHERE Shift.ShiftCalendarID = T0.ShiftCalendarID
AND ( ( FromDay = T0.CurDay AND FromTimeOfDay <= T1.TimeValue AND TillTimeOfDay > T1.TimeValue )
OR ( FromDay = T0.CurDay AND FromTimeOfDay >= TillTimeOfDay AND FromTimeOfDay <= T1.TimeValue )
OR ( FromDay = T0.PrvDay AND FromTimeOfDay >= TillTimeOfDay AND TillTimeOfDay > T1.TimeValue )
);
$$

To add on to Dave Weldon's solution - in UDF bodies, there are no colons in front of parameters. Instead of :shiftCalendarID it is shiftCalendarID and :dateTime is just dateTime. Colons are needed in Stored Procedures, because the parameters are treated as string literal constants, but this is not the case with User Defined Functions.

Related

Processing aborted due to error Snowflake

I've encountered this error:
Execution error in stored procedure: SQL execution internal error: Processing aborted due to error at Snowflake.execute
when running this script:
CREATE OR REPLACE PROCEDURE DATES_TABLE (INITIALDATE VARCHAR, FINALDATE VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS
$$
var DATESDIFF = (Date.parse(formatDate(FINALDATE)) - Date.parse(formatDate(INITIALDATE)))/ (1000 * 3600 * 24);
snowflake.execute(
{
sqlText: ` CREATE OR REPLACE TEMPORARY TABLE TEMP_DATE_RANGE AS SELECT DATE FROM (
SELECT
CAST(DATEADD (DAY, DatesDiff.n, :1) AS DATE) AS DATE
FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY 1) - 1
FROM
TABLE (generator (rowcount => :3))) DatesDiff (n)
); `,
binds: [formatDate(INITIALDATE), formatDate(FINALDATE), DATESDIFF]
}
);
function formatDate(date) {
var d = new Date(date),
month = '' + (d.getMonth() + 1),
day = '' + d.getDate(),
year = d.getFullYear();
if (month.length < 2)
month = '0' + month;
if (day.length < 2)
day = '0' + day;
return [year, month, day].join('-');
}
$$
;
CALL DATES_TABLE('2021-04-01','2021-05-24');
Which when ran outside of the stored procedure, creates a table with dates between the range inputted.
Any idea why this is happening, how to sort it out?
The problem is in binding a variable to TABLE (generator (rowcount => :3)), as Snowflake expects a constant there.
Instead, you could do something like:
SELECT ROW_NUMBER() OVER (ORDER BY 1) - 1 AS rn
FROM TABLE (generator (rowcount => 1000))
QUALIFY rn < :2
I did some cleanup, and this works:
CREATE OR REPLACE PROCEDURE DATES_TABLE (INITIALDATE VARCHAR, FINALDATE VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVASCRIPT
EXECUTE AS CALLER
AS
$$
var DATESDIFF = (Date.parse(formatDate(FINALDATE)) - Date.parse(formatDate(INITIALDATE)))/ (1000 * 3600 * 24);
snowflake.execute(
{
sqlText: `
CREATE OR REPLACE TEMPORARY TABLE TEMP_DATE_RANGE AS
SELECT CAST(DATEADD(DAY, DatesDiff.rn, :1) AS DATE) AS DATE
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY 1) - 1 AS rn
FROM TABLE (generator (rowcount => 1000))
QUALIFY rn < :2
)
;`
, binds: [formatDate(INITIALDATE), DATESDIFF]
}
);
function formatDate(date) {
var d = new Date(date),
month = '' + (d.getMonth() + 1),
day = '' + d.getDate(),
year = d.getFullYear();
if (month.length < 2)
month = '0' + month;
if (day.length < 2)
day = '0' + day;
return [year, month, day].join('-');
}
$$
;
CALL DATES_TABLE('2021-04-01','2021-05-24');
select * from TEMP_DATE_RANGE;
For a shorter way of generating a sequence of dates, see my answer to https://stackoverflow.com/a/66449068/132438.

Store the result of sql and process it in informix

We have a view which contains 2 columns: pattern_start_time, pattern_end_time.
The select query in the function will convert it to minutes and using that result we are processing to get the shift unused coverage.The function is getting created but the processing is not happening and getting the below error:
SQLError[IX000]:Routine (my_list) cant be resolved.
Also please enter image description heresuggest to loop till the length of the result.
CREATE function myshifttesting(orgid int) returning int;
DEFINE my_list LIST( INTEGER not null );
DEFINE my_list1 LIST( INTEGER not null );
define i, j, sub, sub1 int;
define total int;
TRACE ON;
TRACE 'my testing starts';
INSERT INTO TABLE( my_list )
select
((extend(current, year to second) + (dots.v_shift_coverage.pattern_start_time - datetime(00:00) hour to minute) - current)::interval minute(9) to minute)::char(10)::INTEGER
from
dots.v_shift_coverage
where
org_guid = orgid;
INSERT INTO TABLE( my_list1 )
select
((extend(current, year to second) + (dots.v_shift_coverage.pattern_end_time - datetime(00:00) hour to minute) - current)::interval minute(9) to minute)::char(10)::INTEGER
from
dots.v_shift_coverage
where
org_guid = orgid;
let sub = 0;
let sub1 = 0;
let total = 0;
for j = 0 to 4
if (my_list(j) < my_list1(j))
then
if (my_list(j + 1) > my_list1(j))
then
let sub = sub + my_list(j + 1) - my_list1(j);
end if;
end if;
end for
if (my_list(0) > my_list1(4))
then
let sub1 = my_list(0) - my_list1(4);
end if;
let total = sub + sub1;
return total;
end function;
The error that you are receiving is because my_list(j) is not valid Informix syntax to access a LIST element. Informix is interpreting my_list(j) as a call to a function named mylist.
You can use a temporary table to "emulate" an array with your logic, something like this:
CREATE TABLE somedata
(
letter1 CHAR( 2 ),
letter2 CHAR( 2 )
);
INSERT INTO somedata VALUES ( 'a1', 'a2' );
INSERT INTO somedata VALUES ( 'b1', 'b2' );
INSERT INTO somedata VALUES ( 'c1', 'c2' );
INSERT INTO somedata VALUES ( 'd1', 'd2' );
INSERT INTO somedata VALUES ( 'e1', 'e2' );
DROP FUNCTION IF EXISTS forloop;
CREATE FUNCTION forloop()
RETURNING CHAR( 2 ) AS letter1, CHAR( 2 ) AS letter2;
DEFINE number_of_rows INTEGER;
DEFINE iterator INTEGER;
DEFINE my_letter1 CHAR( 2 );
DEFINE my_letter2 CHAR( 2 );
-- Drop temp table if it already exists in the session
DROP TABLE IF EXISTS tmp_data;
CREATE TEMP TABLE tmp_data
(
tmp_id SERIAL,
tmp_letter1 CHAR( 2 ),
tmp_letter2 CHAR( 2 )
);
-- Insert rows into the temp table, serial column will be the access key
INSERT INTO tmp_data
SELECT 0,
d.letter1,
d.letter2
FROM somedata AS d
ORDER BY d.letter1;
-- Get total rows of temp table
SELECT COUNT( * )
INTO number_of_rows
FROM tmp_data;
FOR iterator = 1 TO number_of_rows
SELECT d.tmp_letter1
INTO my_letter1
FROM tmp_data AS d
WHERE d.tmp_id = iterator;
-- Check if not going "out of range"
IF iterator < number_of_rows THEN
SELECT d.tmp_letter2
INTO my_letter2
FROM tmp_data AS d
WHERE d.tmp_id = iterator + 1;
ELSE
-- iterator + 1 is "out of range", return to the beginning
SELECT d.tmp_letter2
INTO my_letter2
FROM tmp_data AS d
WHERE d.tmp_id = 1;
END IF;
RETURN my_letter1, my_letter2 WITH RESUME;
END FOR;
END FUNCTION;
-- Running the function
EXECUTE FUNCTION forloop();
-- Results
letter1 letter2
a1 b2
b1 c2
c1 d2
d1 e2
e1 a2
5 row(s) retrieved.

How to have sum function of a field of MB and GB values in MVC

My field values are like:
810.9 MB
1.2 GB
395.1 MB
982.3 MB
7.7 GB
149.4 MB
10.0 GB
429.1 MB
3.1 GB
and I want to sum this column in gb in my ASP.NET MVC controller.
But I have no idea how to do this.
You could try something like this:
-- This is your "raw" input - just all the strings in your example
DECLARE #input TABLE (Measure VARCHAR(50))
INSERT INTO #input ( Measure )
VALUES ('810.9 MB'), ('1.2 GB'), ( '395.1 MB'), ( '982.3 MB'), ( '7.7 GB'), ( '149.4 MB'), ( '10.0 GB'), ( '429.1 MB'), ( '3.1 GB')
-- Now declare a separate table that contains (1) the raw value, (2) the contained *numerical* value, and (3) the unit of measure
DECLARE #Storage TABLE (Measure VARCHAR(50), NumValue DECIMAL(20,4), Unit VARCHAR(10))
-- Fill your raw input into that "working table"
INSERT INTO #Storage (Measure, NumValue, Unit)
SELECT
Measure,
NumMeasure = CAST(SUBSTRING(Measure, 1, CHARINDEX(' ', Measure)) AS DECIMAL(20, 2)),
Unit = SUBSTRING(Measure, CHARINDEX(' ', Measure) + 1, 9999)
FROM
#input
SELECT * FROM #Storage
-- when you select from that "working" table, you can now easily *SUM* the numerical values,
-- and show them on screen whichever way you want - as "xxx MB" or "yyyy GB" or whatever - up to you
SELECT
SUM(CASE Unit
WHEN 'MB' THEN NumValue * 1000000
WHEN 'GB' THEN NumValue * 1000000000
ELSE NumValue
END),
CAST(SUM(CASE Unit
WHEN 'MB' THEN NumValue * 1000000
WHEN 'GB' THEN NumValue * 1000000000
ELSE NumValue
END) / 1000000000.0 AS VARCHAR(25)) + ' GB'
FROM
#Storage
Update:
If you want to do this in C# code, try this:
foreach(var item in list)
{
// split "item" into two parts
string[] parts = item.Split(' ');
// parts[0] should be a decimal value
decimal numValue = 0.0m;
if (decimal.TryParse(parts[0], out numValue))
{
decimal convertedValue = 0.0m;
if(parts[1] == "MB")
{
convertedValue = numValue * 1000000;
}
else if (parts[1] == "GB")
{
convertedValue = numValue * 1000000000;
}
}
}

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery

declare #ProductDetails as table(ProductName nvarchar(200),ProductDescription nvarchar(200),
Brand nvarchar(200),
Categry nvarchar(200),
Grop nvarchar(200),
MRP decimal,SalesRate decimal,CurrentQuantity decimal,AvailableQty decimal)
declare #AvailableQty table(prcode nvarchar(100),Aqty decimal)
declare #CloseStock table(pcode nvarchar(100),
Cqty decimal)
insert into #CloseStock
select PCODE ,
0.0
from producttable
insert into #AvailableQty
select PCODE ,
0.0
from producttable
--Current Qty
--OpenQty
update #CloseStock set Cqty=((OOQTY+QTY+SRRQTY+PYQTY)-(STQTY+PRRQTY))
from
(
select PC.PCODE as PRODUCTCODE,
--Opening
(select case when SUM(PU.Quantity)is null then 0 else SUM(PU.Quantity) end as Q from ProductOpeningYearEnd PU
where PC.PCODE=PU.ProductName) as OOQTY,
--Purchase
(select case when SUM(PU.quantity)is null then 0 else SUM(PU.quantity) end as Q from purchase PU
where PC.PCODE=PU.prdcode ) as QTY,
--Sales
(select case when SUM(ST.QUANTITY)is null then 0 else SUM(ST.QUANTITY)end as Q2 from salestable ST
where PC.PCODE=ST.PRODUCTCODE and ST.status!='cancel' )as STQTY,
--Physical Stock
(select case when SUM(PS.Adjustment)is null then 0 else SUM(PS.Adjustment)end as Q3 from physicalstock PS
where PC.PCODE=PS.PCODE )as PYQTY,
--Sales Return
(select case when SUM(SR.quantity)is null then 0 else SUM(SR.quantity)end as Q3 from salesreturn SR
where PC.PCODE=SR.prdcode )as SRRQTY,
--Purchase Return
(select case when SUM(PR.quantity)is null then 0 else SUM(PR.quantity)end as Q3 from purchasereturn PR
where PC.PCODE=PR.prdcode )as PRRQTY
from producttable PC
group by PC.PCODE
)t
where PCODE=t.PRODUCTCODE
--Available
update #AvailableQty set Aqty=((CCqty-GIQty)+(GOQty))
--((OOQTY+QTY+SRRQTY+PYQTY)-(STQTY+PRRQTY))
from
(
select PC.PCODE as PRODUCTCODE,
--GoodsIn
(select case when SUM(GI.quantity)is null then 0 else SUM(GI.quantity) end as Q from goodsin GI
where PC.PCODE=GI.productcode) as GIQty,
--GoodsOut
(select case when SUM(GUT.quantity)is null then 0 else SUM(GUT.quantity) end as Q from goodsout GUT
where PC.PCODE=GUT.productcode ) as GOQty,
--Current Stock
(select CS.Cqty as Q from #CloseStock CS
where PC.PCODE=CS.pcode ) as CCqty
from producttable PC
group by PC.PCODE
)t
where prcode=t.PRODUCTCODE
insert into #ProductDetails
select PCODE,[DESCRIPTION],BRAND,CATEGORY,DEPARTMENT,MRP,SALERATE,0,0
from producttable
update #ProductDetails set CurrentQuantity=pcqty,AvailableQty=acqty
from
(
select pt.ProductName as pn,cs.Cqty as pcqty,ac.Aqty as acqty from #ProductDetails pt
inner join #CloseStock cs on pt.ProductName=cs.pcode
inner join #AvailableQty ac on pt.ProductName=ac.prcode
)t
where ProductName=t.pn
select * from #ProductDetails
end
This not working when productable in pcode field add ant (-.&) this kind of symbol i want to even allow in pcode field,
please help me how i can allow any symbol in query
(problem with this code)
update #AvailableQty set Aqty=((CCqty-GIQty)+(GOQty))
from
(
select PC.PCODE as PRODUCTCODE,
--GoodsIn
(select case when SUM(GI.quantity)is null then 0 else SUM(GI.quantity) end as Q from goodsin GI
where PC.PCODE=GI.productcode) as GIQty,
--GoodsOut
(select case when SUM(GUT.quantity)is null then 0 else SUM(GUT.quantity) end as Q from goodsout GUT
where PC.PCODE=GUT.productcode ) as GOQty,
--Current Stock
(select CS.Cqty as Q from #CloseStock CS
where PC.PCODE=CS.pcode ) as CCqty
from producttable PC
group by PC.PCODE
)t
where prcode=t.PRODUCTCODE
The problem here isn't with the symbols you're using, it's that you are assigning the value of a subquery to a single column in the result set. For example:
(select case when SUM(PR.quantity)is null then 0 else SUM(PR.quantity)end as Q3 from purchasereturn PR
where PC.PCODE=PR.prdcode )as PRRQTY
Note that this is allowed only if the subquery returns only a single value; otherwise, we don't know which of the values should be assigned to the column.
If you expect your subqueries to return multiple values and you just want an arbitrary one, use TOP 1 in the subquery to only return 1 value. Otherwise, you'll have to debug each subquery to figure out which returns multiple results and is causing the issue.

How to verify that specific paths do not exist in cypher query

I would like to get nodes, which do no have a specific relation (a relation with specific properties).
The graph contains entity nodes (the n's), which occur at specific lines (line_nr) in files (the f).
The current query I have is as follows:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( n4.text? =~ "nonreachablenodestextregex" AND (p4 = null OR left.line_nr < right4.line_nr - 0 OR left.line_nr > right4.line_nr + 0 OR ID(left) = ID(right4) ) ) )
AND (
( n7.text? =~ "othernonreachablenodestextregex" AND (p7 = null OR left.line_nr < right7.line_nr - 0 OR left.line_nr > right7.line_nr + 0 OR ID(left) = ID(right7) ) ) ) )
WITH n, left, f, count(*) as group_by_cause
RETURN ID(left) as occ_id,
n.text as ent_text,
substring(f.text, ABS(left.file_offset-1), 2 + LENGTH(n.text) ) as occ_text,
f.path as file_path,
left.line_nr as occ_line_nr,
ID(f) as file_id
Instead of a new path in the MATCH clause, I thought it would also be possible to have:
NOT ( (f)<-[right4:OCCURS]-(n4) )
But, I do not want to exclude the existence of any path, but specific paths.
As an alternative solution, I thought to include additional start nodes (as I have an index on the not to be reachable node), to remove the text comparison in the WHERE clause. This however doesn't return anything if there are no nodes in neo4j matching the wildcard.
start n=node:entities("text:*")
, n4=node:entities("text:nonreachablenodestextwildcard")
, n7=node:entities("text:othernonreachablenodestextwildcard")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4) ) ) )
AND (
( (p7 = null
OR left.line_nr < right7.line_nr - 0
OR left.line_nr > right7.line_nr + 0
OR ID(left) = ID(right7) ) )
) )
Old Update:
As mentioned in the answers, I could use the predicate functions to construct an inner query. I therefore updated the query to:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE ( (
(NONE(path in (f)<-[:OCCURS]-(n4)
WHERE
(LAST(nodes(path))).text =~ "nonreachablenodestextRegex"
AND FIRST(r4 in rels(p)).line_nr <= left.line_nr
AND FIRST(r4 in rels(p)).line_nr >= left.line_nr
)
) )
AND (
(NONE(path in (f)<-[:OCCURS]-(n7)
WHERE
(LAST(nodes(path))).text =~ "othernonreachablenodestextRegex"
AND FIRST(r7 in rels(p)).line_nr <= left.line_nr
AND FIRST(r7 in rels(p)).line_nr >= left.line_nr
)
) )
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....
This gives me an java.lang.OutOfMemoryException :
java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Pattern.compile(Pattern.java:1432)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at scala.util.matching.Regex.<init>(Regex.scala:38)
at scala.collection.immutable.StringLike$class.r(StringLike.scala:226)
at scala.collection.immutable.StringOps.r(StringOps.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCase(Base.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCases(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at scala.util.parsing.combinator.Parsers$Parser.p$3(Parsers.scala:209)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:183)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
(The last 6 lines are repeated a few more times)
Solution
The previous update probably contains a syntax error somewhere, got it fixed slightly different as follows:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE (
(NONE ( path in (f)<-[:OCCURS]-()
WHERE
ANY(n4 in nodes(path)
WHERE ID(n4) <> ID(n)
AND n4.type = 'ENTITY'
AND n4.text =~ "a regex expr"
)
AND ALL(r4 in rels(path)
WHERE r4.line_nr <= left.line_nr + 0
AND r4.line_nr >= left.line_nr - 0
)
)
) )
AND
NONE ( ...... )
WITH n, left, f, count(*) as group_by_cause
RETURN ...
It is however slow. Order of seconds (>10) for small graph:
4 entities-nodes and 6 :OCCURS relations in total, all to 1 single destination f node, with line_nr's between 0 and 3.
Performance Update
The following is about twice as fast:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE
( n4.text? =~ "regex1"
AND (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4)
)
)
AND
( n7.text? =~ "regex2"
AND (p7 = null .....)
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....
I think instead of the optional relationships you should use the pattern predicates in WHERE as you noted. The pattern expressions actually return a collection of paths, so you could do collection predicates like (ALL, NONE, ANY, SINGLE)
WHERE NONE(path in (f)<-[:OCCURS]-(n4) WHERE
ALL(r in rels(p) : r.line_nr = 42 ))
see: http://docs.neo4j.org/chunked/milestone/query-function.html#_predicates

Resources