How to verify that specific paths do not exist in cypher query - neo4j

I would like to get nodes, which do no have a specific relation (a relation with specific properties).
The graph contains entity nodes (the n's), which occur at specific lines (line_nr) in files (the f).
The current query I have is as follows:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( n4.text? =~ "nonreachablenodestextregex" AND (p4 = null OR left.line_nr < right4.line_nr - 0 OR left.line_nr > right4.line_nr + 0 OR ID(left) = ID(right4) ) ) )
AND (
( n7.text? =~ "othernonreachablenodestextregex" AND (p7 = null OR left.line_nr < right7.line_nr - 0 OR left.line_nr > right7.line_nr + 0 OR ID(left) = ID(right7) ) ) ) )
WITH n, left, f, count(*) as group_by_cause
RETURN ID(left) as occ_id,
n.text as ent_text,
substring(f.text, ABS(left.file_offset-1), 2 + LENGTH(n.text) ) as occ_text,
f.path as file_path,
left.line_nr as occ_line_nr,
ID(f) as file_id
Instead of a new path in the MATCH clause, I thought it would also be possible to have:
NOT ( (f)<-[right4:OCCURS]-(n4) )
But, I do not want to exclude the existence of any path, but specific paths.
As an alternative solution, I thought to include additional start nodes (as I have an index on the not to be reachable node), to remove the text comparison in the WHERE clause. This however doesn't return anything if there are no nodes in neo4j matching the wildcard.
start n=node:entities("text:*")
, n4=node:entities("text:nonreachablenodestextwildcard")
, n7=node:entities("text:othernonreachablenodestextwildcard")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4) ) ) )
AND (
( (p7 = null
OR left.line_nr < right7.line_nr - 0
OR left.line_nr > right7.line_nr + 0
OR ID(left) = ID(right7) ) )
) )
Old Update:
As mentioned in the answers, I could use the predicate functions to construct an inner query. I therefore updated the query to:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE ( (
(NONE(path in (f)<-[:OCCURS]-(n4)
WHERE
(LAST(nodes(path))).text =~ "nonreachablenodestextRegex"
AND FIRST(r4 in rels(p)).line_nr <= left.line_nr
AND FIRST(r4 in rels(p)).line_nr >= left.line_nr
)
) )
AND (
(NONE(path in (f)<-[:OCCURS]-(n7)
WHERE
(LAST(nodes(path))).text =~ "othernonreachablenodestextRegex"
AND FIRST(r7 in rels(p)).line_nr <= left.line_nr
AND FIRST(r7 in rels(p)).line_nr >= left.line_nr
)
) )
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....
This gives me an java.lang.OutOfMemoryException :
java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Pattern.compile(Pattern.java:1432)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at scala.util.matching.Regex.<init>(Regex.scala:38)
at scala.collection.immutable.StringLike$class.r(StringLike.scala:226)
at scala.collection.immutable.StringOps.r(StringOps.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCase(Base.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCases(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at scala.util.parsing.combinator.Parsers$Parser.p$3(Parsers.scala:209)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:183)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
(The last 6 lines are repeated a few more times)
Solution
The previous update probably contains a syntax error somewhere, got it fixed slightly different as follows:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE (
(NONE ( path in (f)<-[:OCCURS]-()
WHERE
ANY(n4 in nodes(path)
WHERE ID(n4) <> ID(n)
AND n4.type = 'ENTITY'
AND n4.text =~ "a regex expr"
)
AND ALL(r4 in rels(path)
WHERE r4.line_nr <= left.line_nr + 0
AND r4.line_nr >= left.line_nr - 0
)
)
) )
AND
NONE ( ...... )
WITH n, left, f, count(*) as group_by_cause
RETURN ...
It is however slow. Order of seconds (>10) for small graph:
4 entities-nodes and 6 :OCCURS relations in total, all to 1 single destination f node, with line_nr's between 0 and 3.
Performance Update
The following is about twice as fast:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE
( n4.text? =~ "regex1"
AND (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4)
)
)
AND
( n7.text? =~ "regex2"
AND (p7 = null .....)
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....

I think instead of the optional relationships you should use the pattern predicates in WHERE as you noted. The pattern expressions actually return a collection of paths, so you could do collection predicates like (ALL, NONE, ANY, SINGLE)
WHERE NONE(path in (f)<-[:OCCURS]-(n4) WHERE
ALL(r in rels(p) : r.line_nr = 42 ))
see: http://docs.neo4j.org/chunked/milestone/query-function.html#_predicates

Related

What am i doing wrong in writing this Snowflake function with variables?

Another user was helping me with this problem, but i'm having trouble executing it, i'm getting error:
syntax error line 3 at position 8 unexpected 'time'. syntax error line 3 at position 23 unexpected ':'. (line 3)
it seems i'm either declaring the variables wrong, actually i'm sure of that because when i comment out "time", it gives me an error with "curDay".
Here is the function i'm trying to execute;
CREATE OR REPLACE FUNCTION DB_BI_DEV.RAW_CPMS_AAR.cfn_GetShiftIDFromDateTime (dateTime TIMESTAMP_NTZ(9), shiftCalendarID int)
RETURNS table (shiftID int)
AS
$$
DECLARE
time time TIME(:dateTime);
curDay int;
prvDay int;
shiftID int;
BEGIN
SELECT TOP 1
ID,
DATEDIFF( day, BeginDate, :dateTime ) % PeriodInDays + 1,
( :curDay + PeriodInDays - 2 ) % PeriodInDays + 1
INTO :shiftCalendarID, :curDay, :prvDay
FROM RAW_CPMS_AAR.ShiftCalendar
WHERE ID = :shiftCalendarID
OR ( :shiftCalendarID IS NULL
AND Name = 'Factory'
AND BeginDate <= :dateTime )
ORDER BY BeginDate DESC;
SELECT ID into :shiftID
FROM Shift
WHERE ShiftCalendarID = #shiftCalendarID
AND ( ( FromDay = :curDay AND FromTimeOfDay <= :time AND TillTimeOfDay > :time )
OR ( FromDay = :curDay AND FromTimeOfDay >= TillTimeOfDay AND FromTimeOfDay <= :time )
OR ( FromDay = :prvDay AND FromTimeOfDay >= TillTimeOfDay AND TillTimeOfDay > :time )
);
END;
$$
This may need some small changes, but should be close to what you require:
CREATE OR REPLACE FUNCTION DB_BI_DEV.RAW_CPMS_AAR.cfn_GetShiftIDFromDateTime (dateTime TIMESTAMP_NTZ(9), shiftCalendarID int)
RETURNS table (shiftID int)
AS
$$
WITH T0 (ShiftCalendarID, CurDay, PrvDay)
AS (
SELECT TOP 1
ID AS ShiftCalendarID,
DATEDIFF( day, BeginDate, :dateTime ) % PeriodInDays + 1 AS CurDay,
( CurDay + PeriodInDays - 2 ) % PeriodInDays + 1 AS PrvDay
FROM RAW_CPMS_AAR.ShiftCalendar
WHERE ID = :shiftCalendarID
OR ( :shiftCalendarID IS NULL
AND Name = 'Factory'
AND BeginDate <= :dateTime )
ORDER BY BeginDate DESC
),
T1 (TimeValue)
AS (
SELECT TIME_FROM_PARTS(
EXTRACT(HOUR FROM :datetime),
EXTRACT(MINUTE FROM :datetime),
EXTRACT(SECOND FROM :datetime))
)
)
SELECT ID as shiftID
FROM Shift, T0, T1
WHERE Shift.ShiftCalendarID = T0.ShiftCalendarID
AND ( ( FromDay = T0.CurDay AND FromTimeOfDay <= T1.TimeValue AND TillTimeOfDay > T1.TimeValue )
OR ( FromDay = T0.CurDay AND FromTimeOfDay >= TillTimeOfDay AND FromTimeOfDay <= T1.TimeValue )
OR ( FromDay = T0.PrvDay AND FromTimeOfDay >= TillTimeOfDay AND TillTimeOfDay > T1.TimeValue )
);
$$
To add on to Dave Weldon's solution - in UDF bodies, there are no colons in front of parameters. Instead of :shiftCalendarID it is shiftCalendarID and :dateTime is just dateTime. Colons are needed in Stored Procedures, because the parameters are treated as string literal constants, but this is not the case with User Defined Functions.

User Defined Function in neo4j

I have the following function that I found in this Reference.
I want to use this function to find the minimum dominating set. How can I execute this in neo4j?
function CYPHERGreedyAlgorithm
MATCH(h)
SET h.whiteness = 1
SET h.blackness = 0
WITH h
OPTIONAL MATCH(j)
WHERE NOT (j){>()
SET j.blackness = 1
SET j.whiteness = 0
WITH j
MATCH (n){>(m)
WHERE n.blackness <> 1
WITH collect(m) as neighbourhood, n
WITH reduce(totalweight = n.whiteness, j in neighbourhood | totalweight + j.whiteness) as weightings, n
WITH n, weightings
ORDER BY weightings desc limit 1
MATCH (n1){>(m1)
WHERE n1.blackness <> 1
WITH collect(m1) as neighbourhood, n1
WITH reduce(totalweight = n1.whiteness, j in neighbourhood | totalweight + j.whiteness) as weightings, n1
WITH n1, weightings
ORDER BY weightings desc limit 1
MATCH(n1){>(m1)
WHERE m1.blackness <> 1
SET n1.blackness = 1
SET n1.whiteness = 0
SET m1.whiteness = 0
WITH n1
MATCH (k)
WHERE k.whiteness = 1
RETURN count(distinct(k)) as countOfRemainingWhiteNodes
end function

How can I get second shortpath?

I created big graph database. Graph includes relationships between organizations. I would like get shortpath between two nodes.
I filtered relationships types by next query
MATCH (o:Organization) WHERE Id(o) = 23806112
MATCH (o1:Organization) WHERE Id(o1) = 24385058
MATCH p = allShortestPaths((o)-[*1..10]-(o1)) WHERE
ALL (r IN RELATIONSHIPS(p) WHERE
(type(r) = 'OrganizationFounderOrganization' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationFounderPerson' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationChief' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationManagingCompany' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationPhone')
OR (type(r) = 'OrganizationAddress' AND NOT EXISTS(r.DateTo) )
)
RETURN p SKIP 0 LIMIT 30
Filters by relationships do not may be combined because they could use other filters.
My execution plan by this query
http://joxi.ru/E2pNPDDH9Rpger
When I get paths, I filter nodes by other conditions (capital, status) and if I get not right paths I apply next query with filter by bad nodes.
MATCH (o:Organization) WHERE Id(o) = 23806112
MATCH (o1:Organization) WHERE Id(o1) = 24385058
MATCH p = allShortestPaths((o)-[*1..10]-(o1)) WHERE
ALL (r IN RELATIONSHIPS(p) WHERE
(type(r) = 'OrganizationFounderOrganization' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationFounderPerson' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationChief' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationManagingCompany' AND r.DateFrom <= datetime('2019-03-13') )
OR (type(r) = 'OrganizationPhone')
OR (type(r) = 'OrganizationAddress' AND NOT EXISTS(r.DateTo) )
)
AND ALL (n IN NODES(p) WHERE NOT Id(n) IN [15665,1557884,7888953]
RETURN p SKIP 0 LIMIT 30
Execution plan second query
http://joxi.ru/KAxWPDDcMJGyY2
In the process of executing second query neo4j become unresponsive and I should restart container
Neo4j version 3.4.7
Store Sizes
Count Store 5.69 KiB
Label Store 16.02 KiB
Index Store 9.27 GiB
Schema Store 8.01 KiB
Array Store 8.01 KiB
Logical Log 16.53 MiB
Node Store 1.18 GiB
Property Store 20.33 GiB
Relationship Store 8.79 GiB
String Store 18.73 GiB
Total Store Size 58.56 GiB
ID Allocation
Node ID 84686407
Property ID 532508736
Relationship ID 276570526
Relationship Type ID 13
Container memory limit 122880mb
Processor 32 core

Find top 3 nodes with maximum relationships

The structure of my data base is:
( :node ) -[:give { money: some_int_value } ]-> ( :Org )
One node can have multiple relations.
I need to find top 3 nodes with the most number of relations :give with their property money holding: vx <= money <= vy
Using ORDER BY and LIMIT should solve your problem:
Match ( n:node ) -[r:give { money: some_int_value } ]-> ( :Org )
RETURN n
ORDER BY count(r) DESC //Order by the number of relations each node has
LIMIT 3 //We only want the top 3 nodes
Instead of using the label 'node', maybe use something more descriptive like Person for the label so the datamodel is more clear:
MATCH (p:Person)-[r:give]->(o:Org)
WITH count(r) AS num, sum(r.money) AS total, p
RETURN p, num, total ORDER BY num DESC LIMIT 3;
I'm not sure what you mean by "their property money holding: vx <= money <= vy". If you could clarify I can update my answer accordingly. You can calculate the total of the money properties using the sum() function.
Edit
To only include relationships with money property with value greater than 10 and less 25:
MATCH (p:Person)-[r:give]->(o:Org)
WHERE r.money >= 10 AND r.money <= 25
WITH count(r) AS num, sum(r.money) AS total, p
RETURN p, num, total ORDER BY num DESC LIMIT 3;

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery

declare #ProductDetails as table(ProductName nvarchar(200),ProductDescription nvarchar(200),
Brand nvarchar(200),
Categry nvarchar(200),
Grop nvarchar(200),
MRP decimal,SalesRate decimal,CurrentQuantity decimal,AvailableQty decimal)
declare #AvailableQty table(prcode nvarchar(100),Aqty decimal)
declare #CloseStock table(pcode nvarchar(100),
Cqty decimal)
insert into #CloseStock
select PCODE ,
0.0
from producttable
insert into #AvailableQty
select PCODE ,
0.0
from producttable
--Current Qty
--OpenQty
update #CloseStock set Cqty=((OOQTY+QTY+SRRQTY+PYQTY)-(STQTY+PRRQTY))
from
(
select PC.PCODE as PRODUCTCODE,
--Opening
(select case when SUM(PU.Quantity)is null then 0 else SUM(PU.Quantity) end as Q from ProductOpeningYearEnd PU
where PC.PCODE=PU.ProductName) as OOQTY,
--Purchase
(select case when SUM(PU.quantity)is null then 0 else SUM(PU.quantity) end as Q from purchase PU
where PC.PCODE=PU.prdcode ) as QTY,
--Sales
(select case when SUM(ST.QUANTITY)is null then 0 else SUM(ST.QUANTITY)end as Q2 from salestable ST
where PC.PCODE=ST.PRODUCTCODE and ST.status!='cancel' )as STQTY,
--Physical Stock
(select case when SUM(PS.Adjustment)is null then 0 else SUM(PS.Adjustment)end as Q3 from physicalstock PS
where PC.PCODE=PS.PCODE )as PYQTY,
--Sales Return
(select case when SUM(SR.quantity)is null then 0 else SUM(SR.quantity)end as Q3 from salesreturn SR
where PC.PCODE=SR.prdcode )as SRRQTY,
--Purchase Return
(select case when SUM(PR.quantity)is null then 0 else SUM(PR.quantity)end as Q3 from purchasereturn PR
where PC.PCODE=PR.prdcode )as PRRQTY
from producttable PC
group by PC.PCODE
)t
where PCODE=t.PRODUCTCODE
--Available
update #AvailableQty set Aqty=((CCqty-GIQty)+(GOQty))
--((OOQTY+QTY+SRRQTY+PYQTY)-(STQTY+PRRQTY))
from
(
select PC.PCODE as PRODUCTCODE,
--GoodsIn
(select case when SUM(GI.quantity)is null then 0 else SUM(GI.quantity) end as Q from goodsin GI
where PC.PCODE=GI.productcode) as GIQty,
--GoodsOut
(select case when SUM(GUT.quantity)is null then 0 else SUM(GUT.quantity) end as Q from goodsout GUT
where PC.PCODE=GUT.productcode ) as GOQty,
--Current Stock
(select CS.Cqty as Q from #CloseStock CS
where PC.PCODE=CS.pcode ) as CCqty
from producttable PC
group by PC.PCODE
)t
where prcode=t.PRODUCTCODE
insert into #ProductDetails
select PCODE,[DESCRIPTION],BRAND,CATEGORY,DEPARTMENT,MRP,SALERATE,0,0
from producttable
update #ProductDetails set CurrentQuantity=pcqty,AvailableQty=acqty
from
(
select pt.ProductName as pn,cs.Cqty as pcqty,ac.Aqty as acqty from #ProductDetails pt
inner join #CloseStock cs on pt.ProductName=cs.pcode
inner join #AvailableQty ac on pt.ProductName=ac.prcode
)t
where ProductName=t.pn
select * from #ProductDetails
end
This not working when productable in pcode field add ant (-.&) this kind of symbol i want to even allow in pcode field,
please help me how i can allow any symbol in query
(problem with this code)
update #AvailableQty set Aqty=((CCqty-GIQty)+(GOQty))
from
(
select PC.PCODE as PRODUCTCODE,
--GoodsIn
(select case when SUM(GI.quantity)is null then 0 else SUM(GI.quantity) end as Q from goodsin GI
where PC.PCODE=GI.productcode) as GIQty,
--GoodsOut
(select case when SUM(GUT.quantity)is null then 0 else SUM(GUT.quantity) end as Q from goodsout GUT
where PC.PCODE=GUT.productcode ) as GOQty,
--Current Stock
(select CS.Cqty as Q from #CloseStock CS
where PC.PCODE=CS.pcode ) as CCqty
from producttable PC
group by PC.PCODE
)t
where prcode=t.PRODUCTCODE
The problem here isn't with the symbols you're using, it's that you are assigning the value of a subquery to a single column in the result set. For example:
(select case when SUM(PR.quantity)is null then 0 else SUM(PR.quantity)end as Q3 from purchasereturn PR
where PC.PCODE=PR.prdcode )as PRRQTY
Note that this is allowed only if the subquery returns only a single value; otherwise, we don't know which of the values should be assigned to the column.
If you expect your subqueries to return multiple values and you just want an arbitrary one, use TOP 1 in the subquery to only return 1 value. Otherwise, you'll have to debug each subquery to figure out which returns multiple results and is causing the issue.

Resources