Cypher query condition is always returning true - neo4j

MATCH (c:OBJ {dummy:false})
where [{param} is null or [(c)-[]->(p:PRO {dummy:false}) WHERE p.val ={param} | true ] ]
return c
I have a big query from which there is this simple part. But
[{param} is null or [(c)-[]->(p:PRO {dummy:false}) WHERE p.val = {param} | true] ]
part is always returning true even if p.val ={param} is false. What did I do wrong here? The syntax looks fine for me.

A pattern comprehension will always return a list, even if it is empty.
So, the following will never equate to false or NULL:
[(c)-[]->(p:PRO {dummy:false}) WHERE p.val = {param} | true]
This query (which tests the size of the inner comprehension result) will probably work for you:
MATCH (c:OBJ {dummy:false})
WHERE [$param IS NULL or SIZE([(c)-->(p:PRO {dummy:false}) WHERE p.val = $param | true]) > 0 ]
RETURN c

There's a couple things going on here. Note cybersam's answer, as it relates to the pattern comprehension within.
However, the bigger problem is that the WHERE clause will ultimately be evaluating a list, NOT a boolean!
By using square brackets for the entirety of your WHERE clause, that means you're creating a list, and its contents will be whatever is inside. The stuff inside will evaluate to a boolean, so this means that there are two possibilities for how this will turn out:
WHERE [true]
or
WHERE [false]
and in either of these cases, these are NOT booleans, but single-element lists, where the only element in the list is a boolean!
Using WHERE on a list like this will ALWAYS evaluate to true. No matter what is in the list. No matter how many elements are in the list. No matter if the list is empty. If the list itself exists, it's going to be evaluated as true, and the WHERE clause succeeds.
So to fix all this, do NOT use square brackets as some means to compartmentalize boolean logic. Use parenthesis instead, and only if you really need them:
MATCH (c:OBJ {dummy:false})
WHERE ( {param} is null or (c)-->(:PRO {dummy:false, val:{param}}) )
RETURN c
In the above, the parenthesis in use when using the WHERE clause aren't really needed, but you can use them if you like.
This should also be a better way of expressing the predicate you want. In this case we don't need the pattern comprehension at all. It's enough to say that we either need the {param} parameter to be null, or this pattern (where one of the properties is the {param} parameter) must exist.
If you really do end up needing a pattern comprehension, follow cybersam's advice and make sure you're testing whether the the list resulting from the comprehension is empty or non-empty.

Related

Query against a Postgres array column type

TL;DR I'm wondering what the pros and cons are (or if they are even equivalent) between #> {as_champion, whatever} and using IN ('as_champion', 'whatever') is. Details below:
I'm working with Rails and using Postgres' array column type, but having to use raw sql for my query as the Rails finder methods don't play nicely with it. I found a way that works, but wondering what the preferred method is:
The roles column on the Memberships table is my array column. It was added via rails as so:
add_column :memberships, :roles, :text, array: true
When I examine the table, it shows the type as: text[] (not sure if that is truly how Postgres represents an array column or if that is Rails shenanigans.
To query against it I do something like:
Membership.where("roles #> ?", '{as_champion, whatever}')
From the fine Array Operators manual:
Operator: #>
Description: contains
Example: ARRAY[1,4,3] #> ARRAY[3,1]
Result: t (AKA true)
So #> treats its operand arrays as sets and checks if the right side is a subset of the left side.
IN is a little different and is used with subqueries:
9.22.2. IN
expression IN (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of IN is "true" if any equal subquery row is found. The result is "false" if no equal row is found (including the case where the subquery returns no rows).
or with literal lists:
9.23.1. IN
expression IN (value [, ...])
The right-hand side is a parenthesized list of scalar expressions. The result is "true" if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for
expression = value1
OR
expression = value2
OR
...
So a IN b more or less means:
Is the value a equal to any of the values in the list b (which can be a query producing single element rows or a literal list).
Of course, you can say things like:
array[1] in (select some_array from ...)
array[1] in (array[1], array[2,3])
but the arrays in those cases are still treated like single values (that just happen to have some internal structure).
If you want to check if an array contains any of a list of values then #> isn't what you want. Consider this:
array[1,2] #> array[2,4]
4 isn't in array[1,2] so array[2,4] is not a subset of array[1,2].
If you want to check if someone has both roles then:
roles #> array['as_champion', 'whatever']
is the right expression but if you want to check if roles is any of those values then you want the overlaps operator (&&):
roles && array['as_champion', 'whatever']
Note that I'm using the "array constructor" syntax for the arrays everywhere, that's because it is much more convenient for working with a tool (such as ActiveRecord) that knows to expand an array into a comma delimited list when replacing a placeholder but doesn't fully understand SQL arrays.
Given all that, we can say things like:
Membership.where('roles #> array[?]', %w[as_champion whatever])
Membership.where('roles #> array[:roles]', :roles => some_ruby_array_of_strings)
and everything will work as expected. You're still working with little SQL snippets (as ActiveRecord doesn't have a full understanding of SQL arrays or any way of representing the #> operator) but at least you won't have to worry about quoting problems. You could probably go through AREL to manually add #> support but I find that AREL quickly devolves into an incomprehensible and unreadable mess for all but the most trivial uses.

Mnesia Errors case_clause in QLC query without a case clause

I have the following function for a hacky project:
% The Record variable is some known record with an associated table.
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
ExistingFields = record_to_fields(Existing),
RecordFields = record_to_fields(Record),
ExistingFields == RecordFields
]).
The function record_to_fields/1 simply drops the record name and ID from the tuple so that I can compare the fields themselves. If anyone wants context, it's because I pre-generate a unique ID for a record before attempting to insert it into Mnesia, and I want to make sure that a record with identical fields (but different ID) does not exist.
This results in the following (redacted for clarity) stack trace:
{aborted, {{case_clause, {stuff}},
[{db, '-my_func/2-fun-1-',8, ...
Which points to the line where I declare Query, however there is no case clause in sight. What is causing this error?
(Will answer myself, but I appreciate a comment that could explain how I could achieve what I want)
EDIT: this wouldn't be necessary if I could simply mark certain fields as unique, and Mnesia had a dedicated insert/1 or create/1 function.
For your example, I think your solution is clearer anyway (although it seems you can pull the record_to_fields(Record) portion outside the comprehension so it isn't getting calculated over and over.)
Yes, list comprehensions can only have generators and assignments. But you can cheat a little by writing an assignment as a one-element generator. For instance, you can re-write your expression as this:
RecordFields = record_to_fields(Record),
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
ExistingFields <- [record_to_fields(Existing)],
ExistingFields == RecordFields
]).
As it turns out, the QLC DSL does not allow assignments, only generators and filters; as per the documentation (emphasis mine):
Syntactically QLCs have the same parts as ordinary list
comprehensions:
[Expression || Qualifier1, Qualifier2, ...]
Expression (the template)
is any Erlang expression. Qualifiers are either filters or generators.
Filters are Erlang expressions returning boolean(). Generators have
the form Pattern <- ListExpression, where ListExpression is an
expression evaluating to a query handle or a list.
Which means we cannot variable assignments within a QLC query.
Thus my only option, insofar as I know, is to simply write out the query as:
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
record_to_fields(Existing) == record_to_fields(Record)
]).

Intersection of 2 mongo queries

I want to emulate an "&" operator for searching elements in my mongo db.
there are 4 searchable fields name id tags negative_tags
for a match to be true, any of these could match.
For instance if I search a&b, "a" could be matched in any of the 4 fields and "b" as well. However, they need to both be matched
I tried doing the following
Model.or({:name.all => regexps}, {:id.all =>regexps}, {:tags.all => regexps}, {:negative_tags.all => regexps})
regexps is an array of regexp. For the example given it would be
[ /a/i, /b/i ]
However, this does not behave like I want, because you need the matches to all happen on the same property.
My other try was to run separate mongo queries for each regexps and take the intersection of the sets.
Model.or({:name.in => one_regexp}, {:id.in => one_regexp}, {:tags.in => one_regexp}, {:negative_tags.in => one_regexp})
My problem is that I am not sure how to merge the two hashes. Mongoid lazily evaluates the queries and returns a Mongoid::Criteria object.
I'd like to know how I can do an intersection
There are two distinct ways to handle this. Are you trying to have both regular expressions evaluate per field or can a be true for name and b be true for id?
If it is the latter, I would use a gem for this:
gem 'mongoid_search'
If it is the former, I'd simply join the array into a single regex:
one_regexp.collect {|regexp| "(?=.*#{regexp}" }.join
If what you want to do is to apply two RegEx expressions onto each field, simply put both both in nonconsuming patterns and use one regular expression. This is known as positive lookahead assertion (?=) combined with the .* operator that allows the order to be reversed.
/(?=.*a)(?=.*b)/

neo4j, how to exclude some path in all shortest path algorithm

I have a sort of city map where nodes are crosses and arcs are streets. I add an attribute "obstacle" at some streets almost randomly. Now I want to find some path from a point to another without having in this path any streets with this attribute. Is it possible?
This is the code I write and the problem is in the clause "street.obstacle is not null"
MATCH path=allShortestPaths((source:Cross)-[street:Street*]->(destination: Cross))
WHERE source.id="49" AND destination.id="57" AND
street.obstacle IS NOT NULL
return path AS shortestPath,
reduce(LENGTH=0, n IN rels(path)| LENGTH + n.length) AS totalLength
Null doesn't apply to some boolean conditions. Read more in the docs here. The expression (NOT NULL) returns NULL, not false, because null is treated as a third option, neither true nor false. It's the absence of data really.
You might be looking for has(street.obstacle) instead, or possibly the EXISTS() function, depending on what you're trying to express. Has will tell you whether the property exists (regardless of what value it has).

How to check if an element is in a node.collection using Cypher?

I'm begining with Neo4j/Cypher, I have some nodes containing a property which is an array of integers. I want to check if a given number is in a node's collection and if so, append this node to the results. My query looks like this:
MATCH (a) WHERE has(a.user_ids) and (13 IN a.user_ids) RETURN a
where 13 is the given user_id. It throws a syntax error:
Type mismatch: a already defined with conflicting type Node (expected Collection<Any>)
Any idea how can I accomplish that?
Thanks in advance.
You can try the predicate ANY, which returns true if any member of a collection matches some criterion.
MATCH (a) WHERE has(a.user_ids) and ANY(user_id IN a.user_ids WHERE user_id = 13)
It looks a bit backwards now that I'm looking at it, but it should work.
Edit:
It was bugging me why your query didn't work and why my answer seemed backwards and indirect so I did a simple test. Basically, your original query works if you put the property reference in parentheses:
MATCH (a)
WHERE has(a.user_ids) and (13 IN (a.user_ids))
RETURN a
That's easier to read so that's what I should have answered. But I still couldn't see why the parentheses where necessary here, when they are not in other cases. They were not necessary inside the ANY() above, and if you 'detach' the collection from the node
MATCH (a)
WITH a.user_ids as user_ids, a
WHERE 13 IN user_ids
RETURN a
there's no problem. For some reason Cypher needs to be told to evaluate a.user_ids before IN, or it ignores user_ids and tries to evaluate 13 IN a. IN is listed as an operator in the documentation, but in this regard it woks differently than other operators. For example
MATCH (a) RETURN 13 + a.user_ids
returns fine and
MATCH (a) RETURN 13 * a.user_ids
MATCH (a) RETURN 13 < a.user_ids
fails but because a.user_ids is a collection, not because a is a node. It's probably not very important, it's easy enough to use parentheses, but it would be interesting to learn why they are necessary.
I also compared my answer to your original query with added parentheses to see if there were any performance drawback to the more indirect way. Turns out the execution plan is almost identical, 13 IN (a.user_ids) is refactored to use ANY() like in my answer.
My answer:
Filter(pred="any(user_id in Product(a,user_ids(6),true) where user_id == Literal(13))", _rows=1, _db_hits=8)
AllNodes(identifier="a", _rows=8, _db_hits=8)
Your query + ():
Filter(pred="any(-_-INNER-_- in Product(n,user_ids(6),true) where Literal(13) == -_-INNER-_-)", _rows=1, _db_hits=8)
AllNodes(identifier="n", _rows=8, _db_hits=8)
Finally, in your case you probably don't have to check for existence of property with has(). Absent properties and null are handled differently in 2.0 and if the property doesn't exist 13 IN (a.user_ids) will evaluate to false, so usually there is no reason to test for property existence before property evaluation for fear of the query breaking. The place to use has() would be when property existence is relevant in itself, and that would probably be a different property than the one evaluated, i.e. WHERE has(a.someProperty) AND 13 IN (a.someOtherProperty).
Since there is no performance difference, the more readable query is better, and since you, as far as I can see, don't really need to test for property existence, I think your query should be
MATCH (a)
WHERE 13 IN (a.user_ids)
RETURN a

Resources