Cypher: which assignment operator - neo4j

I would appreciate some Cypher-specific theory for why there are, effectively, two different assignment operators in the language. I can get things to work, but feel like something is missing...
Let's use Neo4j's movie database with the following query:
match (kr:Person {name:"Keanu Reeves"}), (hw:Person{name:"Hugo Weaving"}), p=shortestPath((kr)-[*]-(hw)) return p
Sure, the query works, but here's the point of my question: 'kr', 'hw' and 'p' are all variables, right? But why is it that the former two are assigned with a colon, but the latter takes an equal sign?
Thanks.

It's important to note that the : used for nodes and relationships really doesn't have anything to do with variable assignment at all, it's instead associated with node labels and relationship types.
A node label and a relationship type always start with a :, even if there isn't a variable present at all. This helps differentiate a node label or relationship type from a variable (a variable will never begin with a :), and the : naturally acts as a divider between the label/type and the variable when both are present. It's also possible to have a variable on a node or label, but omit the type...in that case no : will be present, which again reinforces that it doesn't have anything to do with assignment.
In the context of a map {} (such as a properties map, including when it's inlined within a match on a node or relationship), then the : is used for map key/value pairs, and is common syntax, used in JSON representation.
Actual assignment (such as in SET clauses, and in your example of setting the path variable to a pattern within a match) uses =.

I do not think there is a deep theoretical reason for it. The original idea of Cypher was to provide an ASCII art-style language, where the MATCH part of the query which resembles a graph pattern that you'd draw on a whiteboard.
In many ways, a graph instance is quite similar to a UML Object Diagram (and other common representations), where you would use name : type to denote an object's variable name and type (class) or just use : type for anonymous instances.
Now paths do not really fit into that picture. On a whiteboard, I'd just put the relevant part in a dashed/circled area write p or p= next to it. Definitely not p:.
Note that it is possible to rephrase your query to a more compact form:
match p=shortestPath((kr:Person {name:"Keanu Reeves"})-[*]-(hw:Person {name:"Hugo Weaving"}))
return p
Here, using colons everywhere would look out of place, think: p:shortestPath((kr:Person {name:"Keanu Reeves"})
Remark 1. If you try to use a variable to capture relationships of a variable length pattern, you will get a warning:
Warning. This feature is deprecated and will be removed in future versions.
Binding relationships to a list in a variable length pattern is deprecated. (org.neo4j.graphdb.impl.notification.NotificationDetail$Factory$2#1eb6644d)
MATCH (a)-[rs:REL*]->(b)
^
So you would better use a path and the relationships function to get the same result:
MATCH p=(a)-[:REL*]->(b)
RETURN relationships(p)
Remark 2. I come from an OO background and have been writing Cypher for a few years, so it might just be me getting used the syntax -- it might be odd for newcomers, especially from different fields.
Remark 3. The openCypher project now provides a grammar specification
, which gives you an insight of how a MATCH clause is parsed.

Related

Basic error using solver in Z3-Python: it is returning [] as a model

Maybe I have slept bad today, but I am really struggling with this simple query to Z3-Python:
from z3 import *
a = Bool('a')
b = Bool('b')
sss = Solver()
sss.add(Exists([a,b], True))
print(sss.check())
print(sss.model())
The check prints out sat, but the model is []. However, it should be printing some (anyone) concrete assignment, such as a=True, b=True.
The same is happening if I change the formula to, say: sss.add(Exists([a,b], Not(And(a,b)))). Also tested sss.add(True). Thus, I am missing something really basic, so sorry for the basic nature of this question.
Other codes are working normally (even with optimizers instead of solvers), so it is not a problem of my environment (Collab)
Any help?
Note that there's a big difference between:
a = Bool('a')
b = Bool('b')
sss.add(Exists([a, b], True))
and
a = Bool('a')
b = Bool('b')
sss.add(True) # redundant, but to illustrate
In the first case, you're checking if the statement Exists a. Exists b. True is satisfiable; which trivially is; but there is no model to display: The variables a and b are inside quantification; and they don't play any role in model construction.
In the second case, a and b are part of the model, and hence will be displayed.
What's confusing is why do you need to declare the a and b in the first case. That is, why can't we just say:
sss.add(Exists([a, b], True))
without any a or b in the environment? After all, they are irrelevant as far as the problem is concerned. This is merely a peculiarity of the Python API; there's really no good reason other than this is how it is implemented.
You can see the generated SMTLib by adding a statement of the form:
print(sss.sexpr())
and if you do that for the above segments, you'll see the first one doesn't even declare the variables at all.
So, long story short, the formula exists a, b. True has no "model" variables and thus there's nothing to display. The only reason you declare them in z3py is because of an implementation trick that they use (so they can figure out the type of the variable), nothing more than that.
Here're some specific comments about your questions:
An SMT solver will only construct model-values for top-level declared variables. If you create "local" variables via exists/forall, they are not going to be displayed in models constructed. Note that you could have two totally separate assertions that talk about the "same" existentially quantified variable: It wouldn't even have a way of referring to that item. Rule-of-thumb: If you want to see the value in a model, it has to be declared at the top-level.
Yes, this is the reason for the trick. So they can figure out what type those variables are. (And other bookkeeping.) There's just no syntax afforded by z3py to let you say something like Exists([Int(a), Int(b)], True) or some such. Note that this doesn't mean something like this cannot be implemented. They just didn't. (And is probably not worth it.)
No you understood correctly. The reason you get [] is because there are absolutely no constraints on those a and b, and z3's model constructor will not assign any values because they are irrelevant. You can recover their values via model_completion parameter. But the best way to experiment with this is to add some extra constraints at the top level. (It might be easier to play around if you make a and b Ints. At the top level, assert that a = 5, b = 12. At the "existential" level assert something else. You'll see that your model will only satisfy the top-level constraints. Interestingly, if you assert something that's unsatisfiable in your existential query, the whole thing will become unsat; which is another sign of how they are treated.
I said trivially true because your formula is Exists a. Exists b. True. There're no constraints, so any assignment to a and b satisfy it. It's trivial in this sense. (And all SMTLib logics work over non-empty domains, so you can always assign values freely.)
Quantified variables can always be alpha-renamed without changing the semantics. So, whenever there's collision between quantified names and top-level names, imagine renaming them to be unique. I think you'll be able to answer your own question if you think of it this way.
Extended discussions over comments is really not productive. Feel free to ask new questions if anything isn't clear.

Extracting from .bib file with Raku (previously aka Perl 6)

I have this .bib file for reference management while writing my thesis in LaTeX:
#article{garg2017patch,
title={Patch testing in patients with suspected cosmetic dermatitis: A retrospective study},
author={Garg, Taru and Agarwal, Soumya and Chander, Ram and Singh, Aashim and Yadav, Pravesh},
journal={Journal of Cosmetic Dermatology},
year={2017},
publisher={Wiley Online Library}
}
#article{hauso2008neuroendocrine,
title={Neuroendocrine tumor epidemiology},
author={Hauso, Oyvind and Gustafsson, Bjorn I and Kidd, Mark and Waldum, Helge L and Drozdov, Ignat and Chan, Anthony KC and Modlin, Irvin M},
journal={Cancer},
volume={113},
number={10},
pages={2655--2664},
year={2008},
publisher={Wiley Online Library}
}
#article{siperstein1997laparoscopic,
title={Laparoscopic thermal ablation of hepatic neuroendocrine tumor metastases},
author={Siperstein, Allan E and Rogers, Stanley J and Hansen, Paul D and Gitomirsky, Alexis},
journal={Surgery},
volume={122},
number={6},
pages={1147--1155},
year={1997},
publisher={Elsevier}
}
If anyone wants to know what bib file is, you can find it detailed here.
I'd like to parse this with Perl 6 to extract the key along with the title like this:
garg2017patch: Patch testing in patients with suspected cosmetic dermatitis: A retrospective study
hauso2008neuroendocrine: Neuroendocrine tumor epidemiology
siperstein1997laparoscopic: Laparoscopic thermal ablation of hepatic neuroendocrine tumor metastases
Can you please help me to do this, maybe in two ways:
Using basic Perl 6
Using a Perl 6 Grammar
TL;DR
A complete and detailed answer that does just exactly as #Suman asks.
An introductory general answer to "I want to parse X. Can anyone help?"
A one-liner in a shell
I'll start with terse code that's perfect for some scenarios[1], and which someone might write if they're familiar with shell and Raku basics and in a hurry:
> raku -e 'for slurp() ~~ m:g / "#article\{" (<-[,]>+) \, \s+
"title=\{" (<-[}]>+) \} / -> $/ { put "$0: $1\n" }' < derm.bib
This produces precisely the output you specified:
garg2017patch: Patch testing in patients with suspected cosmetic dermatitis: A retrospective study
hauso2008neuroendocrine: Neuroendocrine tumor epidemiology
siperstein1997laparoscopic: Laparoscopic thermal ablation of hepatic neuroendocrine tumor metastases
Same single statement, but in a script
Skipping shell escapes and adding:
Whitespace.
Comments.
► use tio.run to run the code below
for slurp() # "slurp" (read all of) stdin and then
~~ m :global # match it "globally" (all matches) against
/ '#article{' (<-[,]>+) ',' \s+ # a "nextgen regex" that uses (`(...)`) to
'title={' (<-[}]>+) '}' / # capture the article id and title and then
-> $/ { put "$0: $1\n" } # for each article, print "article id: title".
Don't worry if the above still seems like pure gobbledygook. Later sections explain the above while also introducing code that's more general, clean, and readable.[2]
Four statements instead of one
my \input = slurp;
my \pattern = rule { '#article{' ( <-[,]>+ ) ','
'title={' ( <-[}]>+ ) }
my \articles = input .match: pattern, :global;
for articles -> $/ { put "$0: $1\n" }
my declares a lexical variable. Raku supports sigils at the start of variable names. But it also allows devs to "slash them out" as I have done.
my \pattern ...
my \pattern = rule { '#article{' ( <-[,]>+ ) ','
'title={' ( <-[}]>+ ) }
I've switched the pattern syntax from / ... / in the original one-liner to rule { ... }. I did this to:
Eliminate the risk of pathological backtracking
Classic regexes risk pathological backtracking. That's fine if you can just kill a program that's gone wild, but click the link to read how bad it can get! 🤪 We don't need backtracking to match the .bib format.
Communicate that the pattern is a rule
If you write a good deal of pattern matching code, you'll frequently want to use rule { ... }. A rule eliminates any risk of the classic regex problem just described (pathological backtracking), and has another superpower. I'll cover both aspects below, after first introducing the adverbs corresponding to those superpowers.
Raku regexes/rules can be (often are) used with "adverbs". These are convenient shortcuts that modify how patterns are applied.
I've already used an adverb in the earlier versions of this code. The "global" adverb (specified using :global or its shorthand alias :g) directs the matching engine to consume all of the input, generating a list of as many matches as it contains, instead of returning just the first match.
While there are shorthand aliases for adverbs, some are used so repeatedly that it's a lot tidier to bundle them up into distinct rule declarators. That's why I've used rule. It bundles up two adverbs appropriate for matching many data formats like .bib files:
:ratchet (alias :r)
:sigspace (alias :s)
Ratcheting (:r / :ratchet) tells the compiler that when an "atom" (a sub-pattern in a rule that is treated as one unit) has matched, there can be no going back on that. If an atom further on in the pattern in the same rule fails, then the whole rule immediately fails.
This eliminates any risk of the "pathological backtracking" discussed earlier.
Significant space handling (:s / :sigspace) tells the compiler that an atom followed by literal spacing that is in the pattern indicates that a "token" boundary pattern, aka ws should be appended to the atom.
Thus this adverb deals with tokenizing. Did you spot that I'd dropped the \s+ from the pattern compared to the original one in the one-liner? That's because :sigspace, which use of rule implies, takes care of that automatically:
say 'x#y x # y' ~~ m:g:s /x\#y/; # (「x#y」) <-- only one match
say 'x#y x # y' ~~ m:g /x \# y/; # (「x#y」) <-- only one match
say 'x#y x # y' ~~ m:g:s /x \# y/; # (「x#y」 「x # y」) <-- two matches
You might wonder why I've reverted to using / ... / to show these two examples. Turns out that while you can use rule { ... } with the .match method (described in the next section), you can't use rule with m. No problem; I just used :s instead to get the desired effect. (I didn't bother to use :r for ratcheting because it makes no difference for this pattern/input.)
To round out this dive into the difference between classic regexes (which can also be written regex { ... }) and rule rules, let me mention the other main option: token. The token declarator implies the :ratchet adverb, but not the :sigspace one. So it also eliminates the pathological backtracking risk of a regex (or / ... /) but, just like a regex, and unlike a rule, a token ignores whitespace used by a dev in writing out the rule's pattern.
my \articles = input .match: pattern, :global
This line uses the method form (.match) of the m routine used in the one-liner solution.
The result of a match when :global is used is a list of Match objects rather than just one. In this case we'll get three, corresponding to the three articles in the input file.
for articles -> $/ { put "$0: $1\n" }
This for statement successively binds a Match object corresponding to each of the three articles in your sample file to the symbol $/ inside the code block ({ ... }).
Per Raku doc on $/, "$/ is the match variable, so it usually contains objects of type Match.". It also provides some other conveniences; we take advantage of one of these conveniences related to numbered captures:
The pattern that was matched earlier contained two pairs of parentheses;
The overall Match object ($/) provides access to these two Positional captures via Positional subscripting (postfix []), so within the for's block, $/[0] and $/[1] provide access to the two Positional captures for each article;
Raku aliases $0 to $/[0] (and so on) for convenience, so most devs use the shorter syntax.
Interlude
This would be a good time to take a break. Maybe just a cuppa, or return here another day.
The last part of this answer builds up and thoroughly explains a grammar-based approach. Reading it may provide further insight into the solutions above and will show how to extend Raku's parsing to more complex scenarios.
But first...
A "boring" practical approach
I want to parse this with Raku. Can anyone help?
Raku may make writing parsers less tedious than with other tools. But less tedious is still tedious. And Raku parsing is currently slow.
In most cases, the practical answer when you want to parse well known formats and/or really big files is to find and use an existing parser. This might mean not using Raku at all, or using an existing Raku module, or using an existing non-Raku parser in Raku.
A suggested starting point is to search for the file format on modules.raku.org or raku.land. Look for a publicly shared parsing module already specifically packaged for Raku for the given file format. Then do some simple testing to see if you have a good solution.
At the time of writing there are no matches for 'bib'.
Even if you don't know C, there's almost certainly a 'bib' parsing C library already available that you can use. And it's likely to be the fastest solution. It's typically surprisingly easy to use an external library in your own Raku code, even if it's written in another programming language.
Using C libs is done using a feature called NativeCall. The doc I just linked may well be too much or too little, but please feel free to visit the freenode IRC channel #raku and ask for help. (Or post an SO question.) We're friendly folk. :)
If a C lib isn't right for a particular use case, then you can probably still use packages written in some other language such as Perl, Python, Ruby, Lua, etc. via their respective Inline::* language adapters.
The steps are:
Install a package (that's written in Perl, Python or whatever);
Make sure it runs on your system using a compiler of the language it's written for;
Install the appropriate Inline language adapter that lets Raku run packages in that other language;
Use the "foreign" package as if it were a Raku package containing exported Raku functions, classes, objects, values, etc.
(At least, that's the theory. Again, if you need help, please pop on the IRC channel or post an SO question.)
The Perl adapter is the most mature so I'll use that as an example. Let's say you use Perl's Text::BibTex packages and now wish to use Raku with that package. First, setup it up as it's supposed to be per its README. Then, in Raku, write something like:
use Text::BibTeX::BibFormat:from<Perl5>;
...
#blocks = $entry.format;
Explanation of these two lines:
The first line is how you tell Raku that you wish to load a Perl module.
(It won't work unless Inline::Perl5 is already installed and working. But it should be if you're using a popular Raku bundle. And if not, you should at least have the module installer zef so you can run zef install Inline::Perl5.)
The last line is just a mechanical Raku translation of the #blocks = $entry->format; line from the SYNOPSIS of the Perl package Text::BibTeX::BibFormat.
A Raku grammar / parser
OK. Enough "boring" practical advice. Let's now try have some fun creating a grammar based Raku parser good enough for the example from your question.
► use glot.io to run the code below
unit grammar bib;
rule TOP { <article>* }
rule article { '#article{' $<id>=<-[,]>+ ','
<kv-pairs>
'}'
}
rule kv-pairs { <kv-pair>* % ',' }
rule kv-pair { $<key>=\w* '={' ~ '}' $<value>=<-[}]>* }
With this grammar in place, we can now write something like:
die "Use CommaIDE?" unless bib .parsefile: 'derm.bib';
for $<article> -> $/ { put "$<id>: $<kv-pairs><kv-pair>[0]<value>\n" }
to generate exactly the same output as the previous solutions.
When a match or parse fails, by default Raku just returns Nil, which is, well, rather terse feedback.
There are several nice debugging options to figure out what's going on with a regex or grammar, but the best option by far is to use CommaIDE's Grammar-Live-View.
If you haven't already installed and used Comma, you're missing one of the best parts of using Raku. The features built in to the free version of Comma ("Community Edition") include outstanding grammar development / tracing / debugging tools.
Explanation of the 'bib' grammar
unit grammar bib;
The unit declarator is used at the start of a source file to tell Raku that the rest of the file declares a named package of code of a particular type.
The grammar keyword specifies a grammar. A grammar is like a class, but contains named "rules" -- not just named methods, but also named regexs, tokens, and rules. A grammar also inherits a bunch of general purpose rules from a base grammar.
rule TOP {
Unless you specify otherwise, parsing methods (.parse and .parsefile) that are called on a grammar start by calling the grammar's rule named TOP (declared with a rule, token, regex, or method declarator).
As a, er, rule of thumb, if you don't know if you should be using a rule, regex, token, or method for some bit of parsing, use a token. (Unlike regex patterns, tokens don't risk pathological backtracking.)
But I've used a rule. Like token patterns, rules also avoid the pathological backtracking risk. But, in addition rules interpret some whitespace in the pattern to be significant, in a natural manner. (See this SO answer for precise details.)
rules are typically appropriate towards the top of the parse tree. (Tokens are typically appropriate towards the leaves.)
rule TOP { <article>* }
The space at the end of the rule (between the * and pattern closing }) means the grammar will match any amount of whitespace at the end of the input.
<article> invokes another named rule in this grammar.
Because it looks like one should allow for any number of articles per bib file, I added a * (zero or more quantifier) at the end of <article>*.
rule article { '#article{' $<id>=<-[,]>+ ','
<kv-pairs>
'}'
}
If you compare this article pattern with the ones I wrote for the earlier Raku rules based solutions, you'll see various changes:
Rule in original one-liner
Rule in this grammar
Kept pattern as simple as possible.
Introduced <kv-pairs> and closing }
No attempt to echo layout of your input.
Visually echoes your input.
<[...]> is the Raku syntax for a character class, like[...] in traditional regex syntax. It's more powerful, but for now all you need to know is that the - in <-[,]> indicates negation, i.e. the same as the ^ in the [^,] syntax of ye olde regex. So <-[,]>+ attempts a match of one or more characters, none of which are ,.
$<id>=<-[,]>+ tells Raku to attempt to match the quantified "atom" on the right of the = (i.e. the <-[,]>+ bit) and store the results at the key <id> within the current match object. The latter will be hung from a branch of the parse tree; we'll get to precisely where later.
rule kv-pairs { <kv-pair>* % ',' }
This pattern illustrates one of several convenient Raku regex features. It declares you want to match zero or more kv-pairs separated by commas.
(In more detail, the % regex infix operator requires that matches of the quantified atom on its left are separated by the atom on its right.)
rule kv-pair { $<key>=\w* '={' ~ '}' $<value>=<-[}]>* }
The new bit here is '={' ~ '}'. This is another convenient regex feature. The regex Tilde operator parses a delimited structure (in this case one with a ={ opener and } closer) with the bit between the delimiters matching the quantified regex atom on the right of the closer. This confers several benefits but the main one is that error messages can be clearer.
I could have used the ~ approach in the /.../ regex in the one-liner, and vice-versa. But I wanted this grammar solution to continue the progression toward illustrating "better practice" idioms.
Constructing / deconstructing the parse tree
for $<article> { put "$<id>: $<kv-pairs><kv-pair>[0]<value>\n" }`
$<article>, $<id> etc. refer to named match objects that are stored somewhere in the "parse tree". But how did they get there? And exactly where is "there"?
Returning to the top of the grammar:
rule TOP {
If a .parse is successful, a single 'TOP' level match object is returned. (After a parse is complete the variable $/ is also bound to that top match object.) During parsing a tree will have been formed by hanging other match objects off this top match object, and then others hung off those, and so on.
Addition of match objects to a parse tree is done by adding either a single generated match object, or a list of them, to either a Positional (numbered) or Associative (named) capture of a "parent" match object. This process is explained below.
rule TOP { <article>* }
<article> invokes a match of the rule named article. An invocation of the rule <article> has two effects:
Raku tries to match the rule.
If it matches, Raku captures that match by generating a corresponding match object and adding it to the parse tree under the key <article> of the parent match object. (In this case the parent is the top match object.)
If the successfully matched pattern had been specified as just <article>, rather than as <article>*, then only one match would have been attempted, and only one value, a single match object, would have been generated and added under the key <article>.
But the pattern was <article>*, not merely <article>. So Raku attempts to match the article rule as many times as it can. If it matches at all, then a list of one or more match objects is stored as the value of the <article> key. (See my answer to "How do I access the captures within a match?" for a more detailed explanation.)
$<article> is short for $/<article>. It refers to the value stored under the <article> key of the current match object (which is stored in $/). In this case that value is a list of 3 match objects corresponding to the 3 articles in the input.
rule article { '#article{' $<id>=<-[,]>+ ','
Just as the top match object has several match objects hung off of it (the three captures of article matches that are stored under the top match object's <article> key), so too do each of those three article match objects have their own "child" match objects hanging off of them.
To see how that works, let's consider just the first of the three article match objects, the one corresponding to the text that starts "#article{garg2017patch,...". The article rule matches this article. As it's doing that matching, the $<id>=<-[,]>+ part tells Raku to store the match object corresponding to the id part of the article ("garg2017patch") under that article match object's <id> key.
Hopefully this is enough (quite possibly way too much!) and I can at last exhaustively (exhaustingly?) explain the last line of code, which, once again, was:
for $<article> -> $/ { put "$<id>: $<kv-pairs><kv-pair>[0]<value>\n" }`
At the level of the for, the variable $/ refers to the top of the parse tree generated by the parse that just completed. Thus $<article>, which is shorthand for $/<article>, refers to the list of three article match objects.
The for then iterates over that list, binding $/ within the lexical scope of the -> $/ { ... } block to each of those 3 article match objects in turn.
The $<id> bit is shorthand for $/<id>, which inside the block refers to the <id> key within the article match object that $/ has been bound to. In other words, $<id> inside the block is equivalent to $<article><id> outside the block.
The $<kv-pairs><kv-pair>[0]<value> follows the same scheme, albeit with more levels and a positional child (the [0]) in the midst of all the key (named/ associative) children.
(Note that there was no need for the article pattern to include a $<kv-pairs>=<kv-pairs> because Raku just presumes a pattern of the form <foo> should store its results under the key <foo>. If you wish to disable that, write a pattern with a non-alpha character as the first symbol. For example, use <.foo> if you want to have exactly the same matching effect as <foo> but just not store the matched input in the parse tree.)
Phew!
When the automatically generated parse tree isn't what you want
As if all the above were not enough, I need to mention one more thing.
The parse tree strongly reflects the tree structure of the grammar's rules calling one another from the top rule down to leaf rules. But the resulting structure is sometimes inconvenient.
Often one still wants a tree, but a simpler one, or perhaps some non-tree data structure.
The primary mechanism for generating exactly what you want from a parse, when the automatic results aren't suitable, is make. (This can be used in code blocks inside rules or factored out into Action classes that are separate from grammars.)
In turn, the primary use case for make is to generate a sparse tree of nodes hanging off the parse tree, such as an AST.
Footnotes
[1] Basic Raku is good for exploratory programming, spikes, one-offs, PoCs and other scenarios where the emphasis is on quickly producing working code that can be refactored later if need be.
[2] Raku's regexes/rules scale up to arbitrary parsing, as introduced in the latter half of this answer. This contrasts with past generations of regex which could not.[3]
[3] That said, ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ remains a great and relevant read. Not because Raku rules can't parse (X)HTML. In principle they can. But for a task as monumental as correctly handling full arbitrary in-the-wild XHTML I would strongly recommend you use an existing parser written expressly for that purpose. And this applies generally for existing formats; it's best not to reinvent the wheel. But the good news with Raku rules is that if you need to write a full parser, not just a bunch of regexes, you can do so, and it need not involve going insane!

DLV Rule is not safe

I am starting to work with DLV (Disjunctive Datalog) and I have a rule that is reporting a "Rule is not safe" error, when running the code. The rule is the following:
foo(R, 1) :- not foo(R, _)
I have read the manual and seen that "cyclic dependencies are disallowed". I guess this is why I am being reported the error, but I am not sure how this statement is so problematic to DLV. The final goal is to have some kind of initialization in case that the predicate has not been defined.
More precisely, if there is no occurrence of 'foo' with the parameter R (and anything else), then define it with parameters R and 1. Once it is defined, the rule shouldn't be triggered again. So, this is not a real recursion in my opinion.
Any comments on how to solve this issue are welcomed!
I have realised that I probably need another predicate to match the parameter R in the body of the rule. Something like this:
foo(R, 1) :- not foo(R, _), bar(R)
Since, otherwise there would be no way to know whether there are no occurrences of foo(R, _). I don't know whether I made myself clear.
Anyway, this doesn't work either :(
To the particular "Rule is not safe" error: First of all this has nothing to do with cyclic or acyclic dependencies. The same error message shows up for the non-cyclic program:
foo2(R, 1) :- not foo(R,_), bar(R).
The problem is that the program is actually not safe (http://www.dlvsystem.com/html/DLV_User_Manual.html#SAFETY). As mentioned in the section on negative rules (anchor #AEN375, I am only allowed to use 2 links in my answer):
Variables, which occur in a negated literal, must also occur in a
positive literal in the body.
Observe that the _ is an anonymous variable. I.e., the program
foo(R,1) :- not foo(R,_), bar(R).
can be equivalently written as (and is equivalent to)
foo(R,1) :- not foo(R,X), bar(R).
Anonymous variables (DLV manual, anchor #AEN264 - at the end of the section) just allow us to avoid inventing names for variables that will only occur once within the rule (i.e. for variables that only express "there is some value, I absolutely do not care about it), but they are variables nevertheless. And since negation with not is "negation" and not "true negation" (or "strong negation" as it is also often called), none of the three safety conditions is satisfied by the rule.
A very rough and high-level intuition for safety is that it guarantees that every variable in the program can be assigned to some finite domain - as it is now the case with R by adding bar(R). However, the same also must be the case for the anonymous variable _ .
To the actual problem of defining default values:
As pointed out by lambda.xy.x, the problem here is the Answer Set (or stable model) semantics of DLV: Trying to do it in one rule does not give any solution:
In order to get a safe program, we could replace the above problems e.g. by
foo(1,2). bar(1). bar(2).
tmp(R) :- foo(R,_).
foo(R,1) :- not tmp(R), bar(R).
This has no stable model:
Assume the answer is, as intended,
{foo(1,2), bar(1), bar(2), foo(2,1)}
However, this is not a valid model, since tmp(R) :- foo(R,_) would require it to contain tmp(2). But then, "not tmp(2)" is no longer true, and therefore having foo(2,1) in the model violates the required minimality of the model. (This is not exactly what is going on, more a rough intuition. More technical details could be found in any article on answer set programming, a quick Google search gave me this paper as one of the first results: http://www.kr.tuwien.ac.at/staff/tkren/pub/2009/rw2009-asp.pdf)
In order to solve the problem, it is therefore somehow necessary to "break the cycle". One possibility would be:
foo(1,2). bar(1). bar(2). bar(3).
tmp(R) :- foo(R,X), X!=1.
foo(R,1) :- bar(R), not tmp(R).
I.e., by explicitly stating that we want to add R into the intermediate atom only if the value is different from 1, having foo(2,1) in the model does not contradict tmp(2) not being part of the model as well. Of course, this no longer allows to distinguish whether foo(R,1) is there as default value or by input, but if this is not required ...
Another possibility would be to not use foo for the computation, but some foo1 instead. I.e. having
foo1(R,X) :- foo(R,X).
tmp(R) :- foo(R,_).
foo1(R,1) :- bar(R), not tmp(R).
and then just use foo1 instead of foo.

part after '_' in data returned is PascalCase even though I set camelCase as the default naming convention

I have this line of code before I create my entity manager:
breeze.NamingConvention.camelCase.setAsDefault();
and this query:
var query = entityQuery.from('vehicle')
.expand("engine, driveType")
.select("engine.friendlyEngineName, driveType.friendlyDrivetrain")
The data comes back like so:
driveType_FriendlyDriveTrain = "Rear Wheel"
engine_FriendlyEngineName = "6.2L V8"
Why are "FriendlyDriveTrain" and "FriendlyEngineName" PascalCase? This seems clearly wrong since I set camelCase before I created the EntityManager and the query. How do I make it so that the parts after the '_' are camelCase as well?
note: I made sure to remove any web api json formatting configurations so that breeze is the only one managing the translation.
edit: The weird thing is that this same query returns properties that are camelCase after the '_' when it hits the cache. So same query, two different results.
Please review the documentation on NamingConvention. If that doesn't clear it up, you can come back and ask a more specific question. Thanks.
Edit 2 Sept 2014 (morning)
Please also explain what you think Breeze should be doing with the "_" character. The Breeze convention treats it like any other identifier character (e.g., a-z,A-Z,0-9) and does nothing with it. It seems you want your convention to skip over leading underscores and lower the case of the first char thereafter if it is alpha. I think you also want to drop the underscores ... but I'm not sure.
That's cool. You can write your own NamingConvention that does what you want. And you can make it the default too. In fact, the noUnderscoreConvention shown on the NamingConvention documentation page is more than half way home for you.
I will say you threw me for a loop with your mix of periods and underscores in the select statement and returned data. I have no idea how you got the select to work or got that kind of result.
Edit 2 Sept 2014 (afternoon)
Ok ... I overlooked the fact that you are doing a projection!
The projection flattens dotted property path values (whether ComplexType or EntityType navigations). It creates projected property names for these "flattened properties" by using "_" as the separator character. Yes, it applies the applicable MetadataStore's naming convention to those "_" separated names and, yes, the Breeze camelCase convention doesn't do anything special with the underscore. So you get back flattened projection property names such as "engine_FriendlyEngineName".
You can call this a bug if you like. I don't. I don't because (a) it seems to me a matter of opinion as to what a camelCase convention should do in this situation; (b) it's been this way forever; and (c) I don't think it is important enough to risk breaking existing applications that might rely on the current behavior ... not when your remedy is so simple.
And that remedy is to write your own camelCase convention that handles these projected property names differently.
Please include it here when you write it so that others who share your perspective may benefit from it.
I'm not sure what you mean when you write: "this same query returns properties that are camelCase after the '_' when it hits the cache". What query is that? Not the one you wrote here.
Projection results don't "hit the cache" ... unless the projected property value is itself an entity ... and that is NOT the case here. What do you mean by "this same query". I await your response.

Duh? help with f# option types

I am having a brain freeze on f#'s option types. I have 3 books and read all I can but I am not getting them.
Does someone have a clear and concise explanation and maybe a real world example?
TIA
Gary
Brian's answer has been rated as the best explanation of option types, so you should probably read it :-). I'll try to write a more concise explanation using a simple F# example...
Let's say you have a database of products and you want a function that searches the database and returns product with a specified name. What should the function do when there is no such product? When using null, the code could look like this:
Product p = GetProduct(name);
if (p != null)
Console.WriteLine(p.Description);
A problem with this approach is that you are not forced to perform the check, so you can easily write code that will throw an unexpected exception when product is not found:
Product p = GetProduct(name);
Console.WriteLine(p.Description);
When using option type, you're making the possibility of missing value explicit. Types defined in F# cannot have a null value and when you want to write a function that may or may not return value, you cannot return Product - instead you need to return option<Product>, so the above code would look like this (I added type annotations, so that you can see types):
let (p:option<Product>) = GetProduct(name)
match p with
| Some prod -> Console.WriteLine(prod.Description)
| None -> () // No product found
You cannot directly access the Description property, because the reuslt of the search is not Product. To get the actual Product value, you need to use pattern matching, which forces you to handle the case when a value is missing.
Summary. To summarize, the purpose of option type is to make the aspect of "missing value" explicit in the type and to force you to check whether a value is available each time you work with values that may possibly be missing.
See,
http://msdn.microsoft.com/en-us/library/dd233245.aspx
The intuition behind the option type is that it "implements" a null-value. But in contrast to null, you have to explicitly require that a value can be null, whereas in most other languages, references can be null by default. There is a similarity to SQLs NULL/NOT NULL if you are familiar with those.
Why is this clever? It is clever because the language can assume that no output of any expression can ever be null. Hence, it can eliminate all null-pointer checks from the code, yielding a lot of extra speed. Furthermore, it unties the programmer from having to check for the null-case all the same, should he or she want to produce safe code.
For the few cases where a program does require a null value, the option type exist. As an example, consider a function which asks for a key inside an .ini file. The key returned is an integer, but the .ini file might not contain the key. In this case, it does make sense to return 'null' if the key is not to be found. None of the integer values are useful - the user might have entered exactly this integer value in the file. Hence, we need to 'lift' the domain of integers and give it a new value representing "no information", i.e., the null. So we wrap the 'int' to an 'int option'. Now, if there is no integer value we will get 'None' and if there is an integer value, we will get 'Some(N)' where N is the integer value in question.
There are two beautiful consequences of the choice. One, we can use the general pattern match features of F# to discriminate the values in e.g., a case expression. Two, the framework of algebraic datatypes used to define the option type is exposed to the programmer. That is, if there were no option type in F# we could have created it ourselves!

Resources