Can an array of atomic values be a node name in an XPath query? - saxon

For the query "PROJECT[1]/PROPOSAL[1]/SOLUTION[1]/UNIT[1]/distinct-values(LANDING_DOOR_FRAME_FINISH_FRONT/LANDING_DOOR_FRAME_FINISH_FRONT_VALUE)" this appears to work if distinct-values() returns exactly one value, but throw an exception otherwise. (And by the way, this query is not my idea).
Is it a bad idea to have an atomic value as a node name in a query? Or is it ok? And if ok, is it ok only if it returns exactly one value?
Calling Saxon from Java for this.

It's a perfectly valid query, whether or not distinct-values() returns exactly one value.
(If it fails, show us a repro: all the data we need to reproduce the problem, plus the error message).
But your question, about using atomic values as node names, suggest that you don't understand what the expression means. The values returned by distinct-values() don't have to be node names, and they are not used as node names.
These days I prefer to use the "!" operator when the RHS expression returns atomic values rather than nodes. It's equivalent, but clearer.

Related

How can I indicate an error during a parse operation?

Within the scripting language I am implementing, valid IDs can consist of a sequence of numbers, which means I have an ambiguous situation where "345" could be an integer, or could be an ID, and that's not known until runtime. Up until now, I've been handling every case as an ID and planning to handle the check for whether a variable has been declared under that name at runtime, but when I was improving my implementation of a particular bit of code, I found that there was a situation where an integer is valid, but any other sort of ID would not be. It seems like it would make sense to handle this particular case as a parsing error so that, e.g., the following bit of code that activates all picks with a spell level tag greater than 5 would be considered valid:
foreach pick in hero where spell.level? > 5
pick.activate[]
nexteach
but the following which instead compares against an ID that can't be mistaken for an integer constant would be flagged as an error during parsing:
foreach pick in hero where spell.level? > threshold
pick.activate[]
nexteach
I've considered separate tokens, ID and ID_OR_INTEGER, but that means having to handle that ambiguity everywhere I'm currently using an ID, which is a lot of places, including variable declarations, expressions, looping structures, and procedure calls.
Is there a better way to indicate a parsing error than to just print to the error log, and maybe set a flag?
I would think about it differently. If an ID is "just a number" and plain numbers are also needed, I would say any string of digits is a number, and a number might designate an ID in some circumstances.
For bare integer literals (like 345), I would have the tokenizer return maybe a NUMBER token, indicating it found an integer. In the parser, wherever you currently accept ID, change it to NUMBER, and call a lookup function to verify the "NUMBER" is a valid ID.
I might have misunderstood your question. You start by talking about "345", but your second example has no integer strings.

Cypher: which assignment operator

I would appreciate some Cypher-specific theory for why there are, effectively, two different assignment operators in the language. I can get things to work, but feel like something is missing...
Let's use Neo4j's movie database with the following query:
match (kr:Person {name:"Keanu Reeves"}), (hw:Person{name:"Hugo Weaving"}), p=shortestPath((kr)-[*]-(hw)) return p
Sure, the query works, but here's the point of my question: 'kr', 'hw' and 'p' are all variables, right? But why is it that the former two are assigned with a colon, but the latter takes an equal sign?
Thanks.
It's important to note that the : used for nodes and relationships really doesn't have anything to do with variable assignment at all, it's instead associated with node labels and relationship types.
A node label and a relationship type always start with a :, even if there isn't a variable present at all. This helps differentiate a node label or relationship type from a variable (a variable will never begin with a :), and the : naturally acts as a divider between the label/type and the variable when both are present. It's also possible to have a variable on a node or label, but omit the type...in that case no : will be present, which again reinforces that it doesn't have anything to do with assignment.
In the context of a map {} (such as a properties map, including when it's inlined within a match on a node or relationship), then the : is used for map key/value pairs, and is common syntax, used in JSON representation.
Actual assignment (such as in SET clauses, and in your example of setting the path variable to a pattern within a match) uses =.
I do not think there is a deep theoretical reason for it. The original idea of Cypher was to provide an ASCII art-style language, where the MATCH part of the query which resembles a graph pattern that you'd draw on a whiteboard.
In many ways, a graph instance is quite similar to a UML Object Diagram (and other common representations), where you would use name : type to denote an object's variable name and type (class) or just use : type for anonymous instances.
Now paths do not really fit into that picture. On a whiteboard, I'd just put the relevant part in a dashed/circled area write p or p= next to it. Definitely not p:.
Note that it is possible to rephrase your query to a more compact form:
match p=shortestPath((kr:Person {name:"Keanu Reeves"})-[*]-(hw:Person {name:"Hugo Weaving"}))
return p
Here, using colons everywhere would look out of place, think: p:shortestPath((kr:Person {name:"Keanu Reeves"})
Remark 1. If you try to use a variable to capture relationships of a variable length pattern, you will get a warning:
Warning. This feature is deprecated and will be removed in future versions.
Binding relationships to a list in a variable length pattern is deprecated. (org.neo4j.graphdb.impl.notification.NotificationDetail$Factory$2#1eb6644d)
MATCH (a)-[rs:REL*]->(b)
^
So you would better use a path and the relationships function to get the same result:
MATCH p=(a)-[:REL*]->(b)
RETURN relationships(p)
Remark 2. I come from an OO background and have been writing Cypher for a few years, so it might just be me getting used the syntax -- it might be odd for newcomers, especially from different fields.
Remark 3. The openCypher project now provides a grammar specification
, which gives you an insight of how a MATCH clause is parsed.

comparing length of node values

In Cypher, how can I compare the length of two values using Length and substring.. like this
length(n.VALUE)= length(SUBSTRING(INPUT_STRING, 0,length(n.VALUE)))
I always get this exception when using this syntax:
SubstringFunction expected to be of type Collection but it is of type String
The length function as it stands is meant to be used with collections. There is currently no way to get a length of a string in cypher. I've started work on adding a bunch of new string functions like soundex and charindex, and I'll throw this one on the stack of things to do, but I probably won't get it finished for a couple more weeks (and it needs to go through acceptance and even then will only be available as M05+, probably).

In Dart can hashCode() method calls return different values on equal (==) Objects?

My immediate project is to develop a system of CheckSums for proving that two somewhat complex objects are (functionally)EQUAL - in the sense that they have the same values for the critical properties. (Have discovered that dates/times cannot be included, so can't use JSON on the bigger object - duh :) (For my purposes) ).
To do this calling the hashCode() method on selected strings seemed to be the way to go.
Upon implementing this, I note that in practice I am getting very different values on multiple runs of highest level objects that are functionally 'identical'.
There are a number of "nums" that I have not rounded, there are integers, bools, Strings and not much more.
I have 'always' thought that a hashCode on the same set of values would return the same number, am I missing something?
BTW the only context that I have found material on hashCode() has been with WebSockets.
Of course I can write my own String to a unique value but I want to understand if this is a problem with Dart or something else.
I can attempt to answer the question posed in the title: "Can hashCode() method calls return different values on equal (==) Objects?"
Short answer: hash codes for two objects must be the same if those two objects are equals (==).
If you override hashCode you must also override equals. Two objects that are equal, as defined by ==, must also have the same hash code.
However, hash codes do not have to be unique. That is, a perfectly valid hash code is the value 1. A good hash code, however, should be uniformly distributed.
From the docs from Object:
Hash codes are guaranteed to be the same for objects that are equal
when compared using the equality operator ==. Other than that there
are no guarantees about the hash codes. They will not be consistent
between runs and there are no distribution guarantees.
If a subclass overrides hashCode it should override the equality
operator as well to maintain consistency.
I found the immediate problem. The object stringify() method, at one level, was not getting called, but rather some stringify property that must exist in all objects (?).
With this fixed everything is working as exactly as I would expect, and multiple runs of our Statistical Studies are returning exactly the same CheckSum at the highest levels (based on some 5 levels of hierarchy).
Meanwhile the JSON.stringify has continued to fail. Even in the most basic object. I have not been able to determine what is causing to fail. Of course, the question is not how "stringify" is accomplished.
So, empirically at least, I believe it is true that "objects with equal properties" will return equal checkSums in Dart. It was decided to round nums, I don't know if this was causing a problem - perhaps good to be aware of? And, of course, remember to be beware of things like dates, times, or anything that could legitimately vary.
_swarmii
The doc linked by Seth Ladd now include info:
They need not be consistent between executions of the same program and there are no distribution guarantees.`
so technically hashCode value can be change with same object in different executions for your question:
I have 'always' thought that a hashCode on the same set of values would return the same number, am I missing something?

Duh? help with f# option types

I am having a brain freeze on f#'s option types. I have 3 books and read all I can but I am not getting them.
Does someone have a clear and concise explanation and maybe a real world example?
TIA
Gary
Brian's answer has been rated as the best explanation of option types, so you should probably read it :-). I'll try to write a more concise explanation using a simple F# example...
Let's say you have a database of products and you want a function that searches the database and returns product with a specified name. What should the function do when there is no such product? When using null, the code could look like this:
Product p = GetProduct(name);
if (p != null)
Console.WriteLine(p.Description);
A problem with this approach is that you are not forced to perform the check, so you can easily write code that will throw an unexpected exception when product is not found:
Product p = GetProduct(name);
Console.WriteLine(p.Description);
When using option type, you're making the possibility of missing value explicit. Types defined in F# cannot have a null value and when you want to write a function that may or may not return value, you cannot return Product - instead you need to return option<Product>, so the above code would look like this (I added type annotations, so that you can see types):
let (p:option<Product>) = GetProduct(name)
match p with
| Some prod -> Console.WriteLine(prod.Description)
| None -> () // No product found
You cannot directly access the Description property, because the reuslt of the search is not Product. To get the actual Product value, you need to use pattern matching, which forces you to handle the case when a value is missing.
Summary. To summarize, the purpose of option type is to make the aspect of "missing value" explicit in the type and to force you to check whether a value is available each time you work with values that may possibly be missing.
See,
http://msdn.microsoft.com/en-us/library/dd233245.aspx
The intuition behind the option type is that it "implements" a null-value. But in contrast to null, you have to explicitly require that a value can be null, whereas in most other languages, references can be null by default. There is a similarity to SQLs NULL/NOT NULL if you are familiar with those.
Why is this clever? It is clever because the language can assume that no output of any expression can ever be null. Hence, it can eliminate all null-pointer checks from the code, yielding a lot of extra speed. Furthermore, it unties the programmer from having to check for the null-case all the same, should he or she want to produce safe code.
For the few cases where a program does require a null value, the option type exist. As an example, consider a function which asks for a key inside an .ini file. The key returned is an integer, but the .ini file might not contain the key. In this case, it does make sense to return 'null' if the key is not to be found. None of the integer values are useful - the user might have entered exactly this integer value in the file. Hence, we need to 'lift' the domain of integers and give it a new value representing "no information", i.e., the null. So we wrap the 'int' to an 'int option'. Now, if there is no integer value we will get 'None' and if there is an integer value, we will get 'Some(N)' where N is the integer value in question.
There are two beautiful consequences of the choice. One, we can use the general pattern match features of F# to discriminate the values in e.g., a case expression. Two, the framework of algebraic datatypes used to define the option type is exposed to the programmer. That is, if there were no option type in F# we could have created it ourselves!

Resources