XQuery FLWOR Expression with Multiple for-loops returns unexpected empty tags - xml-parsing

Here is a short version of my question.
When using a nested for-loop in a FLOWR expression returning the variable from the inner loop behaves as expected but returning the variable from the outer loop unexpectedly returns empty tags.
Here is a longer version of my question.
I have recently started learning XQuery. I've been using the XQilla command-line tool on Mac OS X 10.10.4 (Yosemite) to execute my XQuery scripts. In particular I've tried XQilla Version 2.3.0_2 which I installed with the MacPorts package manager and also XQilla version 2.3.2 which I manually installed from source.
While experimenting with various XPath and FLWOR expressions I came across an unexpected behavior which I've narrowed down to the simplified example which I am going to present here. The example consists of a single XML data file and two versions of a simple XQuery; the first query works as expected but the second query does not.
Here is my XML data file (test.xml):
<elements>
<element>
element data
</element>
</elements>
Here is my first XQuery script (test_1.xq):
for $e1 in doc("test.xml")//element
for $e2 in doc("test.xml")//element
return $e1
And here is my second XQuery script (test_2.xq):
for $e1 in doc("test.xml")//element
for $e2 in doc("test.xml")//element
return $e2
Both queries consist of the same redundant, nested for-loops, but the first query returns the variable from the outer (i.e. first) for-loop and the second query returns the variable from the inner (i.e. second) for-loop. I expected the two queries to have similar output in general, and identical output for this specific data file, however that appears not to be the case.
Here is a sample shell session:
$ xqilla test_1.xq
<element>
element data
</element>
$ xqilla test_2.xq
<element/>
So returning the variable from the outer loop only produces the empty tag; this was unexpected.
As a sanity-check I tried a third variation (test_3.xq):
for $e1 in doc("test.xml")//element
for $e2 in doc("test.xml")//element
where $e1=$e2
return $e2
And when I execute this third query I get:
$ xqilla test_3.xq
<element>
element data
</element>
I would have originally expected this query to work correctly, but in light of the unexpected behavior of test_2.xq I would think that test_3.xq would also fail. This only adds to my confusion.

Most probably a bug. An online XQuery tester http://www.xpathtester.com/xquery returns
<?xml version="1.0" encoding="UTF-8"?>
<element>
element data
</element>
as a result for both scripts (with the doc(...) part removed).

Related

Lua Pattern matching only returning first match

I can't figure out how to get Lua to return ALL matches for a particular pattern match.
I have the following regex which works and is so basic:
.*\n
This just splits a long string per line.
The equivelent of this in Lua is:
.-\n
If you run the above in a regex website against the following text it will find three matches (if using the global flag).
Hello
my name is
Someone
If you do not use the global flag it will return only the first match. This is the behaviour of LUA; it's as if it does not have a global switch and will only ever return the first match.
The exact code I have is:
local test = {string.match(string_variable_here, ".-\n")}
If I run it on the above test for example, test will be a table with only one item (the first row). I even tried using capture groups but the result is the same.
I cannot find a way to make it return all occurrences of a match, does anyone know if this is possible in LUA?
Thanks,
You can use string.gmatch(s, pattern) / s:gmatch(pattern):
This returns a pattern finding iterator. The iterator will search through the string passed looking for instances of the pattern you passed.
See the online Lua demo:
local a = "Hello\nmy name is\nSomeone\n"
for i in string.gmatch(a, ".*\n") do
print(i)
end
Note that .*\n regex is equivalent to .*\n Lua pattern. - in Lua patterns is the equivalent of *? non-greedy ("lazy") quantifier.

Why we are unable to evaluate comprehension if we have defined it inside a rule body in OPA?

The following is my sample code: https://play.openpolicyagent.org/p/oyY1GOsYaf
Here when I try to evaluate names array, it is showing:
error occurred: 1:1: rego_unsafe_var_error: var names is unsafe
But when I define the same comprehension outside the allow rule definition : https://play.openpolicyagent.org/p/Xv0cF7FM8b, I am able to evaluate the selection
[
"smoke",
"dev"]
could someone help me to point out the difference and if I want to define the comprehention inside the rule is there any syntax I need to follow? Thanks in advance
Note: I am getting the final output as expected in both cases, only issue is with the names array evaluation.
The way the Rego Playground generates a query when evaluating a selection is much more simplistic than one might assume. A query will be generated from your selected text, without taking into account where in the document that text was selected. This means that even if you select a local variable inside a rule body, the query will simply contain that variable name (names, in your case); which will be perceived as a reference to a top-level variable in the document's body, even though a rule-local variable was selected. This is why your first sample returns an error, as there is no top-level variable names in the document; whereas the second sample does, and therefore succeeds.
You can test this quirk by selecting and evaluating the word hello on line 3 here: https://play.openpolicyagent.org/p/n5OPoFnlhx.
package play
# hello
hello {
m := input.message
m == "world"
}
Even though it's just part of a comment, it'll evaluate just as if you had selected the rule name on line 5.

Lua io.write() adds unwanted material to output string

When I start an interactive Lua shell, io.write() adds unwanted material after the string I want it to print. print(), however does not:
[user#manjaro lua]$ lua
Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio
> io.write('hello world')
hello worldfile (0x7fcc979d4520)
> print('hello world')
hello world
And when I use io.write() in a program it works fine too:
--hello.lua
io.write('hello world\n')
print ('hello world')
Output:
[user#manjaro lua]$ lua hello.lua
hello world
hello world
I'm using Manjaro Linux on a Dell desktop. Can anyone tell me what's going on here? Thanks in advance.
EDIT: I should add, perhaps, that the unwanted material is always something like this:
file (0x7f346234d520)
It's always 'file' followed by what looks like a large hexadecimal number in parentheses. The exact number stays constant within one shell session but varies between different shell sessions.
"file (0x7fcc979d4520)" (or whatever address) is the return value of the io.write call, with an implicit tostring.
The lua(1) man page says
In interactive mode, lua prompts the user, reads lines from the standard input, and executes them as they are read. If the line contains an expression or list of expressions, then the line is evaluated and the results are printed.
The trouble here is that io.write('hello world') could be either an expression or a statement. Since it's a valid expression, the interpreter outputs that unwanted return value.
As a workaround, try adding a semicolon:
> io.write('hello world\n');
hello world
Although Lua usually doesn't require a semicolon for each statement if it's at the end of a line, it does allow it. And important here, it means the syntax can't be an expression, only a statement which calls the function. So the interpreter won't output the returned value.
You are just seeing the return value of io.write when you call io.write manually, interactively. When using the Lua, uh, shell, if you want to call it that, it almost always prints the return value of any function(s) you call.
file(blabblah) is the internal representation of the file you are writing to (probably just a hex memory address, but who knows?)

Dealing with read no parse error for lists

Hello can someone please explain me how can you deal with failed computations (in our case parsings) in Haskell when performed in a list,retrieving the successful elements?
The error i get is
main: Prelude.read: no parse and that stops all the list from being processed
I am using a forM over a collection of Text , and for each element i am using a read::String->Double for the result value.
Currently the parsing fails at the first element and i can not parse the remaining elements.How can i make single elements "fail-able" but still get partial results ( for the elements of the list that could be parsed) ?
Example :Input: ["asta","1.23","2.44"]
Desired Output:[1.23,2.44]
import qualified Data.Text as T
parseDfile::[T.Text]->IO [Maybe Double]
parseDfile []=do
return [Nothing]
parseDfile lines = forM lines $ \line ->
do
Prelude.putStrLn ("line value:"++(T.unpack line))
let value = (read::String->Double) . T.unpack $ //fails here for first element
print .show $ value
return (Just value)
P.SDo i have to define a method using the Maybe monad separately only for that one line of code ?
The Text.Read library also has a function called readMaybe that returns a Maybe a instead of just an a like read does.
In the case that you're not sure whether or not a string can be parsed, you clearly want a Maybe a. Now you need to deal with the Maybe though, however the Maybe monad has tons of functions that do exactly what you need.
For more complicated parsing you could look into the Haskell ParseLib which is really good. However it might be a little overkill if you're not trying to parse more than your example.

ANT: Reg regexp for extracting contents between slashes in property regex

I have the following strings as input for scheduler file
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-hr454\SRISM.xml
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-lr454\Swap_MUL.xml
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-lr456\Swap_MU.xml
I need to extract the complete part from v80-WM
i.e The regex must be able to select the following string
v80-WM\scheduler\FRQ\AUTO\sml-hr454\SRISM.xml
v80-WM\scheduler\FRQ\AUTO\sml-lr454\Swap_MUL.xml
v80-WM\scheduler\FRQ\AUTO\sml-lr456\Swap_MU.xml
Currently I am using the following regex where the regex finds the last occurence of "Q" in the above string and trimming for there and using workardoung to construct the above mentioned results.
<echo message="runpART ... Scheduler File ${schedulerFile}"/>
<propertyregex property="cfg.arg" input="${schedulerFile}" regexp="([^Q]*).xml" select="\1" casesensitive="false"/>
Need help in extracting string from "v80-WM....xml".
Some inputs will be helpful
That's good. The v80-WM gives you a fixed "starting point"
Using this as your regular expression should do it.
^.(v80-WM.)
What it means:
^.* match anything until you get to *the caret isn't really necessary, but I like making the reg exp more strict)
v80-WM
= .* then match the rest
The parens include the v80-WM name and everything that comes after so you don't have to reconstruct it.

Resources