Construct an alphabetical sequence - xslt-2.0

This XPath expression:
for $n in 1 to 5 return $n
Returns
1 2 3 4 5
Is it possible to do something similar with alphabetic characters?

Yep:
for $n in 65 to 70 return fn:codepoints-to-string($n)
returns:
A
B
C
D
E
In ascii/iso-8859-1 at least.
for $n in fn:string-to-codepoints('A') to fn:string-to-codepoints('E')
return fn:codepoints-to-string($n)
should work in any locale.

Or, in XPath 3.0 (XSLT 3.0):
((32 to 127) ! codepoints-to-string(.))[matches(., '[A-Z]')]
Here we don't know whether or not the wanted characters have adjacent character codes (and in many real cases they wouldn't).
A complete XSLT 3.0 transformation using this XPath 3.0 expression:
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select=
"((32 to 127) ! codepoints-to-string(.))[matches(., '[A-Z]')]
"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied (I am using Saxon-EE 9.4.0.6J) on any XML document (not used), the wanted, correct result is produced:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
In case we know the wanted result characters have all-adjacent character codes, then:
(string-to-codepoints('A') to string-to-codepoints('Z')) ! codepoints-to-string(.)
Explanation:
Use of the new XPath 3.0 simple map operator !.

Related

How to write a parser in prolog to output a parse tree

I am writing a parser in prolog that should be able to parse this math formula:
a = 1 * 2 + (3 - 4) / 5;
and print out a parse tree out of it that should look like this:
PARSE TREE:
assignment
ident(a)
assign_op
expression
term
factor
int(1)
mult_op
term
factor
int(2)
add_op
expression
term
factor
left_paren
expression
term
factor
int(3)
sub_op
expression
term
factor
int(4)
right_paren
div_op
term
factor
int(5)
semicolon
I have this function that will print out the parse tree when I run the code run('program1.txt', 'myparsetree1.txt'). which will read the math formula from the program1.txt file and print out the parse tree on the myparsetree1.txt file.
So far I have tried writing this grammar for the parser, but isn't working as I keep getting existence and instantiation error mostly with letter_code and digit_code from the tokenizer whereas prolog complains on too few arguments in it among other things.
/*Loads the tokenizer*/
:- [tokenizer].
parse(I) --> assign(I).
assign(assign(I,'=', Expr,';')) -->
letter_code(I), '=', expr(Expr), ';'.
expr(expr(Term, add_op, Expr)) -->
term(Term), add_op, expr(Expr).
expr(expr(Term, sub_op, Expr)) -->
term(Term), sub_op, expr(Expr).
expr(expr(Term)) --> term(Term).
term(term(Factor, mul_op, Term)) -->
factor(Factor), mul_op, term(Term).
term(term(Factor, div_op, Term)) -->
factor(Factor), div_op, term(Term).
term(term(Factor)) --> factor(Factor).
factor(factor('(', Expr, ')')) --> '(', expr(Expr), ')'.
factor(factor(Digit)) --> digit_code(Digit).
add_op --> ['+'].
sub_op --> ['-'].
mul_op --> ['*'].
div_op --> ['/'].
letter_code and digit_code are predicates from a separate file called tokenizer.pl
digit_code(Code):-
Code >= 48, /* 48 = '0' 57 = '9' */
Code =< 57.
letter_code(Code):-
Code >= 97, /* 97 = 'a' 122 = 'z' */
Code =< 122.
When I run the program I usually get existence error: letter_code/3, same with digit code in which it complains that theere isn't a predicate with 3 arguments. I have tried changing the predicate to have three arguments instead but then I get instantiation error instead. This is what I did and what it results in:
letter_code(Code, Xs, Xs):-
Code >= 97,
Code =< 122.
| ?- run('program1.txt','myparsetree1.txt').
! Existence error in user:letter_code/1
! procedure user:letter_code/1 does not exist
! goal: user:letter_code(97)
//------------------------------------------------
letter_code(Code, Xs, Xs):-
Code >= 97,
Code =< 122.
letter_code(Code):-
Code >= 97,
Code =< 122.
| ?- run('program1.txt','myparsetree1.txt').
! Instantiation error in argument 1 of (>=)/2
! goal: _293>=97
Does anyone know how to resolve this? I hope I made it clearer than when I first postid this question.
In the DCG body, letter_code(C) is going to expand to a call to letter_code/3. Your implementation with three arguments doesn't remove anything from the list, so I think this is not likely to have the effect you want, you probably want something like this instead:
letter_code(Code) -->
[Code],
{
Code >= 97,
Code =< 122
}.
DCG rules expand to predicates with two extra arguments; for instance, listing(letter_code//1) shows this as the parsed value:
?- listing(letter_code//1).
letter_code(A, [A|C], B) :-
A>=97,
A=<122,
B=C.
true.
When you redefined letter_code/3 yourself with letter_code(Code, Xs, Xs) :- ... you should have made something that would compile but fail at runtime. So your issues with not being able to find predicates that you have clearly defined lie somewhere else.
In DCGs, you embed Prolog code bracketing it:
parse(I) --> assign(I).
assign(assign(I,'=', Expr,';')) -->
{letter_code(I)}, '=', expr(Expr), ';'.
...
That is.

elimination of indirect left recursion

I'm having problems understanding an online explanation of how to remove the left recursion in this grammar. I know how to remove direct recursion, but I'm not clear how to handle the indirect. Could anyone explain it?
A --> B x y | x
B --> C D
C --> A | c
D --> d
The way I learned to do this is to replace one of the offending non-terminal symbols with each of its expansions. In this case, we first replace B with its expansions:
A --> B x y | x
B --> C D
becomes
A --> C x y | D x y | x
Now, we do the same for non-terminal symbol C:
A --> C x y | D x y | x
C --> A | c
becomes
A --> A x y | c x y | D x y | x
The only other remaining grammar rule is
D --> d
so you can also make that replacement, leaving your entire grammar as
A --> A x y | c x y | d x y | x
There is no indirect left recursion now, since there is nothing indirect at all.
Also see here.
To eliminate left recursion altogether (not merely indirect left recursion), introduce the A' symbol from your own materials (credit to OP for this clarification and completion):
A -> x A'
A' -> xyA' | cxyA' | dxyA' | epsilon
Response to naomik's comments
Yes, grammars have interesting properties, and you can characterize certain semantic capabilities in terms of constraints on grammar rules. There are transformation algorithms to handle certain types of parsing problems.
In this case, we want to remove left-recursion: one desirable property of a grammar is that the use of any rule must consume at least one input token (terminal symbol). Left-recursion opens a door to infinite recursion in the parser.
I learned these things in my "Foundations of Computing" and "Compiler Construction" classes many years ago. Instead of writing a parser to adapt to a particular grammar, we'd transform the grammar to fit the parser style we wanted.

How to transform a term into a list in Prolog

How can i transform a term like: 3 * y * w * t^3 in a list made of: List = [3, *, y,...], without using the following predicate:
t2l(Term, List) :-
t2l_(Term, List-X),
X = [].
t2l_(Term, [F|X]-X) :-
Term =.. [F],
!.
t2l_(Term, L1-L4) :-
Term =.. [F, A1, A2],
t2l_(A1, L1-L2),
L2 = [F|L3],
t2l_(A2, L3-L4).
Is there a simple way?
In Prolog, everything that can be expressed by pattern matching should be expressed by pattern matching.
In your case, this is difficult, because you cannot collectively distinguish the integers from other arising terms by pattern matching with the defaulty representation you are using.
In the following, I am not solving the task completely for you, but I am showing how you can solve it once you have a clean representation.
As always when describing a list in Prolog, consider using dcg notation:
term_to_list(y) --> [y].
term_to_list(w) --> [w].
term_to_list(t) --> [t].
term_to_list(i(I)) --> [I].
term_to_list(A * B) -->
term_to_list(A),
[*],
term_to_list(B).
term_to_list(A^B) -->
term_to_list(A),
[^],
term_to_list(B).
In this example, I am using i(I) to symbolically represent the integer I.
Sample query and result:
?- phrase(term_to_list(i(3)*y*w*t^i(3)), Ls).
Ls = [3, *, y, *, w, *, t, ^, 3].
I leave converting the defaulty representation to a clean one as an easy exercise.
Thanks mat for answering, i forgot to close the question. However i have created a new predicate that solve the problem:
term_string(Term, X),
string_codes(X, AList),
ascii_to_list(AList, Y).
ascii_to_list([X | Xs], [Y | Out]) :-
X >= 48,
X =< 57,
!,
number_codes(Y, [X]),
ascii_to_list(Xs, Out).
ascii_to_list([X | Xs], [Y | Out]) :-
char_code(Y, X),
ascii_to_list(Xs, Out).
ascii_to_list([], []).

Bottleneck in math parser Haskell

I got this code below from the wiki books page here. It parses math expressions, and it works very well for the code I'm working on. Although there is one problem, when I start to add layers of brackets to my expression the program slows down dramatically, crashing my computer at some point. It has something to do with the number of operators I have it check for, the more operators I have the less brackets I can parse. Is there anyway to get around or fix this bottleneck?
Any help is much appreciated.
import Text.ParserCombinators.ReadP
-- slower
operators = [("Equality",'='),("Sum",'+'), ("Product",'*'), ("Division",'/'), ("Power",'^')]
-- faster
-- ~ operators = [("Sum",'+'), ("Product",'*'), ("Power",'^')]
skipWhitespace = do
many (choice (map char [' ','\n']))
return ()
brackets p = do
skipWhitespace
char '('
r <- p
skipWhitespace
char ')'
return r
data Tree op = Apply (Tree op) (Tree op) | Branch op (Tree op) (Tree op) | Leaf String deriving Show
leaf = chainl1 (brackets tree
+++ do
skipWhitespace
s <- many1 (choice (map char "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.-[]" ))
return (Leaf s))
(return Apply)
tree = foldr (\(op,name) p ->
let this = p +++ do
a <- p +++ brackets tree
skipWhitespace
char name
b <- this
return (Branch op a b)
in this)
(leaf +++ brackets tree)
operators
readA str = fst $ last $ readP_to_S tree str
main = do loop
loop = do
-- ~ try this
-- ~ (a+b+(c*d))
str <- getLine
print $ last $ readP_to_S tree str
loop
This is a classic problem in backtracking (or parallel parsing, they are basically the same thing).... Backtracking grows (at worst) exponentially with the size of the input, so the time to parse something can suddenly explode. In practice backtracking works OK in language parsing for most input, but explodes with recursive infix operator notation. You can see why by considering how many possibile ways this could be parsed (using made up & and % operators):
a & b % c & d
could be parsed as
a & (b % (c & d))
a & ((b % c) & d)
(a & (b % c)) & d
((a & b) % c) & d
This grows like 2^(n-1). The solution to this is to add some operator precidence information earlier in the parse, and throw away all but the sensible cases.... You will need an extra stack to hold pending operators, but you can always go through infix operator expressions in O(1).
LR parsers like yacc do this for you.... With a parser combinator you need to do it by hand. In parsec, there is a Expr package with a buildExpressionParser function that builds this for you.

How to parse S-expression in Erlang?

I am implementing client agent for Robocup Soccer simulator in Erlang. Simulator sends sensory information to client in form of S-expressions. Like this
(see 15 ((f c) 2 0 0 0) ((f r t) 64.1 -32) ((f r b) 64.1 32) ((f g r b) 55.1 7)
((g r) 54.6 0) ((b) 2 0 -0 0) ((l r) 54.6 90))
(see 16 ((f r t) 72.2 -44) ((f r b) 54.1 20) ((f g r b) 52.5 -10) ((g r) 54.1 -17)
((l r) 51.4 -89))
Simulator sends such type of sensor informatio in each cycle(100-200 msec).
The main format of the information is:
(see Time ObjInfo ObjInfo . . . )
The ObjInfos are of the format below:
(ObjName Distance Direction
[DistChange DirChange [BodyFac- ingDir
HeadFacingDir]])
where the objects are like:
(b) Ball, (g r) Right goal, (f ...) represents various flags.
What I want is to parse this information and store/update in some database(record) to use for analysis.
The main difficulty I am facing is to Parse this information.
Please suggest me some way of doing this? (does Erlang contain any library for such work)
Yecc and Leex are your friends: http://erlang.org/doc/apps/parsetools/index.html
Leex is a lexical analyzer generator for Erlang which will tokenize your data. Yecc is LALR-1 parser generator that can parse your tokens into meaningful structures.
There's a good blog post by Relops, Leex And Yecc, detailing some of the basics.
If you load LFE (Lisp Flavoured Erlang) it contains a lisp scanner and parser. The modules you need are lfe_scan, lfe_parse and lfe_io which wraps the other two. The scanner is written using leex (source is lfe_scan.xrl) while the parser is hand written as there are some features of how yecc works which didn't quite fit.
The correct approach would be to just write a small LISP reader.
The quick and (very) dirty way (for initial testing ONLY): Substitute whitespace with a comma, "(" with "{" and ")" with "}". Then you have an erlang literal.
Have a look at erl_scan and erl_parse.

Resources