Source
Id,name
1,Andrew
2,john
3,Robert
Target
detail
((1/Andrew)(2/john)(3/Robert))
please provide solution for the above scenario.
Thanks in advance.
In expression transformation, you can do as following:
ID (I) - ID
Name (I) - Name
v_EXP (V) - v_EXP||'('||ID||'/'||Name||')'
o_EXP (O) - v_EXP
Then link this exp transformation to aggregator transformation which will assign '(1/Andrew)(2/john)(3/Robert)' to o_EXP. Then push it through exp transformation again and do the following:
o_EXP (O) - '(' ||o_EXP || ')'
Source -> SQ -> EXP -> Agg -> Target
In Expression Transformation
create 3 ports
ID(I) - ID
Name(I) - Name
V_exp(V) - v_O||'('||ID||'/'|| Name||')'
V_O (V) - V_exp
O_Result(O) - '('||v_exp||')'
Pass above (O) port to aggregator transformation and then to Target
Related
I'm trying to use Menhir's incremental parsing API and introspection APIs in a generated parser. I want to, say, determine the semantic value associated with a particular LR(1) stack entry; i.e. a token that's been previously consumed by the parser.
Given an abstract parsing checkpoint, encapsulated in Menhir's type 'a env, I can extract a “stack element” from the LR automaton; it looks like this:
type element =
| Element: 'a lr1state * 'a * position * position -> element
The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form Element (s, v, startp, endp), s is a (non-initial) state and v is a semantic value. The value v is associated with the incoming symbol A of the state s. In other words, the value v was pushed onto the stack just before the state s was entered. Thus, for some type 'a, the state s has type 'a lr1state and the value v has type 'a ...
In order to do anything useful with the value v, one must gain information about the type 'a, by inspection of the state s. So far, the type 'a lr1state is abstract, so there is no way of inspecting s. The inspection API (§9.3) offers further tools for this purpose.
Okay, cool! So I go and dive into the inspection API:
The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ terminal =
| T_A : unit terminal
| T_B : int terminal
The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ nonterminal =
| N_main : thing nonterminal
Piecing these together, I get something like the following (where "command" is one of my grammar's nonterminals, and thus N_command is a string nonterminal):
let current_command (env : 'a env) =
let rec f i =
match Interpreter.get i env with
| None -> None
| Some Interpreter.Element (lr1state, v, _startp, _endp) ->
match Interpreter.incoming_symbol lr1state with
| Interpreter.N Interpreter.N_command -> Some v
| _ -> f (i + 1)
in
f 0
Unfortunately, this is puking up very confusing type-errors for me:
File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
string
This instance of string is ambiguous:
it would escape the scope of its equation
This is a bit above my level! I'm pretty sure I understand why I can't do what I tried to do above; but I don't understand what my alternatives are. In fact, the Menhir manual specifically mentions this complexity:
This function can be used to gain access to the semantic value v in a stack element Element (s, v, _, _). Indeed, by case analysis on the symbol incoming_symbol s, one gains information about the type 'a, hence one obtains the ability to do something useful with the value v.
Okay, but that's what I thought I did, above: case-analysis by match'ing on incoming_symbol s, pulling out the case where v is of a single, specific type: string.
tl;dr: how do I extract the string payload from this GADT, and do something useful with it?
If your error sounds like
This instance of string is ambiguous:
it would escape the scope of its equation
it means that the type checker is not really sure if outside of the pattern matching branch the type of v should be a string, or another type that is equal to string but only inside the branch. You just need to add a type annotation when leaving the branch to remove this ambiguity:
| Interpreter.(N N_command) -> Some (v:string)
I have an AST containing a simple list of tokens...
and I simply want to group pairs of balanced parameters into nested trees.
I've been trying various rules but I can't quite get it...
bottomup : findParams;
findParams
: ^(LIST left+=expression* LPARAM inner? RPARAM right+=expression*)
-> ^(LIST $left* ^(PARAMS inner?) $right*);
inner : (left+=expression* LPARAM inner? RPARAM right+=expression*)
-> $left* ^(PARAMS inner?) $right*) | (a+=expression* -> $a*);
fragment expression = INT;
This is sort of like the dyck language, but on a tree rather than a source. Also, I can't debug pattern matching tree grammars using remote debugging which is a hindrance.
Your approach is on the right track, but you're mixing a top-down approach with a bottom-up one. Top-down is good for breaking things down: "this list is big, make it into some smaller ones." Bottom-up is good for breaking things out: "this is the simplest thing that could be a list, so I'll make it into one."
Here is a bottom-up solution to grouping your nodes:
bottomup
: exit_list
;
exit_list
: ^(LIST pre* LPAR reduced* RPAR post+=.*) -> ^(LIST pre* ^(LIST reduced*) $post*)
;
pre : INT
| LPAR
| ^(LIST .*)
;
reduced
: INT
| ^(LIST .*)
;
For each set of parentheses that contains no other parentheses, convert the contents of that set into a new list. This rule is repeated until there are no more parentheses.
Example:
Input
1(3(4))5
Baseline AST
Final AST
Rule bottomup was recursively applied twice:
applied to (4): (LIST 1 '(' 3 '(' 4 ')' ')' 5) -> (LIST 1 '(' 3 (LIST 4) ')' 5)
applied to (3(4)): (LIST 1 '(' 3 (LIST 4) ')' 5) -> (LIST 1 (LIST 3 (LIST 4)) 5)
I am writing a lambda calculus in F#, but I am stuck on implementing the beta-reduction (substituting formal parameters with the actual parameters).
(lambda x.e)f
--> e[f/x]
example of usage:
(lambda n. n*2+3) 7
--> (n*2+3)[7/n]
--> 7*2+3
So I'd love to hear some suggestions in regards to how others might go about this. Any ideas would be greatly appreciated.
Thanks!
Assuming your representation of an expression looks like
type expression = App of expression * expression
| Lambda of ident * expression
(* ... *)
, you have a function subst (x:ident) (e1:expression) (e2:expression) : expression which replaces all free occurrences of x with e1 in e2, and you want normal order evaluation, your code should look something like this:
let rec eval exp =
match exp with
(* ... *)
| App (f, arg) -> match eval f with Lambda (x,e) -> eval (subst x arg e)
The subst function should work as follows:
For a function application it should call itself recursively on both subexpressions.
For lambdas it should call itself on the lambda's body expression unless the lambda's argument name is equal to the identifier you want to replace (in which case you can just leave the lambda be because the identifier can't appear freely anywhere inside it).
For a variable it should either return the variable unchanged or the replacement-expression depending on whether the variable's name is equal to the identifier.
I've got a simple grammar. Actually, the grammar I'm using is more complex, but this is the smallest subset that illustrates my question.
Expr ::= Value Suffix
| "(" Expr ")" Suffix
Suffix ::= "->" Expr
| "<-" Expr
| Expr
| epsilon
Value matches identifiers, strings, numbers, et cetera. The Suffix rule is there to eliminate left-recursion. This matches expressions such as:
a -> b (c -> (d) (e))
That is, a graph where a goes to both b and the result of (c -> (d) (e)), and c goes to d and e. I'm trying to produce an abstract syntax tree for these expressions, but I'm running into difficulty because all of the operators can accept any number of operands on each side. I'd rather keep the logic for producing the AST within the recursive descent parsing methods, since it avoids having to duplicate the logic of extracting an expression. My current strategy is as follows:
If a Value appears, push it to the output.
If a From or To appears:
Output a separator.
Get the next Expr.
Create a Link node.
Pop the first set of operands from output into the Link until a separator appears.
Erase the separator discovered.
Pop the second set of operands into the Link until a separator.
Push the Link to the output.
If I run this through without obeying steps 2.3–2.7, I get a list of values and separators. For the expression quoted above, a -> b (c -> (d) (e)), the output should be:
A sep_1 B sep_2 C sep_3 D E
Applying the To rule would then yield:
A sep_1 B sep_2 (link from C to {D, E})
And subsequently:
(link from A to {B, (link from C to {D, E})})
The important thing to note is that sep_2, crucial to delimit the left-hand operands of the second ->, does not appear, so the parser believes that the expression was actually written:
a -> (b c -> (d) (e))
In order to solve this with my current strategy, I would need a way to produce a separator between adjacent expressions, but only if the current expression is a From or To expression enclosed in parentheses. If that's possible, then I'm just not seeing it and the answer ought to be simple. If there's a better way to go about this, however, then please let me know!
I haven't tried to analyze it in detail, but: "From or To expression enclosed in parentheses" starts to sound a lot like "context dependent", which recursive descent can't handle directly. To avoid context dependence you'll probably need a separate production for a From or To in parentheses vs. a From or To without the parens.
Edit: Though it may be too late to do any good, if my understanding of what you want to match is correct, I think I'd write it more like this:
Graph :=
| List Sep Graph
;
Sep := "->"
| "<-"
;
List :=
| Value List
;
Value := Number
| Identifier
| String
| '(' Graph ')'
;
It's hard to be certain, but I think this should at least be close to matching (only) the inputs you want, and should make it reasonably easy to generate an AST that reflects the input correctly.
Given a LL(1) grammar what is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner? Please feel free to write example code in whatever language you prefer.
My Idea
symbol : either a token or a node
result : success or failure
token : a lexical token from source text
value -> string : the value of the token
type -> integer : the named type code of the token
next -> token : reads the next token and keeps position of the previous token
back -> token : moves back to the previous position and re-reads the token
node : a node in the syntax tree
type -> integer : the named type code of the node
symbols -> linkedList : the symbols found at this node
append -> symbol -> node : appends the new symbol to a new copy of the node
Here is an idea I have been thinking about. The main issue here is handling syntax errors.
I mean I could stop at the first error but that doesn't seem right.
let program token =
sourceElements (node nodeType.program) token
let sourceElements node token =
let (n, r) = sourceElement (node.append (node nodeType.sourceElements)) token
match r with
| success -> (n, r)
| failure -> // ???
let sourceElement node token =
match token.value with
| "function" ->
functionDeclaration (node.append (node nodeType.sourceElement)) token
| _ ->
statement (node.append (node nodeType.sourceElement)) token
Please Note
I will be offering up a nice bounty to the best answer so don't feel rushed. Answers that simply post a link will have less weight over answers that show code or contain detailed explanations.
Final Note
I am really new to this kind of stuff so don't be afraid to call me a dimwit.
You want to parse something into an abstract syntax tree.
In the purely functional programming language Haskell, you can use parser combinators to express your grammar. Here an example that parses a tiny expression language:
EDIT Use monadic style to match Graham Hutton's book
-- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)
-- abstract syntax tree
data Expr = I Int
| Add Expr Expr
| Mul Expr Expr
deriving (Eq,Show)
-- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
where
-- grammar
pExpr :: Parser String Expr
pExpr = do
a <- pNumber +++ parentheses pExpr -- first argument
do
f <- pOp -- operation symbol
b <- pExpr -- second argument
return (f a b)
+++ return a -- or just the first argument
parentheses parser = do -- parse inside parentheses
string "("
x <- parser
string ")"
return x
pNumber = do -- parse an integer
digits <- many1 digit
return . I . read $ digits
pOp = -- parse an operation symbol
do string "+"
return Add
+++
do string "*"
return Mul
Here an example run:
*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))
To learn more about parser combinators, see for example chapter 8 of Graham Hutton's book "Programming in Haskell" or chapter 16 of "Real World Haskell".
Many parser combinator library can be used with different token types, as you intend to do. Token streams are usually represented as lists of tokens [Token].
Definitely check out the monadic parser combinator approach; I've blogged about it in C# and in F#.
Eric Lippert's blog series on immutable binary trees may be helpful. Obviously, you need a tree which is not binary, but it will give you the general idea.