how to remove last text in parenthesis (elegantly and simply) - f#

So I have some working code.
I want to remove the last parenthesized (?) text from a string...as long as its "the last" (i.e. isnt followed by whitespace or a full stop).
so
let x1 = remove "hello how are you. "
let x2 = remove "hello how are you. (remove me) "
let x3 = remove "hello how are you(don't remove me!)., "
goes to
val x1 : string = "hello how are you. "
val x2 : string = "hello how are you. "
val x3 : string = "hello how are you(don't remove me!)., "
I need to do this in language that doesnt support regex and is (sort of) functional, and is pretty basic.
(its XSLT 1.0)
I don't really like hacking in XSLT, so I've hacked together a simple routine in f#.
let remove s =
match (s : string).LastIndexOf '(' with
| -1 ->
s
| lastOpen ->
let rest = s.Substring lastOpen
match rest.LastIndexOf ')' with
| -1 ->
s
| lastClose ->
let after = rest.Substring (lastClose + 1)
if after.Replace(".","") |> String.IsNullOrWhiteSpace then
s.Substring (0,lastOpen) + after
else
s
I can't use Regex, (or write some sort of parser), its just a basic as simple as possible algorithm.
The above broadly works (I can see an edge case it doesnt...but don't get too hung up).
Anything simpler (but less ugly)?
EDIT #1 post accepting reply.
as a sobering thought this is the xslt version of a 4 line f# program!
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:variable name="foo">
<xsl:call-template name="removeParenthesisIf">
<xsl:with-param name="s" select="'foo bar dfsfsfffs)'"/>
<xsl:with-param name="maxCount" select="20"/>
</xsl:call-template>
</xsl:variable>
</xsl:template>
<xsl:template name="removeParenthesisIf">
<xsl:param name="s"/>
<xsl:param name="maxCount"/>
<xsl:choose>
<xsl:when test="substring($s,string-length($s)) = ')'">
<xsl:variable name="lastOpen">
<xsl:call-template name="lastCharIndex">
<xsl:with-param name="pText" select="$s"/>
<xsl:with-param name="pChar" select="'('"/>
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="($maxCount > string-length($s) - $lastOpen + 1) and not(string-length($s) = $lastOpen )">
<xsl:value-of select="substring($s,1,$lastOpen - 1)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$s"/>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$s"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="lastCharIndex">
<xsl:param name="pText"/>
<xsl:param name="pChar" select="' '"/>
<xsl:variable name="vRev">
<xsl:call-template name="reverse">
<xsl:with-param name="pStr" select="$pText"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="string-length($pText) - string-length(substring-before($vRev, $pChar))"/>
</xsl:template>
<xsl:template name="reverse">
<xsl:param name="pStr"/>
<xsl:variable name="vLength" select="string-length($pStr)"/>
<xsl:choose>
<xsl:when test="$vLength = 1">
<xsl:value-of select="$pStr"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vHalfLength" select="floor($vLength div 2)"/>
<xsl:variable name="vrevHalf1">
<xsl:call-template name="reverse">
<xsl:with-param name="pStr"
select="substring($pStr, 1, $vHalfLength)"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vrevHalf2">
<xsl:call-template name="reverse">
<xsl:with-param name="pStr"
select="substring($pStr, $vHalfLength+1)"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="concat($vrevHalf2, $vrevHalf1)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

The simplest way of doing something along those lines that I can think of, using mainly the built-in string functions, is the following:
let remove (s:string) =
if s.TrimEnd(' ').EndsWith(")") then
let last = s.LastIndexOf('(')
s.Remove(last, s.LastIndexOf(')') - last + 1)
else s
There are two tricks:
First, we check if the string ends with ) if we remove all spaces from the end. This is detecting the case where you have something to remove.
If that's the case, you can remove a substring from the middle using Remove. This assumes that the string is well-formed and does actually have an opening parenthesis - so that's an extra check you may need to add.

Related

Is there precedence to listing out tokens in a parser.mly file?

Like if I have some tokens listed at the beginning of my parser like so
%{
open Ast
%}
%token SEMI LPAREN RPAREN LBRACE RBRACE COMMA PLUS MINUS TIMES DIVIDE ASSIGN
%token NOT EQ NEQ LT LEQ GT GEQ AND OR
%token RETURN IF ELSE FOR WHILE INT BOOL FLOAT VOID
%token <int> LITERAL
%token <bool> BLIT
%token <string> ID FLIT
%token EOF
Is there any type of precedence associated with this? Or is this just a listing of the tokens with no particular importance to order?

JavaCC: treat white space like <OR>

I'm trying to build a simple grammar for Search Engine query.
I've got this so far -
options {
STATIC=false;
MULTI=true;
VISITOR=true;
}
PARSER_BEGIN(SearchParser)
package com.syncplicity.searchservice.infrastructure.parser;
public class SearchParser {}
PARSER_END(SearchParser)
SKIP :
{
" "
| "\t"
| "\n"
| "\r"
}
<*> TOKEN : {
<#_TERM_CHAR: ~[ " ", "\t", "\n", "\r", "!", "(", ")", "\"", "\\", "/" ] >
| <#_QUOTED_CHAR: ~["\""] >
| <#_WHITESPACE: ( " " | "\t" | "\n" | "\r" | "\u3000") >
}
TOKEN :
{
<AND: "AND">
| <OR: "OR">
| <NOT: ("NOT" | "!")>
| <LBRACKET: "(">
| <RBRACKET: ")">
| <TERM: (<_TERM_CHAR>)+ >
| <QUOTED: "\"" (<_QUOTED_CHAR>)+ "\"">
}
/** Main production. */
ASTQuery query() #Query: {}
{
subQuery()
( <AND> subQuery() #LogicalAnd
| <OR> subQuery() #LogicalOr
| <NOT> subQuery() #LogicalNot
)*
{
return jjtThis;
}
}
void subQuery() #void: {}
{
<LBRACKET> query() <RBRACKET> | term() | quoted()
}
void term() #Term:
{
Token t;
}
{
(
t=<TERM>
)
{
jjtThis.value = t.image;
}
}
void quoted() #Quoted:
{
Token t;
}
{
(
t=<QUOTED>
)
{
jjtThis.value = t.image;
}
}
Looks like it works as I wanted to, e.g it can handle AND, OR, NOT/!, single terms and quoted text.
However I can't force it to handle whitespaces between terms as OR operator. E.g hello world should be treated as hello OR world
I've tried all obvious solutions, like <OR: ("OR" | " ")>, removing " " from SKIP, etc. But it still doesn't work.
Perhaps you don't want whitespace treated as an OR, perhaps you want the OR keyword to be optional. In that case you can use a grammar like this
query --> subquery (<AND> subquery | (<OR>)? subquery | <NOT> subquery)*
However this grammar treat NOT as an infix operator. Also it doesn't reflect precedence. Usually NOT has precedence over AND and AND over OR. Also your main production should look for an EOF. For that you can try
query --> query0 <EOF>
query0 --> query1 ((<OR>)? query1)*
query1 --> query2 (<AND> query2)*
query2 --> <NOT> query2 | subquery
subquery --> <LBRACKET> query0 <RBRACKET> | <TERM> | <QUOTED>
Ok. Suppose you actually do want to require that any missing ORs be replaced by at least one space. Or to put it another way, if there is one or more white spaces where an OR would be permitted, then that white space is considered to be an OR.
As in my other solution, I'll treat NOT as a unary operator and give NOT precedence over AND and AND precedence over either sort of OR.
Change
SKIP : { " " | "\t" | "\n" | "\r" }
to
TOKEN : {<WS : " " | "\t" | "\n" | "\r" > }
Now use a grammar like this
query() --> query0() ows() <EOF>
query0() --> query1()
( LOOKAHEAD( ows() <OR> | ws() (<NOT> | <LBRACKET> | <TERM> | <QUOTED>) )
( ows() (<OR>)?
query1()
)*
query1() --> query2() (LOOKAHEAD(ows() <AND>) ows() <AND> query2())*
query2() --> ows() (<NOT> query2() | subquery())
subquery() --> <LBRACKET> query0() ows() <RBRACKET> | <TERM> | <QUOTED>
ows() --> (<WS>)*
ws() --> (<WS>)+

How to do a parser in Prolog?

I would like to do a parser in prolog. This one should be able to parse something like this:
a = 3 + (6 * 11);
For now I only have this grammar done. It's working but I would like to improve it in order to have id such as (a..z)+ and digit such as (0..9)+.
parse(-ParseTree, +Program, []):-
parsor(+Program, []).
parsor --> [].
parsor --> assign.
assign --> id, [=], expr, [;].
id --> [a] | [b].
expr --> term, (add_sub, expr ; []).
term --> factor, (mul_div, term ; []).
factor --> digit | (['('], expr, [')'] ; []).
add_sub --> [+] | [-].
mul_div --> [*] | [/].
digit --> [0] | [1] | [2] | [3] | [4] | [5] | [6] | [7] | [8] | [9].
Secondly, I would like to store something in the ParseTree variable in order to print the ParseTree like this:
PARSE TREE:
assignment
ident(a)
assign_op
expression
term
factor
int(1)
mult_op
term
factor
int(2)
add_op
expression
term
factor
left_paren
expression
term
factor
int(3)
sub_op
...
And this is the function I'm going use to print the ParseTree:
output_result(OutputFile,ParseTree):-
open(OutputFile,write,OutputStream),
write(OutputStream,'PARSE TREE:'),
nl(OutputStream),
writeln_term(OutputStream,0,ParseTree),
close(OutputStream).
writeln_term(Stream,Tabs,int(X)):-
write_tabs(Stream,Tabs),
writeln(Stream,int(X)).
writeln_term(Stream,Tabs,ident(X)):-
write_tabs(Stream,Tabs),
writeln(Stream,ident(X)).
writeln_term(Stream,Tabs,Term):-
functor(Term,_Functor,0), !,
write_tabs(Stream,Tabs),
writeln(Stream,Term).
writeln_term(Stream,Tabs1,Term):-
functor(Term,Functor,Arity),
write_tabs(Stream,Tabs1),
writeln(Stream,Functor),
Tabs2 is Tabs1 + 1,
writeln_args(Stream,Tabs2,Term,1,Arity).
writeln_args(Stream,Tabs,Term,N,N):-
arg(N,Term,Arg),
writeln_term(Stream,Tabs,Arg).
writeln_args(Stream,Tabs,Term,N1,M):-
arg(N1,Term,Arg),
writeln_term(Stream,Tabs,Arg),
N2 is N1 + 1,
writeln_args(Stream,Tabs,Term,N2,M).
write_tabs(_,0).
write_tabs(Stream,Num1):-
write(Stream,'\t'),
Num2 is Num1 - 1,
write_tabs(Stream,Num2).
writeln(Stream,Term):-
write(Stream,Term),
nl(Stream).
write_list(_Stream,[]).
write_list(Stream,[Ident = Value|Vars]):-
write(Stream,Ident),
write(Stream,' = '),
format(Stream,'~1f',Value),
nl(Stream),
write_list(Stream,Vars).
I hope someone will be able to help me. Thank you !
Here's an enhancement of your parser as written which can get you started. It's an elaboration of the notions that #CapelliC indicated.
parser([]) --> [].
parser(Tree) --> assign(Tree).
assign([assignment, ident(X), '=', Exp]) --> id(X), [=], expr(Exp), [;].
id(X) --> [X], { atom(X) }.
expr([expression, Term]) --> term(Term).
expr([expression, Term, Op, Exp]) --> term(Term), add_sub(Op), expr(Exp).
term([term, F]) --> factor(F).
term([term, F, Op, Term]) --> factor(F), mul_div(Op), term(Term).
factor([factor, int(N)]) --> num(N).
factor([factor, Exp]) --> ['('], expr(Exp), [')'].
add_sub(Op) --> [Op], { memberchk(Op, ['+', '-']) }.
mul_div(Op) --> [Op], { memberchk(Op, ['*', '/']) }.
num(N) --> [N], { number(N) }.
I might have a couple of niggles in here, but the key elements I've added to your code are:
Replaced digit with num which accepts any Prolog term N for which number(N) is true
Used atom(X) to identify a valid identifier
Added an argument to hold the result of parsing the given expression item
As an example:
| ?- phrase(parser(Tree), [a, =, 3, +, '(', 6, *, 11, ')', ;]).
Tree = [assignment,ident(a),=,[expression,[term,[factor,int(3)]],+,[expression,[term,[factor,[expression,[term,[factor,int(6)],*,[term,[factor,int(11)]]]]]]]]] ? ;
This may not be an ideal representation of the parse tree. It may need some adjustment per your needs, which you can do by modifying what I've shown a little. And then you can write a predicate which formats the parse tree as you like.
You could also consider, instead of a list structure, an embedded Prolog term structure as follows:
parser([]) --> [].
parser(Tree) --> assign(Tree).
assign(assignment(ident(X), '=', Exp)) --> id(X), [=], expr(Exp), [;].
id(X) --> [X], { atom(X) }.
expr(expression(Term)) --> term(Term).
expr(expression(Term, Op, Exp)) --> term(Term), add_sub(Op), expr(Exp).
term(term(F)) --> factor(F).
term(term(F, Op, Term)) --> factor(F), mul_div(Op), term(Term).
factor(factor(int(N))) --> num(N).
factor(factor(Exp)) --> ['('], expr(Exp), [')'].
add_sub(Op) --> [Op], { memberchk(Op, ['+', '-']) }.
mul_div(Op) --> [Op], { memberchk(Op, ['*', '/']) }.
num(N) --> [N], { number(N) }.
Which results in something like this:
| ?- phrase(parser(T), [a, =, 3, +, '(', 6, *, 11, ')', ;]).
T = assignment(ident(a),=,expression(term(factor(int(3))),+,expression(term(factor(expression(term(factor(int(6)),*,term(factor(int(11)))))))))) ? ;
A recursive rule for id//0, made a bit more generic:
id --> [First], {char_type(First,lower)}, id ; [].
Building the tree could be done 'by hand', augmenting each non terminal with the proper term, like
...
assign(assign(Id, Expr)) --> id(Id), [=], expr(Expr), [;].
...
id//0 could become id//1
id(id([First|Rest])) --> [First], {memberchk(First, [a,b])}, id(Rest) ; [], {Rest=[]}.
If you're going to code such parsers frequently, a rewrite rule can be easily implemented...

How to write this regex code snippet right?

I am trying to get a UIWebView to display some text with images.
The text has some links inside of it so for example:
"I once had a fish http://mysite.com/images/fish.jpg.
I also owned a little dog and a rooster http://mysite.com/images/dog.jpg"
would result in:
I once had a fish
----------------------------------
| |
|Fish Image From |
|http://mysite.com/images/fish.jpg|
| |
| |
----------------------------------
I also owned a little dog and a rooster
----------------------------------
| |
|dog Image From |
|http://mysite.com/images/dog.jpg|
| |
| |
----------------------------------
--------------------------------------
| |
|rooster Image From |
|http://mysite.com/images/rooster.jpg|
| |
| |
--------------------------------------
NSMutableString *mutableString = [[NSMutableString alloc] initWithFormat:#"<html><head></head><body>%#</body>",string];
NSString *pattern = #"http://.+\\.(?:jpg|jpeg|png|gif|bmp)";
NSString *replacement = #"<br /><img style=\"width:100%;height:auto;\" src=\"$1\"/><br />";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern
options:0 error:NULL];
[regex replaceMatchesInString:mutableString
options:0
range:NSMakeRange(0, mutableString.length)
withTemplate:replacement];
But when I check on the result the image gets this: <br /><img style="width:100%;height:auto;" src=""/><br /> and the link is gone.
Where the image has to be :
Text...
<br /><img style="width:100%;height:auto;" src="http://mysite.com/images/fish.jpg"/><br />
Text...
<br /><img style="width:100%;height:auto;" src="http://mysite.com/images/dog.jpg"/><br />
<br /><img style="width:100%;height:auto;" src="http://mysite.com/images/rooster.jpg"/><br />
I think for $1 to work properly you need to put the matched pattern inside a group, i.e., inside (). How about using the following pattern:
(http://.+\\.(?:jpg|jpeg|png|gif|bmp))
Also from the pattern the .+ part my eat the whole string. To make it less greedy it can be replaced with .+? or more correctly \w.*? to keep the effect of + in the original pattern.

How can I transform a large discriminated union tree to a readable form?

The following type is clearly quite large so manually writing the code to convert this to a readable form would be tedious. I would like to know the simplest way to display the tree in a readable form.
type Element =
| Nil
| Token of Token
| Expression of Element * Element
| ExpressionNoIn of Element * Element
| AssignmentExpression of Element * AssignmentOperator * Element
| AssignmentExpressionNoIn of Element * AssignmentOperator * Element
| ConditionalExpression of Element * Element * Element
| ConditionalExpressionNoIn of Element * Element * Element
| LogicalORExpression of Element * Element
| LogicalORExpressionNoIn of Element * Element
| LogicalANDExpression of Element * Element
| LogicalANDExpressionNoIn of Element * Element
| BitwiseORExpression of Element * Element
| BitwiseORExpressionNoIn of Element * Element
| BitwiseXORExpression of Element * Element
| BitwiseXORExpressionNoIn of Element * Element
| BitwiseANDExpression of Element * Element
| BitwiseANDExpressionNoIn of Element * Element
| EqualityExpression of Element * EqualityOperator * Element
| EqualityExpressionNoIn of Element * EqualityOperator * Element
| RelationalExpression of Element * RelationalOperator * Element
| RelationalExpressionNoIn of Element * RelationalOperator * Element
| ShiftExpression of Element * BitwiseShiftOperator * Element
| AdditiveExpression of Element * AdditiveOperator * Element
| MultiplicativeExpression of Element * MultiplicativeOperator * Element
| UnaryExpression of UnaryOperator * Element
| PostfixExpression of Element * PostfixOperator
| MemberExpression of Element * Element
| Arguments of Element * Element
| ArgumentList of Element
| CallExpression of Element * Element
| NewExpression of NewOperator * Element
| LeftHandSideExpression of Element
| PrimaryExpression of Element
| ObjectLiteral of Element
| PropertyNameAndValueList of Element * Element
| PropertyAssignment of Element * Element * Element
| PropertyName of Element
| PropertySetParameterList of Element
| ArrayLiteral of Element * Element
| Elision of Element * Element
| ElementList of Element * Element * Element
| Statement of Element
| Block of Element
| StatementList of Element * Element
| VariableStatement of Element
| VariableDeclarationList of Element * Element
| VariableDeclarationListNoIn of Element * Element
| VariableDeclaration of Element * Element
| VariableDeclarationNoIn of Element * Element
| Initialiser of Element
| InitialiserNoIn of Element
| EmptyStatement
| ExpressionStatement of Element
| IfStatement of Element * Element * Element
| IterationStatement of Element * Element * Element * Element
| ContinueStatement of Element
| BreakStatement of Element
| ReturnStatement of Element
| WithStatement of Element * Element
| SwitchStatement of Element * Element
| CaseBlock of Element * Element * Element
| CaseClauses of Element * Element
| CaseClause of Element * Element
| DefaultClause of Element
| LabelledStatement of Element * Element
| ThrowStatement of Element
| TryStatement of Element * Element * Element
| Catch of Element * Element
| Finally of Element
| DebuggerStatement
| FunctionDeclaration of Element * Element * Element
| FunctionExpression of Element * Element * Element
| FormalParameterList of Element * Element
| FunctionBody of Element
| SourceElement of Element
| SourceElements of Element * Element
| Program of Element
Here is an example of how this might be displayed. (It is a bit different as I produced it a while ago.)
<Expression>
<AssignmentExpression>
<ConditionalExpression>
<LogicalORExpression>
<LogicalORExpression>
<LogicalANDExpression>
<BitwiseORExpression>
<BitwiseXORExpression>
<BitwiseANDExpression>
<EqualityExpression>
<EqualityExpression>
<RelationalExpression>
<ShiftExpression>
<AdditiveExpression>
<MultiplicativeExpression>
<MultiplicativeExpression>
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="i" Line="9" Column="13" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
<TokenNode Value="%" Line="9" Column="15" />
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="3" Line="9" Column="17" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
</AdditiveExpression>
</ShiftExpression>
</RelationalExpression>
</EqualityExpression>
<TokenNode Value="===" Line="9" Column="19" />
<RelationalExpression>
<ShiftExpression>
<AdditiveExpression>
<MultiplicativeExpression>
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="0" Line="9" Column="23" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
</AdditiveExpression>
</ShiftExpression>
</RelationalExpression>
</EqualityExpression>
</BitwiseANDExpression>
</BitwiseXORExpression>
</BitwiseORExpression>
</LogicalANDExpression>
</LogicalORExpression>
<TokenNode Value="||" Line="9" Column="25" />
<LogicalANDExpression>
<BitwiseORExpression>
<BitwiseXORExpression>
<BitwiseANDExpression>
<EqualityExpression>
<EqualityExpression>
<RelationalExpression>
<ShiftExpression>
<AdditiveExpression>
<MultiplicativeExpression>
<MultiplicativeExpression>
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="i" Line="9" Column="28" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
<TokenNode Value="%" Line="9" Column="30" />
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="5" Line="9" Column="32" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
</AdditiveExpression>
</ShiftExpression>
</RelationalExpression>
</EqualityExpression>
<TokenNode Value="===" Line="9" Column="34" />
<RelationalExpression>
<ShiftExpression>
<AdditiveExpression>
<MultiplicativeExpression>
<UnaryExpression>
<PostfixExpression>
<LeftHandSideExpression>
<NewExpression>
<MemberExpression>
<PrimaryExpression>
<TokenNode Value="0" Line="9" Column="38" />
</PrimaryExpression>
</MemberExpression>
</NewExpression>
</LeftHandSideExpression>
</PostfixExpression>
</UnaryExpression>
</MultiplicativeExpression>
</AdditiveExpression>
</ShiftExpression>
</RelationalExpression>
</EqualityExpression>
</BitwiseANDExpression>
</BitwiseXORExpression>
</BitwiseORExpression>
</LogicalANDExpression>
</LogicalORExpression>
</ConditionalExpression>
</AssignmentExpression>
</Expression>
If you want to write generic union processing code that will not need to list all the union cases, then you'll probably need to use F# reflection API. Here is a simple example.
The formatUnion function uses F# reflection. It assumes that the type parameter 'T is a union type and uses GetUnionFields to get a tuple containing the name of the current case and arguments. It prints the current case name and iterates over all the arguments. If some of the arguments is value of type 'T (meaning that it is recursive union), we recursively print information about the value:
let rec formatUnion indent (value:'T) = //'
// Get name and arguments of the current union case
let info, args = Reflection.FSharpValue.GetUnionFields(value, typeof<'T>) //'
// Print current name (with some indentation)
printfn "%s%s" indent info.Name
for a in args do
match box a with
| :? 'T as v ->
// Recursive use of the same union type..
formatUnion (indent + " ") v
| _ -> ()
The following example runs the function on a very simple union value:
type Element = | Nil | And of Element * Element | Or of Element * Element
formatUnion "" (And(Nil, Or(Nil, Nil)))
// Here is the expected output:
// And
// Nil
// Or
// Nil
// Nil
As a side-note, I think that you could largely simplify your discriminated union by having cases for BinaryOperator and UnaryOperator (with one additional parameter) instead of listing all the element types explicitly. Then you could probably implement the function directly, because it would be quite simple. Something like:
type BinaryOperator = LogicalOr | LogicalAnd | BitwiseOr // ...
type UnaryOperator = Statement | Block | Initializer // ...
type Element =
| BinaryOperator of BinaryOperator * Element * Element
| UnaryOperator of UnaryOperator * Element

Resources