Purpose of tick (apostrophe) in F# value names? - f#

I'm going through a tutorial on Function Composition, and I keep seeing the ' operator used at the end of a value declaration.
I know that it means a generic when it precedes a parameter, but what does it mean when you see it like:
let add x y = x + y
let myFunc' = add 10
The only thing I can see is that the ' is just another character in the identifier. Is that right? Because if I use that same example, using myFunc gives a not defined error, where myFunc' does resolve.

Yes, as #Lee pointed, ' is a valid identifier.
Though, the purpose of ' at the end of identifiers generally is to denote the value as something related or similar to the value named without the ending '. This is borrowed from mathematics, F# being a functional language, for denoting something as being prime since A is pronounced aye where A' is aye-prime.

Yes, ' is a valid identifier character, although it cannot be the first one. The structure of identifiers is defined in the specification:
3.4 Identifiers and Keywords
ident-text = ident-start-char ident-char*
ident-char = letter-char | digit-char | connecting-char |
combining-char | formatting-char | ' | _

Related

Difference between discriminated Union types in F#

I'm reading about F# and looking at people's source code and I sometimes see
Type test =
| typeone
| typetwo
And sometimes i see
type test = typeone | typetwo
One of them has a pipe before and the one doesn't. At first I thought one was an enum vs discriminated Union but I THINK they are the same. Can someone explain the difference if there is any?
There is no difference. These notations are completely equivalent. The leading pipe character is optional.
Having this first pipe optional helps make the code look nicer in different circumstances. In particular, if my type has many cases, and each case has a lot of data, it makes sense to put them on separate lines. In this case, the leading pipe makes them look visually aligned, so that the reader perceives them as a single logical unit:
type Large =
| Case1 of int * string
| Case2 of bool
| SomeOtherCase
| FinalCase of SomeOtherType
On the other hand, if I only need two-three cases, I can put them on one line. In that case, the leading pipe only gets in the way, creating a feeling of clutter:
type QuickNSmall = One | Two | Three
There is no difference.
In the spec, the first | is optional.
The relevant bit of the spec is this:
union-type-cases:= '|'opt union-type-case '|' ... '|'
union-type-case
An enum would needs to give explicit values to the cases like
Type test =
| typeone = 1
| typetwo = 2
As already mentioned, the leading | is optional.
The examples in the other answers do not show this, so it is worth adding that you can omit it even for a multi-line discriminated union (and include it when defining a single line union):
type Large =
Case1 of int * string
| Case2 of bool
| SomeOtherCase
| FinalCase of SomeOtherType
type QuickNSmall = | One | Two | Three
I think most people just find these ugly (myself included!) and so they are usually written the way you see in the other answers.

How to remove ambiguity from this piece of Delphi grammar

I'm working on a Delphi Grammar in Rascal and I'm having some problems parsing its “record” type. The relevant section of Delphi code can look as follows:
record
private
a,b,c : Integer;
x : Cardinal;
end
Where the "private" can be optional, and the variable declaration lines can also be optional.
I tried to interpret this section using the rules below:
syntax FieldDecl = IdentList ":" Type
| IdentList ":" Type ";"
;
syntax FieldSection = FieldDecl
| "var" FieldDecl
| "class" "var" FieldDecl
;
syntax Visibility = "private" | "protected" | "public"| "published" ;
syntax VisibilitySectionContent = FieldSection
| MethodOrProperty
| ConstSection
| TypeSection
;
syntax VisibilitySection = Visibility? VisibilitySectionContent+
;
syntax RecordType = "record" "end"
| "record" VisibilitySection+ "end"
;
Problem is ambiguity. The entire text between “record” and “end” can be parsed in a single VisibilitySection, but every line on its own can also be a seperate VisibilitySection.
I can change the rule VisibilitySection to
syntax VisibilitySection = Visibility
| VisibilitySectionContent
;
Then the grammar is no longer ambiguous, but the VisibilitySection becomes, flat, there is no nesting anymore of the variable lines under an optional 'private' node, which I would prefer.
Any suggestions on how to solve this problem? What I would like to do is demand a longest /greedy match on the VisibilitySectionContent+ symbol of VisibilitySection.
But changing
syntax VisibilitySection = Visibility? VisibilitySectionContent+
to
syntax VisibilitySection = Visibility? VisibilitySectionContent+ !>> VisibilitySectionContent
does not seem to work for this.
I also ran the Ambiguity report tool on Rascal, but it does not provide me any insights.
Any thoughts?
Thanks
I can't check since you did not provide the full grammar, but I believe this should work to get your "longest match" behavior:
syntax VisibilitySection
= Visibility? VisibilitySectionContent+ ()
>> "public"
>> "private"
>> "published"
>> "protected"
>> "end"
;
In my mind this should remove the interpretation where your nested VisibilitySections are cut short. Now we only accept such sections if they are immediately followed by either the end of the record, or the next section. I'm curious to find out if it really works because it is always hard to predict the behavior of a grammar :-)
The () at the end of the rule (empty non-terminal) makes sure we can skip to the start of the next part before applying the restriction. This only works if you have a longest match rule on layout already somewhere in the grammar.
The VisibilitySectionContent+ in VisibilitySection should be VisibilitySectionContent (without the Kleene plus).
I’m guessing here, but your intention is probably to allow a number of sections/declarations within the record type, and any of those may or may not have a Visibility modifier. To avoid putting this optional Visibility in every section, you have created a VisibilitySectionContent nonterminal which basically models “things that can happen within the record type definition”, one thing per nonterminal, without worrying about visibility modifiers. In this case, you’re fine with one VisibilitySectionContent per VisibilitySection since there is explicit repetition when you refer to the VisibilitySection from the RecordType anyway.

How to resolve Xtext variables' names and keywords statically?

I have a grammar describing an assembler dialect. In code section programmer can refer to registers from a certain list and to defined variables. Also I have a rule matching both [reg0++413] and [myVariable++413]:
BinaryBiasInsideFetchOperation:
'['
v = (Register|[IntegerVariableDeclaration]) ( gbo = GetBiasOperation val = (Register|IntValue|HexValue) )?
']'
;
But when I try to compile it, Xtext throws a warning:
Decision can match input such as "'[' '++' 'reg0' ']'" using multiple alternatives: 2, 3. As a result, alternative(s) 3 were disabled for that input
Spliting the rules I've noticed, that
BinaryBiasInsideFetchOperation:
'['
v = Register ( gbo = GetBiasOperation val = (Register|IntValue|HexValue) )?
']'
;
BinaryBiasInsideFetchOperation:
'['
v = [IntegerVariableDeclaration] ( gbo = GetBiasOperation val = (Register|IntValue|HexValue) )?
']'
;
work well separately, but not at the same time. When I try to compile both of them, XText writes a number of errors saying that registers from list could be processed ambiguously. So:
1) Am I right, that part of rule v = (Register|[IntegerVariableDeclaration]) matches any IntegerVariable name including empty, but rule v = [IntegerVariableDeclaration] matches only nonempty names?
2) Is it correct that when I try to compile separate rules together Xtext thinks that [IntegerVariableDeclaration] can concur with Register?
3) How to resolve this ambiguity?
edit: definitors
Register:
areg = ('reg0' | 'reg1' | 'reg2' | 'reg3' | 'reg4' | 'reg5' | 'reg6' | 'reg7' )
;
IntegerVariableDeclaration:
section = SectionServiceWord? name=ID ':' type = IntegerType ('[' size = IntValue ']')? ( value = IntegerVariableDefinition )? ';'
;
ID is a standart terminal which parses a single word, a.k.a identifier
No, (Register|[IntegerVariableDeclaration]) can't match Empty. Actually, [IntegerVariableDeclaration] is the same than [IntegerVariableDeclaration|ID], it is matching ID rule.
Yes, i think you can't split your rules.
I can't reproduce your problem (i need full grammar), but, in order to solve your problem you should look at this article about xtext grammar debugging:
Compile grammar in debug mode by adding the following line into your workflow.mwe2
fragment = org.eclipse.xtext.generator.parser.antlr.DebugAntlrGeneratorFragment {}
Open generated antrl debug grammar with AntlrWorks and check the diagram.
In addition to Fabien's answer, I'd like to add that an omnimatching rule like
AnyId:
name = ID
;
instead of
(Register|[IntegerVariableDeclaration])
solves the problem. One need to dynamically check if AnyId.name is a Regiser, Variable or something else like Constant.

How to create a parser which means any char not in ['(',')','{','}'], in PetitParserDart?

I want to define a parser which accept any char except ['(', ')', '{', '}'] in PetitParserDart.
I tried:
char('(').not() & char(')').not() & char('{').not() & char('}')
I'm not sure if it's correct, and is it any simple way to do this? (something like chars('(){}').neg()) ?
This matches anything, but the characters listed after the caret ^. It is the character class of all characters without the listed ones:
pattern('^(){}');
This also works (note the .not() on the last character, and the any() to actually consume the character):
char('(').not() & char(')').not() & char('{').not() & char('}').not() & any()
And this one works as well:
anyIn('(){}').neg()
Which is equivalent to:
(anyIn('(){}').not() & any()).pick(1)
And another alternative is:
(char('(') | char(')') | char('{') | char('}')).neg()
Except for the second example, all examples return the parsed character (this can be easily fixed, but I wanted to stay close to your question). The first example is probably the easiest to understand, but depending on context you might prefer one of the alternatives.

What's with "Uppercase variable identifiers should not generally be used in patterns..."?

This compiler like:
let test Xf Yf = Xf + Yf
This compiler no like:
let test Xfd Yfd = Xfd + Yfd
Warning:
Uppercase variable identifiers should not generally be used in patterns, and may indicate a misspelt pattern name.
Maybe I'm not googling properly, but I haven't managed to track down anything which explains why this is the case for function parameters...
I agree that this error message looks a bit mysterious, but there is a good motivation for it. According to the F# naming guidelines, cases of discriminated unions should be named using PascalCase and the compiler is trying to make sure that you don't accidentally misspell name of a case in pattern matching.
For example, if you have the following union:
type Side =
| Left
| Right
You could write the following function that prints "ok" when the argument is Left and "wrong!" otherwise:
let foo a =
match a with
| Lef -> printfn "ok"
| _ -> printfn "wrong!"
There is a typo in the code - I wrote just Lef - but the code is still valid, because Lef can be interpreted as a new variable and so the matching assigns whatever side to Lef and always runs the first case. The warning about uppercase identifiers helps to avoid this.
F# tries to enforce case rules for active patterns - consider what does this code do
let f X =
match X with
|X -> 1
|_ -> 2
This is quite confusing. Also, function parameters are similar to patterns, you can do
let f (a,b,_) = a,b
for example. Not quite sure why the third letter triggers the warning though

Resources