Related
I am having trouble understanding what the rules are for adding a lookahead to a core production during the construction of the DFA. To illustrate my confusion, I will be using an online parser generator that exposes all the internal calculations; this_tool. (<- open in a new tab)
(The formating is: NONTERMINAL -> RULE, LOOKAHEADS, where the lookaheads are forward slash sperated)
Using this grammar as an example:
S -> E
E -> ( E )
E -> N O E
E -> N
N -> 1
N -> 2
N -> 3
O -> +
O -> -
Copy and pasting the above grammar into the lalr parser generator will produce a dfa with 12 states (click the >>). My question is finally, why are the goto(0, N) kernel productions ( {[E -> N.O E, $/)]; [E -> N., $/)]} ) initiated with the ) terminal? Where does the ) come from? I would expect the goto(0, N) to be {[E -> N.O E, $]; [E -> N., $]}. Equally the kernel production in the goto(0, ( ) has an 'extra' ).
As the dfa is being constructed, equal cores are merged (the core is the set of productions that introduce a new state by performing closure on that set). State 2 has production [E -> .N, )];, which when merged with [E -> N., $] produces the correct output, but there's no way for state 0 to have known about lookahead of )
Thanks in advance, sorry if this was a confusing and specific question and about using an external website to demonstrate my issue.✌️
The solution is to propagate any newly found lookaheads then 'goto' the states where those lookaheads are cores of.
The method is described in chapter 4 section 7.5 of the Dragon Book 2nd ed.
(here: https://github.com/muthukumarse/books/blob/master/Dragon%20Book%20Compilers%20Principle%20Techniques%20and%20Tools%202nd%20Edtion.pdf)
I am working on simple list functions in Erlang to learn the syntax.
Everything was looking very similar to code I wrote for the Prolog version of these functions until I got to an implementation of 'intersection'.
The cleanest solution I could come up with:
myIntersection([],_) -> [];
myIntersection([X|Xs],Ys) ->
UseFirst = myMember(X,Ys),
myIntersection(UseFirst,X,Xs,Ys).
myIntersection(true,X,Xs,Ys) ->
[X|myIntersection(Xs,Ys)];
myIntersection(_,_,Xs,Ys) ->
myIntersection(Xs,Ys).
To me, this feels slightly like a hack. Is there a more canonical way to handle this? By 'canonical', I mean an implementation true to the spirit of what Erlang's design.
Note: the essence of this question is conditional handling of user-defined predicate functions. I am not asking for someone to point me to a library function. Thanks!
I like this one:
inter(L1,L2) -> inter(lists:sort(L1),lists:sort(L2),[]).
inter([H1|T1],[H1|T2],Acc) -> inter(T1,T2,[H1|Acc]);
inter([H1|T1],[H2|T2],Acc) when H1 < H2 -> inter(T1,[H2|T2],Acc);
inter([H1|T1],[_|T2],Acc) -> inter([H1|T1],T2,Acc);
inter([],_,Acc) -> Acc;
inter(_,_,Acc) -> Acc.
it gives the exact intersection:
inter("abcd","efgh") -> []
inter("abcd","efagh") -> "a"
inter("abcd","efagah") -> "a"
inter("agbacd","eafagha") -> "aag"
if you want that a value appears only once, simply replace one of the lists:sort/1 function by lists:usort/1
Edit
As #9000 says, one clause is useless:
inter(L1,L2) -> inter(lists:sort(L1),lists:sort(L2),[]).
inter([H1|T1],[H1|T2],Acc) -> inter(T1,T2,[H1|Acc]);
inter([H1|T1],[H2|T2],Acc) when H1 < H2 -> inter(T1,[H2|T2],Acc);
inter([H1|T1],[_|T2],Acc) -> inter([H1|T1],T2,Acc);
inter(_,_,Acc) -> Acc.
gives the same result, and
inter(L1,L2) -> inter(lists:usort(L1),lists:sort(L2),[]).
inter([H1|T1],[H1|T2],Acc) -> inter(T1,T2,[H1|Acc]);
inter([H1|T1],[H2|T2],Acc) when H1 < H2 -> inter(T1,[H2|T2],Acc);
inter([H1|T1],[_|T2],Acc) -> inter([H1|T1],T2,Acc);
inter(_,_,Acc) -> Acc.
removes any duplicate in the output.
If you know that there are no duplicate values in the input list, I think that
inter(L1,L2) -> [X || X <- L1, Y <- L2, X == Y].
is the shorter code solution but much slower (1 second to evaluate the intersection of 2 lists of 10 000 elements compare to 16ms for the previous solution, and an O(2) complexity comparable to #David Varela proposal; the ratio is 70s compare to 280ms with 2 lists of 100 000 elements!, an I guess there is a very high risk to run out of memory with bigger lists)
The canonical way ("canonical" as in "SICP") is to use an accumulator.
myIntersection(A, B) -> myIntersectionInner(A, B, []).
myIntersectionInner([], _, Acc) -> Acc;
myIntersectionInner(_, [], Acc) -> Acc;
myIntersectionInner([A|As], B, Acc) ->
case myMember(A, Bs) of
true ->
myIntersectionInner(As, Bs, [A|Acc]);
false ->
myIntersectionInner(As, Bs, [Acc]);
end.
This implementation of course produces duplicates if duplicates are present in both inputs. This can be fixed at the expense of calling myMember(A, Acc) and only appending A is the result is negative.
My apologies for the approximate syntax.
Although I appreciate the efficient implementations suggested, my intention was to better understand Erlang's implementation. As a beginner, I think #7stud's comment, particularly http://erlang.org/pipermail/erlang-questions/2009-December/048101.html, was the most illuminating. In essence, 'case' and pattern matching in functions use the same mechanism under the hood, although functions should be preferred for clarity.
In a real system, I would go with one of #Pascal's implementations; depending on whether 'intersect' did any heavy lifting.
Please explain how to design the context sensitive grammar of the above language.
I am new to context sensitive grammar.
Can this be a solution?
A -> aa
AA -> AAAA
AAAA -> AAAAAAAA
and so on
We get
A^i -> A^i.A^i , i>=1
and
A -> aa
The idea is to have a symbol that will 'track' across the sentiential form and
double everything
S -> ERAE
RA -> AAR
RE -> LE | F
AL -> LA
EL -> ER
AF -> Fa
EF -> ε
This is just off the top of my head and may be wrong, but hopefully the idea comes through and you can give a proper answer.
I think the solution you gave is wrong as it seems to proprose an infinite number of rules - why not just have a rule for each possible string?
I'm currently constructing LR(1) states from the following grammar.
S->AS
S->c
A->aA
A->b
where A,S are nonterminals and a,b,c are terminals.
This is the construction of I0
I0: S' -> .S, epsilon
---------------
S -> .AS, epsilon
S -> .c, epsilon
---------------
S -> .AS, a
S -> .c, c
A -> .aA, a
A -> .b, b
And I1.
From S, I1: S' -> S., epsilon //DONE
And so on. But when I get to constructing I4...
From a, I4: A -> a.A, a
-----------
A -> .aA, a
A -> .b, b
The problem is
A -> .aA
When I attempt to construct the next state from a, I'm going to once again get the exact same content of I4, and this continues infinitely. A similar loop occurs with
S -> .AS
So, what am I doing wrong? There has to be some detail that I'm missing, but I've browsed my notes and my book and either can't find or just don't understand what's wrong here. Any help?
I'm pretty sure I figured out the answer. Obviously, states can point to each other, so that eliminates the need to create new ones if it's content already exists. I'd still like it if someone can confirm this, though.
I've been trying to get into F# on and off for a while but I keep getting put off. Why?
Because no matter which 'beginners' resource I try to look at I see very simple examples that start using the operator ->.
However, nowhere have I found as yet that provides a clear simple explanation of what this operator means. It's as though it must be so obvious that it doesn't need explanation even to complete newbies.
I must therefore be really dense or perhaps it's nearly 3 decades of previous experience holding me back.
Can someone please, explain it or point to a truly accessible resource that explains it?
'->' is not an operator. It appears in the F# syntax in a number of places, and its meaning depends on how it is used as part of a larger construct.
Inside a type, '->' describes function types as people have described above. For example
let f : int -> int = ...
says that 'f' is a function that takes an int and returns an int.
Inside a lambda ("thing that starts with 'fun' keyword"), '->' is syntax that separates the arguments from the body. For example
fun x y -> x + y + 1
is an expression that defines a two argument function with the given implementation.
Inside a "match" construct, '->' is syntax that separates patterns from the code that should run if the pattern is matched. For example, in
match someList with
| [] -> 0
| h::t -> 1
the stuff to the left of each '->' are patterns, and the stuff on the right is what happens if the pattern on the left was matched.
The difficulty in understanding may be rooted in the faulty assumption that '->' is "an operator" with a single meaning. An analogy might be "." in C#, if you have never seen any code before, and try to analyze the "." operator based on looking at "obj.Method" and "3.14" and "System.Collections", you may get very confused, because the symbol has different meanings in different contexts. Once you know enough of the language to recognize these contexts, however, things become clear.
It basically means "maps to". Read it that way or as "is transformed into" or something like that.
So, from the F# in 20 minutes tutorial,
> List.map (fun x -> x % 2 = 0) [1 .. 10];;
val it : bool list
= [false; true; false; true; false; true; false; true; false; true]
The code (fun i -> i % 2 = 0) defines
an anonymous function, called a lambda
expression, that has a parameter x and
the function returns the result of "x
% 2 = 0", which is whether or not x is
even.
First question - are you familiar with lambda expressions in C#? If so the -> in F# is the same as the => in C# (I think you read it 'goes to').
The -> operator can also be found in the context of pattern matching
match x with
| 1 -> dosomething
| _ -> dosomethingelse
I'm not sure if this is also a lambda expression, or something else, but I guess the 'goes to' still holds.
Maybe what you are really referring to is the F# parser's 'cryptic' responses:
> let add a b = a + b
val add: int -> int -> int
This means (as most of the examples explain) that add is a 'val' that takes two ints and returns an int. To me this was totally opaque to start with. I mean, how do I know that add isn't a val that takes one int and returns two ints?
Well, the thing is that in a sense, it does. If I give add just one int, I get back an (int -> int):
> let inc = add 1
val inc: int -> int
This (currying) is one of the things that makes F# so sexy, for me.
For helpful info on F#, I have found that blogs are FAR more useful that any of the official 'documentation': Here are some names to check out
Dustin Campbell (that's diditwith.net, cited in another answer)
Don Symes ('the' man)
Tomasp.net (aka Tomas Petricek)
Andrew Kennedy (for units of measure)
Fsharp.it (famous for the Project Euler solutions)
http://lorgonblog.spaces.live.com/Blog (aka Brian)
Jomo Fisher
(a -> b) means "function from a to b". In type annotation, it denotes a function type. For example, f : (int -> String) means that f refers to a function that takes an integer and returns a string. It is also used as a contstructor of such values, as in
val f : (int -> int) = fun n -> n * 2
which creates a value which is a function from some number n to that same number multiplied by two.
There are plenty of great answers here already, I just want to add to the conversation another way of thinking about it.
' -> ' means function.
'a -> 'b is a function that takes an 'a and returns a 'b
('a * 'b) -> ('c * 'd) is a function that takes a tuple of type ('a, 'b) and returns a tuple of ('c, 'd). Such as int/string returns float/char.
Where it gets interesting is in the cascade case of 'a -> 'b -> 'c. This is a function that takes an 'a and returns a function ('b -> 'c), or a function that takes a 'b -> 'c.
So if you write:
let f x y z = ()
The type will be f : 'a -> 'b -> 'c -> unit, so if you only applied the first parameter, the result would be a curried function 'b -> 'c -> 'unit.
From Microsoft:
Function types are the types given to
first-class function values and are
written int -> int. They are similar
to .NET delegate types, except they
aren't given names. All F# function
identifiers can be used as first-class
function values, and anonymous
function values can be created using
the (fun ... -> ...) expression form.
Many great answers to this questions, thanks people. I'd like to put here an editable answer that brings things together.
For those familiar with C# understanding -> being the same as => lamba expression is a good first step. This usage is :-
fun x y -> x + y + 1
Can be understood as the equivalent to:-
(x, y) => x + y + 1;
However its clear that -> has a more fundemental meaning which stems from concept that a function that takes two parameters such as the above can be reduced (is that the correct term?) to a series of functions only taking one parameter.
Hence when the above is described in like this:-
Int -> Int -> Int
It really helped to know that -> is right associative hence the above can be considered:-
Int -> (Int -> Int)
Aha! We have a function that takes Int and returns (Int -> Int) (a curried function?).
The explaination that -> can also appear as part of type definiton also helped. (Int -> Int) is the type of any of function which takes an Int and returns an Int.
Also helpful is the -> appears in other syntax such as matching but there it doesn't have the same meaning? Is that correct? I'm not sure it is. I suspect it has the same meaning but I don't have the vocabulary to express that yet.
Note the purpose of this answer is not to spawn further answers but to be collaboratively edited by you people to create a more definitive answer. Utlimately it would be good that all the uncertainies and fluf (such as this paragraph) be removed and better examples added. Lets try keep this answer as accessible to the uninitiated as possible.
In the context of defining a function, it is similar to => from the lambda expression in C# 3.0.
F#: let f = fun x -> x*x
C#: Func<int, int> f = x => x * x;
The -> in F# is also used in pattern matching, where it means: if the expression matches the part between | and ->, then what comes after -> should be given back as the result:
let isOne x = match x with
| 1 -> true
| _ -> false
The nice thing about languages such as Haskell (it's very similar in F#, but I don't know the exact syntax -- this should help you understand ->, though) is that you can apply only parts of the argument, to create curried functions:
adder n x y = n + x + y
In other words: "give me three things, and I'll add them together". When you throw numbers at it, the compiler will infer the types of n x and y. Say you write
adder 1 2 3
The type of 1, 2 and 3 is Int. Therefore:
adder :: Int -> Int -> Int -> Int
That is, give me three integers, and I will become an integer, eventually, or the same thing as saying:
five :: Int
five = 5
But, here's the nice part! Try this:
add5 = adder 5
As you remember, adder takes an int, an int, an int, and gives you back an int. However, that is not the entire truth, as you'll see shortly. In fact, add5 will have this type:
add5 :: Int -> Int -> Int
It will be as if you have "peeled off" of the integers (the left-most), and glued it directly to the function. Looking closer at the function signature, we notice that the -> are right-associative, i.e.:
addder :: Int -> (Int -> (Int -> Int))
This should make it quite clear: when you give adder the first integer, it'll evaluate to whatever's to the right of the first arrow, or:
add5andtwomore :: Int -> (Int -> Int)
add5andtwomore = adder 5
Now you can use add5andtwomore instead of "adder 5". This way, you can apply another integer to get (say) "add5and7andonemore":
add5and7andonemore :: Int -> Int
add5and7andonemore = adder 5 7
As you see, add5and7andonemore wants exactly another argument, and when you give it one, it will suddenly become an integer!
> add5and7andonemore 9
=> ((add5andtwomore) 7) 9
=> ((adder 5) 7) 9)
<=> adder 5 7 9
Substituting the parameters to adder (n x y) for (5 7 9), we get:
> adder 5 7 9 = 5 + 7 + 9
=> 5 + 7 + 9
=> 21
In fact, plus is also just a function that takes an int and gives you back another int, so the above is really more like:
> 5 + 7 + 9
=> (+ 5 (+ 7 9))
=> (+ 5 16)
=> 21
There you go!