Get line number in yecc - parsing

I'm using yecc to parse my tokenized asm-like code. After providing code like "MOV [1], [2]\nJMP hello" and after lexer'ing, this is what I'm getting in response.
[{:opcode, 1, :MOV}, {:register, 1, 1}, {:",", 1}, {:register, 1, 2},
{:opcode, 2, :JMP}, {:identifer, 2, :hello}]
When I parse this I'm getting
[%{operation: [:MOV, [:REGISTER, 1], [:REGISTER, 2]]},
%{operation: [:JMP, [:CONST, :hello]]}]
But I want every operation to have line number in order to get meaningful errors further in code.
So I changed my parser to this:
Nonterminals
code statement operation value.
Terminals
label identifer integer ',' opcode register address address_in_register line_number.
Rootsymbol code.
code -> line_number statement : [{get_line('$1'), '$2'}].
code -> line_number statement code : [{get_line('$1'), '$2'} | '$3'].
%code -> statement : ['$1'].
%code -> statement code : ['$1' | '$2'].
statement -> label : #{'label' => label('$1')}.
statement -> operation : #{'operation' => '$1'}.
operation -> opcode value ',' value : [operation('$1'), '$2', '$4'].
operation -> opcode value : [operation('$1'), '$2'].
operation -> opcode identifer : [operation('$1'), value('$2')].
operation -> opcode : [operation('$1')].
value -> integer : value('$1').
value -> register : value('$1').
value -> address : value('$1').
value -> address_in_register : value('$1').
Erlang code.
get_line({_, Line, _}) -> Line.
operation({opcode, _, OpcodeName}) -> OpcodeName.
label({label, _, Value}) -> Value.
value({identifer, _, Value}) -> ['CONST', Value];
value({integer, _, Value}) -> ['CONST', Value];
value({register, _, Value}) -> ['REGISTER', Value];
value({address, _, Value}) -> ['ADDRESS', Value];
value({address_in_register, _, Value}) -> ['ADDRESS_IN_REGISTER', Value].
(commented code is old, working rule)
Now I'm getting
{:error, {1, :assembler_parser, ['syntax error before: ', ['\'MOV\'']]}}
After providing same input. How to fix this?

My suggestion is to keep the line numbers in the tokens and not as separate tokens and then change how you build the operations.
So I would suggest this:
operation -> opcode value ',' value : [operation('$1'), line('$1'), '$2', '$4'].
operation -> opcode value : [operation('$1'), line('$1'), '$2'].
operation -> opcode identifer : [operation('$1'), line('$1'), value('$2')].
operation -> opcode : [operation('$1'), line('$1')].
line({_, Line, _}) -> Line.
Or even this if you want to mirror Elixir AST:
operation -> opcode value ',' value : {operation('$1'), meta('$1'), ['$2', '$4']}.
operation -> opcode value : {operation('$1'), meta('$1'), ['$2']}.
operation -> opcode identifer : {operation('$1'), meta('$1'), [value('$2')]}.
operation -> opcode : {operation('$1'), meta('$1'), []}.
meta({_, Line, _}) -> [{line, Line}].

Related

How should Erlang filter the elements in the list, and add punctuation and []?

-module(solarSystem).
-export([process_csv/1, is_numeric/1, parseALine/2, parse/1, expandT/1, expandT/2,
parseNames/1]).
parseALine(false, T) ->
T;
parseALine(true, T) ->
T.
parse([Name, Colour, Distance, Angle, AngleVelocity, Radius, "1" | T]) ->
T;%Where T is a list of names of other objects in the solar system
parse([Name, Colour, Distance, Angle, AngleVelocity, Radius | T]) ->
T.
parseNames([H | T]) ->
H.
expandT(T) ->
T.
expandT([], Sep) ->
[];
expandT([H | T], Sep) ->
T.
% https://rosettacode.org/wiki/Determine_if_a_string_is_numeric#Erlang
is_numeric(L) ->
S = trim(L, ""),
Float = (catch erlang:list_to_float(S)),
Int = (catch erlang:list_to_integer(S)),
is_number(Float) orelse is_number(Int).
trim(A) ->
A.
trim([], A) ->
A;
trim([32 | T], A) ->
trim(T, A);
trim([H | T], A) ->
trim(T, A ++ [H]).
process_csv(L) ->
X = parse(L),
expandT(X).
The problem is that it will calls process_csv/1 function in my module in a main, L will be a file like this:
[["name "," col"," dist"," a"," angv"," r "," ..."],["apollo11 ","white"," 0.1"," 0"," 77760"," 0.15"]]
Or like this:
["planets ","earth","venus "]
Or like this:
["a","b"]
I need to display it as follows:
apollo11 =["white", 0.1, 0, 77760, 0.15,[]];
Planets =[earth,venus]
a,b
[[59],[97],[44],[98]]
My problem is that no matter how I make changes, it can only show a part, and there are no symbols. The list cannot be divided, so I can't find a way.
In addition, because Erlang is a niche programming language, I can't even find examples online.
So, can anyone help me? Thank you, very much.
In addition, I am restricted from using recursion.
I think the first problem is that it is hard to link what you are trying to achieve with what your code says thus far. Therefore, this feedback maybe is not exactly what you are looking for, but might give some ideas. Let's structure the problem into the common elements: (1) input, (2) process, and (3) output.
Input
You mentioned that L will be a file, but I assume it is a line in a file, where each line can be one of the 3 (three) samples. In this regard, the samples also do not have consistent pattern.For this, we can build a function to convert each line of the file into Erlang term and pass the result to the next step.
Process
The question also do not mention the specific logic in parsing/processing the input. You also seem to care about the data type so we will convert and display the result accordingly. Erlang as a functional language will naturally be handling list, so on most cases we will need to use functions on lists module
Output
You didn't specifically mention where you want to display the result (an output file, screen/erlang shell, etc), so let's assume you just want to display it in the standard output/erlang shell.
Sample file content test1.txt (please note the dot at the end of each line)
[["name "," col"," dist"," a"," angv"," r "],["apollo11 ","white","0.1"," 0"," 77760"," 0.15"]].
["planets ","earth","venus "].
["a","b"].
Howto run: solarSystem:process_file("/Users/macbook/Documents/test1.txt").
Sample Result:
(dev01#Macbooks-MacBook-Pro-3)3> solarSystem:process_file("/Users/macbook/Documents/test1.txt").
apollo11 = ["white",0.1,0,77760,0.15]
planets = ["earth","venus"]
a = ["b"]
Done processing 3 line(s)
ok
Module code:
-module(solarSystem).
-export([process_file/1]).
-export([process_line/2]).
-export([format_item/1]).
%%This is the main function, input is file full path
%%Howto call: solarSystem:process_file("file_full_path").
process_file(Filename) ->
%%Use file:consult to convert the file content into erlang terms
%%File content is a dot (".") separated line
{StatusOpen, Result} = file:consult(Filename),
case StatusOpen of
ok ->
%%Result is a list and therefore each element must be handled using lists function
Ctr = lists:foldl(fun process_line/2, 0, Result),
io:format("Done processing ~p line(s) ~n", [Ctr]);
_ -> %%This is for the case where file not available
io:format("Error converting file ~p due to '~p' ~n", [Filename, Result])
end.
process_line(Term, CtrIn) ->
%%Assume there are few possibilities of element. There are so many ways to process the data as long as the input pattern is clear.
%%We basically need to identify all possibilities and handle them accordingly.
%%Of course there are smarter (dynamic) ways to handle them, but below may give you some ideas.
case Term of
%%1. This is to handle this pattern -> [["name "," col"," dist"," a"," angv"," r "],["apollo11 ","white"," 0.1"," 0"," 77760"," 0.15"]]
[[_, _, _, _, _, _], [Name | OtherParams]] ->
%%At this point, Name = "apollo11", OtherParamsList = ["white"," 0.1"," 0"," 77760"," 0.15"]
OtherParamsFmt = lists:map(fun format_item/1, OtherParams),
%%Display the result to standard output
io:format("~s = ~p ~n", [string:trim(Name), OtherParamsFmt]);
%%2. This is to handle this pattern -> ["planets ","earth","venus "]
[Name | OtherParams] ->
%%At this point, Name = "planets ", OtherParamsList = ["earth","venus "]
OtherParamsFmt = lists:map(fun format_item/1, OtherParams),
%%Display the result to standard output
io:format("~s = ~p ~n", [string:trim(Name), OtherParamsFmt]);
%%3. Other cases
_ ->
%%Display the warning to standard output
io:format("Unknown pattern ~p ~n", [Term])
end,
CtrIn + 1.
%%This is to format the string accordingly
format_item(Str) ->
StrTrim = string:trim(Str), %%first, trim it
format_as_needed(StrTrim).
format_as_needed(Str) ->
Float = (catch erlang:list_to_float(Str)),
case Float of
{'EXIT', _} -> %%It is not a float -> check if it is an integer
Int = (catch erlang:list_to_integer(Str)),
case Int of
{'EXIT', _} -> %%It is not an integer -> return as is (string)
Str;
_ -> %%It is an int
Int
end;
_ -> %%It is a float
Float
end.

Syntax error in a guard causing undefined function

I have no idea what is the problem,
this is the code-
solve_bdd(BddTree, ListVars) ->
findRes(BddTree, maps:from_list(ListVars++[{one, 1}, {zero, 0}])).
findRes(BddTree, Map) when is_record(BddTree, node)-> Val = maps:get(getName(BddTree)), Name = getName(BddTree),
if Name=='one' or Name=='zero' -> maps:get(getName(BddTree));
(Val==1 or Val=='true') -> findRes(getRight(BddTree), Map);
(Val==0 or Val=='false') -> findRes(getLeft(BddTree), Map);
true -> error
end;
findRes(_, _) -> error.
And the shell errors-
exf.erl:183: syntax error before: '=='
exf.erl:180: function findRes/2 undefined
exf.erl:21: Warning: function getRight/1 is unused
exf.erl:22: Warning: function getLeft/1 is unused
error
When there is multiple conditions, You should group operands of or operator in parentheses:
1> false or false.
false
2> false == true or false == true.
* 1: syntax error before: '=='
2> (false == true) or (false == true).
false
Also maps:get/1 (function get in module maps which accepts 1 parameter) that you used:
maps:get( getName(BddTree) )
does not exists! But you can use maps:get/2 or maps:get/3.
Most of the time you can use case expression instead of if expression.
Also sometimes it's better to use orlese operator instead of or.
It's better to not handle anything! instead of handling both 0 and 1 and boolean types, you can use one of them and remove unnecessary checks.
By convention in Erlang it's better to write function and record names in Snake_case.
BTW, your findRes/2 function would be like:
%%% I don't know what work you expect from this function so if it's not working
%%% just like your own, try to fix it!
% findRes -> find_res
find_res(BddTree, Map) when is_record(BddTree, node) ->
% Sounds like BddTree is a record. If by `get_name/1` you just want to
% access one of it's elements, you can simply write BddTree#node.<ELEMENT_NAME>
% getName -> get_name
case get_name(BddTree) of
% you don't have to use ' character for atoms:
Name when Name == one orelse Name == zero ->
% I thinkd you've missed `Map`:
maps:get(get_name(BddTree), Map);
_ ->
% I do not use 0 and 1 and just use boolean type:
find_res(
% I thinkd you've missed `Map`:
case maps:get(get_name(BddTree), Map) of
Val when Val -> % when Val == true
% getRight -> get_right
get_right(BddTree);
_ -> % Assume false
get_left(BddTree)
end,
Map
)
end;
find_res(_, _) -> error.
And let's look at above code without comments:
find_res(BddTree, Map) when is_record(BddTree, node) ->
case get_name(BddTree) of
Name when Name == one orelse Name == zero ->
maps:get(get_name(BddTree), Map);
_ ->
find_res(
case maps:get(get_name(BddTree), Map) of
Val when Val -> % when Val == true
get_right(BddTree);
_ ->
get_left(BddTree)
end,
Map
)
end;
find_res(_, _) -> error.

type mismatch error for async chained operations

Previously had a very compact and comprehensive answer for my question.
I had it working for my custom type but now due to some reason I had to change it to string type which is now causing type mismatch errors.
module AsyncResult =
let bind (binder : 'a -> Async<Result<'b, 'c>>) (asyncFun : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> =
async {
let! result = asyncFun
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) = fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
Railway Oriented functions
let create (json: string) : Async<Result<string, Error>> =
let url = "http://api.example.com"
let request = WebRequest.CreateHttp(Uri url)
request.Method <- "GET"
async {
try
// http call
return Ok "result"
with :? WebException as e ->
return Error {Code = 500; Message = "Internal Server Error"}
}
test
type mismatch error for the AsyncResult.bind line
let chain = create
>> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
match chain "initial data" |> Async.RunSynchronously with
| Ok data -> Assert.IsTrue(true)
| Error error -> Assert.IsTrue(false)
Error details:
EntityTests.fs(101, 25): [FS0001] Type mismatch. Expecting a '(string -> string -> Async<Result<string,Error>>) -> 'a' but given a 'Async<Result<'b,'c>> -> Async<Result<'d,'c>>' The type 'string -> string -> Async<Result<string,Error>>' does not match the type 'Async<Result<'a,'b>>'.
EntityTests.fs(101, 25): [FS0001] Type mismatch. Expecting a '(string -> string -> Async<Result<string,Error>>) -> 'a' but given a 'Async<Result<string,'b>> -> Async<Result<string,'b>>' The type 'string -> string -> Async<Result<string,Error>>' does not match the type 'Async<Result<string,'a>>'.
Edit
Curried or partial application
In context of above example, is it the problem with curried functions? for instance if create function has this signature.
let create (token: string) (json: string) : Async<Result<string, Error>> =
and then later build chain with curried function
let chain = create "token" >> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
Edit 2
Is there a problem with following case?
signature
let create (token: Token) (entityName: string) (entityType: string) (publicationId: string) : Async<Result<string, Error>> =
test
let chain = create token >> AsyncResult.bind ( fun (result: string) -> async {return Ok "more results"} )
match chain "test" "article" "pubid" |> Async.RunSynchronously with
Update: At the front of the answer, even, since your edit 2 changes everything.
In your edit 2, you have finally revealed your actual code, and your problem is very simple: you're misunderstanding how the types work in a curried F# function.
When your create function looked like let create (json: string) = ..., it was a function of one parameter. It took a string, and returned a result type (in this case, Async<Result<string, Error>>). So the function signature was string -> Async<Result<string, Error>>.
But the create function you've just shown us is a different type entirely. It takes four parameters (one Token and three strings), not one. That means its signature is:
Token -> string -> string -> string -> Async<Result<string, Error>>
Remember how currying works: any function of multiple parameters can be thought of as a series of functions of one parameter, which return the "next" function in that chain. E.g., let add3 a b c = a + b + c is of type int -> int -> int -> int; this means that add3 1 returns a function that's equivalent to let add2 b c = 1 + b + c. And so on.
Now, keeping currying in mind, look at your function type. When you pass a single Token value to it as you do in your example (where it's called as create token, you get a function of type:
string -> string -> string -> Async<Result<string, Error>>
This is a function that takes a string, which returns another function that takes a string, which returns a third function which takes a string and returns an Async<Result<whatever>>. Now compare that to the type of the binder parameter in your bind function:
(binder : 'a -> Async<Result<'b, 'c>>)
Here, 'a is string, so is 'b, and 'c is Error. So when the generic bind function is applied to your specific case, it's looking for a function of type string -> Async<Result<'b, 'c>>. But you're giving it a function of type string -> string -> string -> Async<Result<string, Error>>. Those two function types are not the same!
That's the fundamental cause of your type error. You're trying to apply a function that returns a function that returns function that returns a result of type X to a design pattern (the bind design pattern) that expects a function that returns a result of type X. What you need is the design pattern called apply. I have to leave quite soon so I don't have time to write you an explanation of how to use apply, but fortunately Scott Wlaschin has already written a good one. It covers a lot, not just "apply", but you'll find the details about apply in there as well. And that's the cause of your problem: you used bind when you needed to use apply.
Original answer follows:
I don't yet know for a fact what's causing your problem, but I have a suspicion. But first, I want to comment that the parameter names for your AsyncResult.bind are wrong. Here's what you wrote:
let bind (binder : 'a -> Async<Result<'b, 'c>>)
(asyncFun : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> =
(I moved the second parameter in line with the first parameter so it wouldn't scroll on Stack Overflow's smallish column size, but that would compile correctly if the types were right: since the two parameters are lined up vertically, F# would know that they are both belonging to the same "parent", in this case a function.)
Look at your second parameter. You've named it asyncFun, but there's no arrow in its type description. That's not a function, it's a value. A function would look like something -> somethingElse. You should name it something like asyncValue, not asyncFun. By naming it asyncFun, you're setting yourself up for confusion later.
Now for the answer to the question you asked. I think your problem is this line, where you've fallen afoul of the F# "offside rule":
let chain = create
>> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
Note the position of the >> operator, which is to the left of its first operand. Yes, the F# syntax appears to allow that in most situations, but I suspect that if you simply change that function definition to the following, your code will work:
let chain =
create
>> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
Or, better yet because it's good style to make the |> (and >>) operators line up with their first operand:
let chain =
create
>> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
If you look carefully at the rules that Scott Wlaschin lays out in https://fsharpforfunandprofit.com/posts/fsharp-syntax/, you'll note that his examples where he shows exceptions to the "offside rule", he writes them like this:
let f g h = g // defines a new line at col 15
>> h // ">>" allowed to be outside the line
Note how the >> character is still to the right of the = in the function definition. I don't know exactly what the F# spec says about the combination of function definitions and the offside rule (Scott Wlaschin is great, but he's not the spec so he could be wrong, and I don't have time to look up the spec right now), but I've seen it do funny things that I didn't quite expect when I wrote functions with part of the function definition on the same line as the function, and the rest on the next line.
E.g., I once wrote something like this, which didn't work:
let f a = if a = 0 then
printfn "Zero"
else
printfn "Non-zero"
But then I changed it to this, which did work:
let f a =
if a = 0 then
printfn "Zero"
else
printfn "Non-zero"
I notice that in Snapshot's answer, he made your chain function be defined on a single line, and that worked for him. So I suspect that that's your problem.
Rule of thumb: If your function has anything after the = on the same line, make the function all on one line. If your function is going to be two lines, put nothing after the =. E.g.:
let f a b = a + b // This is fine
let g c d =
c * d // This is also fine
let h x y = x
+ y // This is asking for trouble
I would suspect that the error stems from a minor change in indentation since adding a single space to an FSharp program changes its meaning, the FSharp compiler than quickly reports phantom errors because it interprets the input differently. I just pasted it in and added bogus classes and removed some spaces and now it is working just fine.
module AsyncResult =
[<StructuralEquality; StructuralComparison>]
type Result<'T,'TError> =
| Ok of ResultValue:'T
| Error of ErrorValue:'TError
let bind (binder : 'a -> Async<Result<'b, 'c>>) (asyncFun : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> =
async {
let! result = asyncFun
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) = fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
open AsyncResult
open System.Net
type Assert =
static member IsTrue (conditional:bool) = System.Diagnostics.Debug.Assert(conditional)
type Error = {Code:int; Message:string}
[<EntryPoint>]
let main args =
let create (json: string) : Async<Result<string, Error>> =
let url = "http://api.example.com"
let request = WebRequest.CreateHttp(Uri url)
request.Method <- "GET"
async {
try
// http call
return Ok "result"
with :? WebException as e ->
return Error {Code = 500; Message = "Internal Server Error"}
}
let chain = create >> AsyncResult.bind (fun (result: string) -> (async {return Ok "more results"}))
match chain "initial data" |> Async.RunSynchronously with
| Ok data -> Assert.IsTrue(true)
| Error error -> Assert.IsTrue(false)
0

Erlang: syntax error before: ","word"

I have the following functions:
search(DirName, Word) ->
NumberedFiles = list_numbered_files(DirName),
Words = make_filter_mapper(Word),
Index = mapreduce(NumberedFiles, Words, fun remove_duplicates/3),
dict:find(Word, Index).
list_numbered_files(DirName) ->
{ok, Files} = file:list_dir(DirName),
FullFiles = [ filename:join(DirName, File) || File <- Files ],
Indices = lists:seq(1, length(Files)),
lists:zip(Indices, FullFiles). % {Index, FileName} tuples
make_filter_mapper(MatchWord) ->
fun (_Index, FileName, Emit) ->
{ok, [Words]} = file:consult(FileName), %% <---- Line 20
lists:foreach(fun (Word) ->
case MatchWord == Word of
true -> Emit(Word, FileName);
false -> false
end
end, Words)
end.
remove_duplicates(Word, FileNames, Emit) ->
UniqueFiles = sets:to_list(sets:from_list(FileNames)),
lists:foreach(fun (FileName) -> Emit(Word, FileName) end, UniqueFiles).
However, when i call search(Path_to_Dir, Word) I get:
Error in process <0.185.0> with exit value:
{{badmatch,{error,{1,erl_parse,["syntax error before: ","wordinfile"]}}},
[{test,'-make_filter_mapper/1-fun-1-',4,[{file,"test.erl"},{line,20}]}]}
And I do not understand why. Any ideas?
The Words variable will match to content of the list, which might not be only one tuple, but many of them. Try to match {ok, Words} instead of {ok, [Words]}.
Beside the fact that the function file:consult/1 may return a list of several elements so you should replace {ok,[Words]} (expecting a list of one element = Words) by {ok,Words}, it actually returns a syntax error meaning that in the file you are reading, there is a syntax error.
Remember that the file should contain only valid erlang terms, each of them terminated by a dot. The most common error is to forget a dot or replace it by a comma.

Having trouble getting Yecc and Leex to work

I'm trying to create a very simple DSL that takes a string formatted like
GET /endpoint controller.action1 |> controller.action2
And turn it to something along the lines of
{"GET", "/endpoint", [{controller.action1}, {controller.action2}]}
My Leex file is this:
Definitions.
Rules.
GET|PUT|POST|DELETE|PATCH : {token, {method, TokenLine, TokenChars}}.
/[A-Za-z_]+ : {token, {endpoint, TokenLine, TokenChars}}.
[A-Za-z0-9_]+\.[A-Za-z0-9_]+ : {token, {function, TokenLine, splitControllerAction(TokenChars)}}.
\|\> : {token, {pipe, TokenLine}}.
[\s\t\n\r]+ : skip_token.
Erlang code.
splitControllerAction(A) ->
[Controller, Action] = string:tokens(A, "."),
{list_to_atom(Controller), list_to_atom(Action)}.
And my Yecc file looks like this:
Nonterminals route actionlist elem.
Terminals function endpoint method pipe.
Rootsymbol route.
route -> method endpoint actionlist : {$1, $2, $3}.
actionlist -> elem : [$1].
actionlist -> elem 'pipe' actionlist : [$1 | $3].
elem -> function : $1.
Erlang code.
extract_token({_Token, _Line, Value}) -> _Token;
The output I'm getting with this:
2> {ok, Fart, _} = blah:string("GET /asdfdsf dasfadsf.adsfasdf |> adsfsdf.adsfdf").
{ok,[{method,1,"GET"},
{endpoint,1,"/asdfdsf"},
{function,1,{dasfadsf,adsfasdf}},
{pipe,1},
{function,1,{adsfsdf,adsfdf}}],
1}
3> blah_parser:parse(Fart).
{ok,{49,50,51}}
Turns out you need to surround $1 with single-quotes, otherwise it just tries and be the ASCII value.
-Thomas Gebert.

Resources