Related
It seems a bit like a trivial question, but I am stuck on parsing the end of file EOF using my own island grammar. I am using the new VScode extension btw.
I've mostly been using the examples from the basic recipes and have a simple grammar with the following layout rules:
layout Whitespace = [\t-\n\r\ ]*;
lexical IntegerLiteral = [0-9]+ !>> [0-9];
lexical Comment = "%%" ![\n]* $;
Using this, and some rules it parses some simple files, but will give a parse error anytime a file ends in a newline. (newlines in between lines are no problem).
Am is missing something obvious?
Thanks!
It sounds a bit like your grammar is missing a start nonterminal. All grammar rules get whitespace in between their constituent symbols but not at the start or the end.
A start nonterminal is the exception:
start syntax Islands = Island+;
Islands parseIslands(loc input)
= parse(#start[Islands], input).top;
Passing the start nonterminal to parse will allow the file to start and end with whitespace, and using the .top field you can ignore that whitespace from the parse tree again by projecting out the middle Islands tree.
Island grammars tend to be a complex beast, so without sharing the full grammar and input string, it might be a bit hard to answer this question. But I'll share some generic feedback.
he layout production might be ambiguous, if any other part of your language has optional parts. Rascal's parsing is non-greedy. So if you have:
lexical A = "a";
lexical B = "b";
lexical C = "c";
syntax A = A? B? C;
After fusing in the layouts, this becomes:
A` = A? Whitespace? B? Whitespace? C;
Now since whitespace is not eating all characters, the grammar is ambigous, as the parser can "bind" a whitespace between the A and B, or between the B and C. So in most cases, you want to make sure it's a greedy match by adding a follow restriction:
layout Whitespace = [\t-\n \r \ ]* !>> [\t-\n \r \ ];
Also, I fixed a bug, the layout definition didn't include a space as valid whitespace. Rascal allows for spaces in the character class (for readability), so in case we need to add a space, you have to say \ .
For the rest, it looks okay, but like I started with, island grammars are a bit harder to debug without both the full syntax, and what you want to have as water and what as island.
I'm curious about Prolog as a parser, so I'm making a little Lisp front-end. I have already made a tokenizer, which you can see here:
base_tokenize([], Buffer, [Buffer]).
base_tokenize([Char | Chars], Buffer, Tokens) :-
(Char = '(' ; Char = ')') ->
base_tokenize(Chars, '', Tail_Tokens),
Tokens = [Buffer, Char | Tail_Tokens];
Char = ' ' ->
base_tokenize(Chars, '', Tail_Tokens),
Tokens = [Buffer | Tail_Tokens];
atom_concat(Buffer, Char, New_Buffer),
base_tokenize(Chars, New_Buffer, Tokens).
filter_empty_blank([], []).
filter_empty_blank([Head | Tail], Result) :-
filter_empty_blank(Tail, Tail_Result),
((Head = [] ; Head = '') ->
Result = Tail_Result;
Result = [Head | Tail_Result]).
tokenize(Expr, Tokens) :-
atom_chars(Expr, Chars),
base_tokenize(Chars, '', Dirty_Tokens),
filter_empty_blank(Dirty_Tokens, Tokens).
I now have a new challenge: construct a parse tree from this. First, I tried making one without a grammar, but that turned out really messy. So I'm using DCGs. Wikipedia's page on it is not very clear - especially the portion Parsing with DCGs. Maybe someone can give me a clearer idea of how I would construct a tree? I was very happy to know that Prolog's lists are untyped, so it's a bit easier now that no sum types are needed. I'm just really confused about inputs to grammar clauses like sentence(s(NP,VP)) or verb(v(eats)) (on the Wiki), why the arguments have such abstruse names, and how I can get started with my parser without too much hassle.
expr --> [foo].
expr --> list.
seq --> expr, seq.
seq --> expr.
list --> ['('], seq, [')'].
Here is a beginning: Parsing a LISP list-of-atom, which at first is unstructured list-of-token:
List = [ '(', '(', foo, bar, ')', baz ')' ].
First, just accept it.
Write down the grammar directly:
so_list --> ['('], so_list_content, [')'].
so_list_content --> [].
so_list_content --> so_atom, so_list_content.
so_list_content --> so_list, so_list_content.
so_atom --> [X], { \+ member(X,['(',')']),atom(X) }.
Add some test cases (is there plunit in GNU Prolog?)
:- begin_tests(accept_list).
test(1,[fail]) :- phrase(so_list,[]).
test(2,[true,nondet]) :- phrase(so_list,['(',')']).
test(3,[true,nondet]) :- phrase(so_list,['(',foo,')']).
test(4,[true,nondet]) :- phrase(so_list,['(',foo,'(',bar,')',')']).
test(5,[true,nondet]) :- phrase(so_list,['(','(',bar,')',foo,')']).
test(6,[fail]) :- phrase(so_list,['(',foo,'(',bar,')']).
:- end_tests(accept_list).
And so:
?- run_tests.
% PL-Unit: accept_list ...... done
% All 6 tests passed
true.
Cool. Looks like we can accept lists-of-tokens.
Now build a parse tree. This is done by growing a Prolog term through parameters of the "DCG predicates". The term (or multiple terms) in the head collect the terms (or multiple terms) appearing in the body into a larger structure, quite naturally. Once the terminal tokens are reached, the structure starts to fill up with actual content:
so_list(list(Stuff)) --> ['('], so_list_content(Stuff), [')'].
so_list_content([]) --> [].
so_list_content([A|Stuff]) --> so_atom(A), so_list_content(Stuff).
so_list_content([L|Stuff]) --> so_list(L), so_list_content(Stuff).
so_atom(X) --> [X], { \+ member(X,['(',')']),atom(X) }.
Yup, tests (move the expected Result out of the test head because the visual noise is too much)
:- begin_tests(parse_list).
test(1,[fail]) :-
phrase(so_list(_),[]).
test(2,[true(L==Result),nondet]) :-
phrase(so_list(L),['(',')']),
Result = list([]).
test(3,[true(L==Result),nondet]) :-
phrase(so_list(L),['(',foo,')']),
Result = list([foo]).
test(4,[true(L==Result),nondet]) :-
phrase(so_list(L),['(',foo,'(',bar,')',')']),
Result = list([foo,list([bar])]).
test(5,[true(L==Result),nondet]) :-
phrase(so_list(L),['(','(',bar,')',foo,')']),
Result = list([list([bar]),foo]).
test(6,[fail]) :-
phrase(so_list(_),['(',foo,'(',bar,')']).
:- end_tests(parse_list).
And so:
?- run_tests.
% PL-Unit: parse_list ...... done
% All 6 tests passed
true.
I have to write parse(Tkns, T) that takes in a mathematical expression in the form of a list of tokens and finds T, and return a statement representing the abstract syntax, respecting order of operations and associativity.
For example,
?- parse( [ num(3), plus, num(2), star, num(1) ], T ).
T = add(integer(3), multiply(integer(2), integer(1))) ;
No
I've attempted to implement + and * as follows
parse([num(X)], integer(X)).
parse(Tkns, T) :-
( append(E1, [plus|E2], Tkns),
parse(E1, T1),
parse(E2, T2),
T = add(T1,T2)
; append(E1, [star|E2], Tkns),
parse(E1, T1),
parse(E2, T2),
T = multiply(T1,T2)
).
Which finds the correct answer, but also returns answers that do not follow associativity or order of operations.
ex)
parse( [ num(3), plus, num(2), star, num(1) ], T ).
also returns
mult(add(integer(3), integer(2)), integer(1))
and
parse([num(1), plus, num(2), plus, num(3)], T)
returns the equivalent of 1+2+3 and 1+(2+3) when it should only return the former.
Is there a way I can get this to work?
Edit: more info: I only need to implement +,-,*,/,negate (-1, -2, etc.) and all numbers are integers. A hint was given that the code will be structured similarly to the grammer
<expression> ::= <expression> + <term>
| <expression> - <term>
| <term>
<term> ::= <term> * <factor>
| <term> / <factor>
| <factor>
<factor> ::= num
| ( <expression> )
Only with negate implemented as well.
Edit2: I found a grammar parser written in Prolog (http://www.cs.sunysb.edu/~warren/xsbbook/node10.html). Is there a way I could modify it to print a left hand derivation of a grammar ("print" in the sense that the Prolog interpreter will output "T=[the correct answer]")
Removing left recursion will drive you towards DCG based grammars.
But there is an interesting alternative way: implement bottom up parsing.
How hard is this in Prolog ? Well, as Pereira and Shieber show in their wonderful book
'Prolog and Natural-Language Analysis', can be really easy: from chapter 6.5
Prolog supplies by default a top-down, left-to-right, backtrack parsing algorithm for
DCGs.
It is well known that top-down parsing algorithms of this kind will loop on
left-recursive rules (cf. the example of Program 2.3).
Although techniques are avail-
able to remove left recursion from context-free grammars, these techniques are not
readily generalizable to DCGs, and furthermore they can increase grammar size by
large factors.
As an alternative, we may consider implementing a bottom-up parsing method
directly in Prolog. Of the various possibilities, we will consider here the left-corner
method in one of its adaptations to DCGs.
For programming convenience, the input grammar for the left-corner DCG interpreter is represented in a slight variation of the DCG notation. The right-hand sides of
rules are given as lists rather than conjunctions of literals. Thus rules are unit clauses
of the form, e.g.,
s ---> [np, vp].
or
optrel ---> [].
Terminals are introduced by dictionary unit clauses of the form word(w,PT).
Consider to complete the lecture before proceeding (lookup the free book entry by title in info page).
Now let's try writing a bottom up processor:
:- op(150, xfx, ---> ).
parse(Phrase) -->
leaf(SubPhrase),
lc(SubPhrase, Phrase).
leaf(Cat) --> [Word], {word(Word,Cat)}.
leaf(Phrase) --> {Phrase ---> []}.
lc(Phrase, Phrase) --> [].
lc(SubPhrase, SuperPhrase) -->
{Phrase ---> [SubPhrase|Rest]},
parse_rest(Rest),
lc(Phrase, SuperPhrase).
parse_rest([]) --> [].
parse_rest([Phrase|Phrases]) -->
parse(Phrase),
parse_rest(Phrases).
% that's all! fairly easy, isn't it ?
% here start the grammar: replace with your one, don't worry about Left Recursion
e(sum(L,R)) ---> [e(L),sum,e(R)].
e(num(N)) ---> [num(N)].
word(N, num(N)) :- integer(N).
word(+, sum).
that for instance yields
phrase(parse(P), [1,+,3,+,1]).
P = e(sum(sum(num(1), num(3)), num(1)))
note the left recursive grammar used is e ::= e + e | num
Before fixing your program, look at how you identified the problem! You assumed that a particular sentence will have exactly one syntax tree, but you got two of them. So essentially, Prolog helped you to find the bug!
This is a very useful debugging strategy in Prolog: Look at all the answers.
Next is the specific way how you encoded the grammar. In fact, you did something quite smart: You essentially encoded a left-recursive grammar - nevertheless your program terminates for a list of fixed length! That's because you indicate within each recursion that there has to be at least one element in the middle serving as operator. So for each recursion there has to be at least one element. That is fine. However, this strategy is inherently very inefficient. For, for each application of the rule, it will have to consider all possible partitions.
Another disadvantage is that you can no longer generate a sentence out of a syntax tree. That is, if you use your definition with:
?- parse(S, add(add(integer(1),integer(2)),integer(3))).
There are two reasons: The first is that the goals T = add(...,...) are too late. Simply put them at the beginning in front of the append/3 goals. But much more interesting is that now append/3 does not terminate. Here is the relevant failure-slice (see the link for more on this).
parse([num(X)], integer(X)) :- false.
parse(Tkns, T) :-
( T = add(T1,T2),
append(E1, [plus|E2], Tkns), false,
parse(E1, T1),
parse(E2, T2),
; false, T = multiply(T1,T2),
append(E1, [star|E2], Tkns),
parse(E1, T1),
parse(E2, T2),
).
#DanielLyons already gave you the "traditional" solution which requires all kinds of justification from formal languages. But I will stick to your grammar you encoded in your program which - translated into DCGs - reads:
expr(integer(X)) --> [num(X)].
expr(add(L,R)) --> expr(L), [plus], expr(R).
expr(multiply(L,R)) --> expr(L), [star], expr(R).
When using this grammar with ?- phrase(expr(T),[num(1),plus,num(2),plus,num(3)]). it will not terminate. Here is the relevant slice:
expr(integer(X)) --> {false}, [num(X)].
expr(add(L,R)) --> expr(L), {false}, [plus], expr(R).
expr(multiply(L,R)) --> {false}expr(L), [star], expr(R).
So it is this tiny part that has to be changed. Note that the rule "knows" that it wants one terminal symbol, alas, the terminal appears too late. If only it would occur in front of the recursion! But it does not.
There is a general way how to fix this: Add another pair of arguments to encode the length.
parse(T, L) :-
phrase(expr(T, L,[]), L).
expr(integer(X), [_|S],S) --> [num(X)].
expr(add(L,R), [_|S0],S) --> expr(L, S0,S1), [plus], expr(R, S1,S).
expr(multiply(L,R), [_|S0],S) --> expr(L, S0,S1), [star], expr(R, S1,S).
This is a very general method that is of particular interest if you have ambiguous grammars, or if you do not know whether or not your grammar is ambiguous. Simply let Prolog do the thinking for you!
The correct approach is to use DCGs, but your example grammar is left-recursive, which won't work. Here's what would:
expression(T+E) --> term(T), [plus], expression(E).
expression(T-E) --> term(T), [minus], expression(E).
expression(T) --> term(T).
term(F*T) --> factor(F), [star], term(T).
term(F/T) --> factor(F), [div], term(T).
term(F) --> factor(F).
factor(N) --> num(N).
factor(E) --> ['('], expression(E), [')'].
num(N) --> [num(N)], { number(N) }.
The relationship between this and your sample grammar should be obvious, as should the transformation from left-recursive to right-recursive. I can't recall the details from my automata class about left-most derivations, but I think it only comes into play if the grammar is ambiguous, and I don't think this one is. Hopefully a genuine computer scientist will come along and clarify that point.
I see no point in producing an AST other than what Prolog would use. The code within parenthesis on the left-hand side of the production is the AST-building code (e.g. the T+E in the first expression//1 rule). Adjust the code accordingly if this is undesirable.
From here, presenting your parse/2 API is quite trivial:
parse(L, T) :- phrase(expression(T), L).
Because we're using Prolog's own structures, the result will look a lot less impressive than it is:
?- parse([num(4), star, num(8), div, '(', num(3), plus, num(1), ')'], T).
T = 4* (8/ (3+1)) ;
false.
You can show a more AST-y output if you like using write_canonical/2:
?- parse([num(4), star, num(8), div, '(', num(3), plus, num(1), ')'], T),
write_canonical(T).
*(4,/(8,+(3,1)))
T = 4* (8/ (3+1)) a
The part *(4,/(8,+(3,1))) is the result of write_canonical/1. And you can evaluate that directly with is/2:
?- parse([num(4), star, num(8), div, '(', num(3), plus, num(1), ')'], T),
Result is T.
T = 4* (8/ (3+1)),
Result = 8 ;
false.
I found this nice snippet for parsing lisp in Prolog (from here):
ws --> [W], { code_type(W, space) }, ws.
ws --> [].
parse(String, Expr) :- phrase(expressions(Expr), String).
expressions([E|Es]) -->
ws, expression(E), ws,
!, % single solution: longest input match
expressions(Es).
expressions([]) --> [].
% A number N is represented as n(N), a symbol S as s(S).
expression(s(A)) --> symbol(Cs), { atom_codes(A, Cs) }.
expression(n(N)) --> number(Cs), { number_codes(N, Cs) }.
expression(List) --> "(", expressions(List), ")".
expression([s(quote),Q]) --> "'", expression(Q).
number([D|Ds]) --> digit(D), number(Ds).
number([D]) --> digit(D).
digit(D) --> [D], { code_type(D, digit) }.
symbol([A|As]) -->
[A],
{ memberchk(A, "+/-*><=") ; code_type(A, alpha) },
symbolr(As).
symbolr([A|As]) -->
[A],
{ memberchk(A, "+/-*><=") ; code_type(A, alnum) },
symbolr(As).
symbolr([]) --> [].
However expressions uses a cut. I'm assuming this is for efficiency. Is it possible to write this code so that it works efficiently without cut?
Would also be in interested answers that involve Mercury's soft-cut / committed choice.
The cut is not used for efficiency, but to commit to the first solution (see the comment next to the !/0: "single solution: longest input match"). If you comment out the !/0, you get for example:
?- parse("abc", E).
E = [s(abc)] ;
E = [s(ab), s(c)] ;
E = [s(a), s(bc)] ;
E = [s(a), s(b), s(c)] ;
false.
It is clear that only the first solution, consisting of the longest sequence of characters that form a token, is desired in such cases. Given the example above, I therefore disagree with "false": expression//1 is ambiguous, because number//1 and symbolr//1 are. In Mercury, you could use the determinism declaration cc_nondet to commit to a solution, if any.
You are touching a quite deep problem here. At the place of the cut you have
added the comment "longest input match". But what you actually did was to commit
to the first solution which will produce the "longest input match" for the non-terminal ws//0 but not necessarily for expression//1.
Many programming languages define their tokens based on the longest input match. This often leads to very strange effects. For example, a number may be immediately
followed by a letter in many programming languages. That's the case for Pascal, Haskell,
Prolog and many other languages. E.g. if a>2then 1 else 2 is valid Haskell.
Valid Prolog: X is 2mod 3.
Given that, it might be a good idea to define a programming language such that it does not depend on such features at all.
Of course, you would then like to optimize the grammar. But I can only recommend to start with a definition that is unambiguous in the first place.
As for efficiency (and purity):
eos([],[]).
nows --> call(eos).
nows, [W] --> [W], { code_type(W, nospace) }.
ws --> nows.
ws --> [W], {code_type(W, space)}, ws.
You could use a construct that has already found its place in Parsing Expression Grammars (PEGs) but which is also available in DCGs. Namely the negation of a DCG goal. In PEGs the exclamation mark (!) with an argument is used for negation, i.e. ! e. In DCG the negation of a DCG goal is expressed by the (\+) operator, which is already used for ordinary negation as failure in ordinary Prolog clauses and queries.
So lets first explain how (\+) works in DCGs. If you have a production rule of
the form:
A --> B, \+C, D.
Then this is translated to:
A(I,O) :- B(I,X), \+ C(X,_), D(X,O).
Which means an attempt is made to parse the C DCG goal, but without actually consuming the input list. Now this can be used to replace the cut, if desired, and it gives a little bit more declarative feeling. To explain the idea lets assume that with have a grammar without ws//0. So the original clause set of expressions//1 would be:
expressions([E|Es]) --> expression(E), !, expressions(Es).
expressions([]) --> [].
With negation we can turn this into the following cut-less form:
expressions([E|Es]) --> expression(E), expressions(Es).
expressions([]) --> \+ expression(_).
Unfortunately the above variant is quite un-efficient, since an attempt to parse an expression is made twice. Once in the first rule, and then again in the second rule for the negation. But you could do the following and only check for the negation of the beginning of an expression:
expressions([E|Es]) --> expression(E), expressions(Es).
expressions([]) --> \+ symbol(_), \+ number(_), \+ "(", \+ "'".
If you try negation, you will see that you get a relatively strict parser. This is important if you try to parse maximum prefix of input and if you want to detect some errors. Try that:
?- phrase(expressions(X),"'",Y).
You should get failure in the negation version which checks the first symbol of the expression. In the cut and in the cut free version you will get success with the empty list as a result.
But you could also deal in another way with errors, I have only made the error example to highlight a little bit how the negation version works.
In other settings, for example CYK parser, one can make the negation quite efficient, it can use the information which is already placed in the chart.
Best Regards
I'm currently working on a recursive Prolog program to link routes together to create a basic GPS of the Birmingham area. At the moment I can get output as so:
Input
routeplan(selly_oak, aston, P).
Output
P = [selly_oak, edgbaston, ... , aston]
What I would like to do is have my program provide some sort of interface, so if I were to type in something along the lines of:
Route from selly_oak to aston
It would provide me with:
Go from selly_oak to edgbaston
Go from edgbaston to ...
Finally, Go from ... to aston.
Prolog is a powerful language so I assume this is easily possible, however many of the books I've taken out seem to skip over this part. As far as I am aware I have to use something along the lines of write() and read() although the details are unknown to me.
Could anyone here a Prolog novice out with some basic examples or links to further information?
EDIT: A lot of these answers seem very complicated, where the solution should only be around 5-10 lines of code. Reading in a value isn't a problem as I can do something along the lines of:
find:-
write('Where are you? '),
read(X),
nl, write('Where do you want to go? '),
read(Y),
loopForRoute(X,Y).
I'd prefer it if the output could be written out using write() so a new line (nl) can be used, so that it displays like the output above.
If this were my input, how would I then arrange the top routeplan() to work with these inputs? Also, if I were to add the Lines for these stations as an extra parameter how would this then be implemented? All links are defined at the beginning of the file like so:
rlinks(selly_oak, edgbaston, uob_line).
rlinks(edgbaston, bham_new_street, main_line).
Therefore, with this information, it'd be good to be able to read the line as so.
Go from selly_oak to edgbaston using the uob_line
Go from edgbaston to ... using the ...
Finally, go from ... to aston using the astuni_line
A book which discusses such things in detail is Natural Language Processing for Prolog Programmers
by Michael A. Covington.
In general, what you need to do is
Tokenize the input
Parse the tokens (e.g. with DCG) to get the input for routeplan/3
Call routeplan/3
Generate some English on the basis of the output of routeplan/3
Something like this (works in SWI-Prolog):
% Usage example:
%
% ?- query_to_response('Route from selly_oak to aston', Response).
%
% Response = 'go from selly_oak to edgbaston then go from edgbaston
% to aston then stop .'
%
query_to_response(Query, Response) :-
concat_atom(QueryTokens, ' ', Query), % simple tokenizer
query(path(From, To), QueryTokens, []),
routeplan(From, To, Plan),
response(Plan, EnglishTokens, []),
concat_atom(EnglishTokens, ' ', Response).
% Query parser
query(path(From, To)) --> ['Route'], from(From), to(To).
from(From) --> [from], [From], { placename(From) }.
to(To) --> [to], [To], { placename(To) }.
% Response generator
response([_]) --> [stop], [.].
response([From, To | Tail]) -->
goto(path(From, To)), [then], response([To | Tail]).
goto(path(From, To)) --> [go], from(From), to(To).
% Placenames
placename(selly_oak).
placename(aston).
placename(edgbaston).
% Mock routeplan/3
routeplan(selly_oak, aston, [selly_oak, edgbaston, aston]).
Hm, if I understand you correctly you just want to format the list nicely for printing out, no?
In SWI-Prolog this works:
output_string([A,B],StrIn,StrOut) :-
concat_atom([StrIn, 'Finally, Go from ', A, ' to ', B, '.'],StrOut),
write(StrOut).
output_string([A,B|Rest],StrIn,StrOut) :-
concat_atom([StrIn,'Go from ', A, ' to ', B, '.\n'],StrAB),
output_string([B|Rest],StrAB,StrOut).
then call with
output_string(P,'',_).
It's probably not very efficient, but it does the job. :)
For this sort of thing, I usually create shell predicates. So in your case...
guided:-
print('Enter your start point'),nl,
read(Start),
print('Enter your destination'),nl,
read(Dest),
routeplan(Start, Dest, Route),
print_route(Route).
And print_route/1 could be something recursive like this:
print_route([]).
print_route([[A,B,Method]|Tail]):-
print_route(Tail),
print('Go from '), print(A),
print(' to '), print(B),
print(' by '), print(Method), nl.
I've assumed that the 3rd variable of the routeplan/3 predicate is a list of lists. Also that it's built by adding to the tail. If it's not, it should be fairly easy to adapt. Ask in the comments.
Here are a few predicates to read lines from a file/stream into a Prolog string:
%%% get_line(S, CL): CL is the string read up to the end of the line from S.
%%% If reading past end of file, returns 'end_of_file' in CL first, raises
%%% an exception second time.
%%% :- pred get_string(+stream, -list(int)).
get_line(S, CL) :-
peek_code(S, C),
( C = -1
-> get_code(S, _),
CL = end_of_file
; get_line(S, C, CL)).
get_line(_, -1, CL) :- !, CL = []. % leave end of file mark on stream
get_line(S, 0'\n, CL) :- !,
get_code(S, _),
CL = [].
get_line(S, C, [C|CL]) :-
get_code(S, _),
peek_code(S, NC),
get_line(S, NC, CL).
%% read_lines(L): reads lines from current input to L. L is a list of list
%% of character codes, newline characters are not included.
%% :- pred read_lines(-list(list(char))).
read_lines(L) :-
current_input(In),
get_line(In, L0),
read_lines(In, L0, L).
%% read_lines(F, L): reads lines from F to L. L is a list of list of character
%% codes, newline characters are not included.
%% :- pred read_lines(+atom, -list(list(char))).
read_lines(F, L) :-
fail_on_error(open(F, read, S)),
call_cleanup((get_line(S, L0),
read_lines(S, L0, L)),
close(S)).
read_lines(_, end_of_file, L) :- !, L = [].
read_lines(S, H, [H|T]) :-
get_line(S, NH),
read_lines(S, NH, T).
Then, take a look at DCGs for information on how to parse a string.