How do I create a DCG rule inverse to another in Prolog? - parsing

I am writing a Commodore BASIC interpreter in Prolog, and I am writing some DCGs to parse it. I have verified the DCGs below to work except for the variable one. My goal is this: for anything which isn't a boolean, integer, float, or a string, it's a variable. However, anything that I give it via phrase just results in no.
bool --> [true].
bool --> [false].
integer --> [1]. % how to match nums?
float --> [0.1].
string --> [Str], {atom_chars(Str, ['"' | Chars]), last(Chars, '"')}.
literal --> bool; integer; float; string.
variable --> \+ literal.
I ran a stack trace like this (with gprolog)
main :- trace, phrase(variable, [bar]).
Looking at this, I cannot figure out why variable fails, given that it fails for each case in literal. I'm guessing that the error is pretty simple, but I'm still stumped, so does anyone who's good at Prolog have an idea of what I'm doing wrong?
| ?- main.
The debugger will first creep -- showing everything (trace)
1 1 Call: phrase(variable,[bar]) ?
2 2 Call: variable([bar],_321) ?
3 3 Call: \+literal([bar],_348) ?
4 4 Call: literal([bar],_348) ?
5 5 Call: bool([bar],_348) ?
5 5 Fail: bool([bar],_348) ?
5 5 Call: integer([bar],_348) ?
5 5 Fail: integer([bar],_348) ?
5 5 Call: float([bar],_348) ?
5 5 Fail: float([bar],_348) ?
5 5 Call: string([bar],_348) ?
6 6 Call: atom_chars(bar,['"'|_418]) ?
6 6 Fail: atom_chars(bar,['"'|_418]) ?
5 5 Fail: string([bar],_348) ?
4 4 Fail: literal([bar],_348) ?
3 3 Exit: \+literal([bar],_348) ?
2 2 Exit: variable([bar],[bar]) ?
1 1 Fail: phrase(variable,[bar]) ?
(2 ms) no
{trace}

To expand a bit on the other answer, the key problem is that a DCG rule like \+ literal does not consume items from the input. It only checks that the next item, if any, is not a literal.
To actually consume an item, you need to use a list goal, similarly to how you use a goal [1] to consume a 1 element. So:
variable -->
\+ literal, % if there is a next element, it's not a literal
[_Variable]. % consume this next element, which is a variable
For example:
?- phrase(variable, [bar]).
true.
?- phrase((integer, variable, float), [I, bar, F]).
I = 1,
F = 0.1.
Having that singleton variable _Variable is a bit strange -- when you parse like this, you lose the name of the variable. When your parser is expanded a bit, you will want to use arguments to your DCG rules to communicate information out of the rules:
variable(Variable) -->
\+ literal,
[Variable].
For example:
?- phrase((integer, variable(Var1), float, variable(Var2)), [I, bar, F, foo]).
Var1 = bar,
Var2 = foo,
I = 1,
F = 0.1.

You can detect a string of integers like this (I've added an argument to collect the digits):
integer([H|T]) --> digit(H), integer(T).
integer([]) --> [].
digit(0) --> "0".
digit(1) --> "1".
...
digit(9) --> "9".
As for variable -- it needs to consume text, so you'd want something similar to integer above, but change digit(H) to something that recognizes a character that's part of a "variable".
If you want further clues (although sometimes using slightly advanced tricks): https://www.swi-prolog.org/pldoc/man?section=basics and the code is here: https://github.com/SWI-Prolog/swipl-devel/blob/master/library/dcg/basics.pl

Related

Properly matching a set of tokens against my BASIC grammar

I am working on writing a BASIC interpreter in Prolog. DCGs are a little tricky, which is why I am having trouble here today.
Here is my grammar.
bool --> [true].
bool --> [false].
is_num_char(AD) :- AD = '.'; (atom_codes(AD, [D]), D >= 48, D =< 57).
number --> [].
number --> is_num_char, number.
quotes_on_atom(S) :- atom_chars(S, ['"' | C]), last(C, '"').
string --> quotes_on_atom.
literal --> bool; number; string.
variable --> \+ literal.
assignment --> string, ['='], expr.
equal --> expr, ['=='], expr.
not_equal --> expr, ['!='], expr.
if --> [if], expr, [then], expr.
for_decl --> [for], assignment, [to], number, [step], number.
for_next --> [next], integer.
expr --> literal; variable; assignment;
equal; not_equal; if; for_decl; for_next.
Here is my main goal:
main :-
% expected: [for, [i, '=', 0], to, '5', step, '1']
trace, phrase(expr, [for, i, '=', '0', to, '5', step, '1']).
Here is the error I get:
uncaught exception: error(existence_error(procedure,is_num_char/2),number/0).
A stack trace revealed this:
The debugger will first creep -- showing everything (trace)
1 1 Call: phrase(expr,[for,i,=,'0',to,'5',step,'1']) ?
2 2 Call: expr([for,i,=,'0',to,'5',step,'1'],_335) ?
3 3 Call: literal([for,i,=,'0',to,'5',step,'1'],_335) ?
4 4 Call: bool([for,i,=,'0',to,'5',step,'1'],_335) ?
4 4 Fail: bool([for,i,=,'0',to,'5',step,'1'],_335) ?
4 4 Call: number([for,i,=,'0',to,'5',step,'1'],_335) ?
4 4 Exit: number([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
3 3 Exit: literal([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
2 2 Exit: expr([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
2 2 Redo: expr([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
3 3 Redo: literal([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
4 4 Redo: number([for,i,=,'0',to,'5',step,'1'],[for,i,=,'0',to,'5',step,'1']) ?
5 5 Call: is_num_char([for,i,=,'0',to,'5',step,'1'],_446) ?
5 5 Exception: is_num_char([for,i,=,'0',to,'5',step,'1'],_459) ?
4 4 Exception: number([for,i,=,'0',to,'5',step,'1'],_335) ?
3 3 Exception: literal([for,i,=,'0',to,'5',step,'1'],_335) ?
2 2 Exception: expr([for,i,=,'0',to,'5',step,'1'],_335) ?
1 1 Exception: phrase(expr,[for,i,=,'0',to,'5',step,'1']) ?
uncaught exception: error(existence_error(procedure,is_num_char/2),number/0)
{trace}
It seems that is_num_char is being passed the whole list of tokens, along with something else. I do not understand why this is happening, given that the number rule accepts only one argument. Additionally, it's odd that the token list unifies with number to begin with. It should be unified with for_decl instead. If you are knowledgeable about Prolog DCGs please let me know what I am doing wrong here.

Why the output differs in these two erlang expression sequence in shell?

In Erlang shell why the following produces different result?
1> Total=15.
2> Calculate=fun(Number)-> Total=2*Number end.
3> Calculate(6).
exception error: no match of right hand side value 12
1> Calculate=fun(Number)-> Total=2*Number end.
2> Total=15.
3> Calculate(6).
12
In Erlang the = operator is both assignment and assertion.
If I do this:
A = 1,
A = 2,
my program will crash. I just told it that A = 1 which, when A is unbound (doesn't yet exist as a label) it now is assigned the value 1 forever and ever -- until the scope of execution changes. So then when I tell it that A = 2 it tries to assert that the value of A is 2, which it is not. So we get a crash on a bad match.
Scope in Erlang is defined by two things:
Definition of the current function. This scope is absolute for the duration of the function definition.
Definition of the current lambda or list comprehension. This scope is local to the lambda but also closes over whatever values from the outer scope are referenced.
These scopes are always superceded at the time they are declared by whatever is in the outer scope. That is how we make closures with anonymous functions. For example, let's say I had have a socket I want to send a list of data through. The socket is already bound to the variable name Socket in the head of the function, and we want to use a list operation to map the list of values to send to a side effect of being sent over that specific socket. I can close over the value of the socket within the body of a lambda, which has the effect of currying that value out of the more general operation of "sending some data":
send_stuff(Socket, ListOfMessages) ->
Send = fun(Message) -> ok = gen_tcp:send(Socket, Message) end,
lists:foreach(Send, ListOfMessages).
Each iteration of the list operation lists:foreach/2 can only accept a function of arity 1 as its first argument. We have created a closure that captures the value of Socket internally already (because that was already bound in the outer scope) and combines it with the unbound, inner variable Message. Note also that we are checking whether gen_tcp:send/2 worked each time within the lambda by asserting that the return value of gen_tcp:send/2 was really ok.
This is a super useful property.
So with that in mind, let's look at your code:
1> Total = 15.
2> Calculate = fun(Number)-> Total = 2 * Number end.
3> Calculate(6).
In the code above you've just assigned a value to Total, meaning you have created a label for that value (just like we had assigned Socket in the above example). Then later you are asserting that the value of Total is whatever the result of 2 * Number might be -- which can never be true since Total was an integer so 2 * 7.5 wouldn't cut it either, because the result would be 15.0, not 15.
1> Calculate = fun(Number)-> Total = 2 * Number end.
2> Total = 15.
3> Calculate(6).
In this example, though, you've got an inner variable called Total which does not close over any value declared in the outer scope. Later, you are declaring a label in the outer scope called Total, but by this time the lambda definition on the first line has been converted to an abstract function and the label Total as used there has been completely given over to the immutable space of the new function definition the assignment to Calculate represented. Thus, no conflict.
Consider what happens, for example, with trying to reference an inner value from a list comprehension:
1> A = 2.
2
2> [A * B || B <- lists:seq(1,3)].
[2,4,6]
3> A.
2
4> B.
* 1: variable 'B' is unbound
This is not what you would expect from, say, Python 2:
>>> a = 2
>>> a
2
>>> [a * b for b in range(1,4)]
[2, 4, 6]
>>> b
3
Incidentally, this has been fixed in Python 3:
>>> a = 2
>>> a
2
>>> [a * b for b in range(1,4)]
[2, 4, 6]
>>> b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'b' is not defined
(And I would provide a JavaScript example for comparison as well, but the scoping rules there are just so absolutely insane it doesn't even matter...)
In the first case you have bound Total to 15. In Erlang, variable are unmutable, but in the shell when you write Total = 15. you do not really create the variable Total, the shell does its best to mimic the behavior you will have if you were running an application, and it stores in a table the couple {"Total",15}.
On the next line you define the fun Calculate. the Parser find the expression Total=2*Number, and it goes through its table to detect that Total was previously defined. The evaluation is turned into something equivalent to 15 = 2*Number.
So, in the third line, when you ask to evaluate Calculate(6), it goes to calculate and evaluates 15 = 2*6 and issues the error message
exception error: no match of right hand side value 12
In the second example, Total is not yet defined when you define the function. The function is stored without assignment (Total is not used anymore), at least no assignment to a global variable. So there is no conflict when you define Total, and no error when you evaluate Calculate(6).
The behavior would be exactly the same in a compiled module.
The variable 'Total' is already assigned a value 15, so you can NOT using the same variable name Total in the second line. You should change to the other name Total1 or Total2...

Definition of f(x) in Lua documentation

While trying to completely understand the solution to Lua - generate sequence of numbers, the section 4.3.4 of Programming in Lua is unclear:
for i=1,f(x) do print(i) end
for i=10,1,-1 do print(i) end
The for loop has some subtleties that you should learn in order to
make good use of it. First, all three expressions are evaluated once,
before the loop starts. For instance, in the first example, f(x) is
called only once. Second, the control variable is a local variable
automatically declared by the for statement and is visible only inside
the loop. [...]
The first line of code doesn't work of course.
What is f(x) and where is it defined?
Unfortunately the documentation isn't available as a single page, making it a huge effort to search for the first occurrence. Searching for "lua f(x)" doesn't bear fruit either.
Explanation: now that I have received answers, I realize the problem was a misunderstanding. I incorrectly interpreted "f(x) is called only once" as "the line containing f(x) - for i=1,f(x) do print(i) end - will only return one value" and didn't pay enough attention to "all three expressions are evaluated once, before the loop starts".
This sentence clarifies it: expressions are evaluated once, before the loop starts.
Thus, f(x) is called only once is merely stating that the expressions will not be affected by potential changes in the loop.
For example, the following code (expressions are i=1 and x in the second line):
x=5
for i=1,x do
x = x - 1
print(i, x)
end
print(x)
will produce the following output:
1 4
2 3
3 2
4 1
5 0
0
and will not produce the following output:
1 4
2 3
3 2
2
f(x) is just a function which takes the argument x and returns a value that is used as the upper bound for the loop.
So for example, if the function f(x) calculates x² and you call it as f(3), it would return the value of 9. The resulting for loop would look like this:
for i=1, f(3) do print(i) end
which is exactly the same as
for i=1, 9 do print(i) end

Read numbers following a keyword into an array in Fortran 90 from a text file

I have many text files of this format
....
<snip>
'FOP' 0.19 1 24 1 25 7 8 /
'FOP' 0.18 1 24 1 25 9 11 /
/
TURX
560231
300244
70029
200250
645257
800191
900333
600334
770291
300335
220287
110262 /
SUBTRACT
'TURX' 'TURY'/
</snip>
......
where the portions I snipped off contain other various data in various formats. The file format is inconsistent (machine generated), the only thing one is assured of is the keyword TURX which may appear more than once. If it appears alone on one line, then the next few lines will contain numbers that I need to fetch into an array. The last number will have a space then a forward slash (/). I can then use this array in other operations afterwards.
How do I "search" or parse a file of unknown format in fortran, and how do I get a loop to fetch the rest of the data, please? I am really new to this and I HAVE to use fortran. Thanks.
Fortran 95 / 2003 have a lot of string and file handling features that make this easier.
For example, this code fragment to process a file of unknown length:
use iso_fortran_env
character (len=100) :: line
integer :: ReadCode
ReadLoop: do
read (75, '(A)', iostat=ReadCode ) line
if ( ReadCode /= 0 ) then
if ( ReadCode == iostat_end ) then
exit ReadLoop
else
write ( *, '( / "Error reading file: ", I0 )' ) ReadCode
stop
end if
end if
! code to process the line ....
end do ReadLoop
Then the "process the line" code can contain several sections depending on a logical variable "Have_TURX". If Have_TRUX is false you are "seeking" ... test whether the line contains "TURX". You could use a plain "==" if TURX is always at the start of the string, or for more generality you could use the intrinsic function "index" to test whether the string "line" contains TURX.
Once the program is in the mode Have_TRUX is true, then you use "internal I/O" to read the numeric value from the string. Since the integers have varying lengths and are left-justified, the easiest way is to use "list-directed I/O": combining these:
read (line, *) integer_variable
Then you could use the intrinsic function "index" again to test whether the string also contains a slash, in which case you change Have_TRUX to false and end reading mode.
If you need to put the numbers into an array, it might be necessary to read the file twice, or to backspace the file, because you will have to allocate the array, and you can't do that until you know the size of the array. Or you could pop the numbers into a linked list, then when you hit the slash allocate the array and fill it from the linked list. Or if there is a known maximum number of values you could use a temporary array, then transfer the numbers to an allocatable output array. This is assuming that you want the output argument of the subroutine be an allocatable array of the correct length, and the it returns one group of numbers per call:
integer, dimension (:), allocatable, intent (out) :: numbers
allocate (numbers (1: HowMany) )
P.S. There is a brief summary of the language features at http://en.wikipedia.org/wiki/Fortran_95_language_features and the gfortran manual has a summary of the intrinsic procedures, from which you can see what built in functions are available for string handling.
I'll give you a nudge in the right direction so that you can finish your project.
Some basics:
Do/While as you'll need some sort of loop
structure to loop through the file
and then over the numbers. There's
no for loop in Fortran, so use this
type.
Read
to read the strings.
To start you need something like this:
program readlines
implicit none
character (len=30) :: rdline
integer,dimension(1000) :: array
! This sets up a character array with 30 positions and an integer array with 1000
!
open(18,file='fileread.txt')
do
read(18,*) rdline
if (trim(rdline).eq.'TURX') exit !loop until the trimmed off portion matches TURX
end do
See this thread for way to turn your strings into integers.
Final edit: Looks like MSB has got most of what I just found out. The iostat argument of the read is the key to it. See this site for a sample program.
Here was my final way around it.
PROGRAM fetchnumbers
implicit none
character (len=50) ::line, numdata
logical ::is_numeric
integer ::I,iost,iost2,counter=0,number
integer, parameter :: long = selected_int_kind(10)
integer, dimension(1000)::numbers !Can the number of numbers be up to 1000?
open(20,file='inputfile.txt') !assuming file is in the same location as program
ReadLoop: do
read(20,*,iostat=iost) line !read data line by line
if (iost .LT. 0) exit !end of file reached before TURX was found
if (len_trim(line)==0) cycle ReadLoop !ignore empty lines
if (index(line, 'TURX').EQ.1) then !prepare to begin capturing
GetNumbers: do
read(20, *,iostat=iost2)numdata !read in the numbers one by one
if (.NOT.is_numeric(numdata)) exit !no more numbers to read
if (iost2 .LT. 0) exit !end of file reached while fetching numbers
read (numdata,*) number !read string value into a number
counter = counter + 1
Storeloop: do I =1,counter
if (I<counter) cycle StoreLoop
numbers(counter)=number !storing data into array
end do StoreLoop
end do GetNumbers
end if
end do ReadLoop
write(*,*) "Numbers are:"
do I=1,counter
write(*,'(I14)') numbers(I)
end do
END PROGRAM fetchnumbers
FUNCTION is_numeric(string)
IMPLICIT NONE
CHARACTER(len=*), INTENT(IN) :: string
LOGICAL :: is_numeric
REAL :: x
INTEGER :: e
is_numeric = .FALSE.
READ(string,*,IOSTAT=e) x
IF (e == 0) is_numeric = .TRUE.
END FUNCTION is_numeric

How do I format a PRINT or WRITE statement to overwrite the current line on the console screen?

I want to display the progress of a calculation done with a DO-loop, on the console screen. I can print out the progress variable to the terminal like this:
PROGRAM TextOverWrite_WithLoop
IMPLICIT NONE
INTEGER :: Number, Maximum = 10
DO Number = 1, MAXIMUM
WRITE(*, 100, ADVANCE='NO') REAL(Number)/REAL(Maximum)*100
100 FORMAT(TL10, F10.2)
! Calcultations on Number
END DO
END PROGRAM TextOverWrite_WithLoop
The output of the above code on the console screen is:
10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00
90.00 100.00
All on the same line, wrapped only by the console window.
The ADVANCE='No' argument and the TL10 (tab left so many spaces) edit descriptor works well to overwrite text on the same line, e.g. the output of the following code:
WRITE(*, 100, ADVANCE='NO') 100, 500
100 FORMAT(I3, 1X, TL4, I3)
Is:
500
Instead of:
100 500
Because of the TL4 edit descriptor.
From these two instances one can conclude that the WRITE statement cannot overwrite what has been written by another WRITE statement or by a previous execution of the same WRITE satement (as in a DO-loop).
Can this be overcome somehow?
I am using the FTN95 compiler on Windows 7 RC1. (The setup program of the G95 compiler bluescreens Windows 7 RC1, even thought it works fine on Vista.)
I know about the question Supressing line breaks in Fortran 95 write statements, but it does not work for me, because the answer to that question means new ouput is added to the previous output on the same line; instead of new output overwriting the previous output.
Thanks in advance.
The following should be portable across systems by use of ACHAR(13) to encode the carriage return.
character*1 creturn
! CODE::
creturn = achar(13) ! generate carriage return
! other code ...
WRITE( * , 101 , ADVANCE='NO' ) creturn , i , npoint
101 FORMAT( a , 'Point number : ',i7,' out of a total of ',i7)
There is no solution to this question within the scope of the Fortran standards. However, if your compiler understand backslash in Fortran strings (GNU Fortran does if you use the option -fbackslash), you can write
write (*,"(A)",advance="no") "foo"
call sleep(1)
write (*,"(A)",advance="no") "\b\b\bbar"
call sleep(1)
write (*,"(A)",advance="no") "\b\b\bgee"
call sleep(1)
write (*,*)
end
This uses the backslash character (\b) to erase previously written characters on that line.
NB: if your compiler does not understand advance="no", you can use related non-standard tricks, such as using the $ specifier in the format string.
The following worked perfectly using g95 fortran:
NF = NF + 1
IF(MOD(NF,5).EQ.0) WRITE(6,42,ADVANCE='NO') NF, ' PDFs'//CHAR(13)
42 FORMAT(I6,A)
gave:
5 PDFs
leaving the cursor at the #1 position on the same line. On the next update,
the 5 turned into a 10. ASCII 13 (decimal) is a carriage return.
OPEN(6,CARRIAGECONTROL ='FORTRAN')
DO I=1,5
WRITE(6,'(1H+" ",I)') I
ENDDO

Resources