How to properly implement POSTPONE in a Forth system? - forth

John Heyes' ANS Forth test suite contains the following definition:
: IFFLOORED [ -3 2 / -2 = INVERT ] LITERAL IF POSTPONE \ THEN ;
This is then used to conditionally define various words depending on whether we're using floored or symmetric division:
IFFLOORED : T/MOD >R S>D R> FM/MOD ;
So IFFLOORED acts like either a noop or a \ depending on the result of the expression. Fine. That's easily implementable on my threaded interpreter by doing this:
: POSTPONE ' , ; IMMEDIATE
...and now IFFLOORED works; the definition is equivalent to : IFFLOORED -1 IF ['] \ EXECUTE THEN ;.
Unfortunately, further down the test suite is the following code:
: GT1 123 ;
: GT4 POSTPONE GT1 ; IMMEDIATE
: GT5 GT4 ;
\ assertion here that the stack is empty
The same implementation doesn't work here. If POSTPONE compiles a reference to its word, then GT4 becomes the equivalent of : GT4 123 ;... but GT4 is immediate. So when GT5 is defined, 123 is pushed onto the compiler's stack and GT5 becomes a noop. But that's not right; the test suite expects calling GT5 to leave 123 on the stack. So for this to work, POSTPONE must generate code which generates code:
: POSTPONE ' LITERAL ['] , LITERAL ;
And, indeed, if I play with gForth, I see that POSTPONE actually works like this:
: GT1 123 ;
: GT4 POSTPONE GT1 ; IMMEDIATE
SEE GT4
<long number> compile, ;
But these two definitions are not compatible. If I use the second definition, the first test fails (because now IFFLOORED tries to compile \ rather than executing it). If I use the first definition, the second test fails (because GT4 pushes onto the compiler stack rather than compiling a literal push).
...but both tests pass in gForth.
So what's going on?

Let me answer here, as the question changed considerably. I am still not sure I understand the question, though :)
In your example, you define
: GT4 POSTPONE GT1 ; IMMEDIATE
What happens here, is the following:
: is executed, reading GT4 and creating the new word
POSTPONE's compilation semantics is executed, which is to compile the compilation semantics of GT1 - as you have seen in GForth.
; is executed, ending the definition
IMMEDIATE is executed, marking the last defined word as immediate.
POSTPONE is called only when compiling GT4, and it does not appear in the compiled code. So when later using this immediate word in the definition of GT5, the interpretation semantics of POSTPONE is not needed.
By the way, according to the standard, POSTPONE has only compilation semantics, and the interpretation semantics is undefined.
See also the POSTPONE tutorial in the GForth manual.
EDIT Examples of interpretation and compilation semantics:
: TEST1 ." interpretation" ; => ok
: TEST2 ." compilation" ; IMMEDIATE => ok
: TEST3 TEST1 TEST2 ; => compilation ok
TEST3 => interpretation ok
: TEST4 POSTPONE TEST1 ; IMMEDIATE => ok
: TEST5 TEST4 ; => ok
TEST5 => interpretation ok
: TEST6 POSTPONE TEST2 ; IMMEDIATE => ok
TEST6 => compilation ok
If you have any more questions, you can reference these tests.

The snippet you've quoted does the following things:
Evaluate -3/2 (at compile time), and check if it is -2.
If it is, store a 0 (false), otherwise store a -1 (true) in IFFLOORED, so when it is evaluated, it will put this value on the stack. (This is the effect of LITERAL.)
When evaluating IFFLOORED, after pushing the value on the stack, comes an IF - THEN expression. When the value is true, it means that we are not in a floored environment, so we want to comment out the rest of the line, and that is what \ does.
So here comes the tricky part - \ is IMMEDIATE, i.e., you cannot use it inside a colon definition, because it will comment out the rest of the line. You have to explicitly tell the compiler that you want to compile this function, and not execute it, which is what POSTPONE does.

The behavior of postpone word in compilation state is to determine compilation semantics for its parsed argument and append these semantics to the current definition.
Compilation semantics for a word can be either special or ordinary (see the section 3.4.3.3 Compilation semantics of Forth-2012). To work correctly, postpone should distinguish these cases and generate code according to the different patterns.
A problem of your implementations is that they are correct either for ordinary compilation semantics or for special compilation semantics.
A standard compliant implementation is as follows:
: state-on ( -- ) 1 state ! ;
: state-off ( -- ) 0 state ! ;
: execute-compiling ( i*x xt --j*x )
state # if execute exit then
state-on execute state-off
;
: postpone ( "name" -- )
bl word find dup 0= -13 and throw 1 = ( xt flag-special )
swap lit, if ['] execute-compiling else ['] compile, then compile,
; immediate
See more details in my post How POSTPONE should work.

Related

Why is my parser showing errors for valid programs?

I can't figure out what's wrong with my parser. Here are the associated files:
parse.y
declarations: INTEGER_SIZE IDENTIFIER TERMINATOR {declare($1,$2);}
void yyerror(char *err){
printf("\n\nYYError on line %d: Error = %s\n", yylineno, err);
}
scan.l
[Xx]+ {yylval.size = strlen(yytext);
When running it against the valid program below it shows an error at line 3; when running any of the lines individually it shows an error on line 1 via the yyerror() function.
BEGINING.
XXX XY-1.
XXXX Y.
XXXX Z.
BODY.
PRINT “Please enter a number? ”.
INPUT Y.
MOVE 15 TO Z.
ADD Y TO Z.
PRINT XY-1;” + “;Y;”=”;Z.
END.
To run the files run the following commands:
yacc -d parser.y
lex lexer.l
gcc -o parser lex.yy.c y.tab.c -ll
This non-terminal is called declarations, from which one might think that it matches one or more declarations, or perhaps zero or more declarations:
declarations: INTEGER_SIZE IDENTIFIER TERMINATOR {declare($1,$2);}
But the rule matches exactly three tokens, which is to say one declaration. So when you give it an input with two declarations, it fails on the second one.
Similarly, your non-terminal called statements only matches a single statement, not several as might be expected from its name.
Grammars need to be explicit. If you want to match several declarations, you have to write that:
declarations: declaration
| declarations declaration
By the way, I have seen before grammars written with the belief that you have to write {;} at the end of a production. I'm curious where this idea comes from. Yacc and bison do not require that productions have an action, and anyway an empty action is {}, just as it is in C.

What exactly does the "DOES>" word do?

I was messing around and trying to understand it, so I wrote a simple word to test it:
: test ." compile time" DOES> ." runtime" ;
The problem is, this word doesn't behave in a consistent way at all. Its output seems to vary depending on a number of factors, like:
is this the first line to be interpreted?
are there other words defined after it?
Also, sometimes it doesn't print anything at all.
(Using Gforth)
This is your code:
: test ." compile time" DOES> ." runtime" ;
After entering that, I can use your word without the ambiguous behavior you are encountering:
CREATE def 12345 , test \ prints "compile time"
It prints compile time, because that's the behavior you compiled into test before DOES>.
Note: this is not actually running at compile time.
DOES> ends the definition of the word, but changes it so test also modifies the last defined word so that it puts its data field address on the stack, and then runs the behaviour found after DOES>.
Using the word I created, it has the instantiated behavior you defined, following the implicit behaviour of pushing the address:
def # . \ prints runtime 12345
Forth 2012 note: Per the definition of DOES> in Forth 2012, this would cause ambiguous behavior if the last word was not defined with CREATE. However, Gforth allows any word definition to be modified.
I hope this example helps explain why, usually, CREATE is used within the definition that uses DOES>, but it is certainly not required.
ruvim's answer may be easier to understand in Gforth.
The code below defines a 'classic' defining word that creates variables initialised to the item on the stack at compile time. I hope the tracing statements in the definitions will help show what is happening
: var \ create: n <name> -- ; does>: -- addr ; initialised VARIABLE
create \ create a dictionary item for <name>.
." HERE at compile time: " HERE .
, \ place n at HERE in the dictionary
does> \ Push the HERE as at compile time to the stack
." Run time address on the stack:" dup .
;
var can now be used to define new words that have the run time action defined after DOES>.
10 var init10 \ Use var to define a new word init10
\ HERE at compile time: 135007328
init10 CR DUP . # .
\ Run time address on the stack:135007328
\ 135007328 10 \ addr of init10, content of that address.
12 var init12 \ Use var to define a new word init12
\ HERE at compile time: 135007376
init12 CR DUP . # .
\ Run time address on the stack:135007376
\ 135007376 12
100 init10 ! \ Store 100 in init10
\ Run time address on the stack:135007328
init10 # .
\ Run time address on the stack:135007328 100
\ init10 now contains 100
Hopefully the answers will provide a framework to explore defining words and the action of DOES>.
Interactive play
In Gforth you can play with does> interpretively.
create foo 123 ,
foo # . \ prints 123
does> ( addr -- ) # . ;
foo \ prints 123
does> ( addr -- ) # 1+ . ;
foo \ prints 124
' foo >body # . \ prints 123
So does> just changes behavior of the last word, when this last word is defined via create. It's a mistake if you run does> when the last word was not defined via create.
Usage in practice
Usually does> is used to set a new behavior only once for a word defined via create. An ability to alter this behavior several times it just a side effect of historical implementation, and this effect is almost not used in the practice.
Alternative ways
In practice, the cases when does> is used, can be also implemented without does>.
For example, let we want to implement a word counter that creates a counter that every time returns the next value, and that is used in the following way:
1 counter x1
x1 . \ prints 1
x1 . \ prints 2
x1 . \ prints 3
An implementation via create does>
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
create , does> ( addr -- x ) dup >r # dup 1+ r> !
;
An implementation using a quotation
[undefined] lit, [if] : lit, ( x -- ) postpone lit, ; [then]
[undefined] xt, [if] : xt, ( xt -- ) compile, ; [then]
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
align here >r , [: ( addr -- x ) dup >r # dup 1+ r> ! ;] >r
: r> r> lit, xt, postpone ;
;
An implementation using a macro (code inlining) via the word ]]:
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
align here >r ,
: r> lit, ]] dup >r # dup 1+ r> ! [[ postpone ;
;
A word that define words that allocate memory can be equipped with an action thru DOES> which gives the address to the memory block.
A simple example is CONSTANT wich can be defined as
: CONSTANT ( n -- ) CREATE , DOES> # ;

Error when appending string from word or variable

I'm trying to append two strings in gforth, but I get some scary looking error messages.
While s" foo" s" bar" append type cr works fine, as soon as I start storing strings in variables or creating them from words, I get errors. For instance:
: make-string ( -- s )
s" foo" ;
: append-print ( s s -- )
append type cr ;
make-string s" bar" append-print
Running it produces the following error:
$ gforth prob1.fs -e bye
gforth(41572,0x7fff79cc2310) malloc: *** error for object 0x103a551a0: pointer being realloc'd was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6.
I'm well versed in C, so it seems pretty clear that I'm using Forth incorrectly!
I suppose I need to learn something very basic about memory management in Forth.
Can anyone please explain what goes wrong here, and what I should do?
I also run into problems when I try to append a string that is stored in a variable:
variable foo
s" foo" foo !
foo s" bar " append type cr
This ends in a loop that I have to break:
$ gforth prob2.fs
foo��^C
in file included from *OS command line*:-1
prob2.fs:4: User interrupt
foo s" bar " append >>>type<<< cr
Backtrace:
$10C7C2E90 write-file
For reference, I'm using gforth 0.7.2 on Mac OS X. I would be very grateful for some good explanations on what's going on.
Update
I can see the definition of append:
see append
: append
>l >l >l >l #local0 #local1 #local3 + dup >l resize throw >l #local4 #local0 #local3 + #local5
move #local0 #local1 lp+!# 48 ; ok
So, it would seem I need to manage memory myself in Forth? If so, how?
Solution
Andreas Bombe provides the clue below. The final program that works would be
: make-string ( -- s )
s" foo" ;
: append-print
s+ type cr ;
make-string s" bar" append-print
Output is
$ gforth b.fs -e bye
foobar
append uses resize on the first string make space to append the second string. This requires that the string be allocated on the heap.
When you compile a string with s" into a word, it gets allocated in the dictionary. If you try resize (directly or indirectly through append) on that pointer you will get the error you see.
Normally s" has undefined interpretation semantics. Gforth defines its interpretation semantics for convenience as allocating the string on the heap. That's why it works (in gforth) as long as you don't compile it.
Edit:
I've found the definition of append, it's part of libcc.fs (a foreign function interface builder as it seems) and not a standard word. This is the definition in the source, more readable than the see decompile:
: append { addr1 u1 addr2 u2 -- addr u }
addr1 u1 u2 + dup { u } resize throw { addr }
addr2 addr u1 + u2 move
addr u ;
Immediately before that is a definition of s+:
: s+ { addr1 u1 addr2 u2 -- addr u }
u1 u2 + allocate throw { addr }
addr1 addr u1 move
addr2 addr u1 + u2 move
addr u1 u2 +
;
As you can see this one allocates new memory space instead of resizing the first string and concatenates both strings into it. You could use this one instead. It is not a standard word however and just happens to be in your environment as an internal implementation detail of libcc.fs in gforth so you can't rely on it being available elsewhere.
The usage of strings in Forth doesn't warrant dynamic allocation mostly and at least not in your example. You can get by nicely with buffers that you allocate yourself using ALLOT
and some very simple words to manipulate them.
[ALLOT uses the data space (ANSI term) in an incremental fashion for adding words and buffers. It is not dynamic, you can't release an item without removing at the same time all items ALLOT-ted later. It is also simple. Do not confuse with ALLOCATE which is dynamic and is in a separate extension wordset]
You make a fundamental mistake in leaving out the specification of your append-buffer.
It doesn't work, and we don't know how it is supposed to work!
In ciforth's an example could be:
: astring S" foo" ;
CREATE buffer 100 ALLOT \ space for 100 chars
\ Put the first string in `buffer and append the second string.
\ Also print the second string
: append-print ( s s -- )
type cr 2swap
buffer $!
buffer $+! ;
astring s" bar" append-print
bar OK \ answer
buffer $# TYPE
foobar OK \ answer
Other Forths have other non-standard words to manipulate simple strings. An excursion through malloc land is really not necessary. In the gforth documentation you can look up 'place' and find an equivalent family of words.
Also nowadays (Forth 2012) you can have strings like so "foo".

How to define VALUE and TO

I want to implement the Forth words VALUE and TO on a RPC/8 (an emulated computer in a Minecraft mod). My best attempts get me a set of words that work fine so long as I don't use them while compiling. More sepecificly VALUE works, but TO does not.
: VALUE CREATE , DOES> # ;
: TO ' 3 + ! ;
I have tried everything I can think of to get it working and my best attempt gets me this:
['] NameOfAValue 3 + !
Note that the processor is not a pure 6502 but a 65EL02, a custom variant of the 65816.
EDIT #1: Somehow I forgot the call to CREATE in value. It should have been there all along.
EDIT #2: I also got 3 and + switched around in TO... oops. It should have been the other way all along.
The simplest solution is
VARIABLE TO-MESSAGE \ 0 : FROM , 1 : TO .
: TO 1 TO-MESSAGE ! ;
: VALUE CREATE , DOES> TO-MESSAGE # IF ! ELSE # THEN
0 TO_MESSAGE ! ;
It uses only CORE words and is absolutely standard.
And it just works in interpret and compile mode, because there is no fishy look ahead in the input stream.
Ok After a lot of trial and error as well as much searching I found something that should work, but because of two bugs in redFORTH, does not.
VALUE
\ Works fine, now to reset the value.
: VALUE \ n <name> --
CREATE ,
DOES> #
;
TO
\ Works if not compiling, LITERAL and POSTPONE are broken.
: TO
TIBWORD FIND 3 +
STATE # IF
POSTPONE LITERAL
POSTPONE !
ELSE
!
THEN
; IMMEDIATE
Demo of bug in LITERAL
\ fails, very wierd error.
: TESTLIT [ 42 ] LITERAL ;
\ TESTLIT Unknown Token: TESTLIT
\ FORGET TESTLIT Unknown Token: TESTLIT
\ WORDS TESTLIT COLD SORTMATCH ...
Demo of bug in POSTPONE
\ fails, postpone is directly equivelent to [']
: TESTPOST POSTPONE + ; IMMEDIATE
: TEST 2 2 TESTPOST . ;
\ . 1935
\ ' + . 1935
I'm off to file a bug report....
EDIT #1: After some more trial and error and not a little swearing (I'm not good with FORTH) I found a way to make it work.
: TO
TIBWORD FIND 3 +
STATE # IF
(lit) (lit) , , \ store address
(lit) ! ,
ELSE
!
THEN
; IMMEDIATE
I'm not sure how your Forth handles interpreting versus compile time, but the definition of TO is trying to store a value to address 3. Seems fishy.

Create a Print Function

I'm learning Bison and at this time the only thing that I do was the rpcalc example, but now I want to implement a print function(like printf of C), but I don't know how to do this and I'm planning to have a syntax like this print ("Something here");, but I don't know how to build the print function and I don't know how to create that ; as a end of line. Thanks for your help.
You first need to ask yourself:
What are the [sub-]parts of my 'print ("something");' syntax ?
Once you identify these parts, "simply" describe them in the form of grammar syntax rules, along with applicable production rules. And then let Bison generate the parser for you; that's about it.
To put you on your way:
The semi-column is probably a element you will use to separate statemements (such a one "call" to print from another).
'print' itself is probably a keyword, or preferably a native function name of your language.
The print statement appears to take a literal string as [one of] its arguments. a literal string starts and ends with a double quote (and probably allow for escaped quotes within itself)
etc.
The bolded and italic expressions above are some of the entities (the 'symbols' in parser lingo) you'll likely need to define in the syntax for your language. For that you'll use Bison grammar rules, such as
stmt : print_stmt ';' | input_stmt ';'| some_other_stmt ';' ;
prnt_stmt : print '(' args ')'
{ printf( $3 ); }
;
args : arg ',' args;
...
Since the question asked about the semi-column, maybe some confusion was from the different uses thereof; see for example above how the ';' belong to your language's syntax whereby the ; (no quotes) at the end of each grammar rule are part of Bison's language.
Note: this is of course a simplistic implementation, aimed at showing the essential. Also the Bison syntax may be a tat off (been there / done it, but a long while back ;-) I then "met" ANTLR never to return to Bison, although I do see how its lightweight and fully self contained nature can make it appropriate in some cases)

Resources