Error when appending string from word or variable - forth

I'm trying to append two strings in gforth, but I get some scary looking error messages.
While s" foo" s" bar" append type cr works fine, as soon as I start storing strings in variables or creating them from words, I get errors. For instance:
: make-string ( -- s )
s" foo" ;
: append-print ( s s -- )
append type cr ;
make-string s" bar" append-print
Running it produces the following error:
$ gforth prob1.fs -e bye
gforth(41572,0x7fff79cc2310) malloc: *** error for object 0x103a551a0: pointer being realloc'd was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6.
I'm well versed in C, so it seems pretty clear that I'm using Forth incorrectly!
I suppose I need to learn something very basic about memory management in Forth.
Can anyone please explain what goes wrong here, and what I should do?
I also run into problems when I try to append a string that is stored in a variable:
variable foo
s" foo" foo !
foo s" bar " append type cr
This ends in a loop that I have to break:
$ gforth prob2.fs
foo��^C
in file included from *OS command line*:-1
prob2.fs:4: User interrupt
foo s" bar " append >>>type<<< cr
Backtrace:
$10C7C2E90 write-file
For reference, I'm using gforth 0.7.2 on Mac OS X. I would be very grateful for some good explanations on what's going on.
Update
I can see the definition of append:
see append
: append
>l >l >l >l #local0 #local1 #local3 + dup >l resize throw >l #local4 #local0 #local3 + #local5
move #local0 #local1 lp+!# 48 ; ok
So, it would seem I need to manage memory myself in Forth? If so, how?
Solution
Andreas Bombe provides the clue below. The final program that works would be
: make-string ( -- s )
s" foo" ;
: append-print
s+ type cr ;
make-string s" bar" append-print
Output is
$ gforth b.fs -e bye
foobar

append uses resize on the first string make space to append the second string. This requires that the string be allocated on the heap.
When you compile a string with s" into a word, it gets allocated in the dictionary. If you try resize (directly or indirectly through append) on that pointer you will get the error you see.
Normally s" has undefined interpretation semantics. Gforth defines its interpretation semantics for convenience as allocating the string on the heap. That's why it works (in gforth) as long as you don't compile it.
Edit:
I've found the definition of append, it's part of libcc.fs (a foreign function interface builder as it seems) and not a standard word. This is the definition in the source, more readable than the see decompile:
: append { addr1 u1 addr2 u2 -- addr u }
addr1 u1 u2 + dup { u } resize throw { addr }
addr2 addr u1 + u2 move
addr u ;
Immediately before that is a definition of s+:
: s+ { addr1 u1 addr2 u2 -- addr u }
u1 u2 + allocate throw { addr }
addr1 addr u1 move
addr2 addr u1 + u2 move
addr u1 u2 +
;
As you can see this one allocates new memory space instead of resizing the first string and concatenates both strings into it. You could use this one instead. It is not a standard word however and just happens to be in your environment as an internal implementation detail of libcc.fs in gforth so you can't rely on it being available elsewhere.

The usage of strings in Forth doesn't warrant dynamic allocation mostly and at least not in your example. You can get by nicely with buffers that you allocate yourself using ALLOT
and some very simple words to manipulate them.
[ALLOT uses the data space (ANSI term) in an incremental fashion for adding words and buffers. It is not dynamic, you can't release an item without removing at the same time all items ALLOT-ted later. It is also simple. Do not confuse with ALLOCATE which is dynamic and is in a separate extension wordset]
You make a fundamental mistake in leaving out the specification of your append-buffer.
It doesn't work, and we don't know how it is supposed to work!
In ciforth's an example could be:
: astring S" foo" ;
CREATE buffer 100 ALLOT \ space for 100 chars
\ Put the first string in `buffer and append the second string.
\ Also print the second string
: append-print ( s s -- )
type cr 2swap
buffer $!
buffer $+! ;
astring s" bar" append-print
bar OK \ answer
buffer $# TYPE
foobar OK \ answer
Other Forths have other non-standard words to manipulate simple strings. An excursion through malloc land is really not necessary. In the gforth documentation you can look up 'place' and find an equivalent family of words.
Also nowadays (Forth 2012) you can have strings like so "foo".

Related

What exactly does the "DOES>" word do?

I was messing around and trying to understand it, so I wrote a simple word to test it:
: test ." compile time" DOES> ." runtime" ;
The problem is, this word doesn't behave in a consistent way at all. Its output seems to vary depending on a number of factors, like:
is this the first line to be interpreted?
are there other words defined after it?
Also, sometimes it doesn't print anything at all.
(Using Gforth)
This is your code:
: test ." compile time" DOES> ." runtime" ;
After entering that, I can use your word without the ambiguous behavior you are encountering:
CREATE def 12345 , test \ prints "compile time"
It prints compile time, because that's the behavior you compiled into test before DOES>.
Note: this is not actually running at compile time.
DOES> ends the definition of the word, but changes it so test also modifies the last defined word so that it puts its data field address on the stack, and then runs the behaviour found after DOES>.
Using the word I created, it has the instantiated behavior you defined, following the implicit behaviour of pushing the address:
def # . \ prints runtime 12345
Forth 2012 note: Per the definition of DOES> in Forth 2012, this would cause ambiguous behavior if the last word was not defined with CREATE. However, Gforth allows any word definition to be modified.
I hope this example helps explain why, usually, CREATE is used within the definition that uses DOES>, but it is certainly not required.
ruvim's answer may be easier to understand in Gforth.
The code below defines a 'classic' defining word that creates variables initialised to the item on the stack at compile time. I hope the tracing statements in the definitions will help show what is happening
: var \ create: n <name> -- ; does>: -- addr ; initialised VARIABLE
create \ create a dictionary item for <name>.
." HERE at compile time: " HERE .
, \ place n at HERE in the dictionary
does> \ Push the HERE as at compile time to the stack
." Run time address on the stack:" dup .
;
var can now be used to define new words that have the run time action defined after DOES>.
10 var init10 \ Use var to define a new word init10
\ HERE at compile time: 135007328
init10 CR DUP . # .
\ Run time address on the stack:135007328
\ 135007328 10 \ addr of init10, content of that address.
12 var init12 \ Use var to define a new word init12
\ HERE at compile time: 135007376
init12 CR DUP . # .
\ Run time address on the stack:135007376
\ 135007376 12
100 init10 ! \ Store 100 in init10
\ Run time address on the stack:135007328
init10 # .
\ Run time address on the stack:135007328 100
\ init10 now contains 100
Hopefully the answers will provide a framework to explore defining words and the action of DOES>.
Interactive play
In Gforth you can play with does> interpretively.
create foo 123 ,
foo # . \ prints 123
does> ( addr -- ) # . ;
foo \ prints 123
does> ( addr -- ) # 1+ . ;
foo \ prints 124
' foo >body # . \ prints 123
So does> just changes behavior of the last word, when this last word is defined via create. It's a mistake if you run does> when the last word was not defined via create.
Usage in practice
Usually does> is used to set a new behavior only once for a word defined via create. An ability to alter this behavior several times it just a side effect of historical implementation, and this effect is almost not used in the practice.
Alternative ways
In practice, the cases when does> is used, can be also implemented without does>.
For example, let we want to implement a word counter that creates a counter that every time returns the next value, and that is used in the following way:
1 counter x1
x1 . \ prints 1
x1 . \ prints 2
x1 . \ prints 3
An implementation via create does>
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
create , does> ( addr -- x ) dup >r # dup 1+ r> !
;
An implementation using a quotation
[undefined] lit, [if] : lit, ( x -- ) postpone lit, ; [then]
[undefined] xt, [if] : xt, ( xt -- ) compile, ; [then]
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
align here >r , [: ( addr -- x ) dup >r # dup 1+ r> ! ;] >r
: r> r> lit, xt, postpone ;
;
An implementation using a macro (code inlining) via the word ]]:
: counter ( x0 "ccc" -- ) \ Run-Time: ( -- x )
align here >r ,
: r> lit, ]] dup >r # dup 1+ r> ! [[ postpone ;
;
A word that define words that allocate memory can be equipped with an action thru DOES> which gives the address to the memory block.
A simple example is CONSTANT wich can be defined as
: CONSTANT ( n -- ) CREATE , DOES> # ;

Perl6 string coercion operator ~ doesn't like leading zeros

I'm toying with Rakudo Star 2015.09.
If I try to stringify an integer with a leading zero, the compiler issues a warning:
> say (~01234).WHAT
Potential difficulties:
Leading 0 does not indicate octal in Perl 6.
Please use 0o123 if you mean that.
at <unknown file>:1
------> say (~0123<HERE>).WHAT
(Str)
I thought maybe I could help the compiler by assigning the integer value to a variable, but obtained the same result:
> my $x = 01234; say (~$x).WHAT
Potential difficulties:
Leading 0 does not indicate octal in Perl 6.
Please use 0o1234 if you mean that.
at <unknown file>:1
------> my $x = 01234<HERE>; say (~$x).WHAT
(Str)
I know this is a silly example, but is this by design? If so, why?
And how can I suppress this kind of warning message?
Is there a reason you have data with leading zeroes? I tend to run into this problem when I have a column of postal codes.
When they were first thinking about Perl 6, one of the goals was to clean up some consistency issues. We had 0x and 0b (I think by that time), but Perl 5 still had to look for the leading 0 to guess it would be octal. See Radix Markers in Synopsis 2.
But, Perl 6 also has to care about what Perl 5 programmers are going to try to do and what they expect. Most people are going to expect a leading 0 to mean octal. But, it doesn't mean octal. It's that you typed the literal, not how you are using it. Perl 6 has lots of warnings about things that Perl 5 people would try to use, like foreach:
$ perl6 -e 'foreach #*ARGS -> $arg { say $arg }' 1 2 3
===SORRY!=== Error while compiling -e
Unsupported use of 'foreach'; in Perl 6 please use 'for' at -e:1
------> foreach⏏ #*ARGS -> $arg { say $arg }
To suppress that sort of warning, don't do what it's warning you about. The language doesn't want you to do that. If you need a string, start with a string '01234'. Or, if you want it to be octal, start with 0o. But, realize that stringifying a number will get you back the decimal representation:
$ perl6 -e 'say ~0o1234'
668

How to properly implement POSTPONE in a Forth system?

John Heyes' ANS Forth test suite contains the following definition:
: IFFLOORED [ -3 2 / -2 = INVERT ] LITERAL IF POSTPONE \ THEN ;
This is then used to conditionally define various words depending on whether we're using floored or symmetric division:
IFFLOORED : T/MOD >R S>D R> FM/MOD ;
So IFFLOORED acts like either a noop or a \ depending on the result of the expression. Fine. That's easily implementable on my threaded interpreter by doing this:
: POSTPONE ' , ; IMMEDIATE
...and now IFFLOORED works; the definition is equivalent to : IFFLOORED -1 IF ['] \ EXECUTE THEN ;.
Unfortunately, further down the test suite is the following code:
: GT1 123 ;
: GT4 POSTPONE GT1 ; IMMEDIATE
: GT5 GT4 ;
\ assertion here that the stack is empty
The same implementation doesn't work here. If POSTPONE compiles a reference to its word, then GT4 becomes the equivalent of : GT4 123 ;... but GT4 is immediate. So when GT5 is defined, 123 is pushed onto the compiler's stack and GT5 becomes a noop. But that's not right; the test suite expects calling GT5 to leave 123 on the stack. So for this to work, POSTPONE must generate code which generates code:
: POSTPONE ' LITERAL ['] , LITERAL ;
And, indeed, if I play with gForth, I see that POSTPONE actually works like this:
: GT1 123 ;
: GT4 POSTPONE GT1 ; IMMEDIATE
SEE GT4
<long number> compile, ;
But these two definitions are not compatible. If I use the second definition, the first test fails (because now IFFLOORED tries to compile \ rather than executing it). If I use the first definition, the second test fails (because GT4 pushes onto the compiler stack rather than compiling a literal push).
...but both tests pass in gForth.
So what's going on?
Let me answer here, as the question changed considerably. I am still not sure I understand the question, though :)
In your example, you define
: GT4 POSTPONE GT1 ; IMMEDIATE
What happens here, is the following:
: is executed, reading GT4 and creating the new word
POSTPONE's compilation semantics is executed, which is to compile the compilation semantics of GT1 - as you have seen in GForth.
; is executed, ending the definition
IMMEDIATE is executed, marking the last defined word as immediate.
POSTPONE is called only when compiling GT4, and it does not appear in the compiled code. So when later using this immediate word in the definition of GT5, the interpretation semantics of POSTPONE is not needed.
By the way, according to the standard, POSTPONE has only compilation semantics, and the interpretation semantics is undefined.
See also the POSTPONE tutorial in the GForth manual.
EDIT Examples of interpretation and compilation semantics:
: TEST1 ." interpretation" ; => ok
: TEST2 ." compilation" ; IMMEDIATE => ok
: TEST3 TEST1 TEST2 ; => compilation ok
TEST3 => interpretation ok
: TEST4 POSTPONE TEST1 ; IMMEDIATE => ok
: TEST5 TEST4 ; => ok
TEST5 => interpretation ok
: TEST6 POSTPONE TEST2 ; IMMEDIATE => ok
TEST6 => compilation ok
If you have any more questions, you can reference these tests.
The snippet you've quoted does the following things:
Evaluate -3/2 (at compile time), and check if it is -2.
If it is, store a 0 (false), otherwise store a -1 (true) in IFFLOORED, so when it is evaluated, it will put this value on the stack. (This is the effect of LITERAL.)
When evaluating IFFLOORED, after pushing the value on the stack, comes an IF - THEN expression. When the value is true, it means that we are not in a floored environment, so we want to comment out the rest of the line, and that is what \ does.
So here comes the tricky part - \ is IMMEDIATE, i.e., you cannot use it inside a colon definition, because it will comment out the rest of the line. You have to explicitly tell the compiler that you want to compile this function, and not execute it, which is what POSTPONE does.
The behavior of postpone word in compilation state is to determine compilation semantics for its parsed argument and append these semantics to the current definition.
Compilation semantics for a word can be either special or ordinary (see the section 3.4.3.3 Compilation semantics of Forth-2012). To work correctly, postpone should distinguish these cases and generate code according to the different patterns.
A problem of your implementations is that they are correct either for ordinary compilation semantics or for special compilation semantics.
A standard compliant implementation is as follows:
: state-on ( -- ) 1 state ! ;
: state-off ( -- ) 0 state ! ;
: execute-compiling ( i*x xt --j*x )
state # if execute exit then
state-on execute state-off
;
: postpone ( "name" -- )
bl word find dup 0= -13 and throw 1 = ( xt flag-special )
swap lit, if ['] execute-compiling else ['] compile, then compile,
; immediate
See more details in my post How POSTPONE should work.

parsing input file in fortran

This is a continuation of my older thread.
I have a file from different code, that I should parse to use as my input.
A snippet from it looks like:
GLOBAL SYSTEM PARAMETER
NQ 2
NT 2
NM 2
IREL 3
*************************************
BEXT 0.00000000000000E+00
SEMICORE F
LLOYD F
NE 32 0
IBZINT 2
NKTAB 936
XC-POT VWN
SCF-ALG BROYDEN2
SCF-ITER 29
SCF-MIX 2.00000000000000E-01
SCF-TOL 1.00000000000000E-05
RMSAVV 2.11362995016878E-06
RMSAVB 1.25411205586140E-06
EF 7.27534671479201E-01
VMTZ -7.72451391270293E-01
*************************************
And so on.
Currently I am reading it line by line, as:
Program readpot
use iso_fortran_env
Implicit None
integer ::i,filestat,nq
character(len=120):: rdline
character(10)::key!,dimension(:),allocatable ::key
real,dimension(:),allocatable ::val
i=0
open(12,file="FeRh.pot_new",status="old")
readline:do
i=i+1
read(12,'(A)',iostat=filestat) rdline!(i)
if (filestat /= 0) then
if (filestat == iostat_end ) then
exit readline
else
write ( *, '( / "Error reading file: ", I0 )' ) filestat
stop
endif
end if
if (rdline(1:2)=="NQ") then
read(rdline(19:20),'(i)'),nq
write(*,*)nq
end if
end do readline
End Program readpot
So, I have to read every line, manually find the value column corresponding to the key, and write that(For brevity, I have shown for one value only).
My question is, is this the proper way of doing this? or there is other simpler way? Kindly let me know.
If the file has no variability you scarcely need to parse it at all. Let's suppose that you have declared variables for all the interesting data items in the file and that those variables have the names shown on the lines of the file. For example
INTEGER :: nq , nt, nm, irel
REAL:: scf_mix, scf_tol ! '-' not allowed in Fortran names
CHARACTER(len=48) :: label, text
LOGICAL :: semicore, lloyd
! Complete this as you wish
Then write a block of code like this
OPEN(12,file="FeRh.pot_new",status="old")
READ(12,*) ! Not interested in the 1st line
READ(12,*) label, nq
READ(12,*) label, nt
READ(12,*) label, nm
READ(12,*) label, irel
READ(12,*) ! Not interested in this line
READ(12,*) label, bext
READ(12,*) label, semicore
! Other lines to write
CLOSE(12)
Fortran's list-directed input understands blanks in lines to separate values. It will not read those blanks as part of a character variable. That behaviour can be changed but in your case you don't need to. Note that it will also understand the character F to mean .false. when read into a logical variable.
My code snippet just ignores the labels and lines of explanation. If you are of a nervous disposition you could process them, perhaps
IF (label/='NE') STOP
or whatever you wish.

How to check for EOF/EOL with Stream I/O in Fortran?

I would like to use FORTRAN streaming I/O to make a program that tells me how many lines a text-file has. The idea is to make something like this:
OPEN(UNIT=10,ACCESS='STREAM',FILE='testfile.txt')
nLines=0
bContinue=.TRUE.
DO WHILE (bContinue)
READ(UNIT=10) cCharacter
IF (cCharacter.EQ.{EOL-char}) nLines=nLines+1
IF (cCharacter.EQ.{EOF-char}) bContinue=.FALSE.
ENDDO
(I didn't include variable declaration but I think you get the idea of what they are; the only important clarification would be that that cCharacter has LEN=1)
My problem is that I don't know how to check if the character I just read from the file is an end-of-line or end-of-file (the "ifs" in the code). When you read and print characters this way, you eventually get newlines in the same place you had them in the original text, so I think it does read and recognize them as "characters", somehow. Perhaps turning the characters into integers and comparing to the appropriate number? Or is there a more direct way?
(I know that you can use the register reading (EDIT: I meant record reading) to do a program that reads lines more easily and add an IOstatus to check for eof, but the "line counter" is just a useful example, the idea is to learn how to move in a more controlled way through a textfile)
Checking for a specific character as line terminator makes you program OS dependent. It would be better to use the facilities of the language so that your program is compiler and OS dependent. Since lines are basically records, why do this with steam I/O? That request seems to make an easy job into a hard one. If are can use regular IO, here is an example program to count the lines in a text file.
EDIT: the code fragment was changed into a program to answer questions in the comments. With "line" as a character variable, when I test the program with gfortran and ifort I don't see a problem when the input file has empty or blank lines.
program test_lc
use, intrinsic :: iso_fortran_env
integer :: LineCount, Read_Code
character (len=200) :: line
open (unit=51, file="temp.txt", status="old", access='sequential', form='formatted', action='read' )
LineCount = 0
ReadLoop: do
read (51, '(A)', iostat=Read_Code) line
if ( Read_Code /= 0 ) then
if ( Read_Code == iostat_end ) then
exit ReadLoop ! end of file --> line count found
else
write ( *, '( / "read error: ", I0 )' ) Read_Code
stop
end if
end if
LineCount = LineCount + 1
write (*, '( I0, ": ''", A, "''" )' ) LineCount, trim (line)
if ( len_trim (line) == 0 ) write (*, '("The above is an empty or all blank line.")' )
end do ReadLoop
write (*, *) "found", LineCount, " lines"
end program test_lc
If you want to do further processing of the file, you can rewind it.
P.S.
The main reason that I have used Fortran Stream IO is to read files produced by other languages, e.g., C
Portable methods are provided to write new-line boundaries; I'm not aware of a portable method to test for such.

Resources