Word similar to "parse-word", but for pushing integers onto the stack - forth

I would like to implement a DSL for setting port numbers on a socket object.
I would like the DSL to follow this API for setting the host port number:
host: 8080
If this were a string operation (such as host: localhost) I could use parse-word. That's less than ideal though, since Forth is very good at parsing numbers, and re-inventing the wheel is a bad thing.
Are there any standard words in Forth that take the first item on the input string, parse it to a number and push it on the stack?

>NUMBER is an ANS word (in CORE) that turns strings into numbers, but it's cumbersome to use. Your Forth probably has a more flexible variant. Your Forth probably also supports syntax like #16 $10 %10000, which all evaluate to 16, regardless of BASE. So one way to do this:
: parse-num ( "number" -- n | d | r ) parse-word evaluate ;
Or with >NUMBER, and only returning a single-cell number:
: parse-num ( "number" -- n )
0. parse-word >number ( d c-addr u )
abort" not a number" drop
abort" double received where single-cell number expected" ;
Which aborts if the returned string isn't the empty string that would result if the entire output of PARSE-WORD was converted into a number, or if the high bits of the double aren't 0, which would be the case if a number not representable by a cell were entered. (NB. >NUMBER doesn't handle double-number syntax either. It will stop parsing 1. at the dot. It doesn't even handle negative numbers.)

Related

MQL4 StringToDouble alters the value of the variable?

MQL4 documentation states that the value limits for double type variables is:
"Minimal Positive Value" = 2.2250738585072014e-308
"Maximum Value" = 1.7976931348623158e+308
See https://docs.mql4.com/basis/types/double
Why does StringToDouble() alter the value converted?
Am I doing one thing while expecting a different result?
void OnStart() {
string s1 = "5554535251504900090807060504030201";
double d1 = StringToDouble(s1);
string s2 = DoubleToString(d1);
Print("s2<",s2,">");
printf("%099.8f",d1);
Print("s1<",s1,">");
return;<br>
}
Here's what I get when I run that code:
s1<5554535251504900090807060504030201>
d1<000000000000000000000000000000000000000000000000000000005554535251504899684469244159852544.00000000>
s2<5554535251504899684469244159852544>
5554535251504900090807060504030201 amounts to5.55454E+33.
Obviously, that doesn't even come remotely close to the 1.7976931348623158e+308 limit.
What am I missing here?
Q : "What am I missing here?"
The documented facts.
MQL4 uses no more than 4-bytes to store int.
MQL4 uses no more than 8-bytes to store double.
IEEE-754 standard defines the rest - how many bits from those 64 are reserved for: exponent ( -308, 0, +308 )
sign ( +, - ) and
the rest, for normalised form of the mantissa : 0.???????...????
Argument, that an actual number is far from either "edge" of < DBL_MIN, DBL_MAX > does explain nothing about the shallow-ness of the exact number reduced-precision representation ( see DBL_EPSILON ~ 2E-16 or DBL_DIG ~ 15-significant digits, or DBL_MANT_DIG ~ 53-bits, left from a 64-bit ( 8-Byte ) storage-cell for mantissa ).
There are many numbers, that simply cannot be stored exactly, using IEEE-754 floating point number representation.
Tons of literature explain this, so feel free to dig deeper, or may use another tools, that rely on infinite-(unlimited)-precision number representation, should your use-case requires that.

New lines in word definition using interpreter directives of Gforth

I am using the interpreter directives (non ANS standard) control structures of Gforth as described in the manual section 5.13.4 Interpreter Directives. I basically want to use the loop words to create a dynamically sized word containing literals. I came up with this definition for example:
: foo
[ 10 ] [FOR]
1
[NEXT]
;
Yet this produces an Address alignment exception after the [FOR] (yes, I know you should not use a for loop in Forth at all. This is just for an easy example).
In the end it turned out that you have to write loops as one-liners in order to ensure their correct execution. So doing
: foo [ 10 [FOR] ] 1 [ [NEXT] ] ;
instead works as intended. Running see foo yields:
: foo
1 1 1 1 1 1 1 1 1 1 1 ; ok
which is exactly what I want.
Is there a way to get new lines in the word definition? The words I would like to write are way more complex, and for a presentation I would need them better formatted.
It would really be best to use an immediate word instead. For example,
: ones ( n -- ) 0 ?do 1 postpone literal loop ; immediate
: foo ( -- ten ones ) [ 10 ] ones ;
With SEE FOO resulting in the same as your example. With POSTPONE, especially with Gforth's ]] .. [[ syntax, the repeated code can be as elaborate as you like.
A multiline [FOR] would need to do four things:
Use REFILL to read in subsequent lines.
Save the read-in lines, because you'll need to evaluate them one by one to preserve line-expecting parsing behavior (such as from comments: \ ).
Stop reading in lines, and loop, when you match the terminating [NEXT].
Take care to leave >IN right after the [NEXT] so that interpretation can continue normally.
You might still run into issues with some code, like code checking SOURCE-ID.
For an example of using REFILL to parse across multiple lines, here's code from a recent posting from CLF, by Gerry:
: line, ( u1 caddr2 u2 -- u3 )
tuck here swap chars dup allot move +
;
: <text>  ( "text" -- caddr u )
here 0
begin
refill
while
bl word count s" </text>" compare
while
0 >in ! source line, bl c, 1+
repeat then
;
This collects everything between <text> and a </text> that's on its own line, as with a HERE document, while also adding spaces. To save the individual lines for [FOR] in an easy way, I'd recommend leaving 0 as a sentinel on the data stack and then drop SAVE-MEM 'd lines on top of it.

string comparison against factors in Stata

Suppose I have a factor variable with labels "a" "b" and "c" and want to see which observations have a label of "b". Stata refuses to parse
gen isb = myfactor == "b"
Sure, there is literally a "type mismatch", since my factor is encoded as an integer and so cannot be compared to the string "b". However, it wouldn't kill Stata to (i) perform the obvious parse or (ii) provide a translator function so I can write the comparison as label(myfactor) == "b". Using decode to (re)create a string variable defeats the purpose of encoding, which is to save space and make computations more efficient, right?
I hadn't really expected the comparison above to work, but I at least figured there would be a one- or two-line approach. Here is what I have found so far. There is a nice macro ("extended") function that maps the other way (from an integer to a label, seen below as local labi: label ...). Here's the solution using it:
// sample data
clear
input str5 mystr int mynum
a 5
b 5
b 6
c 4
end
encode mystr, gen(myfactor)
// first, how many groups are there?
by myfactor, sort: gen ng = _n == 1
replace ng = sum(ng)
scalar ng = ng[_N]
drop ng
// now, which code corresponds to "b"?
forvalues i = 1/`=ng'{
local labi: label myfactor `i'
if "b" == "`labi'" {
scalar bcode = `i'
break
}
}
di bcode
The second step is what irks me, but I'm sure there's a also faster, more idiomatic way of performing the first step. Can I grab the length of the label vector, for example?
An example:
clear all
set more off
sysuse auto
gen isdom = 1 if foreign == "Domestic":`:value label foreign'
list foreign isdom in 1/60
This creates a variable called isdom and it will equal 1 if foreigns's value label is equal to "Domestic". It uses an extended macro function.
From [U] 18.3.8 Macro expressions:
Also, typing
command that makes reference to `:extended macro function'
is equivalent to
local macroname : extended macro function
command that makes reference to `macroname'
This explains one of the two : in the offered syntax. The other can be explained by
... to specify value labels directly in an expression, rather than through
the underlying numeric value ... You specify the label in double quotes
(""), followed by a colon (:), followed by the name of the value
label.
The quote is from Stata tip 14: Using value labels in expressions, by Kenneth Higbee, The Stata Journal (2004). Freely available at http://www.stata-journal.com/sjpdf.html?articlenum=dm0009
Edit
On computing the number of distinct observations, another way is:
by myfactor, sort: gen ng = _n == 1
count if ng
scalar sc_ng = r(N)
display sc_ng
But yours is fine. In fact, it is documented here: http://www.stata.com/support/faqs/data-management/number-of-distinct-observations/, along with more methods and comments.

TCP/IP Client / Server commands data

I have a Client/Server architecture (C# .Net 4.0) that send's command packets of data as byte arrays. There is a variable number of parameters in any command, and each paramater is of variable length. Because of this I use delimiters for the end of a parameter and the command as a whole. The operand is always 2 bytes and both types of delimiter are 1 byte. The last parameter_delmiter is redundant as command_delmiter provides the same functionality.
The command structure is as follow:
FIELD SIZE(BYTES)
operand 2
parameter1 x
parameter_delmiter 1
parameter2 x
parameter_delmiter 1
parameterN x
.............
.............
command_delmiter 1
Parameters are sourced from many different types, ie, ints, strings etc all encoded into byte arrays.
The problem I have is that sometimes parameters when encoded into byte arrays contain bytes that are the same value as a delimiter. For example command_delmiter=255.. and a paramater may have that byte inside of it.
There is 3 ways I can think of fixing this:
1) Encode the parameters differently so that they can never be the same value as a delimiter (255 and 254) Modulus?. This will mean that paramaters will become larger, ie Int16 will be more than 2 bytes etc.
2) Do not use delimiters at all, use count and length values at the start of the command structure.
3) Use something else.
To my knowledge, the way TCP/IP buffers work is that SOME SORT of delimiter has to be used to seperate 'commands' or 'bundles of data' as a buffer may contain multiple commands, or a command may span multiple buffers.. So this
BinaryReader / Writer seems like an obvious candidate, the only issue is that the byte array may contain multiple commands ( with parameters inside). So the byte array would still have to be chopped up in order to feel into the BinaryReader.
Suggestions?
Thanks.
The standard way to do this is to have the length of the message in the (fixed) first few bytes of a message. So you could have the first 4 bytes to denote the length of a message, read those many bytes for the content of the message. The next 4 bytes would be the length of the next message. A length of 0 could indicate end of messages. Or you could use a header with a message count.
Also, remember TCP is a byte stream, so don't expect a complete message to be available every time you read data from a socket. You could receive an arbitrary number of bytes at ever read.

Read numbers following a keyword into an array in Fortran 90 from a text file

I have many text files of this format
....
<snip>
'FOP' 0.19 1 24 1 25 7 8 /
'FOP' 0.18 1 24 1 25 9 11 /
/
TURX
560231
300244
70029
200250
645257
800191
900333
600334
770291
300335
220287
110262 /
SUBTRACT
'TURX' 'TURY'/
</snip>
......
where the portions I snipped off contain other various data in various formats. The file format is inconsistent (machine generated), the only thing one is assured of is the keyword TURX which may appear more than once. If it appears alone on one line, then the next few lines will contain numbers that I need to fetch into an array. The last number will have a space then a forward slash (/). I can then use this array in other operations afterwards.
How do I "search" or parse a file of unknown format in fortran, and how do I get a loop to fetch the rest of the data, please? I am really new to this and I HAVE to use fortran. Thanks.
Fortran 95 / 2003 have a lot of string and file handling features that make this easier.
For example, this code fragment to process a file of unknown length:
use iso_fortran_env
character (len=100) :: line
integer :: ReadCode
ReadLoop: do
read (75, '(A)', iostat=ReadCode ) line
if ( ReadCode /= 0 ) then
if ( ReadCode == iostat_end ) then
exit ReadLoop
else
write ( *, '( / "Error reading file: ", I0 )' ) ReadCode
stop
end if
end if
! code to process the line ....
end do ReadLoop
Then the "process the line" code can contain several sections depending on a logical variable "Have_TURX". If Have_TRUX is false you are "seeking" ... test whether the line contains "TURX". You could use a plain "==" if TURX is always at the start of the string, or for more generality you could use the intrinsic function "index" to test whether the string "line" contains TURX.
Once the program is in the mode Have_TRUX is true, then you use "internal I/O" to read the numeric value from the string. Since the integers have varying lengths and are left-justified, the easiest way is to use "list-directed I/O": combining these:
read (line, *) integer_variable
Then you could use the intrinsic function "index" again to test whether the string also contains a slash, in which case you change Have_TRUX to false and end reading mode.
If you need to put the numbers into an array, it might be necessary to read the file twice, or to backspace the file, because you will have to allocate the array, and you can't do that until you know the size of the array. Or you could pop the numbers into a linked list, then when you hit the slash allocate the array and fill it from the linked list. Or if there is a known maximum number of values you could use a temporary array, then transfer the numbers to an allocatable output array. This is assuming that you want the output argument of the subroutine be an allocatable array of the correct length, and the it returns one group of numbers per call:
integer, dimension (:), allocatable, intent (out) :: numbers
allocate (numbers (1: HowMany) )
P.S. There is a brief summary of the language features at http://en.wikipedia.org/wiki/Fortran_95_language_features and the gfortran manual has a summary of the intrinsic procedures, from which you can see what built in functions are available for string handling.
I'll give you a nudge in the right direction so that you can finish your project.
Some basics:
Do/While as you'll need some sort of loop
structure to loop through the file
and then over the numbers. There's
no for loop in Fortran, so use this
type.
Read
to read the strings.
To start you need something like this:
program readlines
implicit none
character (len=30) :: rdline
integer,dimension(1000) :: array
! This sets up a character array with 30 positions and an integer array with 1000
!
open(18,file='fileread.txt')
do
read(18,*) rdline
if (trim(rdline).eq.'TURX') exit !loop until the trimmed off portion matches TURX
end do
See this thread for way to turn your strings into integers.
Final edit: Looks like MSB has got most of what I just found out. The iostat argument of the read is the key to it. See this site for a sample program.
Here was my final way around it.
PROGRAM fetchnumbers
implicit none
character (len=50) ::line, numdata
logical ::is_numeric
integer ::I,iost,iost2,counter=0,number
integer, parameter :: long = selected_int_kind(10)
integer, dimension(1000)::numbers !Can the number of numbers be up to 1000?
open(20,file='inputfile.txt') !assuming file is in the same location as program
ReadLoop: do
read(20,*,iostat=iost) line !read data line by line
if (iost .LT. 0) exit !end of file reached before TURX was found
if (len_trim(line)==0) cycle ReadLoop !ignore empty lines
if (index(line, 'TURX').EQ.1) then !prepare to begin capturing
GetNumbers: do
read(20, *,iostat=iost2)numdata !read in the numbers one by one
if (.NOT.is_numeric(numdata)) exit !no more numbers to read
if (iost2 .LT. 0) exit !end of file reached while fetching numbers
read (numdata,*) number !read string value into a number
counter = counter + 1
Storeloop: do I =1,counter
if (I<counter) cycle StoreLoop
numbers(counter)=number !storing data into array
end do StoreLoop
end do GetNumbers
end if
end do ReadLoop
write(*,*) "Numbers are:"
do I=1,counter
write(*,'(I14)') numbers(I)
end do
END PROGRAM fetchnumbers
FUNCTION is_numeric(string)
IMPLICIT NONE
CHARACTER(len=*), INTENT(IN) :: string
LOGICAL :: is_numeric
REAL :: x
INTEGER :: e
is_numeric = .FALSE.
READ(string,*,IOSTAT=e) x
IF (e == 0) is_numeric = .TRUE.
END FUNCTION is_numeric

Resources