Read numbers following a keyword into an array in Fortran 90 from a text file - parsing

I have many text files of this format
....
<snip>
'FOP' 0.19 1 24 1 25 7 8 /
'FOP' 0.18 1 24 1 25 9 11 /
/
TURX
560231
300244
70029
200250
645257
800191
900333
600334
770291
300335
220287
110262 /
SUBTRACT
'TURX' 'TURY'/
</snip>
......
where the portions I snipped off contain other various data in various formats. The file format is inconsistent (machine generated), the only thing one is assured of is the keyword TURX which may appear more than once. If it appears alone on one line, then the next few lines will contain numbers that I need to fetch into an array. The last number will have a space then a forward slash (/). I can then use this array in other operations afterwards.
How do I "search" or parse a file of unknown format in fortran, and how do I get a loop to fetch the rest of the data, please? I am really new to this and I HAVE to use fortran. Thanks.

Fortran 95 / 2003 have a lot of string and file handling features that make this easier.
For example, this code fragment to process a file of unknown length:
use iso_fortran_env
character (len=100) :: line
integer :: ReadCode
ReadLoop: do
read (75, '(A)', iostat=ReadCode ) line
if ( ReadCode /= 0 ) then
if ( ReadCode == iostat_end ) then
exit ReadLoop
else
write ( *, '( / "Error reading file: ", I0 )' ) ReadCode
stop
end if
end if
! code to process the line ....
end do ReadLoop
Then the "process the line" code can contain several sections depending on a logical variable "Have_TURX". If Have_TRUX is false you are "seeking" ... test whether the line contains "TURX". You could use a plain "==" if TURX is always at the start of the string, or for more generality you could use the intrinsic function "index" to test whether the string "line" contains TURX.
Once the program is in the mode Have_TRUX is true, then you use "internal I/O" to read the numeric value from the string. Since the integers have varying lengths and are left-justified, the easiest way is to use "list-directed I/O": combining these:
read (line, *) integer_variable
Then you could use the intrinsic function "index" again to test whether the string also contains a slash, in which case you change Have_TRUX to false and end reading mode.
If you need to put the numbers into an array, it might be necessary to read the file twice, or to backspace the file, because you will have to allocate the array, and you can't do that until you know the size of the array. Or you could pop the numbers into a linked list, then when you hit the slash allocate the array and fill it from the linked list. Or if there is a known maximum number of values you could use a temporary array, then transfer the numbers to an allocatable output array. This is assuming that you want the output argument of the subroutine be an allocatable array of the correct length, and the it returns one group of numbers per call:
integer, dimension (:), allocatable, intent (out) :: numbers
allocate (numbers (1: HowMany) )
P.S. There is a brief summary of the language features at http://en.wikipedia.org/wiki/Fortran_95_language_features and the gfortran manual has a summary of the intrinsic procedures, from which you can see what built in functions are available for string handling.

I'll give you a nudge in the right direction so that you can finish your project.
Some basics:
Do/While as you'll need some sort of loop
structure to loop through the file
and then over the numbers. There's
no for loop in Fortran, so use this
type.
Read
to read the strings.
To start you need something like this:
program readlines
implicit none
character (len=30) :: rdline
integer,dimension(1000) :: array
! This sets up a character array with 30 positions and an integer array with 1000
!
open(18,file='fileread.txt')
do
read(18,*) rdline
if (trim(rdline).eq.'TURX') exit !loop until the trimmed off portion matches TURX
end do
See this thread for way to turn your strings into integers.
Final edit: Looks like MSB has got most of what I just found out. The iostat argument of the read is the key to it. See this site for a sample program.

Here was my final way around it.
PROGRAM fetchnumbers
implicit none
character (len=50) ::line, numdata
logical ::is_numeric
integer ::I,iost,iost2,counter=0,number
integer, parameter :: long = selected_int_kind(10)
integer, dimension(1000)::numbers !Can the number of numbers be up to 1000?
open(20,file='inputfile.txt') !assuming file is in the same location as program
ReadLoop: do
read(20,*,iostat=iost) line !read data line by line
if (iost .LT. 0) exit !end of file reached before TURX was found
if (len_trim(line)==0) cycle ReadLoop !ignore empty lines
if (index(line, 'TURX').EQ.1) then !prepare to begin capturing
GetNumbers: do
read(20, *,iostat=iost2)numdata !read in the numbers one by one
if (.NOT.is_numeric(numdata)) exit !no more numbers to read
if (iost2 .LT. 0) exit !end of file reached while fetching numbers
read (numdata,*) number !read string value into a number
counter = counter + 1
Storeloop: do I =1,counter
if (I<counter) cycle StoreLoop
numbers(counter)=number !storing data into array
end do StoreLoop
end do GetNumbers
end if
end do ReadLoop
write(*,*) "Numbers are:"
do I=1,counter
write(*,'(I14)') numbers(I)
end do
END PROGRAM fetchnumbers
FUNCTION is_numeric(string)
IMPLICIT NONE
CHARACTER(len=*), INTENT(IN) :: string
LOGICAL :: is_numeric
REAL :: x
INTEGER :: e
is_numeric = .FALSE.
READ(string,*,IOSTAT=e) x
IF (e == 0) is_numeric = .TRUE.
END FUNCTION is_numeric

Related

Lua string manipulatuion

A string '321#322#323#324#325'.
here number of digits in each number is 3 but it's not limited to 3 it could be any number.
here are 5 numbers in a string but this number could be anything.
task is to get 321,322,323,324,325 and store in a table so that any operation could be performed over them.
I have tried several string functions like c = c:gsub('%W','') to eliminate those non-alphanumeric characters, but nothing helped.
function encrypter()--FUNCTION 14
c=' '
print('Please enter your message!')
local message=io.read()
lengthOfString=string.len(message)--Inbuit function to get length of a string.
newLine()
print('Please enter your key. Remember this key, otherwise your message wont be decrypted')
newLine()
key=io.read()
key=tonumber(key)
largeSpace()
print("Encrypted message is")
for s=1,lengthOfString do
--print(encryptionFormula(string.byte(message,s),key))
--inbuilt function for converting character of a string to it's respective ASCII value. First place gets character or variable having string whereas second place gets position of that character in the given string.
b=encryptionFormula(string.byte(message,s),key)
c=c..tostring(b)..'#'
--print(c)
b=0
end
print(c)
largeSpace()
print("Now share this message along with the key to the receiver. Don't share your key with anyone, if you don't want your message to be read.")
end
What you're looking for is string.gmatch().
local input = "123#546514#13#548#2315"
local numbers = {}
for number in string.gmatch(input, '%d+') do
table.insert(numbers, number)
end
-- Output the numbers
for index, number in ipairs(numbers) do
print(index, number)
-- This prints:
-- 1 123
-- 2 546514
-- 3 13
-- 4 548
-- 5 2315
end
If you don't know how Lua patterns work, you can read about them in the reference manual or you can have a look at Programming in Lua (the first edition is available for free on their website)

Parse array of unsigned integers in Julia 1.x.x

I am trying to open a binary file that I have some knowledge of its internal structure, and reinterpret it correctly in Julia. Let us say that I can load it already via:
arx=open("../axonbinaryfile.abf", "r")
databin=read(arx)
close(arx)
The data is loaded as an Array of UInt8, which I guess are bytes.
In the first 4 I can perform a simple Char conversion and it works:
head=databin[1:4]
map(Char, head)
4-element Array{Char,1}:
'A'
'B'
'F'
' '
Then it happens to be that in the positions 13-16 is an integer of 32 bytes waiting to be interpreted. How should I do that?
I have tried reinterpret() and Int32 as function, but to no avail.
You can use reinterpret(Int32, databin[13:16])[1]. The last [1] is needed, because reinterpret returns you a view.
Now note that read supports type passing. So if you first read 12 bytes of data from your file e.g. like this read(arx, 12) and then run read(arx, Int32) you will get the desired number without having to do any conversions or vector allocation.
Finally observe that what conversion to Char does in your code is converting a Unicode number to a character. I am not sure if this is exactly what you want (maybe it is). For example if the first byte read in has value 200 you will get:
julia> Char(200)
'È': Unicode U+00c8 (category Lu: Letter, uppercase)
EDIT one more comment is that when you do a conversion to Int32 of 4 bytes you should be sure to check if it should be encoded as big-endian or little-endian (see ENDIAN_BOM constant and ntoh, hton, ltoh, htol functions)
Here it is. Use view to avoid copying the data.
julia> dat = UInt8[65,66,67,68,0,0,2,40];
julia> Char.(view(dat,1:4))
4-element Array{Char,1}:
'A'
'B'
'C'
'D'
julia> reinterpret(Int32, view(dat,5:8))
1-element reinterpret(Int32, view(::Array{UInt8,1}, 5:8)):
671219712

How to refactor string containing variable names into booleans?

I have an SPSS variable containing lines like:
|2|3|4|5|6|7|8|10|11|12|13|14|15|16|18|20|21|22|23|24|25|26|27|28|29|
Every line starts with pipe, and ends with one. I need to refactor it into boolean variables as the following:
var var1 var2 var3 var4 var5
|2|4|5| 0 1 0 1 1
I have tried to do it with a loop like:
loop # = 1 to 72.
compute var# = SUBSTR(var,2#,1).
end loop.
exe.
My code won't work with 2 or more digits long numbers and also it won't place the values into their respective variables, so I've tried nest the char.substr(var,char.rindex(var,'|') + 1) into another loop with no luck because it still won't allow me to recognize the variable number.
How can I do it?
This looks like a nice job for the DO REPEAT command. However the type conversion is somewhat tricky:
DO REPEAT var#i=var1 TO var72
/i=1 TO 72.
COMPUTE var#i = CHAR.INDEX(var,CONCAT("|",LTRIM(STRING(i,F2.0)),"|"))>0).
END REPEAT.
Explanation: Let's go from the inside to the outside:
STRING(value,F2.0) converts the numeric values into a string of two digits (with a leading white space where the number consist of just one digit), e.g. 2 -> " 2".
LTRIM() removes the leading whitespaces, e.g. " 2" -> "2".
CONCAT() concatenates strings. In the above code it adds the "|" before and after the number, e.g. "2" -> "|2|"
CHAR.INDEX(stringvar,searchstring) returns the position at which the searchstring was found. It returns 0 if the searchstring wasn't found.
CHAR.INDEX(stringvar,searchstring)>0 returns a boolean value indicating if the searchstring was found or not.
It's easier to do the manipulations in Python than native SPSS syntax.
You can use SPSSINC TRANS extension for this purpose.
/* Example data*/.
data list free / TextStr (a99).
begin data.
"|2|3|4|5|6|7|8|10|11|12|13|14|15|16|18|20|21|22|23|24|25|26|27|28|29|"
end data.
/* defining function to achieve task */.
begin program.
def runTask(x):
numbers=map(int,filter(None,[i.strip() for i in x.lstrip('|').split("|")]))
answer=[1 if i in numbers else 0 for i in xrange(1,max(numbers)+1)]
return answer
end program.
/* Run job*/.
spssinc trans result = V1 to V30 type=0 /formula "runTask(TextStr)".
exe.

PostScript execution of nested procedures

(I'm back with yet another question :-) )
Given the following PostScript code:
/riverside { 5 pop } def
/star { 6 pop 2 {riverside} repeat } def
star
I'm wondering how nested procedures should be handled. (I'm creating my own interpreter).
When I execute the star procedure, halfway it finds a nameObjec(riverside) and replaces it with an executable array containing the values from the riverside procedure and executes them.
If I execute the repeat operator the interpreter crashes because there is only one item left on the stack.
Should I actually execute an executable array (=procedure) directly when I'm already in an executable array (=prodecure), or should the executable arrays (=procedures) always be pushed on the (operand?/execution?)stack? or only be executed by another operator?
How many times should this riverside be executed? (2 or 3 times?) I guess 2?
For your information: this is the situation that I have when I execute star on the 3rd line (see the ERROR):
% begin execute 3rd line (star)
% OP = operand stack
% EX = execution stack
% handle 6
OP: 6
EX: star
% handle pop (removes 6 from OP)
OP: -
EX: star
% handle 2
OP: 2
EX: star
% set the riverside executable array on the EX, execute the values
OP: 2
EX: star riverside
% repeat operator:
CRASH, only one item on the OP left, but repeat operator requires 2 operands.
OP: 5
EX:
% end
Please shine a light on this matter, because it is somewhat complex/confusing :-)
Update:
another code sample might be this one:
/starside
{ 72 0 lineto
currentpoint translate
-144 rotate } def
/star
{ moveto
currentpoint translate
4 {starside} repeat
closepath
gsave
.5 setgray fill
grestore
stroke } def
200 200 star
showpage
when the interpreter tokenizes /star { moveto ... if it encounters the nested {starside} how will that be treated? (+ what if there was {starside 5 2 mul pop} instead of only {starside} ?)
I believe you need to look at section 3.5.3 of the PLRM. Although this deals with a simple executable array the concept is the same. When the token scanner encounters a '{' it starts to build an executable array. Until it reaches a matching '}' token the scanner simply stores what it encounters on the operand stack. When it encounters the matching '{' then the objects are converted into an executable array (and stored on the operand stack)
In the case of the scanner encountering an executable name, it stores the name on the operand stack. It does not execute the name, nor does it even perform lookup on it to retrieve the associated object.
So immediately before the execution of '}' in your example, the operand stack would contain twp objects, the '{' opening array, and the executable name riverside. When you encounter the '}' then the scanner creates the actual executable array and stores it on the operand stack. (Note, implementation details vary here)
So immediately before the execution of 'repeat' you would have two objects on the stack, the counter and an executable array containing a single executable name.
You don't look up the name until the executable array containing the name is executed.
This might make it clearer:
%!
/test {(This is my initial string\n) print} def
2 {test} repeat
2 {test} /test {(This is my second string\n) print} def repeat
Notice that I've redefined 'test' after creating the executable array containing the executable name 'test', yet the execution uses the later definition of test. As you can see, its vitally important not to do name lookup too early!

How do I format a PRINT or WRITE statement to overwrite the current line on the console screen?

I want to display the progress of a calculation done with a DO-loop, on the console screen. I can print out the progress variable to the terminal like this:
PROGRAM TextOverWrite_WithLoop
IMPLICIT NONE
INTEGER :: Number, Maximum = 10
DO Number = 1, MAXIMUM
WRITE(*, 100, ADVANCE='NO') REAL(Number)/REAL(Maximum)*100
100 FORMAT(TL10, F10.2)
! Calcultations on Number
END DO
END PROGRAM TextOverWrite_WithLoop
The output of the above code on the console screen is:
10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00
90.00 100.00
All on the same line, wrapped only by the console window.
The ADVANCE='No' argument and the TL10 (tab left so many spaces) edit descriptor works well to overwrite text on the same line, e.g. the output of the following code:
WRITE(*, 100, ADVANCE='NO') 100, 500
100 FORMAT(I3, 1X, TL4, I3)
Is:
500
Instead of:
100 500
Because of the TL4 edit descriptor.
From these two instances one can conclude that the WRITE statement cannot overwrite what has been written by another WRITE statement or by a previous execution of the same WRITE satement (as in a DO-loop).
Can this be overcome somehow?
I am using the FTN95 compiler on Windows 7 RC1. (The setup program of the G95 compiler bluescreens Windows 7 RC1, even thought it works fine on Vista.)
I know about the question Supressing line breaks in Fortran 95 write statements, but it does not work for me, because the answer to that question means new ouput is added to the previous output on the same line; instead of new output overwriting the previous output.
Thanks in advance.
The following should be portable across systems by use of ACHAR(13) to encode the carriage return.
character*1 creturn
! CODE::
creturn = achar(13) ! generate carriage return
! other code ...
WRITE( * , 101 , ADVANCE='NO' ) creturn , i , npoint
101 FORMAT( a , 'Point number : ',i7,' out of a total of ',i7)
There is no solution to this question within the scope of the Fortran standards. However, if your compiler understand backslash in Fortran strings (GNU Fortran does if you use the option -fbackslash), you can write
write (*,"(A)",advance="no") "foo"
call sleep(1)
write (*,"(A)",advance="no") "\b\b\bbar"
call sleep(1)
write (*,"(A)",advance="no") "\b\b\bgee"
call sleep(1)
write (*,*)
end
This uses the backslash character (\b) to erase previously written characters on that line.
NB: if your compiler does not understand advance="no", you can use related non-standard tricks, such as using the $ specifier in the format string.
The following worked perfectly using g95 fortran:
NF = NF + 1
IF(MOD(NF,5).EQ.0) WRITE(6,42,ADVANCE='NO') NF, ' PDFs'//CHAR(13)
42 FORMAT(I6,A)
gave:
5 PDFs
leaving the cursor at the #1 position on the same line. On the next update,
the 5 turned into a 10. ASCII 13 (decimal) is a carriage return.
OPEN(6,CARRIAGECONTROL ='FORTRAN')
DO I=1,5
WRITE(6,'(1H+" ",I)') I
ENDDO

Resources