How to check for EOF/EOL with Stream I/O in Fortran? - stream

I would like to use FORTRAN streaming I/O to make a program that tells me how many lines a text-file has. The idea is to make something like this:
OPEN(UNIT=10,ACCESS='STREAM',FILE='testfile.txt')
nLines=0
bContinue=.TRUE.
DO WHILE (bContinue)
READ(UNIT=10) cCharacter
IF (cCharacter.EQ.{EOL-char}) nLines=nLines+1
IF (cCharacter.EQ.{EOF-char}) bContinue=.FALSE.
ENDDO
(I didn't include variable declaration but I think you get the idea of what they are; the only important clarification would be that that cCharacter has LEN=1)
My problem is that I don't know how to check if the character I just read from the file is an end-of-line or end-of-file (the "ifs" in the code). When you read and print characters this way, you eventually get newlines in the same place you had them in the original text, so I think it does read and recognize them as "characters", somehow. Perhaps turning the characters into integers and comparing to the appropriate number? Or is there a more direct way?
(I know that you can use the register reading (EDIT: I meant record reading) to do a program that reads lines more easily and add an IOstatus to check for eof, but the "line counter" is just a useful example, the idea is to learn how to move in a more controlled way through a textfile)

Checking for a specific character as line terminator makes you program OS dependent. It would be better to use the facilities of the language so that your program is compiler and OS dependent. Since lines are basically records, why do this with steam I/O? That request seems to make an easy job into a hard one. If are can use regular IO, here is an example program to count the lines in a text file.
EDIT: the code fragment was changed into a program to answer questions in the comments. With "line" as a character variable, when I test the program with gfortran and ifort I don't see a problem when the input file has empty or blank lines.
program test_lc
use, intrinsic :: iso_fortran_env
integer :: LineCount, Read_Code
character (len=200) :: line
open (unit=51, file="temp.txt", status="old", access='sequential', form='formatted', action='read' )
LineCount = 0
ReadLoop: do
read (51, '(A)', iostat=Read_Code) line
if ( Read_Code /= 0 ) then
if ( Read_Code == iostat_end ) then
exit ReadLoop ! end of file --> line count found
else
write ( *, '( / "read error: ", I0 )' ) Read_Code
stop
end if
end if
LineCount = LineCount + 1
write (*, '( I0, ": ''", A, "''" )' ) LineCount, trim (line)
if ( len_trim (line) == 0 ) write (*, '("The above is an empty or all blank line.")' )
end do ReadLoop
write (*, *) "found", LineCount, " lines"
end program test_lc
If you want to do further processing of the file, you can rewind it.
P.S.
The main reason that I have used Fortran Stream IO is to read files produced by other languages, e.g., C
Portable methods are provided to write new-line boundaries; I'm not aware of a portable method to test for such.

Related

What has happened to 'tick' in ANS Forth?

As I remembered 'tick' from FIG-Forth, it could be used without abortion when a word wasn't in the wordlist:
' the_word
gave a reference to the word if it was in the word-list and gave 'false' otherwise.
Is it possible to construct something like that in ANS Forth to be used with [if], [then] and [else]?
I guess something like this:
: tick ( a u -- xt|f ) bl word find 0= if drop 0 then ;
The FIG-Forth document says:
Leaves the parameter field address of dictionary word nnnn. As a
compiler directive, executes in a colon-definition to compile the
address as a literal. If the word is not found after a search of
CONTEXT and CURRENT, an appropriate error message is given.
Although it is entirely possible the version of FIG-Forth you where using did not abide by the standard, and returned false.

Read a single number from a text file and advance stream position in Julia

I understand that Julia has a complete set of low level tools for interfacing with binary files on one hand and some powerfull utilities such as readdlm to load text files containing rectangular data into Array structures on the other hand.
What I cannot discover in the standard library docs, however, is how to easily get input from less structured text files. In particular, what would be the Julia equivalent of the c++ idiom
some_input_stream >> a_variable_int_perhaps;
Given this is such a common usage scenario I am surprised something like this does not feature prominently in the standard library...
You can use readuntil http://docs.julialang.org/en/latest/stdlib/io-network/#Base.readuntil
shell> cat test.txt
1 2 3 4
julia> i,j = open("test.txt") do f
parse(Int, readuntil(f," ")), parse(Int, readuntil(f," "))
end
(1,2)
EDIT: To address comments
To get the last integer in an irregularly formatted ascii file you could use split if you know the character preceding the integer (I've use a blank space here)
shell> cat test.txt
1.0, two five:$#!() + 4
last line 3
julia> i = open("test.txt") do f
parse(Int, split(readline(f), " ")[end])
end
4
As far as code length is concerned, the above examples are completely self contained and the file is opened and closed in an exception safe manner (i.e. wrapped in a try-finally block). To do the same in C++ would be quite verbose.

parsing input file in fortran

This is a continuation of my older thread.
I have a file from different code, that I should parse to use as my input.
A snippet from it looks like:
GLOBAL SYSTEM PARAMETER
NQ 2
NT 2
NM 2
IREL 3
*************************************
BEXT 0.00000000000000E+00
SEMICORE F
LLOYD F
NE 32 0
IBZINT 2
NKTAB 936
XC-POT VWN
SCF-ALG BROYDEN2
SCF-ITER 29
SCF-MIX 2.00000000000000E-01
SCF-TOL 1.00000000000000E-05
RMSAVV 2.11362995016878E-06
RMSAVB 1.25411205586140E-06
EF 7.27534671479201E-01
VMTZ -7.72451391270293E-01
*************************************
And so on.
Currently I am reading it line by line, as:
Program readpot
use iso_fortran_env
Implicit None
integer ::i,filestat,nq
character(len=120):: rdline
character(10)::key!,dimension(:),allocatable ::key
real,dimension(:),allocatable ::val
i=0
open(12,file="FeRh.pot_new",status="old")
readline:do
i=i+1
read(12,'(A)',iostat=filestat) rdline!(i)
if (filestat /= 0) then
if (filestat == iostat_end ) then
exit readline
else
write ( *, '( / "Error reading file: ", I0 )' ) filestat
stop
endif
end if
if (rdline(1:2)=="NQ") then
read(rdline(19:20),'(i)'),nq
write(*,*)nq
end if
end do readline
End Program readpot
So, I have to read every line, manually find the value column corresponding to the key, and write that(For brevity, I have shown for one value only).
My question is, is this the proper way of doing this? or there is other simpler way? Kindly let me know.
If the file has no variability you scarcely need to parse it at all. Let's suppose that you have declared variables for all the interesting data items in the file and that those variables have the names shown on the lines of the file. For example
INTEGER :: nq , nt, nm, irel
REAL:: scf_mix, scf_tol ! '-' not allowed in Fortran names
CHARACTER(len=48) :: label, text
LOGICAL :: semicore, lloyd
! Complete this as you wish
Then write a block of code like this
OPEN(12,file="FeRh.pot_new",status="old")
READ(12,*) ! Not interested in the 1st line
READ(12,*) label, nq
READ(12,*) label, nt
READ(12,*) label, nm
READ(12,*) label, irel
READ(12,*) ! Not interested in this line
READ(12,*) label, bext
READ(12,*) label, semicore
! Other lines to write
CLOSE(12)
Fortran's list-directed input understands blanks in lines to separate values. It will not read those blanks as part of a character variable. That behaviour can be changed but in your case you don't need to. Note that it will also understand the character F to mean .false. when read into a logical variable.
My code snippet just ignores the labels and lines of explanation. If you are of a nervous disposition you could process them, perhaps
IF (label/='NE') STOP
or whatever you wish.

Lua source code manipulation: get innermost function() location for a given line

I've got a file with syntactically correct Lua 5.1 source code.
I've got a position (line and character offset) inside that file.
I need to get an offset in bytes to the closing parenthesis of the innermost function() body that contains that position (or figure out that the position belongs to the main chunk of the file).
I.e.:
local function foo()
^ result
print("bar")
^ input
end
local foo = function()
^ result
print("bar")
^ input
end
local foo = function()
return function()
^ result
print("bar")
^ input
end
end
...And so on.
How do I do that robustly?
EDIT: My original answer did not take into account the "innermost" requirement. I've since taken that into account
To make things "robust," there are a few considerations.
First of all, it's important that you skip over string and comment contents, to avoid incorrect output in situations like:
foo = function()
print(" function() ")
-- function()
print("bar")
^ input
end
This can be somewhat difficult, considering Lua's nested string and comment syntax. Consider, for example, a situation where the input begins in a nested string or comment:
foo = function()
print([[
bar = function()
print("baz")
^ input
end
]])
end
Consequently, if you want a completely robust system, it is not acceptable to only parse backwards until you hit the end of a function parameter list, because you may not have parsed backwards far enough to reach a [[ which would invalidate your match. It is therefore necessary to parse the entire file up to your position (unless you're okay with incorrect matches in these weird situations. If this is an editor plugin, these "incorrect" results may actually be desirable, because they would allow you to edit lua code which is stored in string literal form inside other lua code using the same plugin).
Because the particular syntax that you're trying to match doesn't have any kind of "nesting", a full-blown parser isn't needed. You will need to maintain a stack, however, to keep track of scope. With that in mind, all you need to do is step through the source file character-by-character from the beginning, applying the following logic:
Every time a " or ' is encountered, ignore the characters up to the closing " or '. Be careful to handle escapes like \" and \\
Every time a -- is encountered, ignore the characters up to the closing newline for the comment. Be careful to only do this if the comment is not a multiline comment.
Every time a multiline string opening symbol is encountered (such as [[, [=[, etc), or a multiline comment symbol is encountered (such as --[[ or --[=[, etc) ignore the characters up until the closing square brackets with the proper number of matching equals signs between them.
When a word boundary is encountered check to see if the characters after it could begin a block which ends with an end (for example, if, while, for, function, etc. DO NOT include repeat). If so, push the position on the scope stack. A "word boundary" in this case is any character which could not be used a lua identifier (this is to prevent matches in cases like abcfunction()). The beginning of the file is also considered a word boundary.
If a word boundary is encountered and it is followed by end, pop the top element of the stack. If the stack has no elements, complain about a syntax error.
When you finally step forward and reach your "input" position, pop elements from the stack until you find a function scope. Step forward from that position to the next ), ignoring )'s in comments (which could theoretically be found in an argument list if it spans multiple lines or contains inline --[[ ]] comments). That position is your result.
This should handle every case, including situations where the function syntactic sugar is used, like
function foo()
print("bar")
end
which you did not include in your example but which I imagine you still want to match.

Easiest way to remove Latex tag (but not its content)?

I am using TeXnicCenter to edit a LaTeX document.
I now want to remove a certain tag (say, emph{blabla}} which occurs multiple times in my document , but not tag's content (so in this example, I want to remove all emphasization).
What is the easiest way to do so?
May also be using another program easily available on Windows 7.
Edit: In response to regex suggestions, it is important that it can deal with nested tags.
Edit 2: I really want to remove the tag from the text file, not just disable it.
Using a regular expression do something like s/\\emph\{([^\}]*)\}/\1/g. If you are not familiar with regular expressions this says:
s -- replace
/ -- begin match section
\\emph\{ -- match \emph{
( -- begin capture
[^\}]* -- match any characters except (meaning up until) a close brace because:
[] a group of characters
^ means not or "everything except"
\} -- the close brace
and * means 0 or more times
) -- end capture, because this is the first (in this case only) capture, it is number 1
\} -- match end brace
/ -- begin replace section
\1 -- replace with captured section number 1
/ -- end regular expression, begin extra flags
g -- global flag, meaning do this every time the match is found not just the first time
This is with Perl syntax, as that is what I am familiar with. The following perl "one-liners" will accomplish two tasks
perl -pe 's/\\emph\{([^\}]*)\}/\1/g' filename will "test" printing the file to the command line
perl -pi -e 's/\\emph\{([^\}]*)\}/\1/g' filename will change the file in place.
Similar commands may be available in your editor, but if not this will (should) work.
Crowley should have added this as an answer, but I will do that for him, if you replace all \emph{ with { you should be able to do this without disturbing the other content. It will still be in braces, but unless you have done some odd stuff it shouldn't matter.
The regex would be a simple s/\\emph\{/\{/g but the search and replace in your editor will do that one too.
Edit: Sorry, used the wrong brace in the regex, fixed now.
\renewcommand{\emph}[1]{#1}
any reasonably advanced editor should let you do a search/replace using regular expressions, replacing emph{bla} by bla etc.

Resources