NEXT SENTENCE and nested sentences - cobol

I have come across this bit of code and am wondering which line will be executed if x is smaller than 3.
IF (X < 3)
NEXT SENTENCE
ELSE
GO TO A010-DO-A.
GO TO B010-DO-B.
GO TO C010-DO-C.
I am not sure if the NEXT SENTENCE will notice the sentence nested in the ELSE block. When NEXT SENTENCE is executed will it skip over GO TO A010-DO-A. or GO TO B010-DO-B.?

Don't confuse the scope of statements and sentences in COBOL.
Sentences end with a period (or full stop if you are British). Next Sentence
goes to the next statement following the end of the current Sentence. In
your example that would be GO TO B010-DO-B
In general usage of NEXT SENTENCE in Cobol is depreciated - at least since
the introduction of scope terminators such as END-whatever (eg. END-IF)
which happend sometime around 1985! Please do not
use NEXT SENTENCE any more. You need to know what it is and what it does
in order to read legacy code, but please
avoid using it in any new code.
A better way to write the code in your example would be:
IF (X < 3)
CONTINUE
ELSE
GO TO A010-DO-A
END-IF
GO TO B010-DO-B
GO TO C010-DO-C
or...
IF (X >= 3)
GO TO A010-DO-A
END-IF
GO TO B010-DO-B
GO TO C010-DO-C
Notice all the periods (.) have been removed because
the scope terminator END-IF makes them redundant. Periods
are only needed at the end of procedures (ie. paragraphs/sections) and a few other places.
The CONTINUE statement is basically a no-op so has no affect other than being
a place holder to keep the syntax valid.
BTW... Best I can tell, the statement GO TO C010-DO-C is logically unreachable.

If X is less than 3
IF (X < 3)
NEXT SENTENCE
Otherwise, or in other words, if X is equal to or grater than 3
ELSE
GO TO A010-DO-A.

NEXT SENTENCE "branches" (a GO TO in whatever language is generated by the compiler) to the line of code following the next full-stop/period that is located physically after the NEXT SENTENCE statement. It is effectively a GO TO without needing a "label".
As has been said, it should not be used in new code.

Related

How to leave a section before the end without a goto statement

In cobol a section (similar to a function in c) can look like this:
abc section.
command a
command b
if a = 4
go to abc-end
end-if
command c
command d.
abc-end.
exit.
Until now, the only possibility for me to leave the section before the end (on a = 4), was with the command "goto".
Is there any other way to do it without goto?
Thanks in advance for any help!
From the draft, FCD 1.0(E) 2010-08-06
The EXIT PARAGRAPH and EXIT SECTION statements provide a means of exiting a
structured procedure without executing any of the following statements within
the procedure.
Like this for your example.
abc section.
command a
command b
if a NOT EQUAL TO 4
command c
command d
end-if
.
Like this for something invented but with names to help grasp the thing.
30D-UNPACK-CRATE SECTION.
PERFORM 30DA-COMMON-PER-CRATE
IF NOT STANDARD-CRATE
PERFORM 30DD-NON-STANDARD-CRATE
END-IF
.
or
30D-UNPACK-CRATE SECTION.
PERFORM 30DA-COMMON-PER-CRATE
IF SPECIAL-CRATE
PERFORM 30DD-NON-STANDARD-CRATE
END-IF
.
This uses 88-level Condition Names, so might be
88 STANDARD-CRATE VALUE "A" THRU "D" "J" "2".
88 SPECIAL-CRATE VALUE "X" "Z".
Again, just examples.
Apart from the fact that you are already using SECTIONs (and if you have a SECTION all the paragraphs before the next section belong to that, so you can't have isolated paragraphs) the above could either be a SECTION or a paragraph. No need for PERFORM .... THRU ... or the ... SECTION. Except for local standards... and, with SECTIONs, how the program is already coded.
EXIT PARAGRAPH and EXIT SECTION.
These may or may not be available on thee compiler that you are using.
Be aware that they are functionally equivalent to a GO TO, with some caveats, so replacing GO TOs by them will give a false sense of... I don't know what, but it'll be false.
PERFORM THE-FIRST
THE-FIRST SECTION.
TF-1.
some code
.
TF-2.
some code
.
TF-9.
some code
.
Different program:
PERFORM A-PARAGRAPH THRU AP-9
A-PARAGRAPH.
some code
.
AP-1.
some code
.
AP-9.
some code
.
Often you will find the final paragraph of a SECTION just contains EXIT (and be aware that EXIT generates no code, it is just a placeholder) and a similar situation with PERFORM ... THRU ....
Usually PERFORM ... THRU ... will only contain two paragraphs, but there is nothing except local standards which says it is so.
EXIT SECTION will "GO TO" an assumed CONTINUE statement immediately prior to the full-stop/period which terminates the SECTION.
EXIT PARAGRAPH will "GO TO" an assumed CONTINUE statement immediately prior to the full-stop/period which terminates the paragraph
If EXIT PARAGRAPH is used within a SECTION, or within the range of a PERFORM ... THRU ... which contains more than one paragraph excluding any that solely uses EXIT, it will probably not work as the author expects nor as the next reader expects.
THE-FIRST SECTION.
TF-1.
some code
IF condition
EXIT PARAGRAPH
END-IF
some more code
IF condition
EXIT SECTION
END-IF
some more code
<EXIT PARAGRAPH will arrive here>
.
TF-2.
some code
.
TF-9.
some code
<EXIT SECTION will arrive here>
.
A-PARAGRAPH.
some code
IF condition
EXIT PARAGRAPH (a - just descriptive, not syntax)
END-IF
some more code
<EXIT PARAGRAPH (a) will arrive here>
.
AP-1.
some code
IF condition
EXIT PARAGRAPH (b - just descriptive, not syntax)
END-IF
some more code
<EXIT PARAGRAPH (b) will arrive here>
.
AP-9.
some code
.
So, if using EXIT PARAGRAPH in a SECTION or in a PERFORM ... THRU ... with more than one paragraph containing other than a single EXIT, then you're creating a potentially worse situation than using a plain GO TO (where it is blindingly obvious what the intended destination of the branch is) whilst using a construct which looks to be "structured".

Haskell/Parsec: How do you use the functions in Text.Parsec.Indent?

I'm having trouble working out how to use any of the functions in the Text.Parsec.Indent module provided by the indents package for Haskell, which is a sort of add-on for Parsec.
What do all these functions do? How are they to be used?
I can understand the brief Haddock description of withBlock, and I've found examples of how to use withBlock, runIndent and the IndentParser type here, here and here. I can also understand the documentation for the four parsers indentBrackets and friends. But many things are still confusing me.
In particular:
What is the difference between withBlock f a p and
do aa <- a
pp <- block p
return f aa pp
Likewise, what's the difference between withBlock' a p and do {a; block p}
In the family of functions indented and friends, what is ‘the level of the reference’? That is, what is ‘the reference’?
Again, with the functions indented and friends, how are they to be used? With the exception of withPos, it looks like they take no arguments and are all of type IParser () (IParser defined like this or this) so I'm guessing that all they can do is to produce an error or not and that they should appear in a do block, but I can't figure out the details.
I did at least find some examples on the usage of withPos in the source code, so I can probably figure that out if I stare at it for long enough.
<+/> comes with the helpful description “<+/> is to indentation sensitive parsers what ap is to monads” which is great if you want to spend several sessions trying to wrap your head around ap and then work out how that's analogous to a parser. The other three combinators are then defined with reference to <+/>, making the whole group unapproachable to a newcomer.
Do I need to use these? Can I just ignore them and use do instead?
The ordinary lexeme combinator and whiteSpace parser from Parsec will happily consume newlines in the middle of a multi-token construct without complaining. But in an indentation-style language, sometimes you want to stop parsing a lexical construct or throw an error if a line is broken and the next line is indented less than it should be. How do I go about doing this in Parsec?
In the language I am trying to parse, ideally the rules for when a lexical structure is allowed to continue on to the next line should depend on what tokens appear at the end of the first line or the beginning of the subsequent line. Is there an easy way to achieve this in Parsec? (If it is difficult then it is not something which I need to concern myself with at this time.)
So, the first hint is to take a look at IndentParser
type IndentParser s u a = ParsecT s u (State SourcePos) a
I.e. it's a ParsecT keeping an extra close watch on SourcePos, an abstract container which can be used to access, among other things, the current column number. So, it's probably storing the current "level of indentation" in SourcePos. That'd be my initial guess as to what "level of reference" means.
In short, indents gives you a new kind of Parsec which is context sensitive—in particular, sensitive to the current indentation. I'll answer your questions out of order.
(2) The "level of reference" is the "belief" referred in the current parser context state of where this indentation level starts. To be more clear, let me give some test cases on (3).
(3) In order to start experimenting with these functions, we'll build a little test runner. It'll run the parser with a string that we give it and then unwrap the inner State part using an initialPos which we get to modify. In code
import Text.Parsec
import Text.Parsec.Pos
import Text.Parsec.Indent
import Control.Monad.State
testParse :: (SourcePos -> SourcePos)
-> IndentParser String () a
-> String -> Either ParseError a
testParse f p src = fst $ flip runState (f $ initialPos "") $ runParserT p () "" src
(Note that this is almost runIndent, except I gave a backdoor to modify the initialPos.)
Now we can take a look at indented. By examining the source, I can tell it does two things. First, it'll fail if the current SourcePos column number is less-than-or-equal-to the "level of reference" stored in the SourcePos stored in the State. Second, it somewhat mysteriously updates the State SourcePos's line counter (not column counter) to be current.
Only the first behavior is important, to my understanding. We can see the difference here.
>>> testParse id indented ""
Left (line 1, column 1): not indented
>>> testParse id (spaces >> indented) " "
Right ()
>>> testParse id (many (char 'x') >> indented) "xxxx"
Right ()
So, in order to have indented succeed, we need to have consumed enough whitespace (or anything else!) to push our column position out past the "reference" column position. Otherwise, it'll fail saying "not indented". Similar behavior exists for the next three functions: same fails unless the current position and reference position are on the same line, sameOrIndented fails if the current column is strictly less than the reference column, unless they are on the same line, and checkIndent fails unless the current and reference columns match.
withPos is slightly different. It's not just a IndentParser, it's an IndentParser-combinator—it transforms the input IndentParser into one that thinks the "reference column" (the SourcePos in the State) is exactly where it was when we called withPos.
This gives us another hint, btw. It lets us know we have the power to change the reference column.
(1) So now let's take a look at how block and withBlock work using our new, lower level reference column operators. withBlock is implemented in terms of block, so we'll start with block.
-- simplified from the actual source
block p = withPos $ many1 (checkIndent >> p)
So, block resets the "reference column" to be whatever the current column is and then consumes at least 1 parses from p so long as each one is indented identically as this newly set "reference column". Now we can take a look at withBlock
withBlock f a p = withPos $ do
r1 <- a
r2 <- option [] (indented >> block p)
return (f r1 r2)
So, it resets the "reference column" to the current column, parses a single a parse, tries to parse an indented block of ps, then combines the results using f. Your implementation is almost correct, except that you need to use withPos to choose the correct "reference column".
Then, once you have withBlock, withBlock' = withBlock (\_ bs -> bs).
(5) So, indented and friends are exactly the tools to doing this: they'll cause a parse to immediately fail if it's indented incorrectly with respect to the "reference position" chosen by withPos.
(4) Yes, don't worry about these guys until you learn how to use Applicative style parsing in base Parsec. It's often a much cleaner, faster, simpler way of specifying parses. Sometimes they're even more powerful, but if you understand Monads then they're almost always completely equivalent.
(6) And this is the crux. The tools mentioned so far can only do indentation failure if you can describe your intended indentation using withPos. Quickly, I don't think it's possible to specify withPos based on the success or failure of other parses... so you'll have to go another level deeper. Fortunately, the mechanism that makes IndentParsers work is obvious—it's just an inner State monad containing SourcePos. You can use lift :: MonadTrans t => m a -> t m a to manipulate this inner state and set the "reference column" however you like.
Cheers!

When to use dots in COBOL?

I'm completely new to COBOL, and I'm wondering:
There seems to be no difference between
DISPLAY "foo"
and
DISPLAY "foo".
What does the dot at the end of a line actually do?
When should I use/avoid it?
The period ends the "sentence." It can have an effect on your logic. Consider...
IF A = B
PERFORM 100-DO
SET I-AM-DONE TO TRUE.
...and...
IF A = B
PERFORM 100-DO.
SET I-AM-DONE TO TRUE
The period ends the IF in both examples. In the first, the I-AM-DONE 88-level is set conditionally, in the second it is set unconditionally.
Many people prefer to use explicit scope terminators and use only a single period, often on a physical line by itself to make it stand out, to end a paragraph.
I'm typing this from memory, so if anyone has corrections, I'd appreciate it.
Cobol 1968 required the use of a period to end a division name, procedure division paragraph name, or procedure division paragraph. Each data division element ended with a period.
There were no explicit scope terminators in Cobol 68, like END-IF. A period was also used to end scope. Cobol 1974 brought about some changes that didn't have anything to do with periods.
Rather than try to remember the rules for periods, Cobol programmers tended to end every sentence in a Cobol program with a period.
With the introduction of scope terminators in Cobol 1985, Cobol coders could eliminate most of the periods within a procedure division paragraph. The only periods required in the procedure division of a Cobol 85 program are the to terminate the PROCEDURE DIVISION statement, to terminate code (if any) prior to first paragraph / section header, to terminate paragraph / section header, to terminate a paragraph / section and to terminate a program (if no paragraphs / sections).
Unfortunately, this freaked out the Cobol programmers that coded to the Cobol 68 and 74 standard. To this day, many Cobol shops enforce a coding rule about ending every procedure division sentence with a period.
Where to use!
There are 2 forms to use point.
You can use POINT after every VERB in a SECTION.
EXAMPLE:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I.
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM.
DISPLAY WK-I.
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF.
0000-EXAMEPLE-END. EXIT.
Note that we are using point after every VERB, EXCEPT inside a PERFORM, IF, ETC...
Another form to use is: USING ONLY ONE POINT AT THE END OF SECTION, like here:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM
DISPLAY WK-I
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF
. <======== point here!!!!!!! only HERE!
0000-EXAMEPLE-END. EXIT.
BUT, we ALWAYS have after EXIT and SECTION.....
When it is my choice, I use full-stop/period only where necessary. However, local standards often dictate otherwise: so be it.
The problems caused by full-stops/periods are in the accidental making of something unconditional when code "with" is copied into code "without" whilst coder's brain is left safely in the carpark.
One extra thing to watch for is (hopefully) "old" programs which use NEXT SENTENCE in IBM Mainframe Cobol. "NEXT SENTENCE" means "after the next full-stop/period" which, in "sparse full-stop/period" code is the end of the paragraph/section. Accident waiting to happen. Get a spec-change to allow "NEXT SENTENCE" to be changed to "CONTINUE".
Just tested that in my cobol 85 program by removing all of the periods in procedures and it worked fine.
example:
PROCEDURE DIVISION.
MAIN-PROCESS.
READ DISK-IN
AT END
DISPLAY "NO RECORDS ON INPUT FILE"
STOP RUN
ADD 1 TO READ-COUNT.
PERFORM PROCESS-1 UNTIL END-OF-FILE.
WRITE-HEADER.
MOVE HEADER-INJ-1 TO HEADER-OUT-1
WRITE HEADER-OUT-1.
CLOSE-FILES.
CLOSE DISK-IN
CLOSE DISK-OUT
DISPLAY "READ: " READ-COUNT
DISPLAY "WRITTEN: " WRITE-COUNT
SORT SORT-FILE ON ASCENDING SER-S
USING DISK-OUT
GIVING DISK-OUT
STOP RUN.

In Cobol, to test "null or empty" we use "NOT = SPACE [ AND/OR ] LOW-VALUE" ? Which is it?

I am now working in mainframe,
in some modules, to test
Not null or Empty
we see :
NOT = SPACE OR LOW-VALUE
The chief says that we should do :
NOT = SPACE AND LOW-VALUE
Which one is it ?
Thanks!
Chief is correct.
COBOL is supposed to read something like natural language (this turns out to be just
another bad joke).
Lets play with the following variables and values:
A = 1
B = 2
C = 3
An expression such as:
IF A NOT EQUAL B THEN...
Is fairly straight forward to understand. One is not equal to two so we will do
whatever follows the THEN. However,
IF A NOT EQUAL B AND A NOT EQUAL C THEN...
Is a whole lot harder to follow. Again one is not equal to two AND one is not
equal to three so we will do whatever follows the 'THEN'.
COBOL has a short hand construct that IMHO should never be used. It confuses just about
everyone (including me from time to time). Short hand expressions let you reduce the above to:
IF A NOT EQUAL B AND C THEN...
or if you would
like to apply De Morgans rule:
IF NOT (A EQUAL B OR C) THEN...
My advice to you is avoid NOT in exprssions and NEVER use COBOL short hand expressions.
What you really want is:
IF X = SPACE OR X = LOW-VALUE THEN...
CONTINUE
ELSE
do whatever...
END-IF
The above does nothing when the 'X' contains either spaces or low-values (nulls). It
is exactly the same as:
IF NOT (X = SPACE OR X = LOW-VALUE) THEN
do whatever...
END-IF
Which can be transformed into:
IF X NOT = SPACE AND X NOT = LOW-VALUE THEN...
And finally...
IF X NOT = SPACE AND LOW-VALUE THEN...
My advice is to stick to simple to understand longer and straight forward expressions
in COBOL, forget the short hand crap.
In COBOL, there is no such thing as a Java null AND it is never "empty".
For example, take a field
05 FIELD-1 PIC X(5).
The field will always contain something.
MOVE LOW-VALUES TO FIELD-1.
now it contains hexadimal zeros. x'0000000000'
MOVE HIGH-VALUES TO FIELD-1.
Now it contains all binary ones: x'FFFFFFFFFF'
MOVE SPACES TO FIELD-1.
Now each byte is a space. x'4040404040'
Once you declare a field, it points to a certain area in memory. That memory area must be set to something, even if you never modify it, it still will have what ever garbage it had before the program was loaded. Unless you initialize it.
05 FIELD-1 PIC X(6) VALUE 'BARUCH'.
It is worth noting that the value null is not always the same as low-value and this depends on the device architecture and its character set in use as determined by the manufacturer. Mainframes can have an entirely different collating sequence (low to high character code and symbol order) and symbol set compared to a device using linux or windows as you have no doubt seen by now. The shorthand used in Cobol for comparisons is sometimes used for boolean operations, like IF A GOTO PAR-5 and IF A OR C THEN .... and can be combined with comparisons of two variables or a variable and a literal value. The parser and compiler on different devices should deal with these situations in a standard (ANSI) method but this is not always the situation.
I agree with NealB. Keep it simple, avoid "short cuts", make it easy to understand without having to refer to the manual to check things out.
IF ( X EQUAL TO SPACE )
OR ( X EQUAL TO LOW-VALUES )
CONTINUE
ELSE
do whatever...
END-IF
However, why not put an 88 on X, and keep it really simple?:
88 X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY VALUE SPACE, LOW-VALUES.
IF X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY
CONTINUE
ELSE
do whatever...
END-IF
Note, in Mainframe Cobol, NULL is very restricted in meaning, and is not the meaning that you are attributing to it, Tom. "Empty" only means something in a particular coder-generated context (it means nothing to Cobol as far as a field is concerned).
We don't have "strings". Therefore, we don't have "null strings" (a string of length one including string-terminator). We don't have strings, so a field always has a value, so it can never be "empty" other than as termed by the programmer.
Oguz, I think your post illustrates how complex something that is really simple can be made, and how that can lead to errors. Can you test your conditions, please?

Easiest way to remove Latex tag (but not its content)?

I am using TeXnicCenter to edit a LaTeX document.
I now want to remove a certain tag (say, emph{blabla}} which occurs multiple times in my document , but not tag's content (so in this example, I want to remove all emphasization).
What is the easiest way to do so?
May also be using another program easily available on Windows 7.
Edit: In response to regex suggestions, it is important that it can deal with nested tags.
Edit 2: I really want to remove the tag from the text file, not just disable it.
Using a regular expression do something like s/\\emph\{([^\}]*)\}/\1/g. If you are not familiar with regular expressions this says:
s -- replace
/ -- begin match section
\\emph\{ -- match \emph{
( -- begin capture
[^\}]* -- match any characters except (meaning up until) a close brace because:
[] a group of characters
^ means not or "everything except"
\} -- the close brace
and * means 0 or more times
) -- end capture, because this is the first (in this case only) capture, it is number 1
\} -- match end brace
/ -- begin replace section
\1 -- replace with captured section number 1
/ -- end regular expression, begin extra flags
g -- global flag, meaning do this every time the match is found not just the first time
This is with Perl syntax, as that is what I am familiar with. The following perl "one-liners" will accomplish two tasks
perl -pe 's/\\emph\{([^\}]*)\}/\1/g' filename will "test" printing the file to the command line
perl -pi -e 's/\\emph\{([^\}]*)\}/\1/g' filename will change the file in place.
Similar commands may be available in your editor, but if not this will (should) work.
Crowley should have added this as an answer, but I will do that for him, if you replace all \emph{ with { you should be able to do this without disturbing the other content. It will still be in braces, but unless you have done some odd stuff it shouldn't matter.
The regex would be a simple s/\\emph\{/\{/g but the search and replace in your editor will do that one too.
Edit: Sorry, used the wrong brace in the regex, fixed now.
\renewcommand{\emph}[1]{#1}
any reasonably advanced editor should let you do a search/replace using regular expressions, replacing emph{bla} by bla etc.

Resources