Cobol Mainframe - perform varying Index - display

Cobol Mainframe - perform varying Index - display - cobol

so I am just starting to learn COBOL on Z/OS. I have done quite a bit using visual cobol; however, this is still quite different.
I need to display a table starting at the Index of 1 and displaying until the index is 50
PERFORM VARYING W03-SUBJ-INDX FROM 1 BY 1
UNTIL W03-SUBJ-INDX = 50
DISPLAY W03-SUBJ-TABLE
END-PERFORM
That is what I currently have I also tried
PERFORM VARYING W03-SUBJ-INDX FROM 1 BY 1
UNTIL W03-SUBJ-INDX = 50
DISPLAY W03-SUBJ-TABLE(w03-subj-indx)
END-PERFORM
The top example displays only the first indexed item (To be expected) - The second example gives me an error stating ")" was unexpected.
Any help would be appreciated.. I was told I have to use the index

So regarding your existing code....there was some flakiness in some of the versions of the Enterprise Cobol parsers...
DISPLAY W03-SUBJ-TABLE(w03-subj-indx)
might work as this:
DISPLAY W03-SUBJ-TABLE ( w03-subj-indx )
Some of the versions of the Enterprise Cobol compiler did not parse well without spaces. This was especially important when doing reference modification, but applied to tables as well.
Give it a try, YMMV.

You don't mention which compiler version you are on, but there was once one -- and I can't remember the version -- that was flakey with subscripts verses reference modification.
Try plugging in some spaces:
DISPLAY W03-SUBJ-TABLE ( w03-subj-indx )
Also, make sure that W03-SUBJ-TABLE is the array, not a group item containing the array.

Related

GNUCobol compiled program counts one more record than expected

I'm learning COBOL programming and using GNUCobol (on Linux) to compile and test some simple programs. In one of those programs I have found an unexpected behavior that I don't understand: when reading a sequential file of records, I'm always getting one extra record and, when writing these records to a report, the last record is duplicated.
I have made a very simple program to reproduce this behavior. In this case, I have a text file with a single line of text: "0123456789". The program should count the characters in the file (or 1 chararacter long records) and I expect it to display "10" as a result, but instead I get "11".
Also, when displaying the records, as they are read, I get the following output:
0
1
2
3
4
5
6
7
8
9
11
(There are two blank spaces between 9 and 11).
This is the relevant part of this program:
FD SIMPLE.
01 SIMPLE-RECORD.
05 SMP-NUMBER PIC 9(1).
[...]
PROCEDURE DIVISION.
000-COUNT-RECORDS.
OPEN INPUT SIMPLE.
PERFORM UNTIL SIMPLE-EOF
READ SIMPLE
AT END
SET SIMPLE-EOF TO TRUE
NOT AT END
DISPLAY SMP-NUMBER
ADD 1 TO RECORD-COUNT
END-READ
END-PERFORM
DISPLAY RECORD-COUNT.
CLOSE SIMPLE.
STOP RUN.
I'm using the default options for the compiler, and I have tried using 'WITH TEST {BEFORE|AFTER}' but the result is the same. What can be the cause of this behavior or how can I get the expected result?
Edit: I tried using an "empty" file as data source, expecting a 0 record count, using two different methods to empty the file:
$ echo "" > SIMPLE
This way the record count is 1 (ls -l gives a size of 1 byte for the file).
$ rm SIMPLE
$ touch SIMPLE
This way the record count is 0 (ls -l gives a size of 0 bytes for the file). So I guess that somehow the compiled program is detecting an extra character, but I don't know how to avoid this.

I found out that the cause of this behavior is the automatic newline character that vim seems to append when saving the data file.
After disabling this in vim this way
:set binary
:set noeol
the program works as expected.
Edit: A more elegant way to prevent this problem, when working with data files created from a text editor, is using ORGANIZATION IS LINE SEQUENTIAL in the SELECT clause.
Since the problem was caused by the data format, should I delete this question?

Count number of alphabetic charcters in data

i have string as ' #$rahul ' and i have to calculate number of alpha bates without using inspect verb. Also not using by ord clause for ASCII value. My instructor told me to use empty array but how it is used?? I tried but it counts for symbols also.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-TABLE.
05 WS-A OCCURS 3 TIMES INDEXED BY I.
10 WS-B PIC A(2).
10 WS-C OCCURS 2 TIMES INDEXED BY J.
15 WS-D PIC X(3).
PROCEDURE DIVISION.
MOVE '####DEF34GHIJKL56MNOPQR' TO WS-TABLE.
PERFORM A-PARA VARYING I FROM 1 BY 1 UNTIL I >3
STOP RUN.
A-PARA.
PERFORM C-PARA VARYING J FROM 1 BY 1 UNTIL J>2.
C-PARA.
if ws-table(1) equals to spaces
continue
else
add +1 to ws-count
end-if
DISPLAY WS-C(I,J).

Apart from your table-definition and actual use of the table, you have basically got the idea already, except you are not sure what, specifically, to test for.
What you need to do is find the section in your COBOL documentation on class condition and class tests.
I suspect this bit of code:
if ws-table(1) equals to spaces
continue
else
add +1 to ws-count
end-if
Has been added in haste. With your data, ws-table(1) will never be space, and ws-count is not defined.
Back to your definition. You are defining a structure with three parts (WS-A OCCURS 3) each of which consists of a two-byte alphabetic field followed by two three-byte alphanumeric fields. That definition is of no direct use to your task.
01 the-data.
05 FILLER OCCURS 24 TIMES
INDEXED BY data-byte-index.
10 the-data-byte PIC X.
That will allow you to look at each byte individually. Note that you can always use good names, which will make your programs easier to understand, reduce the chance of careless errors, and make people's lives, including your own when you return to a program some time later, generally easier.
Note, you can also use reference-modification and lose out on the readability for the benefit of less typing.
Format of your program
Unless it is dictated to you (and although I've never seen it before in over 30 years, I have seen it a couple of time recently) there is absolutely no point in "indenting" things like the WORKKING-STORAGE section, or even paragraph/SECTION labels. They already have all the indentation they need, and further indentation adds nothing, which requiring more typing, and also causing experienced COBOL programmers to wonder why you are doing that.
Since the 1985 Standard for COBOL, the use of full-stops/periods in the PROCEDURE DIVISION is greatly relaxed. Since a full-stop/period in the wrong place can cause errors, this was a good thing. It will also be good if you take full advantage of it. Commas look far too much like full-stops/periods to be of any use in code. They never have to be there, so having them benefits nothing. Also noise-words like THEN can/should be avoided. Unlike commas, spacing can be a boon to the format of a program.
Here's your code above, reformatted:
MOVE '####DEF34GHIJKL56MNOPQR'
TO WS-TABLE
PERFORM A-PARA
VARYING I
FROM 1
BY 1
UNTIL I > 3
STOP RUN
.
A-PARA.
PERFORM C-PARA
VARYING J
FROM 1
BY 1
UNTIL J > 2
.
C-PARA.
if ws-table ( 1 ) equal to space
continue
else
add +1 to ws-count
end-if
DISPLAY
WS-C ( I J )
.
Use some proper names, and it's start to look like a real program.
Note, not all people agree on how a program should be formatted. Seriously.

When to use dots in COBOL?

I'm completely new to COBOL, and I'm wondering:
There seems to be no difference between
DISPLAY "foo"
and
DISPLAY "foo".
What does the dot at the end of a line actually do?
When should I use/avoid it?

The period ends the "sentence." It can have an effect on your logic. Consider...
IF A = B
PERFORM 100-DO
SET I-AM-DONE TO TRUE.
...and...
IF A = B
PERFORM 100-DO.
SET I-AM-DONE TO TRUE
The period ends the IF in both examples. In the first, the I-AM-DONE 88-level is set conditionally, in the second it is set unconditionally.
Many people prefer to use explicit scope terminators and use only a single period, often on a physical line by itself to make it stand out, to end a paragraph.

I'm typing this from memory, so if anyone has corrections, I'd appreciate it.
Cobol 1968 required the use of a period to end a division name, procedure division paragraph name, or procedure division paragraph. Each data division element ended with a period.
There were no explicit scope terminators in Cobol 68, like END-IF. A period was also used to end scope. Cobol 1974 brought about some changes that didn't have anything to do with periods.
Rather than try to remember the rules for periods, Cobol programmers tended to end every sentence in a Cobol program with a period.
With the introduction of scope terminators in Cobol 1985, Cobol coders could eliminate most of the periods within a procedure division paragraph. The only periods required in the procedure division of a Cobol 85 program are the to terminate the PROCEDURE DIVISION statement, to terminate code (if any) prior to first paragraph / section header, to terminate paragraph / section header, to terminate a paragraph / section and to terminate a program (if no paragraphs / sections).
Unfortunately, this freaked out the Cobol programmers that coded to the Cobol 68 and 74 standard. To this day, many Cobol shops enforce a coding rule about ending every procedure division sentence with a period.

Where to use!
There are 2 forms to use point.
You can use POINT after every VERB in a SECTION.
EXAMPLE:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I.
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM.
DISPLAY WK-I.
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF.
0000-EXAMEPLE-END. EXIT.
Note that we are using point after every VERB, EXCEPT inside a PERFORM, IF, ETC...
Another form to use is: USING ONLY ONE POINT AT THE END OF SECTION, like here:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM
DISPLAY WK-I
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF
. <======== point here!!!!!!! only HERE!
0000-EXAMEPLE-END. EXIT.
BUT, we ALWAYS have after EXIT and SECTION.....

When it is my choice, I use full-stop/period only where necessary. However, local standards often dictate otherwise: so be it.
The problems caused by full-stops/periods are in the accidental making of something unconditional when code "with" is copied into code "without" whilst coder's brain is left safely in the carpark.
One extra thing to watch for is (hopefully) "old" programs which use NEXT SENTENCE in IBM Mainframe Cobol. "NEXT SENTENCE" means "after the next full-stop/period" which, in "sparse full-stop/period" code is the end of the paragraph/section. Accident waiting to happen. Get a spec-change to allow "NEXT SENTENCE" to be changed to "CONTINUE".

Just tested that in my cobol 85 program by removing all of the periods in procedures and it worked fine.
example:
PROCEDURE DIVISION.
MAIN-PROCESS.
READ DISK-IN
AT END
DISPLAY "NO RECORDS ON INPUT FILE"
STOP RUN
ADD 1 TO READ-COUNT.
PERFORM PROCESS-1 UNTIL END-OF-FILE.
WRITE-HEADER.
MOVE HEADER-INJ-1 TO HEADER-OUT-1
WRITE HEADER-OUT-1.
CLOSE-FILES.
CLOSE DISK-IN
CLOSE DISK-OUT
DISPLAY "READ: " READ-COUNT
DISPLAY "WRITTEN: " WRITE-COUNT
SORT SORT-FILE ON ASCENDING SER-S
USING DISK-OUT
GIVING DISK-OUT
STOP RUN.

How can we eliminate junk value in field?

I have some csv record which are variable in length , for example:
0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85
0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000
I need to seperate each of these fields and I need the last field which should be a money.
However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 .
"FILE SECTION"
FD INPUT_FILE.
01 INPUT_REC PIC X(66).
"WORKING STORAGE SECTion"
01 WS_INPUT_REC PIC X(66).
01 WS_AMOUNT_NUM PIC 9(5).9(2).
01 WS_AMOUNT_TXT PIC X(8).
"MAIN SECTION"
UNSTRING INPUT_REC DELIMITED BY ","
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8)
DISPLAY WS_AMOUNT_NUM
From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become:
79.85^M^#^#
137.35^M^#
I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.

I am guessing, but it looks like you may be reading variable length records from a file into a fixed length
COBOL record. The junk
at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going
to be from one read to the next (data beyond the bounds of actual input record length are technically
undefined). That junk ends up
being included in WS_AMOUNT_TXT after the UNSTRING
There are a number of ways to solve this problem. The suggestion I am giving you here may not
be optimal, but it is simple and should get the job done.
The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing
junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are
digits and the decimal character, you could clean it up as follows:
PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1
UNTIL WS_I = ZERO
IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR
WS_AMOUNT_TXT(WS_I:1) = '.'
MOVE ZERO TO WS_I
ELSE
MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1)
END-IF
END-PERFORM
The basic idea in the above code is to scan from the end of the last UNSTRING output field
to the beginning replacing anything that is not a valid digit or decimal point with a space.
Once a valid digit/decimal is found, exit the loop on the assumption that the rest will
be valid.
After cleanup use the intrinsic function NUMVAL as outlined in my answer to your
previous question
to convert WS_AMOUNT_TXT into a numeric data type.
One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over
from a previous read that might be left in the buffer. This will protect you when reading a very "short"
record after a "long" one - otherwise you may trip over data left over from the previous read.
Hope this helps.
EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the
actual input record length you can do something like:
UNSTRING INPUT_REC(1:REC_LEN) INTO...
Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields.
EDIT 2:
Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following:
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need.
I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail.
EDIT 3:
There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M
Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data.
If the file you are reading came from a Windows platform and you are now
reading it on a UNIX platform that would explain the problem. Under Windows records
are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The
UNIX file system treats <CR> as if it were part of the record.
If this is the case, you can be pretty sure that there will be a single <CR> at the
end of every record read. There are a number of ways to deal with this:
Method 1: As you already noted, pre-edit the file using Notepad++ or some other
tool to remove the <CR> characters before processing through your COBOL program.
Personally I don't think this is the best way of going about it. I prefer to use a COBOL
only solution since it involves fewer processing steps.
Method 2: Trim the last character from each input record before processing it. The last
character should always be <CR>. Try the following if you
are reading records as variable length and have the actual input record length available.
SUBTRACT 1 FROM REC_LEN
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows:
UNSTRING INPUT_REC DELIMITED BY "," OR x"0D"
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
Method 4: Condition the last receiving field from UNSTRING by replacing trailing
non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this
question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE.
Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.

^M is a carriage return.
Would Google Refine be useful for rectifying this data?

In Cobol, to test "null or empty" we use "NOT = SPACE [ AND/OR ] LOW-VALUE" ? Which is it?

I am now working in mainframe,
in some modules, to test
Not null or Empty
we see :
NOT = SPACE OR LOW-VALUE
The chief says that we should do :
NOT = SPACE AND LOW-VALUE
Which one is it ?
Thanks!

Chief is correct.
COBOL is supposed to read something like natural language (this turns out to be just
another bad joke).
Lets play with the following variables and values:
A = 1
B = 2
C = 3
An expression such as:
IF A NOT EQUAL B THEN...
Is fairly straight forward to understand. One is not equal to two so we will do
whatever follows the THEN. However,
IF A NOT EQUAL B AND A NOT EQUAL C THEN...
Is a whole lot harder to follow. Again one is not equal to two AND one is not
equal to three so we will do whatever follows the 'THEN'.
COBOL has a short hand construct that IMHO should never be used. It confuses just about
everyone (including me from time to time). Short hand expressions let you reduce the above to:
IF A NOT EQUAL B AND C THEN...
or if you would
like to apply De Morgans rule:
IF NOT (A EQUAL B OR C) THEN...
My advice to you is avoid NOT in exprssions and NEVER use COBOL short hand expressions.
What you really want is:
IF X = SPACE OR X = LOW-VALUE THEN...
CONTINUE
ELSE
do whatever...
END-IF
The above does nothing when the 'X' contains either spaces or low-values (nulls). It
is exactly the same as:
IF NOT (X = SPACE OR X = LOW-VALUE) THEN
do whatever...
END-IF
Which can be transformed into:
IF X NOT = SPACE AND X NOT = LOW-VALUE THEN...
And finally...
IF X NOT = SPACE AND LOW-VALUE THEN...
My advice is to stick to simple to understand longer and straight forward expressions
in COBOL, forget the short hand crap.

In COBOL, there is no such thing as a Java null AND it is never "empty".
For example, take a field
05 FIELD-1 PIC X(5).
The field will always contain something.
MOVE LOW-VALUES TO FIELD-1.
now it contains hexadimal zeros. x'0000000000'
MOVE HIGH-VALUES TO FIELD-1.
Now it contains all binary ones: x'FFFFFFFFFF'
MOVE SPACES TO FIELD-1.
Now each byte is a space. x'4040404040'
Once you declare a field, it points to a certain area in memory. That memory area must be set to something, even if you never modify it, it still will have what ever garbage it had before the program was loaded. Unless you initialize it.
05 FIELD-1 PIC X(6) VALUE 'BARUCH'.

It is worth noting that the value null is not always the same as low-value and this depends on the device architecture and its character set in use as determined by the manufacturer. Mainframes can have an entirely different collating sequence (low to high character code and symbol order) and symbol set compared to a device using linux or windows as you have no doubt seen by now. The shorthand used in Cobol for comparisons is sometimes used for boolean operations, like IF A GOTO PAR-5 and IF A OR C THEN .... and can be combined with comparisons of two variables or a variable and a literal value. The parser and compiler on different devices should deal with these situations in a standard (ANSI) method but this is not always the situation.

I agree with NealB. Keep it simple, avoid "short cuts", make it easy to understand without having to refer to the manual to check things out.
IF ( X EQUAL TO SPACE )
OR ( X EQUAL TO LOW-VALUES )
CONTINUE
ELSE
do whatever...
END-IF
However, why not put an 88 on X, and keep it really simple?:
88 X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY VALUE SPACE, LOW-VALUES.
IF X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY
CONTINUE
ELSE
do whatever...
END-IF
Note, in Mainframe Cobol, NULL is very restricted in meaning, and is not the meaning that you are attributing to it, Tom. "Empty" only means something in a particular coder-generated context (it means nothing to Cobol as far as a field is concerned).
We don't have "strings". Therefore, we don't have "null strings" (a string of length one including string-terminator). We don't have strings, so a field always has a value, so it can never be "empty" other than as termed by the programmer.
Oguz, I think your post illustrates how complex something that is really simple can be made, and how that can lead to errors. Can you test your conditions, please?

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart