Cobol data files

Cobol data files - cobol

First let me apologize if data is not that complete . This is not me being lazy but me being not aware of cobol details .
I have been assigned in my firm to extract our old financial data from files read by cobol programs and turn them to a database in our oracle DB . I am not able to read these files as normal texts . i don't know how to turn then to normal text .
As per the cobol source each row is 7 records and each record is 72 chars .
the files are very large . each one is 3 GB in average . how can i open them as a normal text ?
here is the file section
000220 ENVIRONMENT DIVISION.
000230 CONFIGURATION SECTION.
000240 SOURCE-COMPUTER. NCR-3000.
000250 OBJECT-COMPUTER. NCR-3000.
000260 INPUT-OUTPUT SECTION.
000270 FILE-CONTROL.
000280 SELECT DQ-HIMVT-A ASSIGN TO DISC
000290 ORGANIZATION INDEXED
000300 ACCESS MODE DYNAMIC
000310 RECORD KEY CLE-A.
000320*
000330 DATA DIVISION.
000340 FILE SECTION.
000350 FD DQ-HIMVT-A BLOCK CONTAINS 7 RECORDS
000360 RECORD CONTAINS 73 CHARACTERS
000370 LABEL RECORD STANDARD
000380 DATA RECORD IS HIMVT-A.
000390 01 HIMVT-A.
000400 02 CLE-A.
000410 03 ENT-A PIC 99.
000420 03 NUCPT-A PIC 9(13) COMP-6.
000430 03 DEV-A PIC XXX.
000440 03 DATOP-A PIC 9(7) COMP-6.
000450 03 SIG-A PIC 9.
000460 03 FORC-A PIC 9.
000470 03 DATVAL-A PIC 9(7) COMP-6.
000480 03 NUMOP-A PIC 9(9) COMP-6.
000490 03 MT-A PIC 9(12)V999 COMP-6.
000500 02 FILLER PIC X(8).
000510 02 TYPCPT-A PIC 9(3) COMP-6.
000520 02 LIBOP-A PIC X(15).
000530 02 SOLD-A PIC S9(12)V999 COMP-3.
000540 02 DATTRAIT-A PIC 9(7) COMP-6.
000550 02 FILLER PIC X.
Here is a sample of the file when opened from notepad++
RMKF I I 0 ** ƒ ’ *B9 *B9 ’ ’ ÿ # "c *B9 Þ #01 EGP %10 % ƒ 21 $ '10 ' (#P )€ 010 0 0 EGP $21 $ %11 $ (EGP $21 $ %11 $ 7EGP $21 $ %11 $ FEGP $21 $ %11 $ UEGP $21 $ %11 $ ` ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ >01 ÔEGP %10 % ÔƒÖ 21Â
NO. 0 ÄÕ
environment section
000220 ENVIRONMENT DIVISION.
000230 CONFIGURATION SECTION.
000240 SOURCE-COMPUTER. NCR-3000.
000250 OBJECT-COMPUTER. NCR-3000.
000260 INPUT-OUTPUT SECTION.
000270 FILE-CONTROL.
000280 SELECT DQ-HIMVT-A ASSIGN TO DISC
000290 ORGANIZATION INDEXED
000300 ACCESS MODE DYNAMIC
000310 RECORD KEY CLE-A.
I found this file which they call a copy book . don't know how it ois related
000100*
000200**** CINVDAT - ZONE DE TRAVAIL ****
000300*******************************************
000400****
000500*
000600 01 INVDATRAV.
000700 03 INVZON1 PIC 99.
000800 03 INVZON2 PIC 99.
000900 03 INVZON3 PIC 99.
001000 01 INVZONI PIC 99.
001100 01 INVDATE PIC 9(6).
001200 01 INVCAL PIC 9.
001300*
Regards

You may be able to locate a service which can do the extract for you. If you go this route, ensure that they have all the information you can provide (which must include the data-definitions under the FD) and agree to only pay on verified receipt of the data.
An alternative is to talk to Micro Focus about a short-term license for a COBOL which (again must be guaranteed) can understand the indexed-file format. You then write one simple program per file whose data you need to extract. Advantage here is that what COMP-3 and COMP-6 represent, you don't need to know, as the conversion to a "text" number is done without anyone having to think about it (on the output definition, you remove all references to COMP-anything (also COMP, if there happen to be any)).
A further alternative is to sit down with a hex editor, knowledge of the data, and work out how to abstract the index information away from the data (all the data records are a known, fixed, length, 73 bytes in your example).
Then, with your preferred language which can handle non-delimited-record (fixed length) binary data, and working out what COMP-3, COMP-6, and any other COMP- (or COMP) fields mean. They will likely be packed-decimal, Binary Coded Decimal (BCD) or "some type of binary" given that Standard COBOL has binary fields limited by decimal values (to the size of the PICture clause).
In the first and second alternatives, there is a greater expectation of the reliability of extract. The third may be the "cheapest", but expectations of the time expended to complete are more difficult to stick to.
Of the first two, cost is the likely determinant (assuming you are not going to use COBOL going forward). If you yourself have to write some COBOL programs, don't worry about that, they are very, very simple, and once you have done one, you simply "clone" it.

I'm not sure which system you are using. As my experience in AS400. COBOL data file using EBCDIC format, it cannot be open directly from a text editor. It will only show random texts. You have to convert it in to ASCII before you export. In AS400, I use CHGTOPCD file/member name to a directory and export it out. Then it will show correct texts. Not sure is this information helps you.

Related

COBOL: Can a GDG file descriptor (FD) reference multiple generations?

I have a program which reads a GDG file and moves data to working storage. I am interested to know if it can be made to repeat this process for multiple generations of the GDG using a reference to the file definition. Perhaps there is a way to use subscripts on the file definition? My thought is there must be a method to move different file definitions into a reference variable from which to access the files.

Code Sample based on suggested, setenv solution
FILE-CONTROL.
SELECT DATAIN ASSIGN TO UT-S-DATAIN.
DATA DIVISION.
FILE-SECTION.
FD DATAIN
BLOCK CONTAINS 0 RECORDS
RECORD CONTAINS 133 CHARACTERS
LABEL RECORDS ARE STANDARD
DATA RECORD IS DATA-REC.
01 DATA-REC PIC X(133).
WORKING-STORAGE SECTION.
01 ENV-VARS.
02 ENV-NAME PIC X(9).
02 ENV-VALUE PIC X(100).
02 ENV-OVERWRITE PIC S9(8) COMPUTATIONAL VALUE 1.
PROCEDURE DIVISION.
MOVE Z"DATAIN" TO ENV-NAME
MOVE Z"DSN(PROGRAMMER.TEST.GDGFILE(-1)),SHR" TO ENV-VALUE
MOVE 1 TO ENV-OVERWRITE
CALL "setenv" USING ENV-NAME ENV-VALUE ENV-OVERWRITE.
Notes
Pay special attention when moving DSN value to ENV-VALUE. On my first swing I left out the closing parentheses, most likely because of JCL muscle memory.
Be sure to empty out your DD statement in JCL/Step.

In mainframe COBOL, the FD refers to a SELECT which refers to a DD statement attached to the EXEC PGM statement for your program in the invoking JCL. The DD statement may refer to one or many GDGs. This is determined at compile time.
What I think you are asking for is dynamic allocation of a file at runtime. There are a couple of ways to accomplish that, one is BPXWDYN.
Identification Division.
Program-ID. SOMETEST.
Environment Division.
Input-Output Section.
File-Control.
Select MY-FILE Assign SYSUT1A.
Data Division.
File Section.
FD MY-FILE
Record 80
Block 0
Recording F.
01 MY-FILE-REC PIC X(080).
Working-Storage Section.
01 CONSTANTS.
05 BPXWDYN-PGM PIC X(008) VALUE 'BPXWDYN '.
05 ALCT-LIT-PROC PIC X(035)
VALUE 'ALLOC FI(SYSUT1A) SHR MSG(WTP) DSN('.
05 FREE-LIT-PROC PIC X(016)
VALUE 'FREE FI(SYSUT1A)'.
05 A-QUOTE PIC X(001) VALUE "'".
01 WORK-AREAS.
05 WS-DSN PIC X(044) VALUE 'MY.GDG.BASE'.
05 WS-GDG-NB PIC 999 VALUE ZEROS.
05 BPXWDYN-PARM.
10 PIC S9(004) COMP-5 VALUE +100.
10 BPXWDYN-PARM-TXT PIC X(100).
Procedure Division.
* Construct the allocation string for BPXWDYN.
MOVE SPACES TO BPXWDYN-PARM-TXT
STRING
ALCT-LIT-PROC
DELIMITED SIZE
WS-DSN
DELIMITED SPACE
'(-'
DELIMITED SIZE
WS-GDG-NB
DELIMITED SIZE
')'
DELIMITED SIZE
INTO
BPXWDYN-PARM-TXT
END-STRING
CALL BPXWDYN-PGM USING
BPXWDYN-PARM
END-CALL
IF RETURN-CODE = 0
CONTINUE
ELSE
[error handling]
END-IF
[file I/O with MY-FILE]
MOVE SPACES TO BPXWDYN-PARM-TXT
MOVE FREE-LIT-PROC TO BPXWDYN-PARM-TXT
CALL BPXWDYN-PGM USING
BPXWDYN-PARM
END-CALL
IF RETURN-CODE = 0
CONTINUE
ELSE
[error handling]
END-IF
GOBACK.
This is just freehand, so there may be a syntax error, but I hope I've made the idea clear.
There is another technique, using the C RTL function setenv, documented by IBM here. It looks like it might be simpler but I've never done it that way.

Report with Report Writer duplicating last line

I find myself sorting an input file, and using a control break to compute some data. We need headers in the control break, the report writer is duplicating the header each time and I can not figure it out for the life of me. The write statement in the break paragraph is written twice, but if I use a DISPLAY it is only displayed once. Where am I going wrong with the Report Writer? The break itself is calculating the data correctly (but probably terribly)
environment division.
configuration section.
input-output section.
file-control.
SELECT corpranks
ASSIGN TO
"corpranks.txt"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT out-file
ASSIGN TO
"report"
ORGANIZATION IS LINE SEQUENTIAL.
SELECT sortfile
ASSIGN TO
"SortFile".
data division.
file section.
FD corpranks
RECORD CONTAINS 80 CHARACTERS.
01 gf-rec.
05 first-initial PIC x.
05 middle-initial PIC x.
05 last-name PIC x(14).
05 rank-code PIC 9.
05 Filler PIC x(15).
05 rank PIC x(3).
05 salary PIC 9(6).
05 corporation PIC x(29) VALUE SPACE.
FD out-file
REPORT IS corp-report.
01 of-rec PIC x(80).
SD sortfile.
01 Sortrec.
05 PIC x(16).
05 SR-rank PIC xxx.
05 PIC x(22).
05 SR-corporation PIC x(29).
working-storage section.
77 EOF PIC x VALUE "N".
77 current-corp PIC x(29).
77 total-salary PIC 9(6) VALUE 0.
77 current-salary PIC 9(6).
77 converted-month PIC x(3).
77 concatenated-date PIC x(28).
77 formatted-date PIC x(80) JUSTIFIED RIGHT.
77 formatted-name PIC x(20).
77 tally-counter PIC 9.
77 inp-len PIC 9.
01 current-date.
05 YYYY PIC x(4).
05 MM PIC x(2).
05 DD PIC x(2).
01 corporation-header.
05 FILLER pic x(18) VALUE SPACES.
05 FILLER pic x(13) VALUE "Corporation: ".
05 ch-corp pic x(40).
01 corporation-subheader.
05 FILLER pic x(5) VALUE SPACES.
05 FILLER pic x(4) VALUE "RANK".
05 FILLER pic x(5) VALUE SPACES.
05 FILLER pic x(4) VALUE "NAME".
05 FILLER pic x(15) VALUE SPACES.
05 FILLER pic x(6) VALUE "SALARY".
77 csh-underline pic x(40) Value
"========================================".
01 main-header.
05 FILLER PIC x(5).
05 header-content PIC x(69) VALUE "Jacksonville Computer App
"lications Support Personnel Salaries".
report section.
RD corp-report.
01 REPORT-LINE
TYPE DETAIL
LINE PLUS 2.
05 COLUMN 6 PIC x(3) SOURCE rank.
05 COLUMN 12 PIC x(20) SOURCE formatted-name.
05 COLUMN 37 PIC 9(6) SOURCE salary.
procedure division.
0000-MAIN.
Sort Sortfile on ascending key SR-corporation
on ascending key SR-rank
Using corpranks
giving corpranks.
OPEN
INPUT corpranks
OUTPUT out-file
INITIATE corp-report.
WRITE of-rec FROM main-header.
ACCEPT current-date from DATE YYYYMMDD.
PERFORM 3000-CONVERT-MONTH.
STRING "As of: " DELIMITED BY SIZE
DD DELIMITED BY SIZE
SPACE
converted-month DELIMITED BY SIZE
SPACE
YYYY DELIMITED BY SIZE
INTO concatenated-date.
MOVE concatenated-date TO formatted-date.
WRITE of-rec FROM formatted-date.
PERFORM 2000-GENERATE-REPORT UNTIL EOF = 1.
TERMINATE corp-report.
stop run.
2000-GENERATE-REPORT.
PERFORM 3100-TRIM-FIELDS
GENERATE REPORT-LINE
READ corpranks
AT END
CLOSE corpranks
out-file
MOVE 1 TO eof
NOT AT END
IF current-corp = SPACE
MOVE corporation to current-corp
MOVE current-corp to ch-corp
WRITE of-rec FROM corporation-header
WRITE of-rec FROM corporation-subheader
WRITE of-rec FROM csh-underline
END-IF
IF current-corp NOT = corporation
PERFORM 2500-CONTROL-BREAK
END-IF
COMPUTE total-salary = total-salary + salary
MOVE corporation to current-corp
END-READ.
2500-CONTROL-BREAK.
WRITE of-rec FROM corporation
MOVE 0 to total-salary
.
3000-CONVERT-MONTH.
EVALUATE mm
WHEN "01" MOVE "JAN" TO converted-month
WHEN "02" MOVE "FEB" TO converted-month
WHEN "03" MOVE "MAR" TO converted-month
WHEN "04" MOVE "APR" TO converted-month
WHEN "05" MOVE "MAY" TO converted-month
WHEN "06" MOVE "JUN" TO converted-month
WHEN "07" MOVE "JUL" TO converted-month
WHEN "08" MOVE "AUG" TO converted-month
WHEN "09" MOVE "SEP" TO converted-month
WHEN "10" MOVE "OCT" TO converted-month
WHEN "11" MOVE "NOV" TO converted-month
WHEN "12" MOVE "DEC" TO converted-month
WHEN OTHER MOVE mm to converted-month
END-EVALUATE.
3100-TRIM-FIELDS.
INSPECT last-name TALLYING tally-counter FOR trailing
spaces.
COMPUTE inp-len = LENGTH OF last-name - tally-counter
MOVE last-name(1: inp-len) to formatted-name
STRING last-name(1: inp-len) DELIMITED BY SIZE
SPACE
first-initial DELIMITED BY SIZE
INTO formatted-name
MOVE 0 TO tally-counter
end program Program2.
Some report output: (at the beginning header, csh-underline is the last thing written, the === underline displays twice. At the corporation control breaks, the next corp name is the last thing written, and is written twice)
Jacksonville Computer Applications Support Personnel Salaries
As of: 18 FEB 2015
Corporation: Alltel Information Services
RANK NAME SALARY
========================================
========================================
EVP COLUMBUS C 100000
SVP ADAMS S 042500
VP REAGAN R 081000
VP FRANKLIN B 080000
A&P FORD G 060000
A&P HAYES R 050000
A&P JACKSON A 057600
A&P TYLER J 069000
A&P HARRISON B 052000
A&P TAFT W 070500
A&P HOOVER H 035000
A&P PIERCE F 044000
American Express
American Express
EVP JOHNSON L 098000
SVP CLINTON W 086000
VP ROOSEVELT F 072000
A&P HARDING W 040000
....

Here's a link to some Report Writer documentation from Micro Focus. It is not the only documentation they provide, but it is all that I have scanned through: http://documentation.microfocus.com/help/index.jsp?topic=%2Fcom.microfocus.eclipse.infocenter.studee60win%2FGUID-48E4E734-F1A4-41C4-BA30-38993C8FE100.html
If you loot at Report File under Enterprise > Micro Focus Studio Enterprise Edition 6.0 > General Reference > COBOL Language Reference > Part 3. Additional Topics > Report Writer you will see this:
Report File
A report file is an output file having sequential organization. A
report file has a file description entry containing a REPORT clause.
The content of a report file consists of records that are written
under control of the RWCS.
A report file is named by a file control entry and is described by a
file description entry containing a REPORT clause. A report file is
referred to and accessed by the OPEN, GENERATE, INITIATE, SUPPRESS,
TERMINATE, USE BEFORE REPORTING, and CLOSE statements.
Although this does not definitively say "Don't use your own WRITE statements and hope that they will work" I think it is clear that you should not. What happens when you do that is not defined, or is "undefined behaviour".
You are getting repeated lines before a break, and after a break, exactly where the Report Writer will be checking if there is anything it needs to do. Although I know nothing at all about the implementation of the Report Writer in Micro Focus COBOL, I am pretty certain that you have correctly identified that the repetition happens and is beyond your control. I think the above quote confirms that, and within other parts of Micro Focus's documentation this may be made more explicit.
You either need to use the Report Writer fully (if the task is to use the Report Writer) or not use it at all. You can't mix automatic and manual on the same report file, it seems, and that makes sense to me.
Remember, it does not matter that some of your WRITE statements seem to work, because this is a computer and you need them all to work.
Some general comments on your program:
In main-header you have a FILLER without a VALUE clause, which can cause problems when written to a file for printing. Whether that is way those five bytes don't show on your output or whether it is due to formatting in the posting here, I don't know.
Also in main-header you have a long literal, continued onto a second line. I can't see the continuation marker, and that may be a feature of how it is done in that Micro Focus COBOL, but it always makes things easier if literals are not continued. Define two smaller fields one after the other, with smaller literals which taken together make up the whole.
You have this:
COMPUTE total-salary = total-salary + salary
This, however, is considered clearer:
ADD salary TO total-salary
You are using STRING. You should be aware that the data-transfer from the sending fields ceases when the receiving field is filled, or when all the sending fields have been processed. In the latter case, automatic space-padding is not carried out, unlike the behaviour of a MOVE statement. You need to set your receiving field to an initial value before the STRING is executed, else you will retain data from the previous execution of STRING when the current execution of STRING has less actual data.
After the STRING you do this:
MOVE 0 TO tally-counter
This means your INSPECT, several statements earlier, but where tally-counter is used, is relying on a previous value for tally-counter for the code after that to work. This is not good practice. Make tally-counter an initial value before it is used in the INSPECT.
If you go with the Report Writer your PROCEDURE DIVISION code will be significantly reduced, because the definition of the report elements defines the automatic processing.
The Report Write feature of COBOL is very powerful. It allows you to define a complex report in the REPORT SECTION of a COBOL program, with headings, column headings, detail lines, control-break totals etc. In the PROCEDURE DIVISION you only need as little as make the source-data available (say with a READ) and then GENERATE the report, and COBOL does the rest for you.
However, you have defined a very simple report, and are attempting to do headings, totals etc yourself. I have never done this, and don't know if it works in general, or if it works for your compiler.
From your testing, it seems like there may be a problem with doing this, and it may be, erroneously, repeating the line you yourself have written. You need to check that that particular line is not output elsewhere in your program.
We need to see the outstanding answers to questions from comments, and, unless it is an excessive size, your entire program.
If your exercise is specifically to use the Report Writer, then I think you need to define a more "complex" report, which will produce, automatically from the definition, everything that you want.
If you do not have to use the Report Writer for this exercise, don't use it, just do the detail-line formatting yourself and WRITE it as you are already doing for headings and totals.
On the assumption (later proved false) that you were using the Report Writer to do everything you need, the problem would have been manually writing to the same output file that the Report Writer was using.
If using the full features of the Report Writer, simply make this change and remove any other WRITEs to that output file, and use the Report Writer features for everything:
2500-CONTROL-BREAK.
MOVE 0 to total-salary
.

Read a file with three records, only one of expected records output

IDENTIFICATION DIVISION.
PROGRAM-ID. PROGRAM1.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT EMP-GRADE ASSIGN TO 'input.txt'
ORGANIZATION IS SEQUENTIAL
ACCESS MODE IS SEQUENTIAL
FILE STATUS IS WS-STATUS.
DATA DIVISION.
FILE SECTION.
FD EMP-GRADE.
01 NEWFILE.
05 FS-EMPID PIC 9(5).
05 FS-NAME PIC A(5).
05 FS-STREAM PIC X(5).
05 FS-GRADE PIC A(1).
05 FILLER PIC X(64).
WORKING-STORAGE SECTION.
01 WS-EOF PIC A(1) VALUE "N".
01 WS-STATUS PIC X(2).
PROCEDURE DIVISION.
MAIN-PARA.
OPEN INPUT EMP-GRADE.
PERFORM PARA1 THRU PARA1-EXIT UNTIL WS-EOF="Y".
CLOSE EMP-GRADE.
STOP RUN.
MAIN-PARA-EXIT.
EXIT.
PARA1.
READ EMP-GRADE
AT END MOVE "Y" TO WS-EOF
NOT AT END
IF FS-GRADE='A'
DISPLAY FS-EMPID , FS-NAME , FS-STREAM , FS-GRADE
END-IF
END-READ.
PARA1-EXIT.
EXIT.
input provided:
1234 sita comp A
2345 tina main B
5689 riya math A
but the output is coming :
1234 sita comp A
It is reading only the first record.

As Brian Tiffin is hinting at in the comments, it is your data which is the problem.
This:
05 FILLER PIC X(64).
Means that your records should be 64 bytes longer than they are.
If you have a fixed-length record, or only fixed-length records, under an FD, then the data all have to be the same length, and equal to what you have defined in your program.
It means, and behaviour depends on compiler, you only have one record as far as the COBOL program is concerned.
A good way to spot such things is to always count your input records, and count your output records, and records which should not be selected for output. You can then easily tell if anything has fallen between a crack.
Leaving that aside, here's your program with some adjustments:
IDENTIFICATION DIVISION.
PROGRAM-ID. PROGRAM1.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT EMP-GRADE ASSIGN TO 'input.txt'
ORGANIZATION IS SEQUENTIAL
ACCESS MODE IS SEQUENTIAL
FILE STATUS IS WS-STUDENT-GRADE-STATUS.
DATA DIVISION.
FILE SECTION.
FD EMP-GRADE.
01 NEWFILE.
05 FS-EMPID PIC 9(5).
05 FS-NAME PIC X(5).
05 FS-STREAM PIC X(5).
05 FS-GRADE PIC X(1).
05 FILLER PIC X(64).
WORKING-STORAGE SECTION.
01 WS-STUDENT-GRADE-STATUS PIC X(2).
88 END-OF-STUDENT-GRADE-FILE VALUE "10".
88 ERROR-ON-STUDENT-GRADE-FILE VALUE ZERO.
PROCEDURE DIVISION.
OPEN INPUT EMP-GRADE
* perform a paragraph to check FILE STATUS field is zero, using an 88.
PERFORM PRIMING-READ
PERFORM PROCESS-STUDENT-GRADE-FILE
UNTIL END-OF-STUDENT-GRADE-FILE
CLOSE EMP-GRADE
* perform a paragraph to check FILE STATUS field is zero, using an 88.
GOBACK
.
PRIMING-READ.
PERFORM READ-STUDENT-GRADE
.
READ-STUDENT-GRADE.
READ EMP-GRADE
* perform a paragraph to check FILE STATUS field is zero, using an 88.
.
PROCESS-STUDENT-GRADE-FILE.
IF FS-GRADE='A'
* To see the problem with your data, DISPLAY the 01-level
DISPLAY NEWFILE
DISPLAY FS-EMPID
FS-NAME
FS-STREAM FS-GRADE
END-IF
PERFORM READ-STUDENT-GRADE
.
If you use the FILE STATUS field, you should check it. Since you use it, you can use it to check for end-of-file without the AT END. If you use a "priming read" you don't need the AT END/NOT AT END tangle. If you code the minimum of full-stops/periods in the PROCEDURE DIVISION, you won't have a problem with them. Commas are never needed, so don't use them. Format your program for human readability. Use good descriptive names for everything. The THRU on a PERFORM invites the use of GO TO. As a learning, avoid the invitation.
If your class itself enforces particular ways to code COBOL, you'll have to use those ways. If so, I'd suggest you do both. The first couple of times at least, submit both to your tutor. Even if they tell you not to do that, continue doing dual examples when given tasks (just don't submit them any more). There is no reason for you to start off with bad habits.
Keep everything simple. If your code looks bad, make it look good through simplification and formatting.
Remember also that COBOL is all about fixed-length stuff. Get's us back to your original problem.

output a report with sub-headers in COBOL

the format of input file is like this:
Region ******* Company Name
A
B
C
A
C
with many lines.
I need to get a output file to rearrange the file with headers like this:
Company in Region A:
name
name
name...
Company in Region B:
name
name
name..
Company in Region C:
name
name
name..
My question is because the region in input file is not ordered. How can I add the second Region A company back in the header "Company in Region A"? I can only read the file one time (I cannot first all do lines with region A then reopen the file to read again). And I can only have 1 output file.

You could use the Sort verb with input/output procedure to sort the file into Region Sequence.
You can find many examples in Google. This ShorExample has a short Sort example,
there is more info here
You will probably need both input and output procedures
Sort Example:
PROCEDURE DIVISION.
000-SORT SECTION.
010-DO-THE-SORT.
SORT SORT-FILE ON ASCENDING KEY SORT-KEY-1
ON DESCENDING KEY SORT-KEY-2
USING INPUT-FILE
OUTPUT PROCEDURE IS 200-WRITE-OUTPUT
THRU 230-DONE-OUTPUT.
DISPLAY "END OF SORT".
STOP RUN.
200-WRITE-OUTPUT SECTION.
210-OPEN-OUTPUT.
OPEN OUTPUT OUTPUT-FILE.
220-GET-SORTED-RECORDS.
RETURN SORT-FILE AT END
CLOSE OUTPUT-FILE
GO TO 230-DONE-OUTPUT.
MOVE SORT-RECORD TO OUTPUT-RECORD.
WRITE OUTPUT-RECORD.
GO TO 220-GET-SORTED-RECORDS.
230-DONE-OUTPUT SECTION.
240-EXIT-OUTPUT.
EXIT.

If you are doing homework, remember that the goal of homework is not to solve a problem, it is to demonstrate that you have learned the class material. So, if this is homework, create your own solution based on what you have been taught. If you are trying to solve a real-life problem, the example below might help point you in the right direction. If you have not been taught sorting using an output procedure, you do not want to use the example below to do your homework. That said, the following program works using the sample data shown with GNUCobol. Notice in particular how paragraph OUTPUT-CO-BY-REGION-REPORT is used by the SORT.
--- contents of sample data file COMPANY.DAT ---
A WAL-MART
B EXXON
C CHEVRON
B BERKSHIRE
A APPLE
C GENERAL MOTORS
IDENTIFICATION DIVISION.
PROGRAM-ID. COMPANY-BY-REGION.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT COMPANY-FILE
ASSIGN TO 'COMPANY.DAT'
ORGANIZATION IS LINE SEQUENTIAL.
SELECT COMPANY-SORT-FILE
ASSIGN TO DISK.
SELECT REGION-REPORT-FILE
ASSIGN TO 'COMPANY-BY-REGION.RPT'
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD COMPANY-FILE.
01 COMPANY-RECORD.
02 COM-REGION PIC X.
02 FILLER PIC X.
02 COM-NAME PIC X(20).
SD COMPANY-SORT-FILE.
01 COMPANY-SORT-RECORD.
02 SORT-REGION PIC X.
02 FILLER PIC X.
02 SORT-NAME PIC X(20).
FD REGION-REPORT-FILE.
01 REGION-REPORT-RECORD PIC X(20).
WORKING-STORAGE SECTION.
01 SORTED-DATA-REMAINS PIC X VALUE 'Y'.
88 NO-SORTED-DATA-REMAINS VALUE 'N'.
01 WS-SORTED-RECORD.
02 WS-REGION PIC X.
02 FILLER PIC X.
02 WS-NAME PIC X(20).
01 REGION-REPORT-HEADER.
02 FILLER PIC X(18) VALUE 'COMPANY IN REGION '.
02 HEAD-REGION PIC X VALUE SPACE.
02 FILLER PIC X VALUE ':'.
01 REGION-DETAIL.
02 DET-NAME PIC X(20).
01 PRIOR-REGION PIC X.
PROCEDURE DIVISION.
WRITE-COMPANY-BY-REGION-REPORT.
SORT COMPANY-SORT-FILE
ASCENDING KEY SORT-REGION
SORT-NAME
USING COMPANY-FILE
OUTPUT PROCEDURE OUTPUT-CO-BY-REGION-REPORT
STOP RUN
.
OUTPUT-CO-BY-REGION-REPORT.
OPEN OUTPUT REGION-REPORT-FILE
PERFORM UNTIL NO-SORTED-DATA-REMAINS
RETURN COMPANY-SORT-FILE INTO WS-SORTED-RECORD
AT END SET NO-SORTED-DATA-REMAINS TO TRUE
NOT AT END
PERFORM WRITE-COMPANY-RECORD
END-PERFORM
CLOSE REGION-REPORT-FILE
.
WRITE-COMPANY-RECORD.
IF HEAD-REGION = SPACE
OR WS-REGION NOT = PRIOR-REGION
PERFORM PRINT-HEADER
END-IF
MOVE WS-NAME TO DET-NAME
WRITE REGION-REPORT-RECORD FROM REGION-DETAIL
.
PRINT-HEADER.
IF HEAD-REGION NOT = SPACE
MOVE SPACES TO REGION-REPORT-RECORD
WRITE REGION-REPORT-RECORD
END-IF
MOVE WS-REGION TO HEAD-REGION
PRIOR-REGION
WRITE REGION-REPORT-RECORD FROM REGION-REPORT-HEADER
.

You should do this in a single pass of the file. Instead of just checking one item at a time, check all of them in the single pass...

Assigning a value to a PIC clause less than specified length

I'm a beginner to COBOL, and i'm wondering what would happen if i did something like the following:
(I know that the below code isnt runnable cobol, its just there for example)
foo pic x(5)
accept foo
and the user types in a string that is only 3 characters long (e.g. yes)
would the value of foo be just "yes"? or would it fill the all 5 characters as specified at creation (for example: "(space)(space)yes" or "yes(space)(space)", or is it something else?
Thanks!
here is my code
000100 IDENTIFICATION DIVISION.
000200 *--------------------
000300 PROGRAM-ID. ZIPCODES.
000400 *--------------------
000500 ENVIRONMENT DIVISION.
000600 *--------------------
000700 CONFIGURATION SECTION.
000800 INPUT-OUTPUT SECTION.
000900 FILE-CONTROL.
001000 SELECT PRT ASSIGN TO UT-S-PRTAREA.
001100
001200 DATA DIVISION.
001300 *-------------
001400 FILE SECTION.
001500 FD PRT
001600 RECORD CONTAINS 80 CHARACTERS
001700 DATA RECORD IS LINE-PRT.
001800 01 LINE-PRT PIC X(80).
001900
002000 WORKING-STORAGE SECTION.
002100 *-----------------------
002200 EXEC SQL INCLUDE SQLCA END-EXEC.
002300
002310 01 done.
002320 02 donevar PIC x(5) VALUE 'done '.
002400 01 ZIP-RECORD.
002500 02 ZIP PIC X(5).
002600 02 ZCITY PIC X(20).
002700 02 ZSTATE PIC X(2).
002800 02 ZLOCATION PIC X(35).
002900
003000 01 H1.
003100 02 COLUMN-1 PIC X(8) VALUE 'Zip-Code'.
003200 02 FILLER PIC X(2).
003300 02 COLUMN-2 PIC X(5) VALUE 'State'.
003400 02 FILLER PIC X(2).
003500 02 COLUMN-3 PIC X(4) VALUE 'City'.
003600 02 FILLER PIC X(16).
003700 02 COLUMN-4 PIC X(14) VALUE 'Location Text'.
003800 02 FILLER PIC X(29).
003900
004000 01 L1.
004100 02 ZIP-L1 PIC X(5).
004200 02 FILLER PIC X(5).
004300 02 STATE-L1 PIC X(2).
004400 02 FILLER PIC X(5).
004500 02 CITY-L1 PIC X(20).
004600 02 LOCTXT-L1 PIC X(35).
004700 02 FILLER PIC X(28).
004800
004900 PROCEDURE DIVISION.
005000 *------------------
005100 BEGIN.
005200 OPEN OUTPUT PRT.
005220 PERFORM ZIP-LOOKUP UNTIL ZIP = done.
005600 PROG-END.
005700 CLOSE PRT.
005800 GOBACK.
005900 *****************************************************
006000 * zip code lookup *
006100 *****************************************************
006200 ZIP-LOOKUP.
006300 DISPLAY 'enter 5 digit zip code'
006400 ACCEPT ZIP
006500 EXEC SQL
006600 SELECT * INTO :ZIP-RECORD FROM ZBANK.ZIPCODE
006700 WHERE ZIP = :ZIP
006800 END-EXEC.
006801 PERFORM PRINT-H1.
006802 PERFORM PRINT-L1.
006900 PRINT-H1.
007000 MOVE H1 TO LINE-PRT
007100 WRITE LINE-PRT.
007200 PRINT-L1.
007300 MOVE ZIP TO ZIP-L1
007400 MOVE ZSTATE TO STATE-L1
007500 MOVE ZCITY TO CITY-L1
007510 STRING ZSTATE DELIMITED BY " ",", ",
007520 ZCITY DELIMITED BY SIZE INTO LOCTXT-L1
007700 MOVE L1 TO LINE-PRT
007800 WRITE LINE-PRT.
I'm trying to write the zstate before the zcity, and having it keep asking for ZIP codes as long as the input isnt 'done'

The first 5 characters entered will be moved to FOO. If fewer than 5 characters are entered then they will be placed in the left hand positions of FOO and the remaining characters (to the right) will be filled with spaces. If the user enters more than 5 charcters then only the first 5 are moved.
So to use your example if the user typed "yes" then FOO would contain "yesbb"
Best thing to do is try it!
Edit in response to updated question...
I think your problem is that the condition needed to terminate the loop is set in the beginning of the loop body and
not at the end. Here are a couple of commonly used techniques to solve this problem:
Pre loop read
DISPLAY 'Enter a 5 digit zip code'
ACCEPT ZIP
PERFORM ZIP-LOOKUP UNTIL ZIP = done.
...
...
ZIP-LOOKUP.
EXEC SQL
SELECT * INTO :ZIP-RECORD FROM ZBANK.ZIPCODE
WHERE ZIP = :ZIP
END-EXEC.
PERFORM PRINT-H1.
PERFORM PRINT-L1.
* Now get next zip code or 'done'
DISPLAY 'Enter a 5 digit zip code'
ACCEPT ZIP
.
Guard against setting terminating condition within the loop
PERFORM ZIP-LOOKUP UNTIL ZIP = done.
...
...
ZIP-LOOKUP.
DISPLAY 'Enter a 5 digit zip code'
ACCEPT ZIP
IF ZIP NOT = DONE
EXEC SQL
SELECT * INTO :ZIP-RECORD FROM ZBANK.ZIPCODE
WHERE ZIP = :ZIP
END-EXEC
PERFORM PRINT-H1
PERFORM PRINT-L1
END-IF
.
Either one of the above should solve your problem. However, I would suggest trying to update your coding style to include
COBOL-85 constructs. The first example above might be coded as follows:
DISPLAY 'Enter a 5 digit zip code'
ACCEPT ZIP
PERFORM UNTIL ZIP = done
EXEC SQL
SELECT * INTO :ZIP-RECORD FROM ZBANK.ZIPCODE
WHERE ZIP = :ZIP
END-EXEC
PERFORM PRINT-H1
PERFORM PRINT-L1
DISPLAY 'Enter a 5 digit zip code'
ACCEPT ZIP
END-PERFORM
.
The ZIP-LOOKUP paragraph has been in-lined into the PERFORM statement. For short sections of code I find this style much more
readable.
Also notice single sentence paragraphs (only one period at the end of a paragraph). When COBOL-85 scope terminators are used (eg. END-xxx)
the need for mulitple sentences per paragraph goes away - and in fact - they should be avoided.
Another COBOL construct that you could make use of here is the 88 LEVEL. You could use it as follows:
01 ZIP-RECORD.
02 ZIP PIC X(5).
88 DONE VALUE 'done '.
...
...
You no longer need donevar at all. Replace your original test:
IF ZIP = DONE
with:
IF DONE
The above will be true whenever the variable ZIP contains the value "donebb". One advantage of
doing this (other than saving one variable declaration) is that a single 88 LEVEL name can be assigned
several values, as in:
01 ZIP-RECORD.
02 ZIP PIC X(5).
88 DONE VALUE 'done ',
'quit ',
'stop '.
When the user enters any one of done, quit or stop the 88 level name DONE evaluates to true.
Finally, I presume this is just a skeleton of the program and that the finished version will be checking for I/O errors, bad SQL codes
and do basic ZIP code validation. If not, you can expect a lot of trouble down the road.
COBOL Reference material
Unfortunately there are very few good up to date resources for learning COBOL. However, one of the
books I would recommend is Advanced Cobol 3rd Edition by DeWard Brown.
This book provides many examples and explanations regarding COBOL program development. It also identifies whether a
construct is rarely used, obsolete or essential. This is good to know since you should be developing new code using modern COBOL
programming techniques (I continue to see a lot of new COBOL developed using pre-COBOL 85 coding practice - and it is horrible).
An open source
guide is the OpenCOBOL Programmers Guide. This targets OpenCOBOL but
much of it is applicable to any flavour of COBOL.
Finally, there are several vendors guides and manuals, many of which are available on the internet. For
example Enterprise COBOL for z/OS Language Reference and
Enterprise COBOL for z/OS Programming guide are
freely available. Microfocus COBOL
guides are also available. Search any you will find...

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart