I'm completely new to COBOL, and I'm wondering:
There seems to be no difference between
DISPLAY "foo"
and
DISPLAY "foo".
What does the dot at the end of a line actually do?
When should I use/avoid it?
The period ends the "sentence." It can have an effect on your logic. Consider...
IF A = B
PERFORM 100-DO
SET I-AM-DONE TO TRUE.
...and...
IF A = B
PERFORM 100-DO.
SET I-AM-DONE TO TRUE
The period ends the IF in both examples. In the first, the I-AM-DONE 88-level is set conditionally, in the second it is set unconditionally.
Many people prefer to use explicit scope terminators and use only a single period, often on a physical line by itself to make it stand out, to end a paragraph.
I'm typing this from memory, so if anyone has corrections, I'd appreciate it.
Cobol 1968 required the use of a period to end a division name, procedure division paragraph name, or procedure division paragraph. Each data division element ended with a period.
There were no explicit scope terminators in Cobol 68, like END-IF. A period was also used to end scope. Cobol 1974 brought about some changes that didn't have anything to do with periods.
Rather than try to remember the rules for periods, Cobol programmers tended to end every sentence in a Cobol program with a period.
With the introduction of scope terminators in Cobol 1985, Cobol coders could eliminate most of the periods within a procedure division paragraph. The only periods required in the procedure division of a Cobol 85 program are the to terminate the PROCEDURE DIVISION statement, to terminate code (if any) prior to first paragraph / section header, to terminate paragraph / section header, to terminate a paragraph / section and to terminate a program (if no paragraphs / sections).
Unfortunately, this freaked out the Cobol programmers that coded to the Cobol 68 and 74 standard. To this day, many Cobol shops enforce a coding rule about ending every procedure division sentence with a period.
Where to use!
There are 2 forms to use point.
You can use POINT after every VERB in a SECTION.
EXAMPLE:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I.
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM.
DISPLAY WK-I.
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF.
0000-EXAMEPLE-END. EXIT.
Note that we are using point after every VERB, EXCEPT inside a PERFORM, IF, ETC...
Another form to use is: USING ONLY ONE POINT AT THE END OF SECTION, like here:
0000-EXAMPLE SECTION.
MOVE 0 TO WK-I
PERFORM UNTIL WK-I GREATER THAN 100
DISPLAY WK-I
ADD 1 TO WK-I
END-PERFORM
DISPLAY WK-I
IF WK-I EQUAL ZEROS
DISPLAY WK-I
END-IF
. <======== point here!!!!!!! only HERE!
0000-EXAMEPLE-END. EXIT.
BUT, we ALWAYS have after EXIT and SECTION.....
When it is my choice, I use full-stop/period only where necessary. However, local standards often dictate otherwise: so be it.
The problems caused by full-stops/periods are in the accidental making of something unconditional when code "with" is copied into code "without" whilst coder's brain is left safely in the carpark.
One extra thing to watch for is (hopefully) "old" programs which use NEXT SENTENCE in IBM Mainframe Cobol. "NEXT SENTENCE" means "after the next full-stop/period" which, in "sparse full-stop/period" code is the end of the paragraph/section. Accident waiting to happen. Get a spec-change to allow "NEXT SENTENCE" to be changed to "CONTINUE".
Just tested that in my cobol 85 program by removing all of the periods in procedures and it worked fine.
example:
PROCEDURE DIVISION.
MAIN-PROCESS.
READ DISK-IN
AT END
DISPLAY "NO RECORDS ON INPUT FILE"
STOP RUN
ADD 1 TO READ-COUNT.
PERFORM PROCESS-1 UNTIL END-OF-FILE.
WRITE-HEADER.
MOVE HEADER-INJ-1 TO HEADER-OUT-1
WRITE HEADER-OUT-1.
CLOSE-FILES.
CLOSE DISK-IN
CLOSE DISK-OUT
DISPLAY "READ: " READ-COUNT
DISPLAY "WRITTEN: " WRITE-COUNT
SORT SORT-FILE ON ASCENDING SER-S
USING DISK-OUT
GIVING DISK-OUT
STOP RUN.
Related
1) Read a line of 2000 characters and replace all SPACES with a single "+" plus character. i.e. Convert "A B" to "A+B" or "A B" to "A+B"
2)Read a line of 2000 characters, then search for a specific patterns like "PWD" or "INI" or etc and finally store next 6 characters into a variable.
3) Read a line of 2000 characters and store the last word in the string to a variable.
Edit:
I use Micro Focus COBOL.
This is a screenshot of my piece of code so far.
My code is below. It removes a few spaces but not all. Try writing any sentence with random numbers of spaces in between words in and input file for test-data.
IDENTIFICATION DIVISION.
PROGRAM-ID. SALAUT.
ENVIRONMENT DIVISION.
FILE-CONTROL.
SELECT IN-FILE ASSIGN TO "INFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-IN-FILE-STATUS.
SELECT OUT-FILE ASSIGN TO "OUTFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-OUT-FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD IN-FILE.
01 FS-IN-FILE PIC X(200).
FD OUT-FILE.
01 FS-OUT-FILE PIC X(200).
WORKING-STORAGE SECTION.
01 WS-ATMA-C.
03 WS-OUT-FILE-STATUS PIC X(02).
03 WS-IN-FILE-STATUS PIC X(02).
03 WS-LOOP-COUNTER PIC 9(03) VALUE 1.
03 WS-IN-EOF PIC X value 'N'.
03 WS-IN-FILE-LEN PIC 9(03).
03 WS-IN-SPACE-CNT PIC 9(03) VALUE 1.
03 FS-IN-FILE-2 PIC X(200).
03 WS-TRIL-SPACE-CNT PIC 9(03).
03 WS-TOT-SPACE-CNT PIC 9(03).
PROCEDURE DIVISION.
MAIN-PARA.
OPEN INPUT IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
OPEN OUTPUT OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
PERFORM SPACE-REMOVER-PARA THRU SPACE-REMOVER-PARA-EXIT.
CLOSE IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
CLOSE OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
MAIN-PARA-EXIT.
STOP RUN.
SPACE-REMOVER-PARA.
PERFORM UNTIL WS-IN-EOF = 'Y'
INITIALIZE FS-IN-FILE FS-OUT-FILE WS-IN-FILE-LEN FS-IN-FILE-2
READ IN-FILE
AT END
MOVE 'Y' TO WS-IN-EOF
NOT AT END
INSPECT FS-IN-FILE TALLYING WS-IN-FILE-LEN FOR CHARACTERS
EXHIBIT NAMED WS-IN-FILE-LEN
MOVE 1 TO WS-LOOP-COUNTER
IF WS-IN-FILE-LEN <> 0
PERFORM UNTIL WS-IN-SPACE-CNT <= ZEROS
INSPECT FS-IN-FILE TALLYING WS-TOT-SPACE-CNT FOR ALL " "
INSPECT FUNCTION REVERSE (FS-IN-FILE) TALLYING
WS-TRIL-SPACE-CNT FOR LEADING " "
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
PERFORM VARYING WS-LOOP-COUNTER FROM 1 BY 1
UNTIL WS-LOOP-COUNTER >=
WS-IN-FILE-LEN - (2 * WS-TRIL-SPACE-CNT)
IF FS-IN-FILE(WS-LOOP-COUNTER:2) = " "
STRING FS-IN-FILE(1:WS-LOOP-COUNTER - 1) DELIMITED BY SIZE
FS-IN-FILE(WS-LOOP-COUNTER + 2
: WS-IN-FILE-LEN - WS-LOOP-COUNTER - 2)
DELIMITED BY SIZE
INTO FS-IN-FILE-2
END-STRING
INITIALIZE FS-IN-FILE
MOVE FS-IN-FILE-2 TO FS-IN-FILE
INITIALIZE FS-IN-FILE-2
END-IF
END-PERFORM
INITIALIZE WS-LOOP-COUNTER WS-TRIL-SPACE-CNT WS-TOT-SPACE-CNT
END-PERFORM
WRITE FS-OUT-FILE FROM FS-IN-FILE
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-WRITE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF
END-IF
END-READ
END-PERFORM.
SPACE-REMOVER-PARA-EXIT.
EXIT.
As INSPECT REPLACING only allows to replace the same number of bytes you can not use it. As Brian pointed out your COBOL runtime may comes with options like GnuCOBOL's FUNCTION SUBSTITUTE. In any case the question "Which COBOL" is still useful to be answered.
To do Thraydor's approach use UNSTRING to a table using a string pointer. Something along
MOVE 1 TO strpoint
PERFORM VARYING table-idx FROM 1 BY 1
UNTIL table-idx = table-max
UNSTRING your2000line DELIMITED BY ALL SPACES
INTO tmp-table (table-idx)
WITH POINTER strpoint
NOT ON OVERFLOW
EXIT PERFORM
END-UNSTRING
END-PERFORM
Another approach which always work is a simple PERFORM over the 2000 bytes with a bunch of IF your2000line (pos:1) statements (if possible: combine it to a single EVALUATE) checking byte by byte (comparing the last byte for removing the duplicate bytes) transferring the source with replacements to a temporary field and MOVE it back once you're finished
Please edit your question to show what exactly you've tried and you can get much better answers.
Firstly, bear in mind that COBOL is a language of dialects. There are also active commercial compilers which target the 1974, 1985, 2002 (now obsolete, incorporated in 2014) and 2014 Standards. All with their own Language Extensions, which may or many not be honoured in a different COBOL compiler.
If you are targeting your learning to a particular environment (IBM Mainframe COBOL you have said) then use that dialect as a subset of what is available to you in the actual COBOL you are using. Which means using the IBM Manuals.
Don't pick and chose stuff from places and use it just because it somehow seemed like a good idea at the time.
I have to admit that EXHIBIT was great fun to use, but it was only ever a Language Extension, and IBM dropped it by at least the later releases of OS/VS COBOL. It, like ON, was a "debugging" statement, although that didn't prevent their being used "normally". There's additional overhead to using EXHIBIT over a simple DISPLAY. IBM Enterprise COBOL only has a simple DISPLAY.
Whilst you may think it fun to use pictograms (the "oh my goodness, what symbol should I use for this" of a figure attempting to pull his own hair out) be aware that that particular symbol was a latecomer to the 2014 Standard, and if it appears in Enterprise COBOL within the next 20 to 50 years I'd be surprised (very low of the list of things to do, another cute way to write "not equal to" when many already exist, and COBOL even has an ELSE).
Some pointers. Don't have a procedure called "remove-all-the-spaces" if what it does is itself is "everything-including-install-a-new-kitchen-sink". Is it any wonder you can't find why it doesn't work?
Many, many, many COBOL programs have the task of reading a file, until the end, and processing the records in the file. Get yourself one of those working well first. Is that relevant to the "business process" the program is addressing? No, it's just technical stuff, which you can't do without so hide it somewhere. Where? in PERFORMed procedures (paragraphs or SECTIONS). Don't expect someone who quickly wants to know what your program is doing to want to read the stuff which every program does. Hide it.
You can find quite a bit of general advice here about writing COBOL programs. Pay attention to those which advise of the use of full-stops/periods, priming reads, and the general structure of COBOL programs.
It is very important to describe things accurately. Work on good, descriptive, accurate names for data-names and procedures. A file is a collection of records.
You have cut down the size of your data to make testing easier, without realising that you have a problem with your data-definitions when you go back to full-length data. Your "counters" can only hold three digits, when they need to be able to cope with the numbers up to 2000.
There is no point in doing something to a piece of data, and then immediately squishing that something with something else which is not related in any way to the original something.
MOVE SPACE TO B
MOVE A TO B
The first MOVE is redundant, superflous, and does nothing but suck up CPU time and confuse the next reader of your program. "Is there some code missing, because otherwise that's just plain dumb".
This is a variant of that example with the MOVE, and you are doing this all over the place:
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
The INITIALIZE is a waste of space, resources, and an introducer of confusion, and extra lines of code to make your program more difficult to understand.
Also, don't "reset" things after they are used, so that they are "ready for next time". That creates dependencies which a future amender of your program will not expect. Even when expected/noticed, they make the code harder to follow.
Exactly what is wrong with your code is impossible to say without knowing what you think is wrong with it. For instance, there is not even a sign of a "+" replacing any spaces, so if you feel that is what it wrong, you simply haven't coded for it.
You've also only attempted one of the three tasks. If once of those not working is what you think is wrong...
Knowing what you think is wrong is one thing, but there are a lot of other problems. If you sit down and sort those out, methodically, then you'll come up with a "structurally" COBOL program which you'll find its easier to understand what your own code does, and where problems lie.
A B C D E
A+B+C+D+E
To get from the first to the second using STRING, look into Simon's suggestion to use WITH POINTER.
Another approach you could take would be using reference-modification.
Either way, you'd be build your result field a piece at a time
This field intentionally blank
A
A+B
A+B+C
A+B+C+D
A+B+C+D+E
Rather than tossing all the data around each time. There are also other ways to code it, but that can be for later.
i have string as ' #$rahul ' and i have to calculate number of alpha bates without using inspect verb. Also not using by ord clause for ASCII value. My instructor told me to use empty array but how it is used?? I tried but it counts for symbols also.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-TABLE.
05 WS-A OCCURS 3 TIMES INDEXED BY I.
10 WS-B PIC A(2).
10 WS-C OCCURS 2 TIMES INDEXED BY J.
15 WS-D PIC X(3).
PROCEDURE DIVISION.
MOVE '####DEF34GHIJKL56MNOPQR' TO WS-TABLE.
PERFORM A-PARA VARYING I FROM 1 BY 1 UNTIL I >3
STOP RUN.
A-PARA.
PERFORM C-PARA VARYING J FROM 1 BY 1 UNTIL J>2.
C-PARA.
if ws-table(1) equals to spaces
continue
else
add +1 to ws-count
end-if
DISPLAY WS-C(I,J).
Apart from your table-definition and actual use of the table, you have basically got the idea already, except you are not sure what, specifically, to test for.
What you need to do is find the section in your COBOL documentation on class condition and class tests.
I suspect this bit of code:
if ws-table(1) equals to spaces
continue
else
add +1 to ws-count
end-if
Has been added in haste. With your data, ws-table(1) will never be space, and ws-count is not defined.
Back to your definition. You are defining a structure with three parts (WS-A OCCURS 3) each of which consists of a two-byte alphabetic field followed by two three-byte alphanumeric fields. That definition is of no direct use to your task.
01 the-data.
05 FILLER OCCURS 24 TIMES
INDEXED BY data-byte-index.
10 the-data-byte PIC X.
That will allow you to look at each byte individually. Note that you can always use good names, which will make your programs easier to understand, reduce the chance of careless errors, and make people's lives, including your own when you return to a program some time later, generally easier.
Note, you can also use reference-modification and lose out on the readability for the benefit of less typing.
Format of your program
Unless it is dictated to you (and although I've never seen it before in over 30 years, I have seen it a couple of time recently) there is absolutely no point in "indenting" things like the WORKKING-STORAGE section, or even paragraph/SECTION labels. They already have all the indentation they need, and further indentation adds nothing, which requiring more typing, and also causing experienced COBOL programmers to wonder why you are doing that.
Since the 1985 Standard for COBOL, the use of full-stops/periods in the PROCEDURE DIVISION is greatly relaxed. Since a full-stop/period in the wrong place can cause errors, this was a good thing. It will also be good if you take full advantage of it. Commas look far too much like full-stops/periods to be of any use in code. They never have to be there, so having them benefits nothing. Also noise-words like THEN can/should be avoided. Unlike commas, spacing can be a boon to the format of a program.
Here's your code above, reformatted:
MOVE '####DEF34GHIJKL56MNOPQR'
TO WS-TABLE
PERFORM A-PARA
VARYING I
FROM 1
BY 1
UNTIL I > 3
STOP RUN
.
A-PARA.
PERFORM C-PARA
VARYING J
FROM 1
BY 1
UNTIL J > 2
.
C-PARA.
if ws-table ( 1 ) equal to space
continue
else
add +1 to ws-count
end-if
DISPLAY
WS-C ( I J )
.
Use some proper names, and it's start to look like a real program.
Note, not all people agree on how a program should be formatted. Seriously.
With this code, I get
16: Perform stmnt not terminated by end-perform
33: syntax error, unexpected end-perform
Why is it saying that I need an end-perform and also not need it?
identification division.
program-id. xxx.
* will accept and display a num until 0 is called then
* asks to go again
data division.
file section.
working-storage section.
01 num pic 9(4).
01 hold pic 9(4).
01 another pic x.
procedure division.
perform until another = 'N' (line 16)
Display "Another Session (Y/N)? "
with no advancing
if another = 'Y'
Display "Enter a 4-digit unsigned number (0 to stop): "
with no advancing
accept num
move num to hold
perform until num = 0
Display "Enter a 4-digit unsigned number (0 to stop): "
with no advancing
accept num
if num <> 0
move num to hold
end-perform.
display space
Display "The last number entered: "hold
End-perform. (Line 33)
stop run.
end-perform.
display space
Display "The last number entered: "hold
End-perform. (Line 33)
It's that full-stop/period (Line 30) which is the killer.
Although since the 1985 Standard COBOL is much more relaxed about full-stops/periods, a single one will bring all current scopes screaming to a halt. You could have nesting 50 levels deep, and one single full-stop/period would end them all, in one fell swoop.
My advice is to use the absolute minimum of full-stop/periods in the PROCEDURE DIVISION.
That is: one to terminate the PROCEDURE DIVISION header; one to terminate each paragraph/SECTION label; one to terminate a paragrpah/SECTION; one to terminate a program (for a program with no paragraphs/SECTIONS). Also, if you have PROCEDURE DIVISION COPY or REPLACE statements, you'll need full-stops/periods to terminate those.
Except for the termination of the labels I put each full-stop/period on a line of its own, never attached to any code. I can then move code around and insert code without worrying about whether I need to add/remove a full-stop/period.
As to why you need END-PERFORM, it is an "inline PERFORM". Syntactically, an inline PERFORM requires an END-PERFORM, but your use of the full-stop/period caused termination of the PERFORM scope before the END-PERFORM was located, so the error on line 16. Subsequently an END-PERFORM unconnected to a PERFORM was located, so the error on line 33.
It is important when putting error messages in your questions that you include the error message exactly as you see it. Copy/paste, don't re-trype, please. Include any message numbers, as well.
You absolutely can not mix the full stop "." scope terminator from Cobol-74 with the End-* scope terminators from Cobol-85.
The difference is that the full stop "." terminates ALL scopes.
The End-* terminates only the most recent scope, just like you might expect.
Putting a "." in the middle of code with End-* is kinda like dropping a nuclear bomb in the middle of it. As a rule, for compilers made in the last quarter century or so, a period should only occur in the procedure division at the end of a paragraph name, or at the end of a paragraph (and sections too, but those are useless in an age where segmentation and overlays are managed by the operating system). I like to use "EXIT." or "CONTINUE." just to highlight that I'm using one of the bad-nasty-best-avoided-periods in the procedure division.
I have some csv record which are variable in length , for example:
0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85
0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000
I need to seperate each of these fields and I need the last field which should be a money.
However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 .
"FILE SECTION"
FD INPUT_FILE.
01 INPUT_REC PIC X(66).
"WORKING STORAGE SECTion"
01 WS_INPUT_REC PIC X(66).
01 WS_AMOUNT_NUM PIC 9(5).9(2).
01 WS_AMOUNT_TXT PIC X(8).
"MAIN SECTION"
UNSTRING INPUT_REC DELIMITED BY ","
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8)
DISPLAY WS_AMOUNT_NUM
From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become:
79.85^M^#^#
137.35^M^#
I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.
I am guessing, but it looks like you may be reading variable length records from a file into a fixed length
COBOL record. The junk
at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going
to be from one read to the next (data beyond the bounds of actual input record length are technically
undefined). That junk ends up
being included in WS_AMOUNT_TXT after the UNSTRING
There are a number of ways to solve this problem. The suggestion I am giving you here may not
be optimal, but it is simple and should get the job done.
The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing
junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are
digits and the decimal character, you could clean it up as follows:
PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1
UNTIL WS_I = ZERO
IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR
WS_AMOUNT_TXT(WS_I:1) = '.'
MOVE ZERO TO WS_I
ELSE
MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1)
END-IF
END-PERFORM
The basic idea in the above code is to scan from the end of the last UNSTRING output field
to the beginning replacing anything that is not a valid digit or decimal point with a space.
Once a valid digit/decimal is found, exit the loop on the assumption that the rest will
be valid.
After cleanup use the intrinsic function NUMVAL as outlined in my answer to your
previous question
to convert WS_AMOUNT_TXT into a numeric data type.
One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over
from a previous read that might be left in the buffer. This will protect you when reading a very "short"
record after a "long" one - otherwise you may trip over data left over from the previous read.
Hope this helps.
EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the
actual input record length you can do something like:
UNSTRING INPUT_REC(1:REC_LEN) INTO...
Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields.
EDIT 2:
Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following:
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need.
I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail.
EDIT 3:
There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M
Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data.
If the file you are reading came from a Windows platform and you are now
reading it on a UNIX platform that would explain the problem. Under Windows records
are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The
UNIX file system treats <CR> as if it were part of the record.
If this is the case, you can be pretty sure that there will be a single <CR> at the
end of every record read. There are a number of ways to deal with this:
Method 1: As you already noted, pre-edit the file using Notepad++ or some other
tool to remove the <CR> characters before processing through your COBOL program.
Personally I don't think this is the best way of going about it. I prefer to use a COBOL
only solution since it involves fewer processing steps.
Method 2: Trim the last character from each input record before processing it. The last
character should always be <CR>. Try the following if you
are reading records as variable length and have the actual input record length available.
SUBTRACT 1 FROM REC_LEN
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows:
UNSTRING INPUT_REC DELIMITED BY "," OR x"0D"
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
Method 4: Condition the last receiving field from UNSTRING by replacing trailing
non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this
question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE.
Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.
^M is a carriage return.
Would Google Refine be useful for rectifying this data?
I have come across this bit of code and am wondering which line will be executed if x is smaller than 3.
IF (X < 3)
NEXT SENTENCE
ELSE
GO TO A010-DO-A.
GO TO B010-DO-B.
GO TO C010-DO-C.
I am not sure if the NEXT SENTENCE will notice the sentence nested in the ELSE block. When NEXT SENTENCE is executed will it skip over GO TO A010-DO-A. or GO TO B010-DO-B.?
Don't confuse the scope of statements and sentences in COBOL.
Sentences end with a period (or full stop if you are British). Next Sentence
goes to the next statement following the end of the current Sentence. In
your example that would be GO TO B010-DO-B
In general usage of NEXT SENTENCE in Cobol is depreciated - at least since
the introduction of scope terminators such as END-whatever (eg. END-IF)
which happend sometime around 1985! Please do not
use NEXT SENTENCE any more. You need to know what it is and what it does
in order to read legacy code, but please
avoid using it in any new code.
A better way to write the code in your example would be:
IF (X < 3)
CONTINUE
ELSE
GO TO A010-DO-A
END-IF
GO TO B010-DO-B
GO TO C010-DO-C
or...
IF (X >= 3)
GO TO A010-DO-A
END-IF
GO TO B010-DO-B
GO TO C010-DO-C
Notice all the periods (.) have been removed because
the scope terminator END-IF makes them redundant. Periods
are only needed at the end of procedures (ie. paragraphs/sections) and a few other places.
The CONTINUE statement is basically a no-op so has no affect other than being
a place holder to keep the syntax valid.
BTW... Best I can tell, the statement GO TO C010-DO-C is logically unreachable.
If X is less than 3
IF (X < 3)
NEXT SENTENCE
Otherwise, or in other words, if X is equal to or grater than 3
ELSE
GO TO A010-DO-A.
NEXT SENTENCE "branches" (a GO TO in whatever language is generated by the compiler) to the line of code following the next full-stop/period that is located physically after the NEXT SENTENCE statement. It is effectively a GO TO without needing a "label".
As has been said, it should not be used in new code.