Find a substring in a string in COBOL - cobol

My problem is, given a variable which I read from a file, see if it contains or matches another string.
In other words, find in a file all the records whose variable
BRADD PIC X(30)
matches or contains a string introduced by keyboard.
I'm very confident this problem is resolved through the INSPECT instruction, and I've tried something like this in my code:
READ BRANCHFILE NEXT RECORD
AT END SET EndOfFile TO TRUE
END-READ.
PERFORM UNTIL EndOfFile
INSPECT BBRADD
TALLYING CONT for CHARACTERS
BEFORE INITIAL CITY
IF CONT>1
DISPLAY " BRANCH CODE :" BBRID
DISPLAY " BRANCH NAME :" BBRNAME
DISPLAY " BRANCH ADDRESS :" BBRADD
DISPLAY " PHONE :" BBRPH
DISPLAY " E-MAIL :" BEMAIL
DISPLAY " MANAGER NAME :" BMGRNAME
DISPLAY " ------------------"
DISPLAY " ------------------"
END-IF
READ BRANCHFILE NEXT RECORD
AT END SET EndOfFile TO TRUE
END-READ
MOVE 0 TO CONT
END-PERFORM.
Where CITY is the variable I introduce through keyboard.
¿Anyone knows how to find a "substring" in a "string"?
For example, if I introduced "Zaragoza" my program have to print all the records in the file which variable BBRADD contains "Zaragoza".
01 BRANCHREC.
88 EndOfFile VALUE HIGH-VALUE.
02 BBRID PIC X(6).
02 BBRNAME PIC X(15).
02 BBRADD PIC X(30).
02 BBRPH PIC X(10).
02 BEMAIL PIC X(20).
02 BMGRNAME PIC X(25).

You would need to set CONT to zero before the INSPECT, every time.
CONT just gets updated from its initial value when the INSPECT starts. After you find your first one, every record will look like it has CITY in it.
If may initially seem odd that it works that way, but if it didn't you'd be limited on the occasions when that is how you want it to work.
Ah, looking a little closer, you are setting CONT to an initial value, you are just doing it in an unexpected place. If it needs to be zero, set it to zero immediately before it should be zero. Much easier to find, less easy for someone changing the program in the future to make a mess of.
However, you have another problem. Let's say CITY is PIC X(20). The user enters SEVILLA and your INSPECT will now search for SEVILLA followed by 13 spaces. Ideally you'd want SEVILLA followed by one space.
You need to be able to test for a value that the user has entered, with a trailing blank, but not more.
The current popular way to do this is with reference-modification.
You need to take your user-input, find out how many trailing spaces it contains, calculate how long the data is, add one for the trailing blank, and hold that value in a field (preferably a BINARY field).
Then your INSPECT can look like this:
INSPECT BBRADD
TALLYING CONT for CHARACTERS
BEFORE INITIAL CITY ( 1 : length-of-data-plus-one )
However, then you have a problem if SEVILLA is actually in the start of the field.
So you make a small change, not to count characters which appear before it, but to count occurrences of it.
INSPECT BBRADD
TALLYING CONT for ALL
CITY ( 1 : length-of-data-plus-one )
Many people will instead code a PERFORM loop with reference-modification and do the test that way. With the final version of the INSPECT above have to code the termination logic yourself. For learning purposes it would be good to do it both ways.
When doing file-io, always use and check the FILE STATUS. Put your READ into a paragraph and perform it, you don't need two different pieces of code. If you use the FILE STATUS you don't need the AT END (or the END-READ) as the field you use to receive the FILE STATUS value will be "10" for end-of-file. Just use your 88 on that field, with the value of "10".
The Edit on your question now indicates where your existing 88-level is.
On the one hand, this is a good idea, because the end-of-file is associated with the record, and there can be no valid accidental content.
On the other hand, this is not a "portable" solution: if you use other COBOLs you may find that once end-of-file is reached it is no longer valid to access data under the FD. In the standard what happens in this situation is not defined, so you get differences amongst compilers.
You can retain the 88 on the group-item had have it portable by using READ ... INTO ... and having your record-layout in WORKING-STORAGE. This takes slightly longer to execute, as the data has to be transferred from one location to another.
I prefer the 88 on the FILE STATUS field and simplify the READ by being able to remove the AT END and END-READ. I already can't access the record-area under the FD so I can't accidentally get wrong values which look good.

Related

Why does my COBOL working storage variable have trailing zeroes?

I'm building a COBOL program to calculate the average of up to 15 integers. The execution displays a number that is far bigger than intended with a lot of trailing zeroes. Here is the relevant code:
Data Division.
Working-Storage Section.
01 WS-COUNTER PIC 9(10).
01 WS-INPUT-TOTAL PIC 9(10).
01 WS-NEXT-INPUT PIC X(8).
01 WS-CONVERTED-INPUT PIC 9(8).
01 WS-AVG PIC 9(8)V99.
Procedure Division.
PROG.
PERFORM INIT-PARA
PERFORM ADD-PARA UNTIL WS-COUNTER = 15 OR WS-NEXT-INPUT = 'q'
PERFORM AVG-PARA
PERFORM END-PARA.
INIT-PARA.
DISPLAY 'This program calculates the average of inputs.'.
MOVE ZERO TO WS-COUNTER
MOVE ZERO TO WS-INPUT-TOTAL
MOVE ZERO TO WS-AVG.
ADD-PARA.
DISPLAY 'Enter an integer or type q to quit: '
ACCEPT WS-NEXT-INPUT
IF WS-NEXT-INPUT NOT = 'q'
MOVE WS-NEXT-INPUT TO WS-CONVERTED-INPUT
ADD WS-CONVERTED-INPUT TO WS-INPUT-TOTAL
ADD 1 TO WS-COUNTER
END-IF.
AVG-PARA.
IF WS-COUNTER > 1
DIVIDE WS-INPUT-TOTAL BY WS-COUNTER GIVING WS-AVG
DISPLAY 'Your average is ' WS-AVG '.' WS-NEXT-INPUT
END-IF.
The reason I put WS-NEXT-INPUT as alphanumeric and move it to a numeric WS-CONVERTED-INPUT if the IF condition is satisfied is because I want it to be able to take "q" to break the UNTIL loop, but after the condition is satisfied, I want a numeric variable for the arithmetical statements. Here's what it looks like with the numbers 10 and 15 as inputs:
10is program calculates the average of inputs.
Enter an integer or type q to quit:
15
Enter an integer or type q to quit:
q
Your average is 1250000000.
The console is a bit buggy so it forces me to input the 10 in that top left corner most of the time. Don't worry about that.
You see my problem in that execution. The result is supposed to be 00000012.50 instead of 1250000000. I tried inserting a few of my other variables into that display statement and they're all basically as they should be except for WS-INPUT-TOTAL which with that combination of numbers ends up being 0025000000 instead of 0000000025 as I would have expected. Why are these digits being stored in such a weird and unexpected way?
You have that strange output because of undefined behavior - computing with spaces.
The MOVE you present has the exact same USAGE and same size - it will commonly be taken over "as is", it normally does not convert the trailing spaces by some magic, so WS-CONVERTED-INPUT ends up with 10 . As the standard says for the move:
De-editing takes place only when the sending operand is a numeric-edited data item and the receiving item is a numeric or a numeric-edited data item.
and if it would be an edited field then it still should raise an exception on the MOVE:
When a numeric-edited data item is the sending operand of a de-editing MOVE statement and the content of that data item is not a possible result for any editing operation in that data item, the result of the MOVE operation is undefined and an EC-DATA-INCOMPATIBLE exception condition is set to exist.
When computing with spaces you commonly would raise a fatal error, but it seems your compile does not have that activated (and because you didn't share your compile command or even your compiler, we can't help with that).
Different COBOL dialects often use (partial only when checks are not activated which would lead to an abort) zero for invalid data, at least for spaces (but they can use everything. This will then lead to WS-CONVERTED-INPUT "seen as" 10000000 - so your computation will then include those big numbers.
So your program should work if you enter the necessary amount of leading zeroes on input.
General:
"never trust input data - validate" (and error or convert as necessary)
at least if something looks suspicious - activate all runtime checks available, re-try.
Solution - Do an explicit conversion:
MOVE FUNCTION NUMVAL(WS-NEXT-INPUT) TO WS-CONVERTED-INPUT, this will strip surrounding spaces and then convert from left to right until invalid data is found. A good coder would also check up-front using FUNCTION TEST-NUMVAL, otherwise you compute with zero if someone enters "TWENTY".

How to write a cobol code to do the below logic?

1) Read a line of 2000 characters and replace all SPACES with a single "+" plus character. i.e. Convert "A B" to "A+B" or "A B" to "A+B"
2)Read a line of 2000 characters, then search for a specific patterns like "PWD" or "INI" or etc and finally store next 6 characters into a variable.
3) Read a line of 2000 characters and store the last word in the string to a variable.
Edit:
I use Micro Focus COBOL.
This is a screenshot of my piece of code so far.
My code is below. It removes a few spaces but not all. Try writing any sentence with random numbers of spaces in between words in and input file for test-data.
IDENTIFICATION DIVISION.
PROGRAM-ID. SALAUT.
ENVIRONMENT DIVISION.
FILE-CONTROL.
SELECT IN-FILE ASSIGN TO "INFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-IN-FILE-STATUS.
SELECT OUT-FILE ASSIGN TO "OUTFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-OUT-FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD IN-FILE.
01 FS-IN-FILE PIC X(200).
FD OUT-FILE.
01 FS-OUT-FILE PIC X(200).
WORKING-STORAGE SECTION.
01 WS-ATMA-C.
03 WS-OUT-FILE-STATUS PIC X(02).
03 WS-IN-FILE-STATUS PIC X(02).
03 WS-LOOP-COUNTER PIC 9(03) VALUE 1.
03 WS-IN-EOF PIC X value 'N'.
03 WS-IN-FILE-LEN PIC 9(03).
03 WS-IN-SPACE-CNT PIC 9(03) VALUE 1.
03 FS-IN-FILE-2 PIC X(200).
03 WS-TRIL-SPACE-CNT PIC 9(03).
03 WS-TOT-SPACE-CNT PIC 9(03).
PROCEDURE DIVISION.
MAIN-PARA.
OPEN INPUT IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
OPEN OUTPUT OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
PERFORM SPACE-REMOVER-PARA THRU SPACE-REMOVER-PARA-EXIT.
CLOSE IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
CLOSE OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
MAIN-PARA-EXIT.
STOP RUN.
SPACE-REMOVER-PARA.
PERFORM UNTIL WS-IN-EOF = 'Y'
INITIALIZE FS-IN-FILE FS-OUT-FILE WS-IN-FILE-LEN FS-IN-FILE-2
READ IN-FILE
AT END
MOVE 'Y' TO WS-IN-EOF
NOT AT END
INSPECT FS-IN-FILE TALLYING WS-IN-FILE-LEN FOR CHARACTERS
EXHIBIT NAMED WS-IN-FILE-LEN
MOVE 1 TO WS-LOOP-COUNTER
IF WS-IN-FILE-LEN <> 0
PERFORM UNTIL WS-IN-SPACE-CNT <= ZEROS
INSPECT FS-IN-FILE TALLYING WS-TOT-SPACE-CNT FOR ALL " "
INSPECT FUNCTION REVERSE (FS-IN-FILE) TALLYING
WS-TRIL-SPACE-CNT FOR LEADING " "
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
PERFORM VARYING WS-LOOP-COUNTER FROM 1 BY 1
UNTIL WS-LOOP-COUNTER >=
WS-IN-FILE-LEN - (2 * WS-TRIL-SPACE-CNT)
IF FS-IN-FILE(WS-LOOP-COUNTER:2) = " "
STRING FS-IN-FILE(1:WS-LOOP-COUNTER - 1) DELIMITED BY SIZE
FS-IN-FILE(WS-LOOP-COUNTER + 2
: WS-IN-FILE-LEN - WS-LOOP-COUNTER - 2)
DELIMITED BY SIZE
INTO FS-IN-FILE-2
END-STRING
INITIALIZE FS-IN-FILE
MOVE FS-IN-FILE-2 TO FS-IN-FILE
INITIALIZE FS-IN-FILE-2
END-IF
END-PERFORM
INITIALIZE WS-LOOP-COUNTER WS-TRIL-SPACE-CNT WS-TOT-SPACE-CNT
END-PERFORM
WRITE FS-OUT-FILE FROM FS-IN-FILE
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-WRITE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF
END-IF
END-READ
END-PERFORM.
SPACE-REMOVER-PARA-EXIT.
EXIT.
As INSPECT REPLACING only allows to replace the same number of bytes you can not use it. As Brian pointed out your COBOL runtime may comes with options like GnuCOBOL's FUNCTION SUBSTITUTE. In any case the question "Which COBOL" is still useful to be answered.
To do Thraydor's approach use UNSTRING to a table using a string pointer. Something along
MOVE 1 TO strpoint
PERFORM VARYING table-idx FROM 1 BY 1
UNTIL table-idx = table-max
UNSTRING your2000line DELIMITED BY ALL SPACES
INTO tmp-table (table-idx)
WITH POINTER strpoint
NOT ON OVERFLOW
EXIT PERFORM
END-UNSTRING
END-PERFORM
Another approach which always work is a simple PERFORM over the 2000 bytes with a bunch of IF your2000line (pos:1) statements (if possible: combine it to a single EVALUATE) checking byte by byte (comparing the last byte for removing the duplicate bytes) transferring the source with replacements to a temporary field and MOVE it back once you're finished
Please edit your question to show what exactly you've tried and you can get much better answers.
Firstly, bear in mind that COBOL is a language of dialects. There are also active commercial compilers which target the 1974, 1985, 2002 (now obsolete, incorporated in 2014) and 2014 Standards. All with their own Language Extensions, which may or many not be honoured in a different COBOL compiler.
If you are targeting your learning to a particular environment (IBM Mainframe COBOL you have said) then use that dialect as a subset of what is available to you in the actual COBOL you are using. Which means using the IBM Manuals.
Don't pick and chose stuff from places and use it just because it somehow seemed like a good idea at the time.
I have to admit that EXHIBIT was great fun to use, but it was only ever a Language Extension, and IBM dropped it by at least the later releases of OS/VS COBOL. It, like ON, was a "debugging" statement, although that didn't prevent their being used "normally". There's additional overhead to using EXHIBIT over a simple DISPLAY. IBM Enterprise COBOL only has a simple DISPLAY.
Whilst you may think it fun to use pictograms (the "oh my goodness, what symbol should I use for this" of a figure attempting to pull his own hair out) be aware that that particular symbol was a latecomer to the 2014 Standard, and if it appears in Enterprise COBOL within the next 20 to 50 years I'd be surprised (very low of the list of things to do, another cute way to write "not equal to" when many already exist, and COBOL even has an ELSE).
Some pointers. Don't have a procedure called "remove-all-the-spaces" if what it does is itself is "everything-including-install-a-new-kitchen-sink". Is it any wonder you can't find why it doesn't work?
Many, many, many COBOL programs have the task of reading a file, until the end, and processing the records in the file. Get yourself one of those working well first. Is that relevant to the "business process" the program is addressing? No, it's just technical stuff, which you can't do without so hide it somewhere. Where? in PERFORMed procedures (paragraphs or SECTIONS). Don't expect someone who quickly wants to know what your program is doing to want to read the stuff which every program does. Hide it.
You can find quite a bit of general advice here about writing COBOL programs. Pay attention to those which advise of the use of full-stops/periods, priming reads, and the general structure of COBOL programs.
It is very important to describe things accurately. Work on good, descriptive, accurate names for data-names and procedures. A file is a collection of records.
You have cut down the size of your data to make testing easier, without realising that you have a problem with your data-definitions when you go back to full-length data. Your "counters" can only hold three digits, when they need to be able to cope with the numbers up to 2000.
There is no point in doing something to a piece of data, and then immediately squishing that something with something else which is not related in any way to the original something.
MOVE SPACE TO B
MOVE A TO B
The first MOVE is redundant, superflous, and does nothing but suck up CPU time and confuse the next reader of your program. "Is there some code missing, because otherwise that's just plain dumb".
This is a variant of that example with the MOVE, and you are doing this all over the place:
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
The INITIALIZE is a waste of space, resources, and an introducer of confusion, and extra lines of code to make your program more difficult to understand.
Also, don't "reset" things after they are used, so that they are "ready for next time". That creates dependencies which a future amender of your program will not expect. Even when expected/noticed, they make the code harder to follow.
Exactly what is wrong with your code is impossible to say without knowing what you think is wrong with it. For instance, there is not even a sign of a "+" replacing any spaces, so if you feel that is what it wrong, you simply haven't coded for it.
You've also only attempted one of the three tasks. If once of those not working is what you think is wrong...
Knowing what you think is wrong is one thing, but there are a lot of other problems. If you sit down and sort those out, methodically, then you'll come up with a "structurally" COBOL program which you'll find its easier to understand what your own code does, and where problems lie.
A B C D E
A+B+C+D+E
To get from the first to the second using STRING, look into Simon's suggestion to use WITH POINTER.
Another approach you could take would be using reference-modification.
Either way, you'd be build your result field a piece at a time
This field intentionally blank
A
A+B
A+B+C
A+B+C+D
A+B+C+D+E
Rather than tossing all the data around each time. There are also other ways to code it, but that can be for later.

Way to "catch" input data through ACCEPT?

I am fairly new to Cobol and am writing a basic application to get started with the language. Right now it is a 100% console application and I ran into a problem that isn't really a problem unless the user inputs the wrong data... And after not being able to find an answer to my "problem" through google I can't seem to get it out of my head, hence the thread here.
I have a variable USER-RECORD X(4) and during run-time I ask the user to enter his/her user record. Let's say they enter '1234', then 1234 is being saved and stored in a file along with the other data being requested. However, if the user were to enter "11234" by mistake, then the program will store 1123 instead of 1234, which is wrong because of the wrongly entered data.
My question is if it is possible to surround the ACCEPT verb with some kind of statement, or "catch" the incoming data in some way that if the data being sent is larger than 4 characters something will happen?
It is probably smarted to handle these things afterwards making the user confirm the data and all, but I can't seem to get this out of my head. Is this possible at all?
01 USER-RECORD PIC X(4).
01 USER-RECORD-REDEFINED REDEFINES USER-RECORD PIC 9(4).
GET-USER-INPUT.
MOVE 'N' TO WS-NUMERIC.
PERFORM UNTIL WS-NUMERIC = 'Y'
DISPLAY 'ENTER YOUR 4 DIGIT RECORD NUMBER: ' NO ADVANCING
ACCEPT USER-RECORD *>---THIS IS WHERE MY QUESTION LIES---
IF USER-RECORD-REDEFINED IS NUMERIC
MOVE 'Y' TO WS-NUMERIC
ELSE
DISPLAY 'VALUE WAS NOT NUMERIC OR 2 SHORT, TRY AGAIN'
END-PERFORM.
Your Micro Focus compiler supports the SCREEN SECTION. If you use a SCREEN, then your data can only be entered up to the size of the field.
See the FORMAT 4 of ACCEPT in the Micro Focus documentation here, https://supportline.microfocus.com/documentation/books/nx51ws01/nx51indx.htm, and from there locate the documentation of the SCREEN SECTION and other documentation about using screens in Micro Focus COBOL.
ACCEPT and DISPLAY are the COBOL verbs which suffer the most difference from compiler to compiler. When COBOL was originally developed, the idea of a user sitting at a screen to interact with a program was, let's say, futuristic.
Assuming that you have already seen the effect you describe and want to use a very simple, and standard-ish, format of ACCEPT, I'd suggest this:
01 USER-RECORD PIC X(80).
01 FILLER REDEFINES USER-RECORD.
05 USER-ID-give-it-a-good-name PIC 9(4).
05 FILLER PIC X(76).
88 USER-RECORD-EXTRAS-BLANK VALUE SPACE.
SET USER-RECORD-EXTRAS-BLANK TO TRUE
ACCEPT USER-RECORD
[your existing code using the nice new name]
IF NOT ( USER-RECORD-EXTRAS-BLANK )
[do something for also an error]
END-IF
The user may be able to circumvent this by idly using the spacebar to get 76 blanks, but probably not worth dealing with that.

Implicit Close of file

I have written the following COBOL program:
*************************************************************
* VERKOOP
*************************************************************
IDENTIFICATION DIVISION.
PROGRAM-ID. VERKOOP.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT PRODUCTEN ASSIGN TO "BESTANDEN/PRODUCTEN"
ACCESS MODE IS RANDOM
ORGANIZATION IS INDEXED
RECORD KEY IS PRODUCTID
FILE STATUS IS WS-FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD PRODUCTEN BLOCK CONTAINS 10 RECORDS.
01 PRODUCT.
02 PRODUCTID PIC X(6).
02 LEVERANCIERID PIC X(6).
02 AANTAL PIC 9(6).
WORKING-STORAGE SECTION.
77 FOUT PIC X.
88 PRODUCT-NIET-GEVONDEN VALUE 1.
77 WS-PRODUCTID PIC X(6).
77 WS-AANTAL PIC 9(6).
77 WS-FILE-STATUS PIC XX.
LINKAGE SECTION.
01 LS-PRODUCTID PIC X(6).
01 LS-AANTAL PIC 9(6).
PROCEDURE DIVISION.
* USING LS-PRODUCTID, LS-AANTAL.
MAIN.
PERFORM INITIALISEER
PERFORM LEES-PRODUCT-IN
PERFORM LEES-BESTAND
PERFORM SLUIT-BESTAND
STOP RUN.
INITIALISEER.
MOVE ZEROS TO PRODUCT
OPEN I-O PRODUCTEN.
* DISPLAY WS-FILE-STATUS..
LEES-PRODUCT-IN.
* MOVE LS-PRODUCTID TO WS-PRODUCTID
* MOVE LS-AANTAL TO WS-AANTAL.
DISPLAY "GEEF PRODUCTID OP: "
ACCEPT WS-PRODUCTID
DISPLAY "GEEF AANTAL OP: "
ACCEPT WS-AANTAL.
LEES-BESTAND.
* DISPLAY "LEES-BESTAND"
MOVE WS-PRODUCTID TO PRODUCTID
* DISPLAY PRODUCTID
READ PRODUCTEN INVALID KEY SET PRODUCT-NIET-GEVONDEN TO TRUE
END-READ
DISPLAY "END-READ" WS-FILE-STATUS
IF PRODUCT-NIET-GEVONDEN PERFORM FOUTJE
ELSE
MOVE WS-PRODUCTID TO PRODUCTID
SUBTRACT WS-AANTAL FROM AANTAL
PERFORM UPDATE-PRODUCT
END-IF.
UPDATE-PRODUCT.
REWRITE PRODUCT INVALID KEY PERFORM FOUTJE.
SLUIT-BESTAND.
* DISPLAY "SLUIT-BESTAND"
CLOSE PRODUCTEN.
FOUTJE.
DISPLAY "ER IS EEN FOUT OPGETREDEN"
* DISPLAY WS-FILE-STATUS
STOP RUN.
The idea is that I find a product by its productid in the file PRODUCTEN.dat and subtract the amount (aantal) by a given number. However everytime I run it I get the following error: WARNING - Implicit CLOSE of PRODUCTEN <"BESTANDEN/PRODUCTEN">. I don't really see the problem, the WS-FILE-STATUS line even gives me back a 00 status. I am 100% sure the product is in the file so I'm not trying to subtract from a non-existing product or anything.
UPDATE: I fixed it by assign PRODUCTEN to a newly declared file as the last one (somehow) got corrupt and was behaving in an unintended way.
To get that Implicit Close message, you must have a STOP RUN before you close the file.
You have a STOP RUN in paragraph FOUTJE, before the file is closed, so paragraph FOUTJE is being used.
You use paragraph FOUTJE in a PERFORM when PRODUCT-NIET-GEVONDEN is true.
PRODUCT-NIET-GEVONDEN is set to true on the INVALID KEY of the READ.
So INVALID KEY is true.
You get a FILE STATUS of ZERO. Unexpected, but fits what you have presented.
I don't have COBOL-IT and I don't know what OS you are using.
I also don't know in your set-up what a READ of a keyed file which does not explicitly reference a key does.
I don't know in any set-up, because I don't do it. If I'm doing a keyed read, I always specify the key.
I don't put data in the key on the file. I use a WORKING-STORAGE field for the key.
Why, well, implementation-dependent for the compiler, but unless your file is OPEN and unless there is a current record on the file, then the content, even the address, of a file record is/can be (implementation dependent) undefined.
As far as I am concerned, the KEY on the SELECT is to define the presence of the key on the file. The key you are using to READ the file obviously comes from elsewhere.
So, I would remove these:
MOVE ZEROS TO PRODUCT
MOVE WS-PRODUCTID TO PRODUCTID
I'd change this to include the KEY of WS-PRODUCTID
READ PRODUCTEN INVALID KEY SET PRODUCT-NIET-GEVONDEN TO TRUE
I'd not use INVALID KEY, I'd just use the value of WS-FILE-STATUS, which I'd expect to be "23" for "not found". I'd do the test with an 88. You then don't need your "flag" (FOUT and PRODUCT-NIET-GEVONDEN) anyway. Check the FILE STATUS field after each IO. This time you spelled your filename correctly, another time you won't and you may waste more time chasing your tail.
Work on consistent indentation, it will make your program easier to read, for you, and anyone else.
If you want to use DISPLAY to verify the logical path, you need to DISPLAY the value which is used to determine the logical path (FOUT in this case).
There are two "formats" of the READ statement. One is for sequential reads, one is for reads using a key. When each is reduced to its mandatory-only content, they are identical. Therefore it is not clear, per compiler, which type of READ is the default (when not explicit) or when it is the default (per file). So I always make it explicit:
READ PRODUCTEN KEY IS WS-PRODUCTID
I would then use the FILE STATUS field to determine whether the key was read (00 in the status) or not found (23) or something else (something else).
NOTE: This Answer as a resolution to your problem only works if everything is as you have described. Further information may invalidate this Answer as a Resolution.
The Answer does work as a generally clearer (therefore better) way to code your COBOL program.
Turns out to have been a suspected corrupted file. This may have caused a disparity between INVALID KEY and FILE STATUS, but in the normal course of events that is not going to happen. It is the only thing which fits all the evidence, but this is an exceptional case, perhaps not able to reproduce without the exact-same file corruption and clutching at this straw in a general case for why a given program is not working is probably the first refuge of a scoundrel.

How can we eliminate junk value in field?

I have some csv record which are variable in length , for example:
0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85
0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000
I need to seperate each of these fields and I need the last field which should be a money.
However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 .
"FILE SECTION"
FD INPUT_FILE.
01 INPUT_REC PIC X(66).
"WORKING STORAGE SECTion"
01 WS_INPUT_REC PIC X(66).
01 WS_AMOUNT_NUM PIC 9(5).9(2).
01 WS_AMOUNT_TXT PIC X(8).
"MAIN SECTION"
UNSTRING INPUT_REC DELIMITED BY ","
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8)
DISPLAY WS_AMOUNT_NUM
From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become:
79.85^M^#^#
137.35^M^#
I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.
I am guessing, but it looks like you may be reading variable length records from a file into a fixed length
COBOL record. The junk
at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going
to be from one read to the next (data beyond the bounds of actual input record length are technically
undefined). That junk ends up
being included in WS_AMOUNT_TXT after the UNSTRING
There are a number of ways to solve this problem. The suggestion I am giving you here may not
be optimal, but it is simple and should get the job done.
The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing
junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are
digits and the decimal character, you could clean it up as follows:
PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1
UNTIL WS_I = ZERO
IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR
WS_AMOUNT_TXT(WS_I:1) = '.'
MOVE ZERO TO WS_I
ELSE
MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1)
END-IF
END-PERFORM
The basic idea in the above code is to scan from the end of the last UNSTRING output field
to the beginning replacing anything that is not a valid digit or decimal point with a space.
Once a valid digit/decimal is found, exit the loop on the assumption that the rest will
be valid.
After cleanup use the intrinsic function NUMVAL as outlined in my answer to your
previous question
to convert WS_AMOUNT_TXT into a numeric data type.
One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over
from a previous read that might be left in the buffer. This will protect you when reading a very "short"
record after a "long" one - otherwise you may trip over data left over from the previous read.
Hope this helps.
EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the
actual input record length you can do something like:
UNSTRING INPUT_REC(1:REC_LEN) INTO...
Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields.
EDIT 2:
Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following:
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need.
I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail.
EDIT 3:
There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M
Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data.
If the file you are reading came from a Windows platform and you are now
reading it on a UNIX platform that would explain the problem. Under Windows records
are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The
UNIX file system treats <CR> as if it were part of the record.
If this is the case, you can be pretty sure that there will be a single <CR> at the
end of every record read. There are a number of ways to deal with this:
Method 1: As you already noted, pre-edit the file using Notepad++ or some other
tool to remove the <CR> characters before processing through your COBOL program.
Personally I don't think this is the best way of going about it. I prefer to use a COBOL
only solution since it involves fewer processing steps.
Method 2: Trim the last character from each input record before processing it. The last
character should always be <CR>. Try the following if you
are reading records as variable length and have the actual input record length available.
SUBTRACT 1 FROM REC_LEN
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows:
UNSTRING INPUT_REC DELIMITED BY "," OR x"0D"
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
Method 4: Condition the last receiving field from UNSTRING by replacing trailing
non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this
question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE.
Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.
^M is a carriage return.
Would Google Refine be useful for rectifying this data?

Resources