COBOL Data Validation for capital letter? - cobol

I'm in my second quarter of college and taking "Advanced COBOL" we just received an assignment that requires us to code in some validation procedures for different data. I have everything done except on small validation procedure.
There is a field called "PART-NUMBER" that is 8 bytes long. The first 5 columns must be a number. The 6th column must be a capital letter and the last 2 columns must be in the range of 01-68 or 78-99. The only problem I have is figuring out how to validate that the 6th column is capital.
Here is the code I am using:
From working storage:
01 DETAIL-LINE.
05 PART-NUMBER.
10 PART-FIRST-FIVE-DL PIC X(5).
10 PART-LETTER-DL PIC X.
88 CAPITAL-LETTER VALUE 'A' THRU 'Z'.
10 PART-LAST-TWO-DL PIC XX.
From 300-VALIDATE-PART-NUMBER
EVALUATE PART-LETTER-DL ALPHABETIC
WHEN TRUE EVALUATE CAPITAL-LETTER
WHEN FALSE MOVE 'YES' TO RECORD-ERROR-SWITCH
MOVE 'PART NUMBER' TO FIELD-NAME
MOVE PART-NO-IN TO FIELD-VALUE
MOVE 'YES' TO PART-NO-ERROR
END-EVALUATE
WHEN FALSE MOVE 'YES' TO RECORD-ERROR-SWITCH
MOVE 'PART NUMBER' TO FIELD-NAME
MOVE PART-NO-IN TO FIELD-VALUE
MOVE 'YES' TO PART-NO-ERROR
END-EVALUATE
I know I'm probably not doing this in a very efficient way but for now I just need to get it to work. I've read the whole chapter on data validation from the book and this is sort of a last minute error (program is due tomorrow) so the teacher is unavailable. I would greatly appreciate any help I can get with this. I'm really lost on how I'm supposed to validate capital letters. The method I'm using now reports an error if anything other than A or Z is in the 6th column of the part number.

I don't see anything fundamentally wrong with your code. I put it into a
driver program, compiled and ran it. I got the expected results: Error reported only
when the 6th character of PART-NUMBER was not an upper case letter.
Your COBOL coding style is very different from what I am used to seeing (not wrong, just
different).
Most veteran COBOL programmers would code something like:
IF PART-LETTER-DL IS ALPHABETIC AND
CAPITAL-LETTER
CONTINUE
ELSE
MOVE 'PART NUMBER' TO FIELD-NAME
MOVE PART-NO-IN TO FIELD-VALUE
MOVE 'YES' TO PART-NO-ERROR
END-IF
The IF applies both of your edit criteria and does nothing if both pass (CONTINUE), otherwise
an error is reported (ELSE part). The above does essentially the same thing your code
example does except using IF as opposed to EVALUATE.
I give you full marks for testing both ALPHABETIC and capital letter
using an 88 level range (THRU). A lot of programmers would only use the 88 level, making the
implicit assumption that 'A' THRU 'Z' covers only alphabetic characters - this is dead wrong
in some environments (EBCDIC character sets in particular).
P.S. I see you guys must have the same teacher that Kimmy had!

One thing you should be concerned about is the "Value 'A' thru 'Z'". It will only work on ASCII machines.
If you actually code Value 'A', 'B', 'C', ... 'Z'. It will work on all platforms.

For capital letters you can test the ALPHABETIC-UPPER condition:
IF PART-LETTER-DL NOT EQUAL SPACE AND PART-LETTER-DL IS ALPHABETIC-UPPER
...
END-IF.
ALPHABETIC-LOWER can be used too, but remember that SPACE is considered ALPHABETIC, so testing SPACE is necessary, if you just want capital letters.

For EBCDIC, drop the ALPHABETIC test and just use the 88:
88 CAPITAL-LETTER VALUE 'A' THRU 'I'
'J' THRU 'R'
'S' THRU 'Z'.
Specifying individual letters works, but generates 26 comparisons! The above generates three. The ALPHABETIC plus 'A' THRU 'Z' only two, but does carry some in-built confusion (space is alphabetic, and the THRU includes non-printable digits in the range X'C1' to X'E9').

Related

Why does my COBOL working storage variable have trailing zeroes?

I'm building a COBOL program to calculate the average of up to 15 integers. The execution displays a number that is far bigger than intended with a lot of trailing zeroes. Here is the relevant code:
Data Division.
Working-Storage Section.
01 WS-COUNTER PIC 9(10).
01 WS-INPUT-TOTAL PIC 9(10).
01 WS-NEXT-INPUT PIC X(8).
01 WS-CONVERTED-INPUT PIC 9(8).
01 WS-AVG PIC 9(8)V99.
Procedure Division.
PROG.
PERFORM INIT-PARA
PERFORM ADD-PARA UNTIL WS-COUNTER = 15 OR WS-NEXT-INPUT = 'q'
PERFORM AVG-PARA
PERFORM END-PARA.
INIT-PARA.
DISPLAY 'This program calculates the average of inputs.'.
MOVE ZERO TO WS-COUNTER
MOVE ZERO TO WS-INPUT-TOTAL
MOVE ZERO TO WS-AVG.
ADD-PARA.
DISPLAY 'Enter an integer or type q to quit: '
ACCEPT WS-NEXT-INPUT
IF WS-NEXT-INPUT NOT = 'q'
MOVE WS-NEXT-INPUT TO WS-CONVERTED-INPUT
ADD WS-CONVERTED-INPUT TO WS-INPUT-TOTAL
ADD 1 TO WS-COUNTER
END-IF.
AVG-PARA.
IF WS-COUNTER > 1
DIVIDE WS-INPUT-TOTAL BY WS-COUNTER GIVING WS-AVG
DISPLAY 'Your average is ' WS-AVG '.' WS-NEXT-INPUT
END-IF.
The reason I put WS-NEXT-INPUT as alphanumeric and move it to a numeric WS-CONVERTED-INPUT if the IF condition is satisfied is because I want it to be able to take "q" to break the UNTIL loop, but after the condition is satisfied, I want a numeric variable for the arithmetical statements. Here's what it looks like with the numbers 10 and 15 as inputs:
10is program calculates the average of inputs.
Enter an integer or type q to quit:
15
Enter an integer or type q to quit:
q
Your average is 1250000000.
The console is a bit buggy so it forces me to input the 10 in that top left corner most of the time. Don't worry about that.
You see my problem in that execution. The result is supposed to be 00000012.50 instead of 1250000000. I tried inserting a few of my other variables into that display statement and they're all basically as they should be except for WS-INPUT-TOTAL which with that combination of numbers ends up being 0025000000 instead of 0000000025 as I would have expected. Why are these digits being stored in such a weird and unexpected way?
You have that strange output because of undefined behavior - computing with spaces.
The MOVE you present has the exact same USAGE and same size - it will commonly be taken over "as is", it normally does not convert the trailing spaces by some magic, so WS-CONVERTED-INPUT ends up with 10 . As the standard says for the move:
De-editing takes place only when the sending operand is a numeric-edited data item and the receiving item is a numeric or a numeric-edited data item.
and if it would be an edited field then it still should raise an exception on the MOVE:
When a numeric-edited data item is the sending operand of a de-editing MOVE statement and the content of that data item is not a possible result for any editing operation in that data item, the result of the MOVE operation is undefined and an EC-DATA-INCOMPATIBLE exception condition is set to exist.
When computing with spaces you commonly would raise a fatal error, but it seems your compile does not have that activated (and because you didn't share your compile command or even your compiler, we can't help with that).
Different COBOL dialects often use (partial only when checks are not activated which would lead to an abort) zero for invalid data, at least for spaces (but they can use everything. This will then lead to WS-CONVERTED-INPUT "seen as" 10000000 - so your computation will then include those big numbers.
So your program should work if you enter the necessary amount of leading zeroes on input.
General:
"never trust input data - validate" (and error or convert as necessary)
at least if something looks suspicious - activate all runtime checks available, re-try.
Solution - Do an explicit conversion:
MOVE FUNCTION NUMVAL(WS-NEXT-INPUT) TO WS-CONVERTED-INPUT, this will strip surrounding spaces and then convert from left to right until invalid data is found. A good coder would also check up-front using FUNCTION TEST-NUMVAL, otherwise you compute with zero if someone enters "TWENTY".

How to write a cobol code to do the below logic?

1) Read a line of 2000 characters and replace all SPACES with a single "+" plus character. i.e. Convert "A B" to "A+B" or "A B" to "A+B"
2)Read a line of 2000 characters, then search for a specific patterns like "PWD" or "INI" or etc and finally store next 6 characters into a variable.
3) Read a line of 2000 characters and store the last word in the string to a variable.
Edit:
I use Micro Focus COBOL.
This is a screenshot of my piece of code so far.
My code is below. It removes a few spaces but not all. Try writing any sentence with random numbers of spaces in between words in and input file for test-data.
IDENTIFICATION DIVISION.
PROGRAM-ID. SALAUT.
ENVIRONMENT DIVISION.
FILE-CONTROL.
SELECT IN-FILE ASSIGN TO "INFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-IN-FILE-STATUS.
SELECT OUT-FILE ASSIGN TO "OUTFILE"
ORGANIZATION IS LINE SEQUENTIAL
FILE STATUS IS WS-OUT-FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD IN-FILE.
01 FS-IN-FILE PIC X(200).
FD OUT-FILE.
01 FS-OUT-FILE PIC X(200).
WORKING-STORAGE SECTION.
01 WS-ATMA-C.
03 WS-OUT-FILE-STATUS PIC X(02).
03 WS-IN-FILE-STATUS PIC X(02).
03 WS-LOOP-COUNTER PIC 9(03) VALUE 1.
03 WS-IN-EOF PIC X value 'N'.
03 WS-IN-FILE-LEN PIC 9(03).
03 WS-IN-SPACE-CNT PIC 9(03) VALUE 1.
03 FS-IN-FILE-2 PIC X(200).
03 WS-TRIL-SPACE-CNT PIC 9(03).
03 WS-TOT-SPACE-CNT PIC 9(03).
PROCEDURE DIVISION.
MAIN-PARA.
OPEN INPUT IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
OPEN OUTPUT OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-OPEN-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
PERFORM SPACE-REMOVER-PARA THRU SPACE-REMOVER-PARA-EXIT.
CLOSE IN-FILE.
IF WS-IN-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-IN-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
CLOSE OUT-FILE.
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'IN-FILE-CLOSE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF.
MAIN-PARA-EXIT.
STOP RUN.
SPACE-REMOVER-PARA.
PERFORM UNTIL WS-IN-EOF = 'Y'
INITIALIZE FS-IN-FILE FS-OUT-FILE WS-IN-FILE-LEN FS-IN-FILE-2
READ IN-FILE
AT END
MOVE 'Y' TO WS-IN-EOF
NOT AT END
INSPECT FS-IN-FILE TALLYING WS-IN-FILE-LEN FOR CHARACTERS
EXHIBIT NAMED WS-IN-FILE-LEN
MOVE 1 TO WS-LOOP-COUNTER
IF WS-IN-FILE-LEN <> 0
PERFORM UNTIL WS-IN-SPACE-CNT <= ZEROS
INSPECT FS-IN-FILE TALLYING WS-TOT-SPACE-CNT FOR ALL " "
INSPECT FUNCTION REVERSE (FS-IN-FILE) TALLYING
WS-TRIL-SPACE-CNT FOR LEADING " "
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
PERFORM VARYING WS-LOOP-COUNTER FROM 1 BY 1
UNTIL WS-LOOP-COUNTER >=
WS-IN-FILE-LEN - (2 * WS-TRIL-SPACE-CNT)
IF FS-IN-FILE(WS-LOOP-COUNTER:2) = " "
STRING FS-IN-FILE(1:WS-LOOP-COUNTER - 1) DELIMITED BY SIZE
FS-IN-FILE(WS-LOOP-COUNTER + 2
: WS-IN-FILE-LEN - WS-LOOP-COUNTER - 2)
DELIMITED BY SIZE
INTO FS-IN-FILE-2
END-STRING
INITIALIZE FS-IN-FILE
MOVE FS-IN-FILE-2 TO FS-IN-FILE
INITIALIZE FS-IN-FILE-2
END-IF
END-PERFORM
INITIALIZE WS-LOOP-COUNTER WS-TRIL-SPACE-CNT WS-TOT-SPACE-CNT
END-PERFORM
WRITE FS-OUT-FILE FROM FS-IN-FILE
IF WS-OUT-FILE-STATUS <> '00'
EXHIBIT 'OUT-FILE-WRITE-ERROR : STOP-RUN'
EXHIBIT NAMED WS-OUT-FILE-STATUS
PERFORM MAIN-PARA-EXIT
END-IF
END-IF
END-READ
END-PERFORM.
SPACE-REMOVER-PARA-EXIT.
EXIT.
As INSPECT REPLACING only allows to replace the same number of bytes you can not use it. As Brian pointed out your COBOL runtime may comes with options like GnuCOBOL's FUNCTION SUBSTITUTE. In any case the question "Which COBOL" is still useful to be answered.
To do Thraydor's approach use UNSTRING to a table using a string pointer. Something along
MOVE 1 TO strpoint
PERFORM VARYING table-idx FROM 1 BY 1
UNTIL table-idx = table-max
UNSTRING your2000line DELIMITED BY ALL SPACES
INTO tmp-table (table-idx)
WITH POINTER strpoint
NOT ON OVERFLOW
EXIT PERFORM
END-UNSTRING
END-PERFORM
Another approach which always work is a simple PERFORM over the 2000 bytes with a bunch of IF your2000line (pos:1) statements (if possible: combine it to a single EVALUATE) checking byte by byte (comparing the last byte for removing the duplicate bytes) transferring the source with replacements to a temporary field and MOVE it back once you're finished
Please edit your question to show what exactly you've tried and you can get much better answers.
Firstly, bear in mind that COBOL is a language of dialects. There are also active commercial compilers which target the 1974, 1985, 2002 (now obsolete, incorporated in 2014) and 2014 Standards. All with their own Language Extensions, which may or many not be honoured in a different COBOL compiler.
If you are targeting your learning to a particular environment (IBM Mainframe COBOL you have said) then use that dialect as a subset of what is available to you in the actual COBOL you are using. Which means using the IBM Manuals.
Don't pick and chose stuff from places and use it just because it somehow seemed like a good idea at the time.
I have to admit that EXHIBIT was great fun to use, but it was only ever a Language Extension, and IBM dropped it by at least the later releases of OS/VS COBOL. It, like ON, was a "debugging" statement, although that didn't prevent their being used "normally". There's additional overhead to using EXHIBIT over a simple DISPLAY. IBM Enterprise COBOL only has a simple DISPLAY.
Whilst you may think it fun to use pictograms (the "oh my goodness, what symbol should I use for this" of a figure attempting to pull his own hair out) be aware that that particular symbol was a latecomer to the 2014 Standard, and if it appears in Enterprise COBOL within the next 20 to 50 years I'd be surprised (very low of the list of things to do, another cute way to write "not equal to" when many already exist, and COBOL even has an ELSE).
Some pointers. Don't have a procedure called "remove-all-the-spaces" if what it does is itself is "everything-including-install-a-new-kitchen-sink". Is it any wonder you can't find why it doesn't work?
Many, many, many COBOL programs have the task of reading a file, until the end, and processing the records in the file. Get yourself one of those working well first. Is that relevant to the "business process" the program is addressing? No, it's just technical stuff, which you can't do without so hide it somewhere. Where? in PERFORMed procedures (paragraphs or SECTIONS). Don't expect someone who quickly wants to know what your program is doing to want to read the stuff which every program does. Hide it.
You can find quite a bit of general advice here about writing COBOL programs. Pay attention to those which advise of the use of full-stops/periods, priming reads, and the general structure of COBOL programs.
It is very important to describe things accurately. Work on good, descriptive, accurate names for data-names and procedures. A file is a collection of records.
You have cut down the size of your data to make testing easier, without realising that you have a problem with your data-definitions when you go back to full-length data. Your "counters" can only hold three digits, when they need to be able to cope with the numbers up to 2000.
There is no point in doing something to a piece of data, and then immediately squishing that something with something else which is not related in any way to the original something.
MOVE SPACE TO B
MOVE A TO B
The first MOVE is redundant, superflous, and does nothing but suck up CPU time and confuse the next reader of your program. "Is there some code missing, because otherwise that's just plain dumb".
This is a variant of that example with the MOVE, and you are doing this all over the place:
INITIALIZE WS-IN-SPACE-CNT
COMPUTE WS-IN-SPACE-CNT =
WS-TOT-SPACE-CNT - WS-TRIL-SPACE-CNT
The INITIALIZE is a waste of space, resources, and an introducer of confusion, and extra lines of code to make your program more difficult to understand.
Also, don't "reset" things after they are used, so that they are "ready for next time". That creates dependencies which a future amender of your program will not expect. Even when expected/noticed, they make the code harder to follow.
Exactly what is wrong with your code is impossible to say without knowing what you think is wrong with it. For instance, there is not even a sign of a "+" replacing any spaces, so if you feel that is what it wrong, you simply haven't coded for it.
You've also only attempted one of the three tasks. If once of those not working is what you think is wrong...
Knowing what you think is wrong is one thing, but there are a lot of other problems. If you sit down and sort those out, methodically, then you'll come up with a "structurally" COBOL program which you'll find its easier to understand what your own code does, and where problems lie.
A B C D E
A+B+C+D+E
To get from the first to the second using STRING, look into Simon's suggestion to use WITH POINTER.
Another approach you could take would be using reference-modification.
Either way, you'd be build your result field a piece at a time
This field intentionally blank
A
A+B
A+B+C
A+B+C+D
A+B+C+D+E
Rather than tossing all the data around each time. There are also other ways to code it, but that can be for later.

Change display format from character mode to numeric mode

The value in variable VAR is -1, and when I am trying to write to a file, it gets displayed as J(character mode), which is equivalent to -1.
The VAR is defined in Cobol program copybook as below:
10 VAR PIC S9(1).
Is there any way, to change the display format from character "J" to -1, in the output file.
The information which I found by googling is below:
Value +0 Character {
Value -0 Character }
Value +1 Character A
To convert the zoned ASCII field which results from an EBCDIC to ASCII character translation to a leading sign numeric field, inspect the last digit in the field. If it's a "{" replace the last digit with a 0 and make the number positive. If it's an "A" replace the last digit with a 1 and make the number positive, if it's a "B" replace the last digit with a 2 and make the number positive, etc., etc. If the last digit is a "}" replace the last digit with a 0 and make the number negative. If it's a "J" replace the last digit with a 1 and make the number negative, if it's a "K" replace the last digit with a 2 and make the number negative, etc., etc. Follow these rules for all possible values. You could do this with a look-up table or with IF or CASE statements. Use whatever method suits you best for the language you are using. In most cases you should put the sign immediately before the first digit in the field. This is called a floating sign, and is what most PC programs expect. For example, if your field is 6 bytes, the value -123 should read " -123" not "- 123".
It might be simpler to move it to an EBCDIC output (display) field so that its just EBCDIC characters, and then convert that to ASCII and write it.
For example
10 VAR PIC S9(1).
10 WS-SEPSIGN PIC S9(1) SIGN IS LEADING SEPARATE.
10 WS-DISP REDEFINES WS-SEPSIGN
PIC XX.
MOVE VAR TO WS-SEPSIGN.
Then convert WS-OUT to ASCII using a standard lookup table and write it to the file.
If you are sending data from an EBCDIC machine to an ASCII machne, or vice versa, by far the best way is to only deal with character data. You can then let the transfer/communication mechanism do the ASCII/EBCDIC translation at record/file level.
Field-level translation is possible, but is much more prone to error (fields must be defined, accurately, for everything) and is slower (many translations versus one).
The SIGN clause is a very good way to do this. There is no need to REDEFINES the field (again you get to issues with field-definitions, two places to change if the size is changed).
There is a similar issue with decimal places where they exist. Where source and data definitions are not the same, an explicit decimal-point has to be provided, or a separate scaling-factor.
Both issues, and the original issue, can also be dealt with by using numeric-edited definitions.
01 transfer-record.
...
05 numeric-edited-VAR1 PIC +9.
...
With positive one, that will contain +1, with negative one, that will contain -1.
Take an amount field:
01 VAR2 PACKED-DECIMAL PIC S9(7)V99.
...
01 transfer-record.
...
05 numeric-edited-VAR2 PIC +9(7).99.
...
For 4567.89, positive, the new field will contain +0004567.79. For the same value, but negative, -0004567.79.
The code on the Source-machine is:
MOVE VAR1 TO numeric-edited-VAR1
MOVE VAR2 TO numeric-edited-VAR2
And on the target (in COBOL)
MOVE numeric-edited-VAR1 TO VAR1
MOVE numeric-edited-VAR2 TO VAR2
The code is the same if you use the SIGN clause for fields without decimal places (or with decimal places if you want the danger of being implicit about it).
Another thing with field-level translation is that Auditors don't/shouldn't like it. "The first thing you do when the data arrives is you change it? Really?" says the Auditor.

What's wrong with this alphanumeric to numeric move?

When I move a number in a PIC X to a PIC 9 the numeric field's value is 0.
FOO, a PIC X(400), has '1' in the first byte and spaces in the remaining 399. Moving into the PIC 9(02) BAR like so
DISPLAY FOO
MOVE FOO to BAR
DISPLAY BAR
yields
1
0
Why is BAR 0 instead of 1? [Edit: originally, 'What is happening?']
Postscript: NealB says "Do not write programs that rely on obscure truncation rules and/or
data type coercion. Be precise and explicit in what you are doing."
That made me realize I really want COMPUTE BAR AS FUNCTION NUMVAL(FOO) wrapped in a NUMERIC test, not a MOVE.
Data MOVEment in COBOL is a complex subject - but here is
a simplified answer to your question. Some data movement rules
are straight forward and conform to what one might expect. Others are somewhat bizzar and may vary with
compiler option, vendor and possibly among editions of the COBOL standard (74, 85, 2002).
With the above in mind, here is an explanation of what happend in your example.
When something 'large' is
MOVEd into something 'small' truncation must occur. This is what happened when BAR was MOVEd to FOO. How that
truncation occurs is determined by the receving item
data type. When the receiving item is character data (PIC X), the rightmost characters will be truncated from the sending field.
For numeric data the leftmost digits are truncated from the sending field. This behaviour is pretty much universal for all COBOL
compilers.
As a consequense of these rules:
When a long 'X' field (BAR) starting with a '1' followed by a bunch of space characters is MOVEd
into a shorter 'X' field the leftmost characters are transferred. This is why the '1' would be preserved when moving to another PIC X
item.
When a long 'X' field (BAR) is moved to a '9' (numeric) datatype the rightmost characters are moved first. This is why '1' was lost, it was never
moved, the last two spaces in BAR were.
So far simple enough... The next bit is more complicated. Exactly what happens is vendor, version, compiler option and character set
specific. For the remainder of this example I will assume EBCDIC character sets and the IBM Enterprise COBOL compiler are being used. I
also assume your program displayed b0 and not 0b.
It is universally legal in COBOL to move PIC X data to PIC 9 fields provided the PIC X field contains only digits. Most
COBOL compilers only look at the lower 4 bits of a PIC 9 field when determining its numeric value. An exception is the least
significant digit where the sign, or lack of one, is stored. For unsigned numerics the upper 4 bits of the least significant digit
are set to 1's (hex F) as a result of the MOVE (coercion follows different rules for signed fields). The lower 4 bits are MOVEd without
coercion. So, what happens when a space character is moved into a PIC 9 field? The hex
representation of a SPACE is '40' (ebcdic). The upper 4 bits, '4', are flipped to 'F' and the lower 4 bits are moved as they are. This results in the
least significant digit (lsd) containing 'F0' hex. This just happens to be the unsigned numeric representation for the digit '0' in a PIC 9 data item.
The remaining leading digits are moved as they are (ie. '40' hex). The net result is that FOO displays as
b0. However, if you were to do anything other that 'MOVE' or 'DISPLAY' FOO, the upper 4 bits of the remaining 'digits' may be coerced to zeroes as a
result. This would flip their display characteristics from spaces to zeros.
The following example COBOL program and its output illustrates these points.
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01.
05 BAR PIC X(10).
05 FOO PIC 9(2).
05 FOOX PIC X(2).
PROCEDURE DIVISION.
MOVE '1 ' TO BAR
MOVE BAR TO FOO
MOVE BAR TO FOOX
DISPLAY 'FOO : >' FOO '< Leftmost trunctaion + lsd coercion'
DISPLAY 'FOOX: >' FOOX '< Righmost truncation'
ADD ZERO TO FOO
DISPLAY 'FOO : >' FOO '< full numeric coercion'
GOBACK
.
Output:
FOO : > 0< Leftmost trunctaion, lsd coercion
FOOX: >1 < Righmost truncation
FOO : >00< full numeric coercion
Final words... Best not to have to know anything about this sort to thing. Do not write programs that rely on obscure truncation
rules and/or data type coercion. Be precise and explicit in what you are doing.
Firstly, why do you think it might be useful to MOVE a 400-byte field to a two-byte field? You are going to get a "certain amount(!)" of "truncation" with that (and the amount of truncation is certain, at 398 bytes). Do you know which part of your 400 bytes is going to be truncated? I'd guess not.
For an alpha-numeric "sending" item (what you have), the (maximum) number of bytes used is the maximum number of bytes in a numeric field (18/31 depending on compiler/compiler option). Those bytes are taken from the right of the alpha-numeric field.
You have, therefore, MOVEd the rightmost 18/31 digits to the two-digit receiving field. You have already explained that you have "1" and 399 spaces, so you have MOVEd 18/31 spaces to your two-digit numeric field.
Your numeric field is "unsigned" (PIC 9(2) not PIC S9(2) or with a SIGN SEPARATE). For an unsigned field (which is a field with "no operational sign") a COBOL compiler should generate code to ensure that the field contains no sign.
This code will turn the right-most space in your PIC 9(2) into a "0" because and ASCII space is X'20' and an EBCDIC space is X'40'. The "sign" is embedded in the right-most byte of a USAGE DISPLAY numeric field, and and no other data but the sign is changed during the MOVE. The 2 or 4 of X'2n' or X'4n' is, without regard to its value, obliterated to the bit-pattern for an "unsign" (the lack of an "operational sign"). An "unsign" followed by a numeric digit (which is the '0' left over from the space) will, obviously, appear as a zero.
Now, you show a single "1" for your 400-byte field and a single 0 for your two-byte numeric.
What I do is this:
DISPLAY
">"
the-first-field-name
"<"
">"
the-second-field-name
"<"
...
or
DISPLAY
">"
the-first-field-name
"<"
DISPLAY
">"
the-second-field-name
"<"
...
If you had done that, you should find 1 followed by 399 spaces for your first field (as you would expect) and space followed by zero for your second field, which you didn't expect.
If you want to specifically see this in operation:
FOO PIC X(400) JUST RIGHT.
MOVE "1" TO FOO
MOVE FOO TO BAR
DISPLAY
">"
FOO
"<"
DISPLAY
">"
BAR
"<"
And you should see what you "almost" expect. You probably want the leading zero as well (the level-number 05 is an example, whatever level-number you are using will work).
05 BAR PIC 99.
05 FILLER REDEFINES BAR.
10 BAR-FIRST-BYTE PIC X.
88 BAR-FIRST-BYTE-SPACE VALUE SPACE.
10 FILLER PIC X.
...
IF BAR-FIRST-BYTE-SPACE
MOVE ZERO TO BAR-FIRST-BYTE
END-IF
Depending on your compiler and how close it is to ANSI Standard (and which ANSI Standard) your results may differ (if so, try to get a better compiler), but:
Don't MOVE alpha-numeric which are longer than the maximum a numeric can be to a numeric;
Note that in the MOVE alpha-numeric to numeric it is the right-most bytes of the alpha-numeric which are actually moved first;
An "unsigned" numeric should/must always remain unsigned;
Always check for compiler diagnostics and correct the code so that no diagnostics are produced (where possible);
When showing examples, it is highly important to show the actual results the computer produced, not the results as interpreted by a human. " 0" is not the same as "0 " is not the same as "0".
EDIT: Looking at TS's other questions, I think Enterprise COBOL is a safe bet. This message would have been issued by the compiler:
IGYPG3112-W Alphanumeric or national sending field "FOO" exceeded 18 digits. The rightmost 18 characters were used as the sender.
Note, the "18 digits" would have been "31 digits" with compiler option ARITH(EXTEND).
Even though it is a lowly "W" which only gives a Return Code of 4, not bothering to read it is not good practice, and if you had read it you'd not have needed to ask the question - although perhaps you'd still not know how you ended up with " 0", but that is another thing.
I gather you expect the 9(2) value to show up as "1" instead of "0" and you are confused as to why it does not?
You are moving values from left to right when you move from an X value (unless the destination value changes things). So the 9 value has a space in it. To simplify it, moving "X(2) value '1 '" to a 9(2) value literally moves those characters. The space makes what is in the 9(2) invalid, so the COBOL compiler does with it what it knows to do, return 0. In other words, defining the 9(2) as it does tells the compiler to interpret the data in a different way.
If you want the 9(2) to show up as "1", you have to present the data in the right way to the 9(2). A 9(2) with a value of 1 has the characters "01". Untested:
03 FOO PIC X(2) value '1'.
03 TEXT-01 PIC X(2) JUSTIFIED RIGHT.
03 NUMB-01 REDEFINES TEXT-01 PIC 9(2).
03 BAR PIC 9(2).
DISPLAY FOO.
MOVE FOO TO TEXT-01.
INSPECT TEXT-01 REPLACING LEADING ' ' BY '0'.
MOVE NUMB-01 TO BAR.
DISPLAY BAR.
Using the NUMERIC test against BAR in your example should fail as well...

Data Validation

So I have entered my second semester of College and they have me doing a course called Advanced COBOL. As one of my assignments I have to my make a program that tests certain things in a file to make sure the input has no errors. I get the general idea but there are just a few things I don't understand and my teacher is one of those people who will give you an assignment and make you figure it out yourself with little or no help. So here is what I need help with.
I have a field that the first 5 columns have to be numbers, the 6th column a capital letter and the last 2 numbers in a range of 01-68 or 78-99.
one of my fields has to be a string of numbers with a dash in it like 00000-000, but some have more than one dash. How can I count the dashes to identify that there is a problem.
Here are a few hints...
Use a hieratical record structure to view the data in different ways. For example:
01 ITEM-REC.
05 ITEM-CODE.
10 ITEM-NUM-CODE PIC 9(3).
10 ITEM-CHAR-CODE PIC A(3).
88 ITEM-TYPE-A VALUE 'AAA' THRU 'AZZ'.
88 ITEM-TYPE-B VALUE 'BAA' THRU 'BZZ'.
05 QUANTITY PIC 9(4).
ITEM-CODE is a 6 character group field, the first part of which is numeric (ITEM-NUM-CODE) and the last part
is alphabetic (ITEM-CHAR-CODE). You can refer to any one of these three variables in your program. When you
refer to ITEM-CODE, or any other group item, COBOL
treats the variable as if it were declared as PIC X. This means you can
MOVE just about anything into it without raising an error. For example:
MOVE 'ABCdef' TO ITEM-CODE
or
MOVE 'ABCdef0005' TO ITEM-REC
Neither one would cause an error even though the elementary data item ITEM-NUM-CODE is definitely not a number.
To verify the validity
of your data after a group move you should validate each elementary data item separately (unless
you know for certain no data type errors could have occurred). There are a variety of ways to do this. For
example if the data item has to be numeric the following would work:
IF ITEM-NUM-CODE IS NUMERIC
CONTINUE
ELSE
DISPLAY 'ITEM-NUM-CODE IS NOT NUMERIC'
PERFORM BIG-BAD-ERROR
END-IF
COBOL provides various class tests which can be applied against a data item. For
example: NUMERIC, ALPHABETIC and ALPHANUMERIC are commonly used.
Another common way to test for ranges of values is by defining various 88 levels - but exercise
caution. In the above
example ITEM-TYPE-A is an 88 level that defines a data range from 'AAA' through 'AZZ' based on
the collating sequence currently in effect. To verify that ITEM-CHAR-CODE contains only alphabetic
characters and the first letter is an 'A' or a 'B', you could do something like:
IF ITEM-CHAR-CODE ALPHABETIC
DISPLAY 'ITEM-CHAR-CODE is alphabetic.'
EVALUATE TRUE
WHEN ITEM-TYPE-A
DISPLAY 'ITEM-CHAR-CODE is in range AAA through AZZ'
WHEN ITEM-TYPE-B
DISPLAY 'ITEM-CHAR-CODE is in range BAA through BZZ'
WHEN OTHER
DISPLAY 'ITEM-CHAR-CODE is in some other range'
END-EVALUATE
ELSE
DISPLAY 'ITEM-CHAR-CODE is not alphabetic'
END-IF
Note the separate test for ALPHABETIC above. Why do that when the 88 level tests
could have done the job? Actually the 88's are not sufficient because they
cover the entire range from AAA through AZZ based on the collating sequence currently
in effect. In
an EBCDIC based environment (a very large number of COBOL shops use EBCDIC) this captures
values such as A}\. the close-brace and backslash characters are non-alpha but
fall into the middle of
the range 'A' through 'Z' (what the #*#! is that all about?). Also note that a value such
as 'aaa' would not satisfy the ITEM-TYPE-A condition because lower case letters fall outside
the defined range. Maybe time to check out an EBCDIC character table.
Finally, you can count the number of occurrences of a character, or string of characters, in
a variable with the INSPECT verb as follows:
INSPECT ITEM-CODE TALLING DASH-COUNT FOR ALL '-'
DASH-COUNT needs to be a numeric item and will contain the number of dash characters in ITEM-CODE. The INSPECT
verb is not so useful if you want to count the number of digits. For this you would need one statement for each digit.
It might be easier to just code a loop something like:
PERFORM VARYING I FROM 1 BY 1
UNTIL I > LENGTH OF ITEM-CODE
EVALUATE ITEM-CODE(I:1)
WHEN '-'
COMPUTE DASH-COUNT = DASH-COUNT + 1
WHEN '0' THRU '9'
COMPUTE DIGIT-COUNT = DIGIT-COUNT + 1
WHEN OTHER
COMPUTE OTHER-COUNT = OTHER-COUNT + 1
END-EVALUATE
END-PERFORM
Now ask yourself why I was comfortable using a zero through 9 range check? Hint: look at the collating sequence.
Hope this helps.

Resources