The value in variable VAR is -1, and when I am trying to write to a file, it gets displayed as J(character mode), which is equivalent to -1.
The VAR is defined in Cobol program copybook as below:
10 VAR PIC S9(1).
Is there any way, to change the display format from character "J" to -1, in the output file.
The information which I found by googling is below:
Value +0 Character {
Value -0 Character }
Value +1 Character A
To convert the zoned ASCII field which results from an EBCDIC to ASCII character translation to a leading sign numeric field, inspect the last digit in the field. If it's a "{" replace the last digit with a 0 and make the number positive. If it's an "A" replace the last digit with a 1 and make the number positive, if it's a "B" replace the last digit with a 2 and make the number positive, etc., etc. If the last digit is a "}" replace the last digit with a 0 and make the number negative. If it's a "J" replace the last digit with a 1 and make the number negative, if it's a "K" replace the last digit with a 2 and make the number negative, etc., etc. Follow these rules for all possible values. You could do this with a look-up table or with IF or CASE statements. Use whatever method suits you best for the language you are using. In most cases you should put the sign immediately before the first digit in the field. This is called a floating sign, and is what most PC programs expect. For example, if your field is 6 bytes, the value -123 should read " -123" not "- 123".
It might be simpler to move it to an EBCDIC output (display) field so that its just EBCDIC characters, and then convert that to ASCII and write it.
For example
10 VAR PIC S9(1).
10 WS-SEPSIGN PIC S9(1) SIGN IS LEADING SEPARATE.
10 WS-DISP REDEFINES WS-SEPSIGN
PIC XX.
MOVE VAR TO WS-SEPSIGN.
Then convert WS-OUT to ASCII using a standard lookup table and write it to the file.
If you are sending data from an EBCDIC machine to an ASCII machne, or vice versa, by far the best way is to only deal with character data. You can then let the transfer/communication mechanism do the ASCII/EBCDIC translation at record/file level.
Field-level translation is possible, but is much more prone to error (fields must be defined, accurately, for everything) and is slower (many translations versus one).
The SIGN clause is a very good way to do this. There is no need to REDEFINES the field (again you get to issues with field-definitions, two places to change if the size is changed).
There is a similar issue with decimal places where they exist. Where source and data definitions are not the same, an explicit decimal-point has to be provided, or a separate scaling-factor.
Both issues, and the original issue, can also be dealt with by using numeric-edited definitions.
01 transfer-record.
...
05 numeric-edited-VAR1 PIC +9.
...
With positive one, that will contain +1, with negative one, that will contain -1.
Take an amount field:
01 VAR2 PACKED-DECIMAL PIC S9(7)V99.
...
01 transfer-record.
...
05 numeric-edited-VAR2 PIC +9(7).99.
...
For 4567.89, positive, the new field will contain +0004567.79. For the same value, but negative, -0004567.79.
The code on the Source-machine is:
MOVE VAR1 TO numeric-edited-VAR1
MOVE VAR2 TO numeric-edited-VAR2
And on the target (in COBOL)
MOVE numeric-edited-VAR1 TO VAR1
MOVE numeric-edited-VAR2 TO VAR2
The code is the same if you use the SIGN clause for fields without decimal places (or with decimal places if you want the danger of being implicit about it).
Another thing with field-level translation is that Auditors don't/shouldn't like it. "The first thing you do when the data arrives is you change it? Really?" says the Auditor.
Related
I realize I have asked a similar question before, but the whole thing is more complicated than I thought.
To cut to the chase, I need to convert a string that contains numbers and letters into a string that only contains numbers, while keeping the numbers that were already there, in the right position.
The letters need to be converted to their corresponding position in the Alphabet + 9. So, A = 10, B= 11.... Z = 35.
So, basically, a string that looks like this:
'GB00LOYD1023456789A1B2'
will have to become:
'161100212429131023456789101112'.
I bolded the letters in both examples so you can see the difference more clearly. Depending on the input, the content will be longer or shorter than this example. Letters will be alternated by numbers and vice versa.
What's the best way to do this?
What's the best way to do this?
That is a matter of opinion.
The REPLACING option of the INSPECT verb requires the replacing and replaced character strings to be the same size, so that's right out because you need to replace one character with two. This is true at least for IBM COBOL.
A way to do this would be to loop through your input string and do a class check on each character. Something like...
01 Stuff.
05 in-posn pic s999 packed-decimal value +0.
05 out-posn pic s999 packed-decimal value +1.
05 in-string pic x(022) value 'GB00LOYD1023456789A1B2'.
05 out-string pic x(100) value spaces.
05 replacer pic x(002) value spaces.
perform varying in-posn from 1 by 1
until in-posn > length of in-string
if in-string(in-posn:1) alphabetic
evaluate in-string(in-posn:1)
when 'A' move '10' to replacer
when 'B' move '11' to replacer
.
.
.
when 'Z' move '35' to replacer
end-evaluate
string replacer delimited size
into out-string
pointer out-posn
end-string
else
string in-string(in-posn:1) delimited size
into out-string
pointer out-posn
end-string
end-if
end-perform
There are variations available. You could replace the evaluate with a couple of table lookups. You could store the length of in-string before beginning the loop. You could store in-string(in-posn:1) rather than hoping the compiler will do that for you.
This is just freehand but I think it conveys the idea.
When I move a number in a PIC X to a PIC 9 the numeric field's value is 0.
FOO, a PIC X(400), has '1' in the first byte and spaces in the remaining 399. Moving into the PIC 9(02) BAR like so
DISPLAY FOO
MOVE FOO to BAR
DISPLAY BAR
yields
1
0
Why is BAR 0 instead of 1? [Edit: originally, 'What is happening?']
Postscript: NealB says "Do not write programs that rely on obscure truncation rules and/or
data type coercion. Be precise and explicit in what you are doing."
That made me realize I really want COMPUTE BAR AS FUNCTION NUMVAL(FOO) wrapped in a NUMERIC test, not a MOVE.
Data MOVEment in COBOL is a complex subject - but here is
a simplified answer to your question. Some data movement rules
are straight forward and conform to what one might expect. Others are somewhat bizzar and may vary with
compiler option, vendor and possibly among editions of the COBOL standard (74, 85, 2002).
With the above in mind, here is an explanation of what happend in your example.
When something 'large' is
MOVEd into something 'small' truncation must occur. This is what happened when BAR was MOVEd to FOO. How that
truncation occurs is determined by the receving item
data type. When the receiving item is character data (PIC X), the rightmost characters will be truncated from the sending field.
For numeric data the leftmost digits are truncated from the sending field. This behaviour is pretty much universal for all COBOL
compilers.
As a consequense of these rules:
When a long 'X' field (BAR) starting with a '1' followed by a bunch of space characters is MOVEd
into a shorter 'X' field the leftmost characters are transferred. This is why the '1' would be preserved when moving to another PIC X
item.
When a long 'X' field (BAR) is moved to a '9' (numeric) datatype the rightmost characters are moved first. This is why '1' was lost, it was never
moved, the last two spaces in BAR were.
So far simple enough... The next bit is more complicated. Exactly what happens is vendor, version, compiler option and character set
specific. For the remainder of this example I will assume EBCDIC character sets and the IBM Enterprise COBOL compiler are being used. I
also assume your program displayed b0 and not 0b.
It is universally legal in COBOL to move PIC X data to PIC 9 fields provided the PIC X field contains only digits. Most
COBOL compilers only look at the lower 4 bits of a PIC 9 field when determining its numeric value. An exception is the least
significant digit where the sign, or lack of one, is stored. For unsigned numerics the upper 4 bits of the least significant digit
are set to 1's (hex F) as a result of the MOVE (coercion follows different rules for signed fields). The lower 4 bits are MOVEd without
coercion. So, what happens when a space character is moved into a PIC 9 field? The hex
representation of a SPACE is '40' (ebcdic). The upper 4 bits, '4', are flipped to 'F' and the lower 4 bits are moved as they are. This results in the
least significant digit (lsd) containing 'F0' hex. This just happens to be the unsigned numeric representation for the digit '0' in a PIC 9 data item.
The remaining leading digits are moved as they are (ie. '40' hex). The net result is that FOO displays as
b0. However, if you were to do anything other that 'MOVE' or 'DISPLAY' FOO, the upper 4 bits of the remaining 'digits' may be coerced to zeroes as a
result. This would flip their display characteristics from spaces to zeros.
The following example COBOL program and its output illustrates these points.
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01.
05 BAR PIC X(10).
05 FOO PIC 9(2).
05 FOOX PIC X(2).
PROCEDURE DIVISION.
MOVE '1 ' TO BAR
MOVE BAR TO FOO
MOVE BAR TO FOOX
DISPLAY 'FOO : >' FOO '< Leftmost trunctaion + lsd coercion'
DISPLAY 'FOOX: >' FOOX '< Righmost truncation'
ADD ZERO TO FOO
DISPLAY 'FOO : >' FOO '< full numeric coercion'
GOBACK
.
Output:
FOO : > 0< Leftmost trunctaion, lsd coercion
FOOX: >1 < Righmost truncation
FOO : >00< full numeric coercion
Final words... Best not to have to know anything about this sort to thing. Do not write programs that rely on obscure truncation
rules and/or data type coercion. Be precise and explicit in what you are doing.
Firstly, why do you think it might be useful to MOVE a 400-byte field to a two-byte field? You are going to get a "certain amount(!)" of "truncation" with that (and the amount of truncation is certain, at 398 bytes). Do you know which part of your 400 bytes is going to be truncated? I'd guess not.
For an alpha-numeric "sending" item (what you have), the (maximum) number of bytes used is the maximum number of bytes in a numeric field (18/31 depending on compiler/compiler option). Those bytes are taken from the right of the alpha-numeric field.
You have, therefore, MOVEd the rightmost 18/31 digits to the two-digit receiving field. You have already explained that you have "1" and 399 spaces, so you have MOVEd 18/31 spaces to your two-digit numeric field.
Your numeric field is "unsigned" (PIC 9(2) not PIC S9(2) or with a SIGN SEPARATE). For an unsigned field (which is a field with "no operational sign") a COBOL compiler should generate code to ensure that the field contains no sign.
This code will turn the right-most space in your PIC 9(2) into a "0" because and ASCII space is X'20' and an EBCDIC space is X'40'. The "sign" is embedded in the right-most byte of a USAGE DISPLAY numeric field, and and no other data but the sign is changed during the MOVE. The 2 or 4 of X'2n' or X'4n' is, without regard to its value, obliterated to the bit-pattern for an "unsign" (the lack of an "operational sign"). An "unsign" followed by a numeric digit (which is the '0' left over from the space) will, obviously, appear as a zero.
Now, you show a single "1" for your 400-byte field and a single 0 for your two-byte numeric.
What I do is this:
DISPLAY
">"
the-first-field-name
"<"
">"
the-second-field-name
"<"
...
or
DISPLAY
">"
the-first-field-name
"<"
DISPLAY
">"
the-second-field-name
"<"
...
If you had done that, you should find 1 followed by 399 spaces for your first field (as you would expect) and space followed by zero for your second field, which you didn't expect.
If you want to specifically see this in operation:
FOO PIC X(400) JUST RIGHT.
MOVE "1" TO FOO
MOVE FOO TO BAR
DISPLAY
">"
FOO
"<"
DISPLAY
">"
BAR
"<"
And you should see what you "almost" expect. You probably want the leading zero as well (the level-number 05 is an example, whatever level-number you are using will work).
05 BAR PIC 99.
05 FILLER REDEFINES BAR.
10 BAR-FIRST-BYTE PIC X.
88 BAR-FIRST-BYTE-SPACE VALUE SPACE.
10 FILLER PIC X.
...
IF BAR-FIRST-BYTE-SPACE
MOVE ZERO TO BAR-FIRST-BYTE
END-IF
Depending on your compiler and how close it is to ANSI Standard (and which ANSI Standard) your results may differ (if so, try to get a better compiler), but:
Don't MOVE alpha-numeric which are longer than the maximum a numeric can be to a numeric;
Note that in the MOVE alpha-numeric to numeric it is the right-most bytes of the alpha-numeric which are actually moved first;
An "unsigned" numeric should/must always remain unsigned;
Always check for compiler diagnostics and correct the code so that no diagnostics are produced (where possible);
When showing examples, it is highly important to show the actual results the computer produced, not the results as interpreted by a human. " 0" is not the same as "0 " is not the same as "0".
EDIT: Looking at TS's other questions, I think Enterprise COBOL is a safe bet. This message would have been issued by the compiler:
IGYPG3112-W Alphanumeric or national sending field "FOO" exceeded 18 digits. The rightmost 18 characters were used as the sender.
Note, the "18 digits" would have been "31 digits" with compiler option ARITH(EXTEND).
Even though it is a lowly "W" which only gives a Return Code of 4, not bothering to read it is not good practice, and if you had read it you'd not have needed to ask the question - although perhaps you'd still not know how you ended up with " 0", but that is another thing.
I gather you expect the 9(2) value to show up as "1" instead of "0" and you are confused as to why it does not?
You are moving values from left to right when you move from an X value (unless the destination value changes things). So the 9 value has a space in it. To simplify it, moving "X(2) value '1 '" to a 9(2) value literally moves those characters. The space makes what is in the 9(2) invalid, so the COBOL compiler does with it what it knows to do, return 0. In other words, defining the 9(2) as it does tells the compiler to interpret the data in a different way.
If you want the 9(2) to show up as "1", you have to present the data in the right way to the 9(2). A 9(2) with a value of 1 has the characters "01". Untested:
03 FOO PIC X(2) value '1'.
03 TEXT-01 PIC X(2) JUSTIFIED RIGHT.
03 NUMB-01 REDEFINES TEXT-01 PIC 9(2).
03 BAR PIC 9(2).
DISPLAY FOO.
MOVE FOO TO TEXT-01.
INSPECT TEXT-01 REPLACING LEADING ' ' BY '0'.
MOVE NUMB-01 TO BAR.
DISPLAY BAR.
Using the NUMERIC test against BAR in your example should fail as well...
I have some csv record which are variable in length , for example:
0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85
0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000
I need to seperate each of these fields and I need the last field which should be a money.
However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 .
"FILE SECTION"
FD INPUT_FILE.
01 INPUT_REC PIC X(66).
"WORKING STORAGE SECTion"
01 WS_INPUT_REC PIC X(66).
01 WS_AMOUNT_NUM PIC 9(5).9(2).
01 WS_AMOUNT_TXT PIC X(8).
"MAIN SECTION"
UNSTRING INPUT_REC DELIMITED BY ","
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8)
DISPLAY WS_AMOUNT_NUM
From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become:
79.85^M^#^#
137.35^M^#
I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.
I am guessing, but it looks like you may be reading variable length records from a file into a fixed length
COBOL record. The junk
at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going
to be from one read to the next (data beyond the bounds of actual input record length are technically
undefined). That junk ends up
being included in WS_AMOUNT_TXT after the UNSTRING
There are a number of ways to solve this problem. The suggestion I am giving you here may not
be optimal, but it is simple and should get the job done.
The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing
junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are
digits and the decimal character, you could clean it up as follows:
PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1
UNTIL WS_I = ZERO
IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR
WS_AMOUNT_TXT(WS_I:1) = '.'
MOVE ZERO TO WS_I
ELSE
MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1)
END-IF
END-PERFORM
The basic idea in the above code is to scan from the end of the last UNSTRING output field
to the beginning replacing anything that is not a valid digit or decimal point with a space.
Once a valid digit/decimal is found, exit the loop on the assumption that the rest will
be valid.
After cleanup use the intrinsic function NUMVAL as outlined in my answer to your
previous question
to convert WS_AMOUNT_TXT into a numeric data type.
One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over
from a previous read that might be left in the buffer. This will protect you when reading a very "short"
record after a "long" one - otherwise you may trip over data left over from the previous read.
Hope this helps.
EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the
actual input record length you can do something like:
UNSTRING INPUT_REC(1:REC_LEN) INTO...
Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields.
EDIT 2:
Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following:
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need.
I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail.
EDIT 3:
There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M
Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data.
If the file you are reading came from a Windows platform and you are now
reading it on a UNIX platform that would explain the problem. Under Windows records
are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The
UNIX file system treats <CR> as if it were part of the record.
If this is the case, you can be pretty sure that there will be a single <CR> at the
end of every record read. There are a number of ways to deal with this:
Method 1: As you already noted, pre-edit the file using Notepad++ or some other
tool to remove the <CR> characters before processing through your COBOL program.
Personally I don't think this is the best way of going about it. I prefer to use a COBOL
only solution since it involves fewer processing steps.
Method 2: Trim the last character from each input record before processing it. The last
character should always be <CR>. Try the following if you
are reading records as variable length and have the actual input record length available.
SUBTRACT 1 FROM REC_LEN
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows:
UNSTRING INPUT_REC DELIMITED BY "," OR x"0D"
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
Method 4: Condition the last receiving field from UNSTRING by replacing trailing
non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this
question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE.
Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.
^M is a carriage return.
Would Google Refine be useful for rectifying this data?
Alphanumeric movement to Numeric variable caused unexpected results. Here is the code fyr:
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-VAR-STR PIC X(3) VALUE SPACES.
01 WS-VAR-NUM PIC 9(3) VALUE ZEROES.
PROCEDURE DIVISION.
MOVE '1' TO WS-VAR-STR
MOVE WS-VAR-STR TO WS-VAR-NUM
DISPLAY 'STRING > ' WS-VAR-STR '< MOVED > ' WS-VAR-NUM '<'
IF WS-VAR-NUM >= 40 AND <= 59
DISPLAY 'INSIDE IF >' WS-VAR-NUM
ELSE
DISPLAY 'INSIDE ELSE >' WS-VAR-NUM
END-IF
GOBACK
.
OUTPUT:
STRING > 1 < MOVED > 1 0<
INSIDE ELSE >1 O
The result is bizzare and want to figure why '1' is moved as '1 0' into numeric variable and interestingly there was NO issue in conditioning it as well. Do share your views. Thanks for your interest.
Basically you have done an illegal MOVE. Moving alphanumeric to numeric fields is valid
provided that the content of the alphanumeric field contains only numeric characters.
This reference
summarizes valid/invalid moves.
What were you expecting as a result?
Moves of alphanumeric fields into numeric ones are done without
'conversion'. Basically you just dropped a one digit followed by two spaces into a numeric field. the '1' was ok, the two spaces
were not. The last two bytes of WS-VAR-NUM contain spaces.
But wait... why is the last character a zero? The answer to this is a bit more complicated.
Items declared as PIC 9 something are represented in Zoned Decimal.
Each digit of a zoned decimal number is represented by a single byte.
The 4 high-order bits of each byte are zone bits; the 4 high-order bits of the low-order byte represent
the sign of the item. The 4 low-order bits of each byte contain the value of the digit. The key here
is where the sign is stored. It is in the high order bits of the last byte. Your declaration did not
include a sign so the MOVE statement blows away the sign bits and replaces them with default
numeric high order bits (remember the only valid characters to MOVE are digits - so this
patch process should always yield a valid result). The high order bits of an unsigned zoned decimal
digit are always HEX F. What are the low order bits of the last byte? A space has an ebcdic HEX value of 40. A zero is HEX F0. Since the MOVE statement "fixes" the sign automatically, you end up with HEX F0 in the low order digit, which happens to be, you guessed it, zero. None of the other 'digits' contain sign bits so they are left as
they were.
Finally, a DISPLAY statement converts zoned decimal fields into their equivalent character representation
for presentation: Net result is: '1 0'.
BTW The above discussion is how it works out on an IBM z/OS platform - other character sets (eg. ASCII) and/or other platforms may yield different results, not because IBM is doing the wrong thing, but because the program is doing an illegal MOVE and the results are essentially undefined.
So I have entered my second semester of College and they have me doing a course called Advanced COBOL. As one of my assignments I have to my make a program that tests certain things in a file to make sure the input has no errors. I get the general idea but there are just a few things I don't understand and my teacher is one of those people who will give you an assignment and make you figure it out yourself with little or no help. So here is what I need help with.
I have a field that the first 5 columns have to be numbers, the 6th column a capital letter and the last 2 numbers in a range of 01-68 or 78-99.
one of my fields has to be a string of numbers with a dash in it like 00000-000, but some have more than one dash. How can I count the dashes to identify that there is a problem.
Here are a few hints...
Use a hieratical record structure to view the data in different ways. For example:
01 ITEM-REC.
05 ITEM-CODE.
10 ITEM-NUM-CODE PIC 9(3).
10 ITEM-CHAR-CODE PIC A(3).
88 ITEM-TYPE-A VALUE 'AAA' THRU 'AZZ'.
88 ITEM-TYPE-B VALUE 'BAA' THRU 'BZZ'.
05 QUANTITY PIC 9(4).
ITEM-CODE is a 6 character group field, the first part of which is numeric (ITEM-NUM-CODE) and the last part
is alphabetic (ITEM-CHAR-CODE). You can refer to any one of these three variables in your program. When you
refer to ITEM-CODE, or any other group item, COBOL
treats the variable as if it were declared as PIC X. This means you can
MOVE just about anything into it without raising an error. For example:
MOVE 'ABCdef' TO ITEM-CODE
or
MOVE 'ABCdef0005' TO ITEM-REC
Neither one would cause an error even though the elementary data item ITEM-NUM-CODE is definitely not a number.
To verify the validity
of your data after a group move you should validate each elementary data item separately (unless
you know for certain no data type errors could have occurred). There are a variety of ways to do this. For
example if the data item has to be numeric the following would work:
IF ITEM-NUM-CODE IS NUMERIC
CONTINUE
ELSE
DISPLAY 'ITEM-NUM-CODE IS NOT NUMERIC'
PERFORM BIG-BAD-ERROR
END-IF
COBOL provides various class tests which can be applied against a data item. For
example: NUMERIC, ALPHABETIC and ALPHANUMERIC are commonly used.
Another common way to test for ranges of values is by defining various 88 levels - but exercise
caution. In the above
example ITEM-TYPE-A is an 88 level that defines a data range from 'AAA' through 'AZZ' based on
the collating sequence currently in effect. To verify that ITEM-CHAR-CODE contains only alphabetic
characters and the first letter is an 'A' or a 'B', you could do something like:
IF ITEM-CHAR-CODE ALPHABETIC
DISPLAY 'ITEM-CHAR-CODE is alphabetic.'
EVALUATE TRUE
WHEN ITEM-TYPE-A
DISPLAY 'ITEM-CHAR-CODE is in range AAA through AZZ'
WHEN ITEM-TYPE-B
DISPLAY 'ITEM-CHAR-CODE is in range BAA through BZZ'
WHEN OTHER
DISPLAY 'ITEM-CHAR-CODE is in some other range'
END-EVALUATE
ELSE
DISPLAY 'ITEM-CHAR-CODE is not alphabetic'
END-IF
Note the separate test for ALPHABETIC above. Why do that when the 88 level tests
could have done the job? Actually the 88's are not sufficient because they
cover the entire range from AAA through AZZ based on the collating sequence currently
in effect. In
an EBCDIC based environment (a very large number of COBOL shops use EBCDIC) this captures
values such as A}\. the close-brace and backslash characters are non-alpha but
fall into the middle of
the range 'A' through 'Z' (what the #*#! is that all about?). Also note that a value such
as 'aaa' would not satisfy the ITEM-TYPE-A condition because lower case letters fall outside
the defined range. Maybe time to check out an EBCDIC character table.
Finally, you can count the number of occurrences of a character, or string of characters, in
a variable with the INSPECT verb as follows:
INSPECT ITEM-CODE TALLING DASH-COUNT FOR ALL '-'
DASH-COUNT needs to be a numeric item and will contain the number of dash characters in ITEM-CODE. The INSPECT
verb is not so useful if you want to count the number of digits. For this you would need one statement for each digit.
It might be easier to just code a loop something like:
PERFORM VARYING I FROM 1 BY 1
UNTIL I > LENGTH OF ITEM-CODE
EVALUATE ITEM-CODE(I:1)
WHEN '-'
COMPUTE DASH-COUNT = DASH-COUNT + 1
WHEN '0' THRU '9'
COMPUTE DIGIT-COUNT = DIGIT-COUNT + 1
WHEN OTHER
COMPUTE OTHER-COUNT = OTHER-COUNT + 1
END-EVALUATE
END-PERFORM
Now ask yourself why I was comfortable using a zero through 9 range check? Hint: look at the collating sequence.
Hope this helps.