I try to create a string in Cobol with individually letters. Until I try to insert a
Space, everything works. Do you have any Idea, how I could create e.x. the string
" ee ee"
?.
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 S1 PIC X(10).
PROCEDURE DIVISION.
MAIN-PARAGRAPH.
Perform InsertSpace 2 Times
Perform InsertE 2 Times
Perform InsertSpace 2 Times
Perform InsertE 2 Times
Display S1
* expectation " ee ee"
End-Main
InsertE Section
STRING S1 DELIMITED BY SPACE
'e' DELIMITED BY SIZE
INTO S1
END-STRING
InsertSpace Section
STRING S1 DELIMITED BY SPACE
' ' DELIMITED BY SIZE
INTO S1
END-STRING
If you are trying to implement a process where one character at a time is added onto a
character variable, then something like the following might work a bit better for you:
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 S1 PIC X(10) VALUE SPACE.
01 S1-SUB PIC S9(4) BINARY VALUE ZERO.
PROCEDURE DIVISION.
PERFORM INSERT-SPACE 2 TIMES
PERFORM INSERT-E 2 TIMES
PERFORM INSERT-SPACE 2 TIMES
PERFORM INSERT-E 2 TIMES
DISPLAY '>' S1 '<'
GOBACK
.
INSERT-SPACE SECTION.
COMPUTE S1-SUB = S1-SUB + 1
MOVE SPACE TO S1 (S1-SUB : 1)
.
INSERT-E SECTION.
COMPUTE S1-SUB = S1-SUB + 1
MOVE 'E' TO S1 (S1-SUB : 1)
.
S1-SUB keeps tract of the current character position and is incremented
each time you PERFORM a section to add another character.
The above program displays: > EE EE <
Notice the trailing spaces? If you do not want these, the appropriate DISPLAY would be:
DISPLAY '>' S1 (1 : S1-SUB) '<'
which will limit the length of the display to only those characters you have explicity put into the variable. COBOL does not support variable length strings so you have to declare some PIC X type variable that can hold the maximum number of characters you want to display and then keep track of how many you have actually "used" and display only that many.
If this is the sort of thing you are looking for, I would also recommend checking
for bounds errors (ie. adding too many characters). That can be done as follows:
INSERT-E SECTION.
COMPUTE S1-SUB = S1-SUB + 1
IF S1-SUB > LENGTH OF S1
PERFORM ERROR-ROUTINE
END-IF
MOVE 'E' TO S1 (S1-SUB : 1)
.
MOVE " ee ee" TO S1
That will do what you want.
It is difficult to be certain, as you don't show what result you do get, and it is unclear what "Until I try to insert a Space, everything works" means, but...
01 S1 PIC X(10) VALUE SPACE.
Where S1 had no VALUE (and presuming you are not using a compiler which sets a default value for a PICture) the DELIMITED BY SPACE will take the whole 10 bytes, the values which a added by the STRING can never appear in the S1 unless it starts with a value of SPACE. With the value of SPACE, your four STRINGs should work. Err... no it won't, because of the SPACE, and the DELIMITED BY SPACE.
You can also use reference-modification, of course:
MOVE " " TO S1 ( 1 : 2 )
MOVE "ee" TO S1 ( 3 : 2 )
MOVE " " TO S1 ( 3 : 2 )
MOVE "ee" TO S1 ( 5 : )
Or, if you don't want to pad the final part of the field to SPACE by default, change the last to ( 5 : 2 ), which will leave bytes nine and 10 of S1 unchanged.
If you can clarify what you want to achieve, and why you think STRING is the verb to use to do it, you may get better answers.
Related
I'm building a COBOL program to calculate the average of up to 15 integers. The execution displays a number that is far bigger than intended with a lot of trailing zeroes. Here is the relevant code:
Data Division.
Working-Storage Section.
01 WS-COUNTER PIC 9(10).
01 WS-INPUT-TOTAL PIC 9(10).
01 WS-NEXT-INPUT PIC X(8).
01 WS-CONVERTED-INPUT PIC 9(8).
01 WS-AVG PIC 9(8)V99.
Procedure Division.
PROG.
PERFORM INIT-PARA
PERFORM ADD-PARA UNTIL WS-COUNTER = 15 OR WS-NEXT-INPUT = 'q'
PERFORM AVG-PARA
PERFORM END-PARA.
INIT-PARA.
DISPLAY 'This program calculates the average of inputs.'.
MOVE ZERO TO WS-COUNTER
MOVE ZERO TO WS-INPUT-TOTAL
MOVE ZERO TO WS-AVG.
ADD-PARA.
DISPLAY 'Enter an integer or type q to quit: '
ACCEPT WS-NEXT-INPUT
IF WS-NEXT-INPUT NOT = 'q'
MOVE WS-NEXT-INPUT TO WS-CONVERTED-INPUT
ADD WS-CONVERTED-INPUT TO WS-INPUT-TOTAL
ADD 1 TO WS-COUNTER
END-IF.
AVG-PARA.
IF WS-COUNTER > 1
DIVIDE WS-INPUT-TOTAL BY WS-COUNTER GIVING WS-AVG
DISPLAY 'Your average is ' WS-AVG '.' WS-NEXT-INPUT
END-IF.
The reason I put WS-NEXT-INPUT as alphanumeric and move it to a numeric WS-CONVERTED-INPUT if the IF condition is satisfied is because I want it to be able to take "q" to break the UNTIL loop, but after the condition is satisfied, I want a numeric variable for the arithmetical statements. Here's what it looks like with the numbers 10 and 15 as inputs:
10is program calculates the average of inputs.
Enter an integer or type q to quit:
15
Enter an integer or type q to quit:
q
Your average is 1250000000.
The console is a bit buggy so it forces me to input the 10 in that top left corner most of the time. Don't worry about that.
You see my problem in that execution. The result is supposed to be 00000012.50 instead of 1250000000. I tried inserting a few of my other variables into that display statement and they're all basically as they should be except for WS-INPUT-TOTAL which with that combination of numbers ends up being 0025000000 instead of 0000000025 as I would have expected. Why are these digits being stored in such a weird and unexpected way?
You have that strange output because of undefined behavior - computing with spaces.
The MOVE you present has the exact same USAGE and same size - it will commonly be taken over "as is", it normally does not convert the trailing spaces by some magic, so WS-CONVERTED-INPUT ends up with 10 . As the standard says for the move:
De-editing takes place only when the sending operand is a numeric-edited data item and the receiving item is a numeric or a numeric-edited data item.
and if it would be an edited field then it still should raise an exception on the MOVE:
When a numeric-edited data item is the sending operand of a de-editing MOVE statement and the content of that data item is not a possible result for any editing operation in that data item, the result of the MOVE operation is undefined and an EC-DATA-INCOMPATIBLE exception condition is set to exist.
When computing with spaces you commonly would raise a fatal error, but it seems your compile does not have that activated (and because you didn't share your compile command or even your compiler, we can't help with that).
Different COBOL dialects often use (partial only when checks are not activated which would lead to an abort) zero for invalid data, at least for spaces (but they can use everything. This will then lead to WS-CONVERTED-INPUT "seen as" 10000000 - so your computation will then include those big numbers.
So your program should work if you enter the necessary amount of leading zeroes on input.
General:
"never trust input data - validate" (and error or convert as necessary)
at least if something looks suspicious - activate all runtime checks available, re-try.
Solution - Do an explicit conversion:
MOVE FUNCTION NUMVAL(WS-NEXT-INPUT) TO WS-CONVERTED-INPUT, this will strip surrounding spaces and then convert from left to right until invalid data is found. A good coder would also check up-front using FUNCTION TEST-NUMVAL, otherwise you compute with zero if someone enters "TWENTY".
I realize I have asked a similar question before, but the whole thing is more complicated than I thought.
To cut to the chase, I need to convert a string that contains numbers and letters into a string that only contains numbers, while keeping the numbers that were already there, in the right position.
The letters need to be converted to their corresponding position in the Alphabet + 9. So, A = 10, B= 11.... Z = 35.
So, basically, a string that looks like this:
'GB00LOYD1023456789A1B2'
will have to become:
'161100212429131023456789101112'.
I bolded the letters in both examples so you can see the difference more clearly. Depending on the input, the content will be longer or shorter than this example. Letters will be alternated by numbers and vice versa.
What's the best way to do this?
What's the best way to do this?
That is a matter of opinion.
The REPLACING option of the INSPECT verb requires the replacing and replaced character strings to be the same size, so that's right out because you need to replace one character with two. This is true at least for IBM COBOL.
A way to do this would be to loop through your input string and do a class check on each character. Something like...
01 Stuff.
05 in-posn pic s999 packed-decimal value +0.
05 out-posn pic s999 packed-decimal value +1.
05 in-string pic x(022) value 'GB00LOYD1023456789A1B2'.
05 out-string pic x(100) value spaces.
05 replacer pic x(002) value spaces.
perform varying in-posn from 1 by 1
until in-posn > length of in-string
if in-string(in-posn:1) alphabetic
evaluate in-string(in-posn:1)
when 'A' move '10' to replacer
when 'B' move '11' to replacer
.
.
.
when 'Z' move '35' to replacer
end-evaluate
string replacer delimited size
into out-string
pointer out-posn
end-string
else
string in-string(in-posn:1) delimited size
into out-string
pointer out-posn
end-string
end-if
end-perform
There are variations available. You could replace the evaluate with a couple of table lookups. You could store the length of in-string before beginning the loop. You could store in-string(in-posn:1) rather than hoping the compiler will do that for you.
This is just freehand but I think it conveys the idea.
When I move a number in a PIC X to a PIC 9 the numeric field's value is 0.
FOO, a PIC X(400), has '1' in the first byte and spaces in the remaining 399. Moving into the PIC 9(02) BAR like so
DISPLAY FOO
MOVE FOO to BAR
DISPLAY BAR
yields
1
0
Why is BAR 0 instead of 1? [Edit: originally, 'What is happening?']
Postscript: NealB says "Do not write programs that rely on obscure truncation rules and/or
data type coercion. Be precise and explicit in what you are doing."
That made me realize I really want COMPUTE BAR AS FUNCTION NUMVAL(FOO) wrapped in a NUMERIC test, not a MOVE.
Data MOVEment in COBOL is a complex subject - but here is
a simplified answer to your question. Some data movement rules
are straight forward and conform to what one might expect. Others are somewhat bizzar and may vary with
compiler option, vendor and possibly among editions of the COBOL standard (74, 85, 2002).
With the above in mind, here is an explanation of what happend in your example.
When something 'large' is
MOVEd into something 'small' truncation must occur. This is what happened when BAR was MOVEd to FOO. How that
truncation occurs is determined by the receving item
data type. When the receiving item is character data (PIC X), the rightmost characters will be truncated from the sending field.
For numeric data the leftmost digits are truncated from the sending field. This behaviour is pretty much universal for all COBOL
compilers.
As a consequense of these rules:
When a long 'X' field (BAR) starting with a '1' followed by a bunch of space characters is MOVEd
into a shorter 'X' field the leftmost characters are transferred. This is why the '1' would be preserved when moving to another PIC X
item.
When a long 'X' field (BAR) is moved to a '9' (numeric) datatype the rightmost characters are moved first. This is why '1' was lost, it was never
moved, the last two spaces in BAR were.
So far simple enough... The next bit is more complicated. Exactly what happens is vendor, version, compiler option and character set
specific. For the remainder of this example I will assume EBCDIC character sets and the IBM Enterprise COBOL compiler are being used. I
also assume your program displayed b0 and not 0b.
It is universally legal in COBOL to move PIC X data to PIC 9 fields provided the PIC X field contains only digits. Most
COBOL compilers only look at the lower 4 bits of a PIC 9 field when determining its numeric value. An exception is the least
significant digit where the sign, or lack of one, is stored. For unsigned numerics the upper 4 bits of the least significant digit
are set to 1's (hex F) as a result of the MOVE (coercion follows different rules for signed fields). The lower 4 bits are MOVEd without
coercion. So, what happens when a space character is moved into a PIC 9 field? The hex
representation of a SPACE is '40' (ebcdic). The upper 4 bits, '4', are flipped to 'F' and the lower 4 bits are moved as they are. This results in the
least significant digit (lsd) containing 'F0' hex. This just happens to be the unsigned numeric representation for the digit '0' in a PIC 9 data item.
The remaining leading digits are moved as they are (ie. '40' hex). The net result is that FOO displays as
b0. However, if you were to do anything other that 'MOVE' or 'DISPLAY' FOO, the upper 4 bits of the remaining 'digits' may be coerced to zeroes as a
result. This would flip their display characteristics from spaces to zeros.
The following example COBOL program and its output illustrates these points.
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01.
05 BAR PIC X(10).
05 FOO PIC 9(2).
05 FOOX PIC X(2).
PROCEDURE DIVISION.
MOVE '1 ' TO BAR
MOVE BAR TO FOO
MOVE BAR TO FOOX
DISPLAY 'FOO : >' FOO '< Leftmost trunctaion + lsd coercion'
DISPLAY 'FOOX: >' FOOX '< Righmost truncation'
ADD ZERO TO FOO
DISPLAY 'FOO : >' FOO '< full numeric coercion'
GOBACK
.
Output:
FOO : > 0< Leftmost trunctaion, lsd coercion
FOOX: >1 < Righmost truncation
FOO : >00< full numeric coercion
Final words... Best not to have to know anything about this sort to thing. Do not write programs that rely on obscure truncation
rules and/or data type coercion. Be precise and explicit in what you are doing.
Firstly, why do you think it might be useful to MOVE a 400-byte field to a two-byte field? You are going to get a "certain amount(!)" of "truncation" with that (and the amount of truncation is certain, at 398 bytes). Do you know which part of your 400 bytes is going to be truncated? I'd guess not.
For an alpha-numeric "sending" item (what you have), the (maximum) number of bytes used is the maximum number of bytes in a numeric field (18/31 depending on compiler/compiler option). Those bytes are taken from the right of the alpha-numeric field.
You have, therefore, MOVEd the rightmost 18/31 digits to the two-digit receiving field. You have already explained that you have "1" and 399 spaces, so you have MOVEd 18/31 spaces to your two-digit numeric field.
Your numeric field is "unsigned" (PIC 9(2) not PIC S9(2) or with a SIGN SEPARATE). For an unsigned field (which is a field with "no operational sign") a COBOL compiler should generate code to ensure that the field contains no sign.
This code will turn the right-most space in your PIC 9(2) into a "0" because and ASCII space is X'20' and an EBCDIC space is X'40'. The "sign" is embedded in the right-most byte of a USAGE DISPLAY numeric field, and and no other data but the sign is changed during the MOVE. The 2 or 4 of X'2n' or X'4n' is, without regard to its value, obliterated to the bit-pattern for an "unsign" (the lack of an "operational sign"). An "unsign" followed by a numeric digit (which is the '0' left over from the space) will, obviously, appear as a zero.
Now, you show a single "1" for your 400-byte field and a single 0 for your two-byte numeric.
What I do is this:
DISPLAY
">"
the-first-field-name
"<"
">"
the-second-field-name
"<"
...
or
DISPLAY
">"
the-first-field-name
"<"
DISPLAY
">"
the-second-field-name
"<"
...
If you had done that, you should find 1 followed by 399 spaces for your first field (as you would expect) and space followed by zero for your second field, which you didn't expect.
If you want to specifically see this in operation:
FOO PIC X(400) JUST RIGHT.
MOVE "1" TO FOO
MOVE FOO TO BAR
DISPLAY
">"
FOO
"<"
DISPLAY
">"
BAR
"<"
And you should see what you "almost" expect. You probably want the leading zero as well (the level-number 05 is an example, whatever level-number you are using will work).
05 BAR PIC 99.
05 FILLER REDEFINES BAR.
10 BAR-FIRST-BYTE PIC X.
88 BAR-FIRST-BYTE-SPACE VALUE SPACE.
10 FILLER PIC X.
...
IF BAR-FIRST-BYTE-SPACE
MOVE ZERO TO BAR-FIRST-BYTE
END-IF
Depending on your compiler and how close it is to ANSI Standard (and which ANSI Standard) your results may differ (if so, try to get a better compiler), but:
Don't MOVE alpha-numeric which are longer than the maximum a numeric can be to a numeric;
Note that in the MOVE alpha-numeric to numeric it is the right-most bytes of the alpha-numeric which are actually moved first;
An "unsigned" numeric should/must always remain unsigned;
Always check for compiler diagnostics and correct the code so that no diagnostics are produced (where possible);
When showing examples, it is highly important to show the actual results the computer produced, not the results as interpreted by a human. " 0" is not the same as "0 " is not the same as "0".
EDIT: Looking at TS's other questions, I think Enterprise COBOL is a safe bet. This message would have been issued by the compiler:
IGYPG3112-W Alphanumeric or national sending field "FOO" exceeded 18 digits. The rightmost 18 characters were used as the sender.
Note, the "18 digits" would have been "31 digits" with compiler option ARITH(EXTEND).
Even though it is a lowly "W" which only gives a Return Code of 4, not bothering to read it is not good practice, and if you had read it you'd not have needed to ask the question - although perhaps you'd still not know how you ended up with " 0", but that is another thing.
I gather you expect the 9(2) value to show up as "1" instead of "0" and you are confused as to why it does not?
You are moving values from left to right when you move from an X value (unless the destination value changes things). So the 9 value has a space in it. To simplify it, moving "X(2) value '1 '" to a 9(2) value literally moves those characters. The space makes what is in the 9(2) invalid, so the COBOL compiler does with it what it knows to do, return 0. In other words, defining the 9(2) as it does tells the compiler to interpret the data in a different way.
If you want the 9(2) to show up as "1", you have to present the data in the right way to the 9(2). A 9(2) with a value of 1 has the characters "01". Untested:
03 FOO PIC X(2) value '1'.
03 TEXT-01 PIC X(2) JUSTIFIED RIGHT.
03 NUMB-01 REDEFINES TEXT-01 PIC 9(2).
03 BAR PIC 9(2).
DISPLAY FOO.
MOVE FOO TO TEXT-01.
INSPECT TEXT-01 REPLACING LEADING ' ' BY '0'.
MOVE NUMB-01 TO BAR.
DISPLAY BAR.
Using the NUMERIC test against BAR in your example should fail as well...
I have a string for which I wish to tally the count of characters till a certain pattern of characters is found.
For example:
Give a string: askabanskarkartikrockstar
I would like to know how many characters are there before the kartik in the string.
In a normal scenario where I need to find the number of characters before, say k, in the given string, I would write the code somewhat as:
INSPECT WS-INPUT-STRING TALLYING CT-COUNTER FOR CHARACTERS BEFORE LT-K
Where
WS-INPUT-STRING is alphanumeric with a value of
askabanskarkartikrockstar,
CT-COUNTER is the counter used to count the number of characters
LT-K is a literal with the value k.
But here, if I wish to do the same for a sub-string, like kartik in the above example, would replacing the value of LT-K with kartik instead of just k work? If yes, is the same applicable for alphanumeric literals that have values in the form of hexadecimal numbers (for example, in a literal X(02) one stores a new-line character as x'0D25')?
I'm trying to implement the above code in zOS IBM mainframe v10. Thanks.
You have pretty much answered your own question... The answer is yes you can do this. Here is a working example program:
IDENTIFICATION DIVISION.
PROGRAM-ID. EXAMPLE.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-INPUT-STRING PIC X(80).
01 WS-COUNTER PIC 9(4).
01 WS-TAG PIC X(10).
PROCEDURE DIVISION.
MAIN-PARAGRAPH.
MOVE 'askabanskarkartikrockstar' TO WS-INPUT-STRING
MOVE ZERO TO WS-COUNTER
MOVE 'kartik' TO WS-TAG
INSPECT WS-INPUT-STRING
TALLYING WS-COUNTER
FOR CHARACTERS BEFORE WS-TAG(1:6)
DISPLAY WS-COUNTER
GOBACK
.
WS-COUNTER displays as 11, there are 11 characters before the WS-TAG string.
Notice that I defined WS-TAG as PIC X(10). This variable is longer than the actual tag value you are looking for. To prevent the INSPECT verb from trying to match on trailing spaces introduced by:
MOVE 'kartik' TO WS-TAG
I had to specify a reference modified value for INSPECT to search for. Had I simply used:
FOR CHARACTERS BEFORE WS-TAG
without reference modification, WS-COUNTER would have been 80 - the length of WS-INPUT-STRING. This is because the string 'kartik ' is not found and the counter tallies the length of the entire input string.
Another approach would be to specify the tag as a literal:
FOR CHARACTERS BEFORE 'kartik'
You can move hexadecimal constants into PIC X fields as follows:
MOVE X'0D25' TO WS-TAG
This occupies 2 characters so you would use WS-TAG(1:2) when INSPECTing it.
If you want to do "a lot" of this at once, then you'll find a PERFORM VARYING will be faster. It is more typing, and you have to think more, and there is more chance for error. But once you have one working, you just have to copy the code to reuse it.
Alphanumeric movement to Numeric variable caused unexpected results. Here is the code fyr:
DATA DIVISION.
WORKING-STORAGE SECTION.
01 WS-VAR-STR PIC X(3) VALUE SPACES.
01 WS-VAR-NUM PIC 9(3) VALUE ZEROES.
PROCEDURE DIVISION.
MOVE '1' TO WS-VAR-STR
MOVE WS-VAR-STR TO WS-VAR-NUM
DISPLAY 'STRING > ' WS-VAR-STR '< MOVED > ' WS-VAR-NUM '<'
IF WS-VAR-NUM >= 40 AND <= 59
DISPLAY 'INSIDE IF >' WS-VAR-NUM
ELSE
DISPLAY 'INSIDE ELSE >' WS-VAR-NUM
END-IF
GOBACK
.
OUTPUT:
STRING > 1 < MOVED > 1 0<
INSIDE ELSE >1 O
The result is bizzare and want to figure why '1' is moved as '1 0' into numeric variable and interestingly there was NO issue in conditioning it as well. Do share your views. Thanks for your interest.
Basically you have done an illegal MOVE. Moving alphanumeric to numeric fields is valid
provided that the content of the alphanumeric field contains only numeric characters.
This reference
summarizes valid/invalid moves.
What were you expecting as a result?
Moves of alphanumeric fields into numeric ones are done without
'conversion'. Basically you just dropped a one digit followed by two spaces into a numeric field. the '1' was ok, the two spaces
were not. The last two bytes of WS-VAR-NUM contain spaces.
But wait... why is the last character a zero? The answer to this is a bit more complicated.
Items declared as PIC 9 something are represented in Zoned Decimal.
Each digit of a zoned decimal number is represented by a single byte.
The 4 high-order bits of each byte are zone bits; the 4 high-order bits of the low-order byte represent
the sign of the item. The 4 low-order bits of each byte contain the value of the digit. The key here
is where the sign is stored. It is in the high order bits of the last byte. Your declaration did not
include a sign so the MOVE statement blows away the sign bits and replaces them with default
numeric high order bits (remember the only valid characters to MOVE are digits - so this
patch process should always yield a valid result). The high order bits of an unsigned zoned decimal
digit are always HEX F. What are the low order bits of the last byte? A space has an ebcdic HEX value of 40. A zero is HEX F0. Since the MOVE statement "fixes" the sign automatically, you end up with HEX F0 in the low order digit, which happens to be, you guessed it, zero. None of the other 'digits' contain sign bits so they are left as
they were.
Finally, a DISPLAY statement converts zoned decimal fields into their equivalent character representation
for presentation: Net result is: '1 0'.
BTW The above discussion is how it works out on an IBM z/OS platform - other character sets (eg. ASCII) and/or other platforms may yield different results, not because IBM is doing the wrong thing, but because the program is doing an illegal MOVE and the results are essentially undefined.