Why use VALUE HIGH-VALUES in 88 Conditionals? - cobol

I understand that HIGH-VALUES correspond to the highest in the collating sequence, however I do not understand why it may be a preferred method when using conditionals.
Example:
01 StudentRecord.
88 EndOfStudentFile VALUE HIGH-VALUES.
02 StudentID PIC X(7).
02 FILLER PIC X(23).
...
AT END SET EndOfStudentFile TO TRUE
Why not simply use VALUE 0 and SET EndOfStudentFile to 1 ?
Whats the advantage of using HIGH-VALUES in these cases?
Appreciate any input on this matter...

The conditional 88 in your example is for the StudentRecord, so it sets/queries that. I think that it may be more appropriate to use VALUE ALL HIGH-VALUES - as it stands it will set the first byte to HIGH-VALUE and then pad the record (with spaces).
VALUE 0/1 would not be possible for that as the record - because it is a group - is alphanumeric, and should not be assigned a numeric value.
... the question "is xyz preferred" is often more a question of style and only rarely "best practice". The commonly good thing is to ensure a consistent use/style so that others reading the code can understand it better.
In this specific case it could be used to "store" the information "all students were processed" which then can be queried later via IF EndOfStudentFile and if for some reason there is another START >= StudentID (I assume that is an ORGANIZATION INDEXED file here) on the file it likely will not found "another" record (still possible here, a student with an id containing ALL HIGH-VALUE would be found).

Just to clarify '88' levels do not represent real storeage.
They are conditionals which refer to the immediately preceding variable definition.
So:
If EndOfStudentFile..
is just as shortcut for
If StudentRecord is equal to High-Values...

Related

Can level 49 be used for STRING purpose

As I'm new to cobol, please help me with the below piece of code.
WORKING-STORAGE SECTION.
01 BAS-REC.
02 INPT-REC.
49 INPT-LEN PIC S9(4) COMP.
49 INPT-TEXT PIC X(150).
02 INPT1-REC.
49 INPT1-LEN PIC S9(4) COMP.
49 INPT1-TEXT PIC X(150).
02 INPT2-REC.
49 INPT2-LEN PIC S9(4) COMP.
49 INPT2-TEXT PIC X(150).
77 VAR1 PIC X(5) VALUE 'APT'.
77 NUM1 PIC 9(1).
I'm using the level 49 for character varying here (to truncate trailing spaces)
Then I have cursor fetch.
After few modification under PROCEDURE DIVISION I'm doing the below.
PERFORM UNTIL SQLCODE=100
PERFORM VARYING NUM1 FROM 1 BY 1 UNTIL NUM1=6
STRING INPT-REC DELIMITED BY ' ',' ',
VAR1 DELIMITED BY ' ',' '
NUM1 DELIMITED BY ' ' INTO INPT2-REC
EXEC SQL
insert query here (which will run 5 times)
END-EXEC
END- PERFORM
END- PERFORM
but in the table the data got inserted only once but it shold have got inserted 5 times and also the INPT2-REC hasn't been concatenated. The INPT2 -REC just contains the value of INPT-REC alone
My question is this a special characteristic of level 49 or am I wrong somewhere?
Note that if you use INPT-REC2 as a host-variable for a VARCHAR-field you will only see the part from INPT-REC since you never update the length-field: it still contains the length it was assigned from INPT-REC.
So you'll have to somehow get the actual length of INPT2-TEXT (e.g. INSPECT the REVERSE of INPT2-TEXT for LEADING SPACES) and move it to INPT2-LENGTH before your EXEC SQL.
As I already said in my comment: there is nothing special about level 49 - you could as well use 48, 33,30 or 05 with the same results. The samples in the DB2 manual probably use 49 since it is the last valid level-number without any special meaning, so it is least likely to cause problems with any level-numbers already used in the program.
As for the query being executed only once: in your loop you are varying NUM1 but are checking whether I=6 - since we don't see I anywhere in your example I can only guess that it is already equal to 6 upon entering the loop.
Level 49 can be treated specially when Embedded SQL is involved, depending on system; this text is copied from the IBM Knowledge Center
Host structure declarations in COBOL must satisfy the following requirements:
COBOL host structures can have a maximum of two levels, even though the host structure might occur within a structure with multiple levels. However, you can declare a varying-length character string, which must be level 49.
A host structure name can be a group name whose subordinate levels name elementary data items.
If you are using the DB2® precompiler, do not declare host variables or host structures on any subordinate levels after one of the following items:
A COBOL item that begins in area A
Any SQL statement (except SQL INCLUDE)
Any SQL statement within an included member
When the DB2 precompiler encounters one of the preceding items in a host structure, it considers the structure to be complete.
So, this seems like a little implementation detail (level 49 for var char) that may spill over into other implementations of COBOL ESQL. Like many details buried in systems, Knowing that would require knowing that.
This particular detail is news to me as of a few minutes ago.
Looking more just now, this came up in the esqlOC contribution for GnuCOBOL recently. A level 49 specific tweak to ensure there was no need to worry about little end big end storage between host and service. So it seems to be a thing.
And an answer to the original question is; depends on compiler environment and ESQL preprocessor, but yeah maybe level 49 fields can be used for VARCHAR.

SPSS: how to find text in text?

In my SPSS-dataset I have a string-variable QuestionA containing the answer to a certain question. However, instead of just one answer, it is possible to check more than one answer.
For example, if one checks answers 02, 05 and 07 it is saved in the variable QuestionA as the string "02;05;07".
I would like to create a variable for specific answer 02. Let's call that variable Answer02. It should contain a 0 if QuestionA does not contain 02 anywhere in its text, and a 1 if QuestionA actually contains 02 anywhere.
For me the catch lies in the fact that one could also check answer 01 as well which makes the answer contained in QuestionA 01;02.
The answer should be generic, if possible, so that I can also create a variable Answer05 in similar fashion.
This should give you a flavour:
DATA LIST FREE / Q (A9).
BEGIN DATA
"01" "02" "03" "01,02" "02,03" "04,05"
END DATA.
DO REPEAT A=A1 to A3 /B="01" "02" "03".
IF CHAR.INDEX(Q,B)>0 A=1.
END REPEAT.
RECODE A1 to A3 (SYSMIS=0).
EXE.
If you are just interested in that one case, this code is simpler.
COMPUTE Answer02=char.index(QuestionA, "02") > 0.

Find a substring in a string in COBOL

My problem is, given a variable which I read from a file, see if it contains or matches another string.
In other words, find in a file all the records whose variable
BRADD PIC X(30)
matches or contains a string introduced by keyboard.
I'm very confident this problem is resolved through the INSPECT instruction, and I've tried something like this in my code:
READ BRANCHFILE NEXT RECORD
AT END SET EndOfFile TO TRUE
END-READ.
PERFORM UNTIL EndOfFile
INSPECT BBRADD
TALLYING CONT for CHARACTERS
BEFORE INITIAL CITY
IF CONT>1
DISPLAY " BRANCH CODE :" BBRID
DISPLAY " BRANCH NAME :" BBRNAME
DISPLAY " BRANCH ADDRESS :" BBRADD
DISPLAY " PHONE :" BBRPH
DISPLAY " E-MAIL :" BEMAIL
DISPLAY " MANAGER NAME :" BMGRNAME
DISPLAY " ------------------"
DISPLAY " ------------------"
END-IF
READ BRANCHFILE NEXT RECORD
AT END SET EndOfFile TO TRUE
END-READ
MOVE 0 TO CONT
END-PERFORM.
Where CITY is the variable I introduce through keyboard.
¿Anyone knows how to find a "substring" in a "string"?
For example, if I introduced "Zaragoza" my program have to print all the records in the file which variable BBRADD contains "Zaragoza".
01 BRANCHREC.
88 EndOfFile VALUE HIGH-VALUE.
02 BBRID PIC X(6).
02 BBRNAME PIC X(15).
02 BBRADD PIC X(30).
02 BBRPH PIC X(10).
02 BEMAIL PIC X(20).
02 BMGRNAME PIC X(25).
You would need to set CONT to zero before the INSPECT, every time.
CONT just gets updated from its initial value when the INSPECT starts. After you find your first one, every record will look like it has CITY in it.
If may initially seem odd that it works that way, but if it didn't you'd be limited on the occasions when that is how you want it to work.
Ah, looking a little closer, you are setting CONT to an initial value, you are just doing it in an unexpected place. If it needs to be zero, set it to zero immediately before it should be zero. Much easier to find, less easy for someone changing the program in the future to make a mess of.
However, you have another problem. Let's say CITY is PIC X(20). The user enters SEVILLA and your INSPECT will now search for SEVILLA followed by 13 spaces. Ideally you'd want SEVILLA followed by one space.
You need to be able to test for a value that the user has entered, with a trailing blank, but not more.
The current popular way to do this is with reference-modification.
You need to take your user-input, find out how many trailing spaces it contains, calculate how long the data is, add one for the trailing blank, and hold that value in a field (preferably a BINARY field).
Then your INSPECT can look like this:
INSPECT BBRADD
TALLYING CONT for CHARACTERS
BEFORE INITIAL CITY ( 1 : length-of-data-plus-one )
However, then you have a problem if SEVILLA is actually in the start of the field.
So you make a small change, not to count characters which appear before it, but to count occurrences of it.
INSPECT BBRADD
TALLYING CONT for ALL
CITY ( 1 : length-of-data-plus-one )
Many people will instead code a PERFORM loop with reference-modification and do the test that way. With the final version of the INSPECT above have to code the termination logic yourself. For learning purposes it would be good to do it both ways.
When doing file-io, always use and check the FILE STATUS. Put your READ into a paragraph and perform it, you don't need two different pieces of code. If you use the FILE STATUS you don't need the AT END (or the END-READ) as the field you use to receive the FILE STATUS value will be "10" for end-of-file. Just use your 88 on that field, with the value of "10".
The Edit on your question now indicates where your existing 88-level is.
On the one hand, this is a good idea, because the end-of-file is associated with the record, and there can be no valid accidental content.
On the other hand, this is not a "portable" solution: if you use other COBOLs you may find that once end-of-file is reached it is no longer valid to access data under the FD. In the standard what happens in this situation is not defined, so you get differences amongst compilers.
You can retain the 88 on the group-item had have it portable by using READ ... INTO ... and having your record-layout in WORKING-STORAGE. This takes slightly longer to execute, as the data has to be transferred from one location to another.
I prefer the 88 on the FILE STATUS field and simplify the READ by being able to remove the AT END and END-READ. I already can't access the record-area under the FD so I can't accidentally get wrong values which look good.

What is a Cobol 88-type equivalent in another languages?

I'm learning COBOL now and really liking the 88-type of variables, and I want to know if there are anything like them in another languages (most known languages also, such as C, Objective-C), even using a library.
The only thing I can think being similar is using
#define booleanResult (variableName==95)
But it isn't possible to set boolenResult to true and make variableName assume 95 as value.
05 nicely-named-data PIC X.
88 a-meangingful-condition VALUE "A".
88 another-meaingingful-condition
VALUE "A" "B"
"X" THRU "Z"
SPACE ZERO.
IF a-meaningful-condition
IF another-meaningful-condition
SET a-meaningful-condition TO TRUE
SET another-meaningful-condition
TO TRUE
The IFs test the value referenced by the data-name (conditional variable) that the 88 (condition name) is associated with, for a single value or one of multiple value, which can included ranges (THRU) and figurative-constants (ZERO, SPACE, LOW-VALUES, etc).
The SET, which in this form is a more recent addition to COBOL from the 1985 Standard, will change the value of the data-name to the first value specified on the 88, such that if you immediately referenced the 88 in a test, the test would be true.
COBOL does not have booleans in the sense of something resolving to 0 or 1, or anything else, being false/true.
Any language which supports Objects could be used to mimic the behaviour. Perhaps you've even done it already without really realising it.
As NealB points out in the comments, functions could be used (or a procedure, or a transfer of control to another module) but the data and references to it would not be together and protected from accidental mischief.
COBOL has great flexibility in defining data-structures. The 88-level is a powerful aid to maintaining and understanding programs, as well as writing them in the first place.
I don't know of another language which has a direct and natural element which is equivalent to this, but then there are lots of languages I don't know.
Again NealB makes an important point in the comments about the use of THRU/THROUGH to specify a range of values.
Care does need to be taken. Although the author may think that the data that they want to select can be represented by the range "010" THRU "090", they may not realise that what the compiler does is to include every single possible value in that range, by generating code for greater than or equal to "010" and less than or equal to "090".
If using THRU, ensure that your data cannot contain anything in the range which is not expected. If you mean "010" "020" "030" ... "090" that is fine, as long as the data is validated at its entry-point, so that it can never include any intervening values.
The classic example is "A" THRU "Z" on the Mainframe. We all know what the author means, but the compiler takes it literally. You cannot use "A" THRU "Z" on its own for validation, because in EBCDIC there are "gaps" between three groups of letters, and using "A" THRU "Z" would treat those gaps as true for a use of the 88.
Where the 88 level in some COBOL compilers does fall down, is in the missing "FALSE".
To re-use from the above example:
88 a-meaingingful-condition VALUE "A".
88 a-meaingingful-condition-NOT
VALUE "N".
To test the switch/flag, you use the first 88. To turn the flag.switch off, you have to use the second. Not ideal. See one of the links below for an example of FALSE on the 88 definition.
In olden times, flags/switches were set and reset with MOVE statements. As soon as the MOVE is involved, you have the same problem as you have in trying to use functions. There is no bound relationship between the MOVE and the 88-level VALUE.
These days, SET can be used to change the value of a field, to turn a flag/switch on or off.
05 FILLER PIC X.
88 a-meaingingful-condition
VALUE "A".
88 a-meaingingful-condition-NOT
VALUE "N".
The field being tested does not even need a name (it can be FILLER or omitted (an implied FILLER)).
Of course, as NealB points out in a comment on one of the links below, someone can still get at the field with a MOVE using reference-modification on a group item. So...
01 FILLER.
05 FILLER PIC X.
88 a-meaingingful-condition
VALUE "A".
88 a-meaingingful-condition-NOT
VALUE "N".
Now they can't use reference-modification even, as there is no field to name. The value of the field can only come from a VALUE clause on the definition, or from a SET statement setting one of the 88s to TRUE.
At the stage, the value that a flag/switch has, its actual value, becomes irrelevant.
01 FILLER.
05 FILLER PIC X(7).
88 a-meaingingful-condition
VALUE "APPLE".
88 a-meaingingful-condition-NOT
VALUE "BICYCLE".
Because nothing can be used to test against a literal/data-name, and the field cannot be the target of any verb except SET, you no longer have to check that all fields which say they contain N, or Y, or 0, or 1, do so, and they're not the wrong case, and no other values get placed in those fields.
I'm not suggesting the use of APPLE and BICYCLE, just using them to illustrate the point.
An 88 can also have a value expressed in hexadecimal notation, like any alpha-numeric field:
88 a-meaingingful-condition VALUE X"25".
An 88 can also be specified on a group item, typically with a figurative-constant as the value:
01 a-group-item.
88 no-more-data-for-matching VALUE HIGH-VALUES.
05 major-key PIC X(10).
05 minor-key PIC X(5).
In a file-matching process, the keys can be set to high-values at end-of-file, and the use of the keys will still cause the other file(s) to be processed correctly (keys lower than on this file).
Here are links to a number of questions from SO relating directly, or tangentially with important aspects, to 88-levels.
COBOL level 88 data type
Group variable in cobol
In Cobol, to test "null or empty" we use "NOT = SPACE [ AND/OR ] LOW-VALUE" ? Which is it?
Does a prefix of "NO" have any special meaning in a COBOL variable?
COBOL Data Validation for capital letter?
My first programming language was Cobol, now I am using c# and here is my solution to Cobol's 88 level:
In Cobol:
01 ws-valid-countries pic xx.
88 valid-country 'US', 'UK' 'HK'.
move ws-country to ws-valid-countries
if valid-country
perform...
in C#
string[] ValidCountries = {"US","UK","HK"} ;
if ( ValidCountries.Contains(newCountry.Trim().ToUpper()) )
{
// do something
Think of it as a boolean getter (essentially as in your macro) and a setter (forcing the variable to be the corresponding value). Who says COBOL wasn't modern in 1965?
As others said, just some object programming. Which is more powerful, but far less elegant. Like :
01 MY-DATASET.
05 MY-DEPARTEMENT PIC 9(2).
88 ILE-DE-FRANCE VALUES 75, 77, 78, 91 THRU 95.
Can be roughly translated in old VBA in a class named MyDataset :
Public MyDepartement As Integer
Property Get IleDeFrance() As Boolean
Dim MyArray() As Variant
MyArray = Array(75, 77, 78, 91, 92, 93, 94, 95)
IleDeFrance = UBound(Filter(MyArray, MyDepartement, True)) > -1
End Property
(just tested, it works on VBA-excel2013)
And I made the VBA as simple as possible, no clean getter or setter for the departement number, just a public data. As a class is a depot of data plus coded actions against them, you can do more things inside than a simple 88-level(that's probably why this feature didn't make up to more modern languages). But at a the price of complexity & readability.
Less elegant because the array has to be specifically defined, and testing presence in it has to be specified also. While it's inherent to the wonderful 88 level.

Data Validation

So I have entered my second semester of College and they have me doing a course called Advanced COBOL. As one of my assignments I have to my make a program that tests certain things in a file to make sure the input has no errors. I get the general idea but there are just a few things I don't understand and my teacher is one of those people who will give you an assignment and make you figure it out yourself with little or no help. So here is what I need help with.
I have a field that the first 5 columns have to be numbers, the 6th column a capital letter and the last 2 numbers in a range of 01-68 or 78-99.
one of my fields has to be a string of numbers with a dash in it like 00000-000, but some have more than one dash. How can I count the dashes to identify that there is a problem.
Here are a few hints...
Use a hieratical record structure to view the data in different ways. For example:
01 ITEM-REC.
05 ITEM-CODE.
10 ITEM-NUM-CODE PIC 9(3).
10 ITEM-CHAR-CODE PIC A(3).
88 ITEM-TYPE-A VALUE 'AAA' THRU 'AZZ'.
88 ITEM-TYPE-B VALUE 'BAA' THRU 'BZZ'.
05 QUANTITY PIC 9(4).
ITEM-CODE is a 6 character group field, the first part of which is numeric (ITEM-NUM-CODE) and the last part
is alphabetic (ITEM-CHAR-CODE). You can refer to any one of these three variables in your program. When you
refer to ITEM-CODE, or any other group item, COBOL
treats the variable as if it were declared as PIC X. This means you can
MOVE just about anything into it without raising an error. For example:
MOVE 'ABCdef' TO ITEM-CODE
or
MOVE 'ABCdef0005' TO ITEM-REC
Neither one would cause an error even though the elementary data item ITEM-NUM-CODE is definitely not a number.
To verify the validity
of your data after a group move you should validate each elementary data item separately (unless
you know for certain no data type errors could have occurred). There are a variety of ways to do this. For
example if the data item has to be numeric the following would work:
IF ITEM-NUM-CODE IS NUMERIC
CONTINUE
ELSE
DISPLAY 'ITEM-NUM-CODE IS NOT NUMERIC'
PERFORM BIG-BAD-ERROR
END-IF
COBOL provides various class tests which can be applied against a data item. For
example: NUMERIC, ALPHABETIC and ALPHANUMERIC are commonly used.
Another common way to test for ranges of values is by defining various 88 levels - but exercise
caution. In the above
example ITEM-TYPE-A is an 88 level that defines a data range from 'AAA' through 'AZZ' based on
the collating sequence currently in effect. To verify that ITEM-CHAR-CODE contains only alphabetic
characters and the first letter is an 'A' or a 'B', you could do something like:
IF ITEM-CHAR-CODE ALPHABETIC
DISPLAY 'ITEM-CHAR-CODE is alphabetic.'
EVALUATE TRUE
WHEN ITEM-TYPE-A
DISPLAY 'ITEM-CHAR-CODE is in range AAA through AZZ'
WHEN ITEM-TYPE-B
DISPLAY 'ITEM-CHAR-CODE is in range BAA through BZZ'
WHEN OTHER
DISPLAY 'ITEM-CHAR-CODE is in some other range'
END-EVALUATE
ELSE
DISPLAY 'ITEM-CHAR-CODE is not alphabetic'
END-IF
Note the separate test for ALPHABETIC above. Why do that when the 88 level tests
could have done the job? Actually the 88's are not sufficient because they
cover the entire range from AAA through AZZ based on the collating sequence currently
in effect. In
an EBCDIC based environment (a very large number of COBOL shops use EBCDIC) this captures
values such as A}\. the close-brace and backslash characters are non-alpha but
fall into the middle of
the range 'A' through 'Z' (what the #*#! is that all about?). Also note that a value such
as 'aaa' would not satisfy the ITEM-TYPE-A condition because lower case letters fall outside
the defined range. Maybe time to check out an EBCDIC character table.
Finally, you can count the number of occurrences of a character, or string of characters, in
a variable with the INSPECT verb as follows:
INSPECT ITEM-CODE TALLING DASH-COUNT FOR ALL '-'
DASH-COUNT needs to be a numeric item and will contain the number of dash characters in ITEM-CODE. The INSPECT
verb is not so useful if you want to count the number of digits. For this you would need one statement for each digit.
It might be easier to just code a loop something like:
PERFORM VARYING I FROM 1 BY 1
UNTIL I > LENGTH OF ITEM-CODE
EVALUATE ITEM-CODE(I:1)
WHEN '-'
COMPUTE DASH-COUNT = DASH-COUNT + 1
WHEN '0' THRU '9'
COMPUTE DIGIT-COUNT = DIGIT-COUNT + 1
WHEN OTHER
COMPUTE OTHER-COUNT = OTHER-COUNT + 1
END-EVALUATE
END-PERFORM
Now ask yourself why I was comfortable using a zero through 9 range check? Hint: look at the collating sequence.
Hope this helps.

Resources