I'm using the grammar on this site in my javacc. It works fine apart from some picture statements. For example ----,---,---.99 or --9.
http://mapage.noos.fr/~bpinon/cobol.jj
It doesn't seem to like more than one dash.
What do I need to change in this to support my picture examples.
I'v messed about with
void NumericConstant() :
{}
{
(<PLUSCHAR>|<MINUSCHAR>)? IntegerConstant() [ <DOTCHAR> IntegerConstant() ]
}
but nothing seems to be working. Any help is much appreciated
EDIT:
<COBOL_WORD: ((["0"-"9"])+ (<MINUSCHAR>)*)*
(["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*
( (<MINUSCHAR>)+ (["a"-"z","0"-"9"])+)*
>
Is this the regular expression for this whole line:
07 STRINGFIELD2 PIC AAAA. ??
If I want to accept 05 TEST3 REDEFINES TEST2 PIC X(10). would I change the regex to be:
<COBOL_WORD: ((["0"-"9"])+ (<MINUSCHAR>)*)*
(<REDEFINES> (["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*)?
(["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*
( (<MINUSCHAR>)+ (["a"-"z","0"-"9"])+)*
Thanks a lot for the help so far
Why are you messing around with NumericConstant() when you are trying to parse a
COBOL PICTURE string?
According to the JavaCC source you have, a COBOL PICTURE should parse with:
void DataPictureClause() :
{}
{
( <PICTURE> | <PIC> ) [ <IS> ] PictureString()
}
the --9 bit is a Picture String and should parse with the PictureString() function:
void PictureString() :
{}
{
[ PictureCurrency() ]
( ( PictureChars() )+ [ <LPARENCHAR> IntegerConstant() <RPARENCHAR> ] )+
[ PicturePunctuation() ( ( PictureChars() )+ [ <LPARENCHAR> IntegerConstant() <RPARENCHAR> ] )+ ]
}
PictureCurrency() comes up empty so move on to PictureChars():
void PictureChars() :
{}
{
<INTEGER> | <COBOL_WORD>
}
But COBOL_WORD does not appear to support many "interesting" valid PICTURE clause definitions:
<COBOL_WORD: ((["0"-"9"])+ (<MINUSCHAR>)*)*
(["0"-"9"])* ["a"-"z"] ( ["a"-"z","0"-"9"] )*
( (<MINUSCHAR>)+ (["a"-"z","0"-"9"])+)*
>
Parsing COBOL is not easy, in fact it is probably one of the most difficult languages in existance to build a quality parser
for. I can tell you right now that the
JavaCC source you are working from is not going to cut it - except for some very simple and probably
totally artificial COBOL program examples.
Answer to comment
COBOL Picture strings tend to mess up the best of parsers. The minus sign you are
having trouble with is only the tip of the iceburg! Picture Strings
are difficult to
parse through because the period and comma
may be part of a Picture string but serve as separators outside of the string. This means
that parsers cannot unambiguously classify a period or comma in a context free manner. They need
to be "aware" of the context in which it is encountered. This may sound trivial but it isn't.
Technically, the separator period and comma must be followed by a space (or end of line). This
little fact could make determining the period/comma role very simple because a Picture String
cannot contain a space. However, many
commercial COBOL compilers are "smart" enough correctly recognize separator periods/commas that
are not followed by a space.
Consequently
there are a lot of COBOL programmers that code illegal separator period/commas, which means you
will probably have to deal with them.
The bottom line is that no matter what you do, those little Picture Strings are going to
haunt you. They will take quite a bit of effort to to deal with.
Just a hint of things to come, how would you parse the following:
01 DISP-NBR-1 PIC -99,999.
01 DISP-NBR-2 PIC -99,999..
01 DISP-NBR-3 PIC -99,999, .
01 DISP-NBR-4 PIC -99,999,.
The period following DISP-NBR-1 terminates the Picture string. It is a separator period. The
period following DISP-NBR-2 is part of the string, the second period is the separator. The comma
following DISP-NBR-3 is a separator - it is not part of the Picture string. However the comma
following DISP-NBR-4 is part of the Picture string because it is not followed by a space.
Welcome to COBOL!
I found that I had to switch the lexer into another mode when I got PICTURE. A COBOL PICTURE string has completely different 'lexics' from the rest of the language, and you must discourage the lever from doing anything with periods, commas, etc, other than accumulate them into the picture string. See NealB's answer for some examples of knowing when to stop picture-scanning.
I have no idea why you want to incorporate the REDEFINES phrase into the word. Just parse it normally in the parser.
Related
Hello guys I want to convert my non delimited file into a delimited file
Example of the file is as follows.
Name. CIF Address line 1 State Phn Address line 2 Country Billing Address line 3
Alex. 44A. Biston NJ 25478163 4th,floor XY USA 55/2018 kenning
And so on all the data are in this format.
First three lines are metadata and then the data.
How can I make it delimited in proper format using logic.
There are two parts in the problem:
how to find the column widths
how to split each line into fields and output a new line with delimiters
I could not propose an automated solution for the first one, because (not knowing anything about the metadata format), there is no clear way to find where one column ends and the next one begins. Some of the column headings contain multiple space-separated words and space is also used as a separator between the headings (and apparently one cannot use the rule "more than one space means the end of a heading name" because there's only one space between "Address line 2" and "Country" - and they're clearly separate columns. Clearly, finding the correct column widths requires understanding English and this is not something that you can write a program for.
For the second problem, things are much easier - once you have the column positions. If you figure the column positions manually (or programmatically, if you know something about the metadata that I don't - and you have a simple method for finding what's a column heading), then a program written in AWK can do this, for example:
cols="8,15,32,40,53,66,83,105"
awk_prog='BEGIN {
nt=split(cols,tabs,",")
delim=","
ORS=""
}
{ o=1 ;
for (i in tabs) { t=tabs[i] ; f=substr($0,o,t-o); sub(" *$","",f) ; print f
delim ; o=t } ;
print substr($0, o) "\n"
}'
awk -v cols="$cols" "$awk_prog" input_file
NOTE that the above program does not deal correctly with the case when the separator character (e.g. ",") appears inside the data. If you decide to use this as-is, be sure to use a separator that is not present in the input data. It may be better to modify the code to escape any separator characters found in the input data (there are different ways to do this - depends on what you plan to feed the output file to).
I am reading a COBOL program file and I am struggling to understand the way the STRING command works in the following example
STRING WK-NO-EMP-SGE
','
WK-DT-DEB-PER-FEU-TEM
','
WK-DT-FIN-PER-FEU-TEM
DELIMITED BY SIZE
INTO UUUUUU-CO-CLE-ERR-DB2
I have three possible understandings of what it does:
Either the code concatenate each variables into UUUUUU-CO-CLE-ERR-DB2 and separate each values with ',', and the last variable is delimited by size;
Either the code concatenate each variables into UUUUUU-CO-CLE-ERR-DB2 and separate each values with ',', but all the values are delimited by size (meaning that the DELIMITED BY SIZE in this case applies to all the values passed in the string command;
Or each variable is delimited by a specific character, for example WK-NO-EMP-SGE would be delimited by ',', WK-DT-DEB-PER-FEU-TEM by ',' and WK-DT-FIN-PER-FEU-TEM would then be DELIMITED BY SIZE.
Which of my reading is actually the good one?
Here's the syntax-diagram for STRING (from the Enterprise COBOL Language Reference):
Now you need to know how to read it.
Fortunately, the same document tells you how:
How to read the syntax diagrams
Use the following description to read the syntax diagrams in this
document:
. Read the syntax diagrams from left to right, from top to bottom,
following the path of the line.
The >>--- symbol indicates the beginning of a syntax diagram.
The ---> symbol indicates that the syntax diagram is continued on the
next line.
The >--- symbol indicates that the syntax diagram is continued from
the previous line.
The --->< symbol indicates the end of a syntax diagram. Diagrams of
syntactical units other than complete statements start with the >---
symbol and end with the ---> symbol.
. Required items appear on the horizontal line (the main path).
. Optional items appear below the main path.
. When you can choose from two or more items, they appear vertically,
in a stack.
If you must choose one of the items, one item of the stack appears on
the main path.
If choosing one of the items is optional, the entire stack appears
below the main path.
. An arrow returning to the left above the main line indicates an item
that can be repeated.
A repeat arrow above a stack indicates that you can make more than one
choice from the stacked items, or repeat a single choice.
. Variables appear in italic lowercase letters (for example, parmx).
They represent user-supplied names or values.
. If punctuation marks, parentheses, arithmetic operators, or other
such symbols are shown, they must be entered as part of the syntax.
All that means, if you follow it through, that your number 2 is correct.
You can use a delimiter (when you don't have fixed-length data) or just use the size. Any item which is not explicit in how it is delimited, is delimited by the next DELIMITED BY statement.
One thing to watch for with STRING, which doesn't matter in your case, is that the target field does not get space-padded if the data is shorter than the target. With variable-length data, you need to clear the field to space before the STRING executes.
There is a nuance one must grasp in order to understand the results. DELIMITED BY SIZE can be misleading if one has experience in other programming languages.
Each of the three variables has a size that is defined in WORKING-STORAGE. Let's presume it looks something like this.
05 WK-NO-EMP-SGE PIC X(04).
05 WK-DT-DEB-PER-FEU-TEM PIC X(10).
05 WK-DT-FIN-PER-FEU-TEM PIC X(10).
If the value of the variables were set like this:
MOVE 'BOB' TO WK-NO-EMP-SGE.
MOVE 'Q' TO WK-DT-DEB-PER-FEU-TEM.
MOVE 'D19EIEIO2B' TO WK-DT-FIN-PER-FEU-TEM.
Then one might expect the value of UUUUUU-CO-CLE-ERR-DB2 to be:
BOB,Q,D19EIEIO2B
But it would actually be:
BOB ,Q ,D19EIEIO2B
I want to make three fields into one field with only a single space between each word in Cobol. I this the correct format below
STRING SORT-WORKER-LAST SPACE
SORT-WORKER-FIRST SPACE
SORT-WORKER-MID SPACE
DELIMITED BY SIZE
INTO REC-VSAM-NAME
This didn't work:
STRING SORT-WORKER-LAST SPACE
SORT-WORKER-FIRST SPACE
SORT-WORKER-MID SPACE
DELIMITED BY space
INTO REC-VSAM-NAME
STRING SORT-WORKER-LAST
SORT-WORKER-FIRST
SORT-WORKER-MID
DELIMITED BY space
INTO REC-VSAM-NAME
Not working either.
SS5726 test test t
" " DELIMETED BY SPACE
This above code is not giving me what I am looking for either.
When used in a STRING statement, the figurative constant SPACE (or SPACES, they are equivalent, the plural means nothing except for human reading) has a length of one byte.
You may not be finished with this. If your source fields contain embedded spaces, you will be best to abandon STRING and do something else.
If you proceed with STRING or there is another time you want to consider using it, then you also have to think about the length of your output field. If you don't do anything about it, it will be quietly truncated.
I've included an example of how to do something. Note that the STRING now has a conditional element (ON), so you must delimit the scope of the STRING by END-STRING (also possible, but tacky, with full-stop/period).
If, logically, the output cannot be breached, the ON OVERFLOW is not needed. Also, if what you are told to do is "just truncate" then it can be omitted, although I'd tend to at least count them, and display the count at the end of the program. Then when the Analyst has said, "there won't be any, just truncate if there are" you can go back and say that there were 3,931 when you did your volume test.
As ScottNelson has pointed out in a comment, there are a couple of things to watch out for with STRING. What concerns you here is that only the data selected by the STRING will appear in your output field, your output field will not be space-padded, as it would be after a MOVE statement.
Because you have been using fixed-length fields up to now, you won't have noticed this. Once you have the correction, you may find, if you are not setting the output field to SPACE first, that you have a mixture of values, with some left over from the previous content.
Another one with STRING is the POINTER.
The effects of the way STRING works is useful if that is what you want. You just have to know what to do to avoid those things when you don't want that action.
Every time you find something new in COBOL, hit that manual. Language Reference first. Try to understand. Programming Guide. Try further. If unsure, experiment. Read manual. Experiment. Continue until understood.
Each time I read the manual, I try to look at something else as well. One technique with knowing a language is to know the type of thing that can be done, and to know where to find the detail, and how to understand the explanations.
You will find similar things with all the "complex" COBOL verbs, STRING, UNSTRING, INSPECT. They have actions which seem initially to be working against you, but which are useful, and otherwise not available, when you need them.
IDENTIFICATION DIVISION.
PROGRAM-ID. DOUGH.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 PART-1 PIC X(30) VALUE "TEST".
01 PART-2 PIC X(30) VALUE "TEST".
01 PART-3 PIC X(30) VALUE "T".
01 ALL-PARTS PIC X(30).
PROCEDURE DIVISION.
MOVE SPACE TO ALL-PARTS
* MOVE ZERO TO data-name-used-with-POINTER
* (if used)
STRING PART-1 DELIMITED BY SPACE
SPACE DELIMITED BY SIZE
PART-2 DELIMITED BY SPACE
SPACE DELIMITED BY SIZE
PART-3 DELIMITED BY SPACE
INTO ALL-PARTS
ON OVERFLOW
DISPLAY "SORRY, YOUR DATA WAS TRUNCATED"
END-STRING
DISPLAY
">"
ALL-PARTS
"<"
GOBACK
.
Try....
STRING SORT-WORKER-LAST DELIMITED BY SPACE
" " DELIMITED BY SIZE
SORT-WORKER-FIRST DELIMITED BY SPACE
" " DELIMITED BY SIZE
SORT-WORKER-MID DELIMITED BY SPACE
INTO REC-VSAM-NAME
Try
STRING
field-1 DELIMITED BY SIZE
" " DELIMITED BY SIZE
field-2 DELIMITED BY SIZE
INTO big-field
Just for completeness sake, you can do the following if you want to be able to cope with data fields that have embedded spaces (in other words, text fields containing multiple words):
INSPECT SORT-WORKER-FIRST
REPLACING TRAILING SPACES BY LOW-VALUES.
INSPECT SORT-WORKER-MID
REPLACING TRAILING SPACES BY LOW-VALUES.
INSPECT SORT-WORKER-LAST
REPLACING TRAILING SPACES BY LOW-VALUES.
STRING SORT-WORKER-LAST " " SORT-WORKER-FIRST " " SORT-WORKER-MID
DELIMITED BY LOW-VALUE INTO REC-VSAM-NAME.
For instance, this would cope when SORT-WORKER-LAST contained something like "VAN DYKE".
If you didn't want to modify the existing SORT-WORKER-* fields, you'd have to move each to a separate field and INSPECT and then STRING those fields.
What you are doing here is converting each of the strings to the 'C' equivalent - terminated by a NUL.
Of course, this depends if your Cobol is new enough.
I have some csv record which are variable in length , for example:
0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85
0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000
I need to seperate each of these fields and I need the last field which should be a money.
However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 .
"FILE SECTION"
FD INPUT_FILE.
01 INPUT_REC PIC X(66).
"WORKING STORAGE SECTion"
01 WS_INPUT_REC PIC X(66).
01 WS_AMOUNT_NUM PIC 9(5).9(2).
01 WS_AMOUNT_TXT PIC X(8).
"MAIN SECTION"
UNSTRING INPUT_REC DELIMITED BY ","
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8)
DISPLAY WS_AMOUNT_NUM
From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become:
79.85^M^#^#
137.35^M^#
I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.
I am guessing, but it looks like you may be reading variable length records from a file into a fixed length
COBOL record. The junk
at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going
to be from one read to the next (data beyond the bounds of actual input record length are technically
undefined). That junk ends up
being included in WS_AMOUNT_TXT after the UNSTRING
There are a number of ways to solve this problem. The suggestion I am giving you here may not
be optimal, but it is simple and should get the job done.
The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing
junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are
digits and the decimal character, you could clean it up as follows:
PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1
UNTIL WS_I = ZERO
IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR
WS_AMOUNT_TXT(WS_I:1) = '.'
MOVE ZERO TO WS_I
ELSE
MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1)
END-IF
END-PERFORM
The basic idea in the above code is to scan from the end of the last UNSTRING output field
to the beginning replacing anything that is not a valid digit or decimal point with a space.
Once a valid digit/decimal is found, exit the loop on the assumption that the rest will
be valid.
After cleanup use the intrinsic function NUMVAL as outlined in my answer to your
previous question
to convert WS_AMOUNT_TXT into a numeric data type.
One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over
from a previous read that might be left in the buffer. This will protect you when reading a very "short"
record after a "long" one - otherwise you may trip over data left over from the previous read.
Hope this helps.
EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the
actual input record length you can do something like:
UNSTRING INPUT_REC(1:REC_LEN) INTO...
Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields.
EDIT 2:
Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following:
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need.
I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail.
EDIT 3:
There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M
Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data.
If the file you are reading came from a Windows platform and you are now
reading it on a UNIX platform that would explain the problem. Under Windows records
are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The
UNIX file system treats <CR> as if it were part of the record.
If this is the case, you can be pretty sure that there will be a single <CR> at the
end of every record read. There are a number of ways to deal with this:
Method 1: As you already noted, pre-edit the file using Notepad++ or some other
tool to remove the <CR> characters before processing through your COBOL program.
Personally I don't think this is the best way of going about it. I prefer to use a COBOL
only solution since it involves fewer processing steps.
Method 2: Trim the last character from each input record before processing it. The last
character should always be <CR>. Try the following if you
are reading records as variable length and have the actual input record length available.
SUBTRACT 1 FROM REC_LEN
MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER
UNSTRING WS_BUFFER INTO...
Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows:
UNSTRING INPUT_REC DELIMITED BY "," OR x"0D"
INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT
Method 4: Condition the last receiving field from UNSTRING by replacing trailing
non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this
question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE.
Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.
^M is a carriage return.
Would Google Refine be useful for rectifying this data?
I am now working in mainframe,
in some modules, to test
Not null or Empty
we see :
NOT = SPACE OR LOW-VALUE
The chief says that we should do :
NOT = SPACE AND LOW-VALUE
Which one is it ?
Thanks!
Chief is correct.
COBOL is supposed to read something like natural language (this turns out to be just
another bad joke).
Lets play with the following variables and values:
A = 1
B = 2
C = 3
An expression such as:
IF A NOT EQUAL B THEN...
Is fairly straight forward to understand. One is not equal to two so we will do
whatever follows the THEN. However,
IF A NOT EQUAL B AND A NOT EQUAL C THEN...
Is a whole lot harder to follow. Again one is not equal to two AND one is not
equal to three so we will do whatever follows the 'THEN'.
COBOL has a short hand construct that IMHO should never be used. It confuses just about
everyone (including me from time to time). Short hand expressions let you reduce the above to:
IF A NOT EQUAL B AND C THEN...
or if you would
like to apply De Morgans rule:
IF NOT (A EQUAL B OR C) THEN...
My advice to you is avoid NOT in exprssions and NEVER use COBOL short hand expressions.
What you really want is:
IF X = SPACE OR X = LOW-VALUE THEN...
CONTINUE
ELSE
do whatever...
END-IF
The above does nothing when the 'X' contains either spaces or low-values (nulls). It
is exactly the same as:
IF NOT (X = SPACE OR X = LOW-VALUE) THEN
do whatever...
END-IF
Which can be transformed into:
IF X NOT = SPACE AND X NOT = LOW-VALUE THEN...
And finally...
IF X NOT = SPACE AND LOW-VALUE THEN...
My advice is to stick to simple to understand longer and straight forward expressions
in COBOL, forget the short hand crap.
In COBOL, there is no such thing as a Java null AND it is never "empty".
For example, take a field
05 FIELD-1 PIC X(5).
The field will always contain something.
MOVE LOW-VALUES TO FIELD-1.
now it contains hexadimal zeros. x'0000000000'
MOVE HIGH-VALUES TO FIELD-1.
Now it contains all binary ones: x'FFFFFFFFFF'
MOVE SPACES TO FIELD-1.
Now each byte is a space. x'4040404040'
Once you declare a field, it points to a certain area in memory. That memory area must be set to something, even if you never modify it, it still will have what ever garbage it had before the program was loaded. Unless you initialize it.
05 FIELD-1 PIC X(6) VALUE 'BARUCH'.
It is worth noting that the value null is not always the same as low-value and this depends on the device architecture and its character set in use as determined by the manufacturer. Mainframes can have an entirely different collating sequence (low to high character code and symbol order) and symbol set compared to a device using linux or windows as you have no doubt seen by now. The shorthand used in Cobol for comparisons is sometimes used for boolean operations, like IF A GOTO PAR-5 and IF A OR C THEN .... and can be combined with comparisons of two variables or a variable and a literal value. The parser and compiler on different devices should deal with these situations in a standard (ANSI) method but this is not always the situation.
I agree with NealB. Keep it simple, avoid "short cuts", make it easy to understand without having to refer to the manual to check things out.
IF ( X EQUAL TO SPACE )
OR ( X EQUAL TO LOW-VALUES )
CONTINUE
ELSE
do whatever...
END-IF
However, why not put an 88 on X, and keep it really simple?:
88 X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY VALUE SPACE, LOW-VALUES.
IF X-HAS-A-VALUE-INDICATING-NULL-OR-EMPTY
CONTINUE
ELSE
do whatever...
END-IF
Note, in Mainframe Cobol, NULL is very restricted in meaning, and is not the meaning that you are attributing to it, Tom. "Empty" only means something in a particular coder-generated context (it means nothing to Cobol as far as a field is concerned).
We don't have "strings". Therefore, we don't have "null strings" (a string of length one including string-terminator). We don't have strings, so a field always has a value, so it can never be "empty" other than as termed by the programmer.
Oguz, I think your post illustrates how complex something that is really simple can be made, and how that can lead to errors. Can you test your conditions, please?