We have a COBOL program in which we populate values into a COBOL internal table and then search this table to find out a certain value. Prior to this search, we initialize tables index variable.
SET PAF-IDX TO 1.
Could anyone clarify, if it is allowed in COBOL to initialize an index variable loke this.
INITIALIZE PAF-IDX.
No. And why would you want to "INITIALIZE" it?
This is from the IBM Enterprise Cobol manual:
identifier-1
Receiving areas.
identifier-1 must reference one of the following:
v An alphanumeric group item
v A national group item
v An elementary data item of one of the following categories:
Alphabetic
Alphanumeric
Alphanumeric-edited
DBCS
External floating-point
Internal floating-point
National
National-edited
Numeric
Numeric-edited
v A special register that is valid as a receiving operand in a MOVE
statement with identifer-2 or literal-1 as the sending operand.
When identifier-1 re
EDIT:
The OpenCobol Programmer's Guide documents it specifically:
The list of data items eligible to be set to new values by this statement is:
Every elementary item specified as identifier-1 ..., PLUS...
Every elementary item defined subordinate to every group item specified as dentifier-1
..., with the following exceptions:
USAGE INDEX items are excluded.
Items with a REDEFINES as part of their definition are excluded; this means that
tems subordinate to them are excluded as well.
The Draft Cobol Standard is less explicit, but these items when used in INITIALIZE are processed by generating a SET rather than a MOVE: DATA-POINTER, FUNCTION-POINTER, OBJECT-REFERENCE, or PROGRAM-POINTER.
EDIT:
Seeing that the OpenCobol reference is not as "specific" as I thought: In IBM Cobol, currently, nothing which cannot be manipulated by being the target of a MOVE can be INITIALIZEd. This is the same for the current OpenCobol. The Draft Cobol has some exceptions, listed, but including neither INDEXED BY (which are not part of the table itself, but separate items for which the compiler itself defines storage) nor USAGE INDEX.
Related
In my COBOL program I have the following statement:
SET MYSELF (STATUS) TO -1.
What this statement does? I don't understand the MYSELF and STATUS words. It seems that the it gives the status parameter the value -1, am I right? What MYSELF means?
MYSELF is a reserved word that enables a compiler-supplied task item to refer to the attributes of its own process. So you are setting STATUS in your own process to -1.
COBOL ANSI-74 Programming Reference Manual (PDF Link)
The reserved word MYSELF is a compiler-supplied task item that
enables a program to access its own task attributes. Thus, any
attribute of a given task can be referenced within that task as
ATTRIBUTE attribute-name OF MYSELF.
For example, CHANGE ATTRIBUTE DECLAREDPRIORITY OF MYSELF TO 90.
CHANGE ATTRIBUTE DECLAREDPRIORITY OF ATTRIBUTE PARTNER OF MYSELF TO 65.
The second example illustrates another task running with a task that you are running.
The PARTNER attribute refers to the other task and the example changes the
DECLAREDPRIORITY of the other task.
In a "plain" COBOL program this statement would not be valid. MYSELF would be an entry below an OCCURS (a "table cell") and STATUS would be the index to use (= a numeric variable).
But as the SET statement can only ("standard COBOL") adjust variables of type POINTER or INDEX and both cannot be set to be negative this statement would normally be invalid.
There are some implementations where you can use SET to adjust any numeric variable (where the -1 would be valid if the target is a signed variable), but as #JerryTheGreek pointed out it looks to be NO COBOL but the "Task-Attribute Identifiers (Extension to ANSI X3.23-1974 COBOL)".
Note:
This question is not asking for advice on which library to use; I'm rolling my own.
I'm reading through the HL7 v2.5.1 spec in order to make a parsing engine for iOS and Windows.
My question is related to the Name Validity Range component in the Patient Name field (PID-5). But I think it applies generally to all DR (Date Range) components.
In Chapter 3: Patient Administration, on page 75, the following information is listed:
Components: {...omitted...} ^ <Name Validity Range (DR)> ^
{...omitted...}
Subcomponents for Name Validity Range (DR):
<Range Start Date/Time (TS)> & <Range End Date/Time (TS)>
Subcomponents for Range Start Date/Time (TS):
<Time (DTM)> & <Degree of Precision (ID)>
Subcomponents for Range End Date/Time (TS):
<Time (DTM)> & <Degree of Precision (ID)>
I understand how the fields, components and subcomponents are structured and how their separators are used... or at least I think I do. However, the above information confuses me as to how the data would be expressed. I have searched, but cannot find a suitable message sample for this kind of data. Based on my understanding of the HL7 data structures, here's how the data would be encoded:
PID|||01234||JONES^SUSIE^Q^^^^^^^199505011201&M&199505011201&M^199505011201&M&199505011201&M
The problem here, of course, is that having subcomponents embedded in subcomponents leaves you unsure exactly how to parse the data and what data goes where.
I did look into Chapter 2: Control, Appendix A and found this text on page 160:
Note: DR cannot be legally expressed when embedded within another data type. Its use is constrained to a segment field.
So, it appears that the standard listed for PID-5 is invalid. I haven't seen any messages from my system that even generate this information, so it may be a moot point for my particular case, but I don't like developing solutions with known holes. Has anybody encountered this "in the wild"?
An item with DR data type can be subdivided and have a precision subcomponent if the item is of type field .eg. ARQ/11 Requested start date/time range.
It can be subdivided in start and end of data range subcomponents but not precision subcomponent if the item with DR data type is already part of an other data type as in your example PID/5.
Patient name is an XPN data type which is a composite data type. That basically mean it can have a combination of Primary (like ST) and other Composites, as shown here
Now, you are looking at XPN.10 which is 10th component which is DR Data type, and again DR is combination of 2 primary DTM - start and end - or 2 subcomponents. And subcomponents are seperated by &.
I'm attempting to output the following row using DISPLAY and am getting the correct result in Micro Focus COBOL in Visual Studio and the Tutorialspoint COBOL compiler, but something strange when running it on a z/OS Mainframe using IBM's Enterprise COBOL:
01 W05-OUTPUT-ROW.
05 W05-OFFICE-NAME PIC X(13).
05 W05-BENEFIT-ROW OCCURS 5 TIMES.
10 PIC X(2) VALUE SPACES.
10 W05-B-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.
05 PIC X(2) VALUE SPACES.
05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.
It appears in Enterprise COBOL that the spaces are being ignored, and is adding an extra zero-filled column even though the PERFORM VARYING and DISPLAY code is the exact same in both versions:
PERFORM VARYING W02-O-IDX FROM 1 BY 1
UNTIL W02-O-IDX > W12-OFFICE-COUNT
MOVE W02-OFFICE-NAME(W02-O-IDX) TO W05-OFFICE-NAME
PERFORM 310-CALC-TOTALS VARYING W02-B-IDX FROM 1 BY 1
UNTIL W02-B-IDX > W13-BENEFIT-COUNT
MOVE W02-O-TOTAL(W02-O-IDX) TO W05-OFFICE-TOTAL
DISPLAY W05-OUTPUT-ROW
END-PERFORM
W13-BENEFIT-COUNT is 5 and never changes in the program, so the 6th column is a mystery to me.
Correct output:
Strange output:
Edit: as requested, here is W02-OFFICE-TABLE:
01 W02-OFFICE-TABLE.
05 W02-OFFICE-ROW OCCURS 11 TIMES
ASCENDING KEY IS W02-OFFICE-NAME
INDEXED BY W02-O-IDX.
10 W02-OFFICE-CODE PIC X(6).
10 W02-OFFICE-NAME PIC X(13).
10 W02-BENEFIT-ROW OCCURS 5 TIMES
INDEXED BY W02-B-IDX.
15 W02-B-CODE PIC 9(1).
15 W02-B-TOTAL PIC 9(5)V99 VALUE ZERO.
10 W02-O-TOTAL PIC 9(5)V99 VALUE ZERO.
and W12-OFFICE-COUNT is always 11, never changes:
01 W12-OFFICE-COUNT PIC 99 VALUE 11.
The question is not so much "why does Enterprise COBOL do that?", because it is documented, as "why do those other two compilers generate programs that do what I want?", which is probably also documented.
Here's a quote from the draft of what became the 2014 COBOL Standard (the actual Standard costs money):
C.3.4.1 Subscripting using index-names
In order to facilitate such operations as table searching and
manipulating specific items, a technique called indexing is available.
To use this technique, the programmer assigns one or more index-names
to an item whose data description entry contains an OCCURS clause. An
index associated with an index-name acts as a subscript, and its value
corresponds to an occurrence number for the item to which the
index-name is associated.
The INDEXED BY phrase, by which the index-name is identified and
associated with its table, is an optional part of the OCCURS clause.
There is no separate entry to describe the index associated with
index-name since its definition is completely hardware oriented. At
runtime the contents of the index correspond to an occurrence number
for that specific dimension of the table with which the index is
associated; however, the manner of correspondence is determined by the
implementor. The initial value of an index at runtime is undefined,
and the index shall be initialized before use. The initial value of an
index is assigned with the PERFORM statement with the VARYING phrase,
the SEARCH statement with the ALL phrase, or the SET statement.
[...]
An index-name may be used to reference only the table to which it is
associated via the INDEXED BY phrase.
From the second paragraph, it is clear that how an index is implemented is down to the implementor of the compiler. Which means that what an index actually contains, and how it is manipulated internally, can vary from compiler to compiler, as long as the results are the same.
The last paragraph quoted indicates that, by the Standard, a specific index can only be used for the table which defines that specific index.
You have some code equivalent to this in 310-CALC-TOTALS: take a source data-item using the index from its table, and use that index from the "wrong" table to store a value derived from that in a different table.
This breaks the "An index-name may be used to reference only the table to which it is associated via the INDEXED BY phrase."
So you changed your code in 310-CALC-TOTALS to: take a source data-item using the index from its table, and use a data-name or index defined on the destination table to store a value derived from that in a different table.
So your code now works, and will give you the same result with each compiler.
Why did the Enterprise COBOL code compile, if the Standard (and this was the same for prior Standards) forbids that use?
IBM has a Language Extension. In fact two Extensions, which are applicable to your case (quoted from the Enterprise COBOL Language Reference in Appendix A):
Indexing and subscripting ... Referencing a table with an index-name
defined for a different table
and
OCCURS ... Reference to a table through indexing when no INDEXED BY
phrase is specified
Thus you get no compile error, as using an index from a different table and using an index when no index is defined on the table are both OK.
So, what does it do, when you use another index? Again from the Language Reference, this time on Subscripting using index-names (indexing)
An index-name can be used to reference any table. However, the element
length of the table being referenced and of the table that the
index-name is associated with should match. Otherwise, the reference
will not be to the same table element in each table, and you might get
runtime errors.
Which is exactly what happened to you. The difference in lengths of the items in the OCCURS is down to the "insertion editing" symbols in your PICture for the table you DISPLAY from. If the items in the two tables were the same length, you'd not have noticed a problem.
You gave a VALUE clause for your table items (unnecessary, as you would always put something in them before the are output) and this left your "sixth" column, the five previous columns were written as shorter items. Note the confusion caused when the editing is done to one length and the storing done with a different implicit length, you even overwrite the second decimal place.
IBM's implementation of INDEXED BY means that the length of the item(s) being indexed is intrinsic. Hence the unexpected results when the fields referenced are actually different lengths.
What about the other two compilers? You'd need to hit their documentation to be certain of what was happening (something as simple as the index being represented by an entry-number (so plain 1, 2, 3, etc), and the allowing of an index to reference another table would be enough). There should be two extensions: to allow an index to be used on a table which did not define that index; to allow an index to be used on a table where no index is defined. The two logically come as a pair, and both only need to be specific (the first would do otherwise) because the are specifically against the Standard.
Micro Focus do have a Language Extension whereby an index from one table may be used to reference data from another table. It is not explicit that this includes referencing a table with no indexes defined, but this is obviously so.
Tutorialspoint uses OpenCOBOL 1.1. OpenCOBOL is now GnuCOBOL. GnuCOBOL 1.1 is the current release, which is different and more up-to-date than OpenCOBOL 1.1. GnuCOBOL 2.0 is coming soon. I contribute to the discussion area for GnuCOBOL at SourceForge.Net and have raised the issue there. Simon Sobisch of the GnuCOBOL project has previously approached Ideaone and Tuturialspoint about their use of the out-dated OpenCOBOL 1.1. Ideaone have provided positive feedback, Tutorialspoint, who Simon has again contacted today, nothing yet.
As a side-issue, it looks like you are using SEARCH ALL to do a binary-search of your table. For "small" tables, it is likely that the overhead of the mechanics of the generalised binary-search provided by SEARCH ALL outweighs any expected savings in machine resources. If you were to be processing large amounts of data, it is likely that a plain SEARCH would be more efficient than the SEARCH ALL.
How small is "small" depends on your data. Five is likely to be small close to 100% of the time.
Better performance than SEARCH and SEARCH ALL functionality can be achieved by coding, but remember that SEARCH and SEARCH ALL don't make mistakes.
However, especially with SEARCH ALL, mistakes by the programmer are easy. If the data is out of sequence, SEARCH ALL will not operate correctly. Defining more data than is populated gets a table quickly out of sequence as well. If using SEARCH ALL with a variable number of items, consider using OCCURS DEPENDING ON for the table, or "padding" unused trailing entries with a value beyond the maximum key-value that can exist.
I'd be very hesitant about mixing VALUE with OCCURS and re-code the WS as
01 W05-OUTPUT-ROW.
05 W05-OFFICE-NAME PIC X(13).
05 W05-BENEFITS PIC X(55) VALUE SPACES.
05 FILLER REDEFINES W05-BENEFITS.
07 W05-BENEFIT-ROW OCCURS 5 TIMES.
10 FILLER PIC X(02).
10 W05-B-TOTAL PIC ZZ,ZZ9.99.
05 FILLER PIC X(02) VALUE SPACES.
05 W05-OFFICE-TOTAL PIC ZZ,ZZ9.99 VALUE ZEROS.
Perhaps it has something to do with the missing fieldname?
Ah! evil INDEXED. I'd make both ***-IDX variables simple 99s.
Trying to find documentation on details, I did not find a lot beyond:
There is a (erlang runtime instance-) atom table.
Atom string literal is only stored once.
Atoms take 1 word.
To me, this leaves a lot of things in the unclear.
Is the atom word value always the same, independent of the sequence modules are loaded into a runtime instance? If modules A and B both define/reference some atoms, will the value of the atom change from session to session, depending on whether A or B was loaded first?
When matching for an atom inside a module, is there some "atom literal to atom value" resolution taking place? Do modules have some own module-local atom-value-lookup table, which gets filled in at load-time of a module?
In a distributed scenario where 2 erlang runtime instances communicate with each other. Is there some "sync-atom-tables" action going on? Or do atoms get serialized as string literals, instead of as words?
Atom is simply an ID maintained by the VM. The representation of the ID is a machine integer of the underlying architecture, e.g. 4 bytes on 32-bit systems and 8 bytes on 64-bit systems. See the usage in the LYSE book.
The same atom in the same running VM is always mapped to the same ID (integer). For example the following tuple:
{apple, pear, cherry, apple}
could be stored as the following tuple in the actual Erlang memory:
{1, 2, 3, 1}
All atoms are stored in one big table which is never garbage-collected, i.e. once an atom is created in a running VM it stays in the table until the VM is shut down.
Answering your questions:
1 . No. The ID of the atom will change between VM runs. If you shut down the VM and reload the tuple above the system might end up with the following IDs:
{50, 51, 52, 50}
depending on what other atoms have been created before it was loaded. Atoms only live as long as the VM.
2 . No. There is only one table of atoms per VM. All literal atoms in the module are mapped to their IDs when the module is loaded. If a particular atom doesn't yet exist in that table then it's inserted and stays there until the VM restarts.
3 . No. Tables with atoms are per VM and they are separate. Consider a situation when two VMs are started at the same time but they don't know of each other. Atoms created in each VM may have different IDs in the table. If at some point in time one node gets to know about the other node different atoms will have different IDs. They can't be easily synchronized or merged. But atoms aren't simply send as text representations to the other node either. They are "compressed" to a form of cache and send all together in the header. See the distribution header in the description of the communication protocol. Basically, the header contains atoms used in later terms with their IDs and textual representation. Then each term references the atom by the ID specified in the header rather than passing the same text each time.
To get really basic without going into implementation, an atom is a literal "thing" with a name. Its value is always itself and it knows its own name. You generally use it when you want the tag, like the atoms ok and error. Atoms are unique in the sense that there is only one atom foo in the system, and each time I refer to foo, I am referring to this same unique foo irrespective of whether they are in the same module, or whether they come from the same process. There is always only one foo.
A bit of implementation. Atoms are stored in a global atom table, and when you create a new atom, it is inserted into the table if it is not already there. This makes comparing atoms for equality very fast as you just check if the two atoms refer to the same slot in the atom table.
While separate instances of the VM, nodes, have separate atom tables, the communication between the nodes in distributed erlang is optimised for this, so very often you don't need to send the actual atom name between nodes.
Even though loops are kind of a logical concept in X12 (not directly physically represented in the text), every transaction set defines a set of loops that it can contain, including identifiers for the loops and an ordering for them. My question is, what is the rule for sorting loops, generically? Is there a concise set of rules that can be expressed in some code that should be able to take a collection of loops (with known identifiers such as 1000A, 2300BB, etc) and properly sort them?
The context of my question is that I'm working on a general-purpose library that applications will use to construct a model of an X12 document/transaction-set (and write out the text such a model represents). It has objects to represent Elements, Segments, and Loops. Ordering of Segments in a particular Loop is easy, they're dictated by the Implementation Guides. But I'm trying to get Loop ordering (within a Transaction Set) to work generically; that's what I'm asking about
It seems that the general rule is that Loops are ordered based on their identifiers using the numeric portion as the primary sort key, with the alpha portion as the secondary sort key. Of course hierarchical loops contained in others will be placed before and loops following the parent in that sort order (eg: 1000A, 2000A, 2010A, 2010B, 2100, 2300 - where 2010A and 2010B are children of 2000A).
I understand that the spec and Implementation Guides contain all of this info; I'm looking for the all-encompassing rule about loop ordering (not Segment ordering). Is there any concise way to express the rule algorithmically? Is there even a hard-and-fast rule at all?
As I mentioned in my comments, the standard has a loop value. Take a look at my screenshot of the Liaison Dictionary Viewer. The CLM segment has a LOOP value of 100. The segments underneath are children of the CLM segment (extended tag). Any "order" can be defined arbitrarily by the partner, or can be in any (undefined) order provided the data is qualified. But that loop can occur 100 times max and can have repeating segments inside the loop value.
The implementation guide will give you the correct order your partner wants them in. It seems like you're writing your own syntax validation engine though.