My question is pertaining to a file status 23, which according to MicroFocus means that upon my attempt to READ from a .DAT file:
"Indicates no record found."
or
"Indicates a duplicate key condition. Attempt has been made to store a
record that would create a duplicate key in the indexed or relative
file or a duplicate alternate record key that does not allow
duplicates."
I have eliminated the fact that the latter is my issue because I'm allowing duplicates in this case.
The reason I'm stumped is that I'm using a START to navigate to the record inside of my .DAT file, and when I execute a READ just after the START has positioned my file pointer, I get the file status 23.
Here is my code:
900-GET-INST-ID.
OPEN INPUT INST-MST.
MOVE FALL-IN-INST TO INST-NAME-REC.
START INST-MST
KEY EQUAL TO INST-NAME-REC
INVALID KEY
DISPLAY "RECORD NOT FOUND"
NOT INVALID KEY
READ INST-MST
MOVE INST-ID-REC TO WS-INST-ID
END-START.
CLOSE INST-MST.
So when I am running this code my START successfully runs and goes into the NOT INVALID KEY block, and then the very next line executes and my read is null. How can this be if my alternate key (INST-NAME-REC) is actually found inside the .DAT?
I have ensured that my FD picture clauses match exactly in the ISAM Build program and in this program (the reading program).
The second reason you show is excluded not because you allow duplicate keys, but because that error message with that file-status is for a WRITE, and your failure is on a READ.
Here's your problem:
READ INST-MST
Here's how you fix it:
READ INST-MST NEXT
In COBOL 85, the READ statement has two formats. Format 1 is for a sequential read and Format 2 is for a keyed (random) read.
Unfortunately, the minimum READ syntax for both sequential and keyed reads is:
READ file-name
Which means if you use READ file-name the compiler will implicitly treat it as Format 1 or Format 2 depending on your SELECT statement.
READ file-name NEXT RECORD is identical to READ file-name NEXT.
Consult your actual documentation for a full explanation and discovery of possible Language Extensions from the vendor. If you consult closely, the behaviour of READ file-name with no further option depends on the type of file. With a keyed file, the default is a keyed READ. You key field (luckily) does not contain a key that exists, so you get the 23.
Even if it didn't work like that, what would be the point of not using the word NEXT? The compiler always knows what you tell it (which sometimes is not what you think you tell it), but in a situation like this, the human reader can be very unsure. The last thing you want to do when bug-hunting is break off to look at the manual to discover exactly how that behaves, and then try to work it if that behaviour was the one sought by the original coder. The bug? A bug? Intended, but sloppy, code? No-one wants to spend that time, and look, even now, it is you.
A couple of comments on your code.
Look up the FILE STATUS clause of the SELECT. Use it. One field per file. Check after each IO. It'll save you grief.
Once using the FILE STATUS, ditch the imperative parts of the IO statements (the something/NOT something) and replace by tests of the file-status field (using 88s).
It looks like you are OPENing and CLOSEing your look-up file all the time. Please don't. OPEN and CLOSE can be very heavy and time-consuming, so do them once per program per file. If you've done that because of a problem, find a correct resolution to that problem, don't use a hack.
Drop the full-stops/periods except where they are needed. This is COBOL 85, which means for 30 years the number of full-stops/periods required in the PROCEDURE DIVISION have been greatly reduced. Get modern, and take advantage of that, it'll save you Gotcha!s as you copy/paste code, leaving the one which shouldn't be there and changing the way the program behaves.
Related
I am working on a project to automate COBOL to generate a class diagram. I am developing using a .NET console application. I need help tracking down the procedure name where the perform statement in used in the below example.
**Z-POST-COPYRIGHT.
move 0 to RETURN-CODE
perform Z-WRITE-FILE**
How do I track the procedure name 'Z-Post-COPYRIGHT' where the procedure 'Z-write-file' is called? The only idea I could think of in terms of COBOL is through indentation as the procedure names are always indented. Ideally in the database, the code should track the procedure name after the word 'perform' and procedure under which it is called (in this case it is Z-POST-COPYRIGHT).
I assume you want to do this "on your own" without external tools (a faster approach can be found at the end).
You first have to "know" your source:
which compiler was it compiled with (get a manual for this compiler)
which options were used
Then you have to preparse the source:
include copybooks (doing the given REPLACING rules if any)
if the source is in free-form reference format: concatenate contents of last line and current line if you find a - in column 7
check for REPLACE and change the result accordingly
remove all comments (maybe only * and \ in column 7 in fixed-form reference format or similar (extensions like "variable" format / "terminal" format", ... exist, maybe only inline comments - when in free-form reference-format, otherwise maybe inline comments *> or compiler specific extensions like |) - depending on the further re-engineering you want to do it could be a good idea to extract them and store them at least with a line number reference
The you finally can track the procedure name with the following rule:
go backwards to the last separator period (there are more rules but the rule "at least one line break, another period, a space a comma or a semicolon" [I've never seen the last two in real code but it is possible" should be enough)
check if there is only one word between this separator period and the next
if this word is no reserved COBOL word (this depends on your compiler) it is very likely a procedure name
Start from here and check the output, then fine grade the rule with actual false positives or missing entries.
If you want to do more than only extract the procedure-names for PERFORM and GO TO (you should at least check the sources for PERFROM ... THRU) then this can get to a lot of work...
Faster approach with external tools:
run a COBOL compiler on the complete sources and tell it to do the preparsing only - this way you have the big second point solved already
if you have the option: tell the compiler or an external tool to create a symbol table / cross reference - this will tell you in which line a procedure is and its name (you can simply find the correct procedure by comparing the line)
Just a note: You may want to check GnuCOBOL (formerly OpenCOBOL) for the preparsing and/or generation of symbol tables/cross-reference and/or printcbl for a completely external tool doing preparsing and/or cobxref for a complete cross reference generation.
I have a really long insert query with more than 40 fields (from an 'inherited' Foxpro database) processed using OleDb, that produces the exception 'Data type mismatch.' Is there any way to know which field of the query is producing this exception?
By now I'm using the force brute method of reducing the number of fields on the insert until I locate the buggy one, but I guess it must be a more straight way to found it...
There isn't really any shortcut beyond taking a guess at which 20 might be the problem, chopping out the other 20 and testing, and repeating that reductive process until you hit it.
Or alternatively looking at the table structure(s) in the DBF and making sure the field types match to the OleDB types you're using. The details of how .NET types are mapped to Visual FoxPro table field types is here.
If you have access to the Visual FoxPro IDE you could probably do that a lot quicker by knocking up a little program or even just doing it in the Command Window.
You are not telling us the language you use, so that we could possibly give a sample to handle it.
Basically what you would do is:
Get the structure,
Parse the insert statement and get values,
Compare data types.
It should be a short code to make this check.
I have a FindFile routine in my program which will list files, but if the "Containing Text" field is filled in, then it should only list files containing that text.
If the "Containing Text" field is entered, then I search each file found for the text. My current method of doing that is:
var
FileContents: TStringlist;
begin
FileContents.LoadFromFile(Filepath);
if Pos(TextToFind, FileContents.Text) = 0 then
Found := false
else
Found := true;
The above code is simple, and it generally works okay. But it has two problems:
It fails for very large files (e.g. 300 MB)
I feel it could be faster. It isn't bad, but why wait 10 minutes searching through 1000 files, if there might be a simple way to speed it up a bit?
I need this to work for Delphi 2009 and to search text files that may or may not be Unicode. It only needs to work for text files.
So how can I speed this search up and also make it work for very large files?
Bonus: I would also want to allow an "ignore case" option. That's a tougher one to make efficient. Any ideas?
Solution:
Well, mghie pointed out my earlier question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and as I answered, it was different and didn't provide the solution.
But he got me thinking that I had done this before and I had. I built a block reading routine for large files that breaks it into 32 MB blocks. I use that to read the input file of my program which can be huge. The routine works fine and fast. So step one is to do the same for these files I am looking through.
So now the question was how to efficiently search within those blocks. Well I did have a previous question on that topic: Is There An Efficient Whole Word Search Function in Delphi? and RRUZ pointed out the SearchBuf routine to me.
That solves the "bonus" as well, because SearchBuf has options which include Whole Word Search (the answer to that question) and MatchCase/noMatchCase (the answer to the bonus).
So I'm off and running. Thanks once again SO community.
The best approach here is probably to use memory mapped files.
First you need a file handle, use the CreateFile windows API function for that.
Then pass that to CreateFileMapping to get a file mapping handle. Finally use MapViewOfFile to map the file into memory.
To handle large files, MapViewOfFile is able to map only a certain range into memory, so you can e.g. map the first 32MB, then use UnmapViewOfFile to unmap it followed by a MapViewOfFile for the next 32MB and so on. (EDIT: as was pointed out below, make sure that the blocks you map this way overlap by a multiple of 4kb, and at least as much as the length of the text you are searching for, so that you are not overlooking any text which might be split at the block boundary)
To do the actual searching once the (part of) the file is mapped into memory, you can make a copy of the source for StrPosLen from SysUtils.pas (it's unfortunately defined in the implementation section only and not exposed in the interface). Leave one copy as is and make another copy, replacing Wide with Ansi every time. Also, if you want to be able to search in binary files which might contain embedded #0's, you can remove the (Str1[I] <> #0) and part.
Either find a way to identify if a file is ANSI or Unicode, or simply call both the Ansi and Unicode version on each mapped part of the file.
Once you are done with each file, make sure to call CloseHandle first on the file mapping handle and then on the file handling. (And don't forget to call UnmapViewOfFile first).
EDIT:
A big advantage of using memory mapped files instead of using e.g. a TFileStream to read the file into memory in blocks is that the bytes will only end up in memory once.
Normally, on file access, first Windows reads the bytes into the OS file cache. Then copies them from there into the application memory.
If you use memory mapped files, the OS can directly map the physical pages from the OS file cache into the address space of the application without making another copy (reducing the time needed for making the copy and halfing memory usage).
Bonus Answer: By calling StrLIComp instead of StrLComp you can do a case insensitive search.
If you are looking for text string searches, look for the Boyer-Moore search algorithm. It uses memory mapped files and a really fast search engine. The is some delphi units around that contain implementations of this algorithm.
To give you an idea of the speed - i currently search through 10-20MB files and it takes in the order of milliseconds.
Oh just read that it might be unicode - not sure if it supports that - but definately look down this path.
This is a problem connected with your previous question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and the same answers apply. If you don't read the files completely but in blocks then large files won't pose a problem. There's also a big speed-up to be had for files containing the text, in that you should cancel the search upon the first match. Currently you read the whole files even when the text to be found is in the first few lines.
May I suggest a component ? If yes I would recommend ATStreamSearch.
It handles ANSI and UNICODE (and even EBCDIC and Korean and more).
Or the class TUTBMSearch from the JclUnicode (Jedi-jcl). It was mainly written by Mike Lischke (VirtualTreeview). It uses a tuned Boyer-Moore algo that ensure speed. The bad point in your case, is that is fully works in unicode (widestrings) so the trans-typing from String to Widestring risk to be penalizing.
It depends on what kind of data yre you going to search with it, in order for you to achieve a real efficient results you will need to let your programm parse the interesting directories including all files in there, and keep the data in a database which you can access each time for a specific word in a specific list of files which can be generated up to the searching path. A Database statement can provide you results in milliseconds.
The Issue is that you will have to let it run and parse all files after the installation, which may take even more than 1 hour up to the amount of data you wish to parse.
This Database should be updated eachtime your programm starts, this can be done by comparing the MD5-Value of each file if it was changed, so you dont have to parse all your files each time.
If this way of working can be interesting if you have all your data in a constant place and you analyse data in the same files more than each time totally new files, some code analyser work like this and they are real efficient. So you invest some time on parsing and saving intresting data and you can jump to the exact place where a searching word appears and provide a list of all places it appears on in a very short time.
If the files are to be searched multiple times, it could be a good idea to use a word index.
This is called "Full Text Search".
It will be slower the first time (text must be parsed and indexes must be created), but any future search will be immediate: in short, it will use only the indexes, and not read all text again.
You have the exact parser you need in The Delphi Magazine Issue 78, February 2002:
"Algorithms Alfresco: Ask A Thousand Times
Julian Bucknall discusses word indexing and document searches: if you want to know how Google works its magic this is the page to turn to."
There are several FTS implementation for Delphi:
Rubicon
Mutis
ColiGet
Google is your friend..
I'd like to add that most DB have an embedded FTS engine. SQLite3 even has a very small but efficient implementation, with page ranking and such.
We provide direct access from Delphi, with ORM classes, to this Full Text Search engine, named FTS3/FTS4.
I maintain a 3rd party Informix driver that's written with ESQL-style (Informix API) calls. I'm working on a bug where, for TEXT fields, INSERTs work fine and UPDATEs fail. Stepping through the code, what I've found is that we're checking our sqlda structure to tell us whether and how to bind, and after the call to sqli_describe_statement, the sqlda.sqld variable contains 2, the correct number of bound parameters for this insert call, and the parameters appear to be set up correctly whereas in the update case, the number returned is 0, with no parameter information (it should be 1, for the one param in: "UPDATE TESTTAB SET COLNAME = ? WHERE OTHERCOLNAME = 1 ").
Using the sqlda information, we correctly set up the required locator structure for the INSERT, but we can't for the update because the information isn't there. If I fake it out in the debugger and run the set-up-the-locator code for the update, it updates fine.
The statement certainly appears correct, and the same variable is being used for the INSERT as the UPDATE bind. Moreover sqli_prep has no problem with the update. For the describe, sqsla.code returns different non-negative numbers 4 and 6, representing the different types of statements being described, as documeneted (i.e., not an error code), so there's no obvious problem there.
Is there something else I should be checking in the code ahead of this that might cause this weird behavior (other than special case handling for the different queries -- nothing there)
Am I missing something fundamental here about how one does UPDATEs on TEXT fields, such as you have to create a locator object, find the row, and click your heels together three times and say "There's no place like IBM?"
So far Google Fu has turned up little in the documentation, but if you know of docs or samples that point the way, that's cool too.
This is one of the murky areas of Informix behaviour. The behaviour of DESCRIBE is supposed to describe output parameters (it is a shorthand for DESCRIBE OUTPUT stmt INTO ...); to describe the input parameters, you would use DESCRIBE INPUT stmt INTO ... instead.
However, for various reasons extending back to the dawn of time (well, 1985, anyway), the INSERT statement got a special case exemption and plain DESCRIBE described its input parameters - unlike UPDATE or DELETE (or, these days, MERGE).
So, your code was probably written before DESCRIBE INPUT and DESCRIBE OUTPUT became feasible (that was circa 2000±3 years). In principle, using the directed DESCRIBE statements should fix the issue. There may be an ONCONFIG parameter to be set to get this behaviour.
I remember being grateful that the feature arrived, but also I remember thinking "Damn, I'm not going to be able to use that for a while - until the old versions without it are all retired". I think that has basically happened now - IDS 7.31 in particular is now obsolete, and so indeed are the IDS 9.x versions, so all available versions of IDS support the feature. OnLine 5.20 - a minority interest - still doesn't and won't ever support it. So, I need to review how to update my programs such as SQLCMD to exploit this. The code there includes what I call 'vignettes'; they're complete little programs that illustrate how to work with BYTE and TEXT blobs. You might find UPDBLOB or APPBLOB, for example, of some use.
I have an existing MF COBOL 4.0 program with years of data in a ISAM file but I need to add a new field to the existing file. The record currently has 1208 chars and I need to add another 10 to it.
If I simply put the extra PIC X(10) field in my copybook, it gives me an error.
You need to modify the underlying data file to match your file definition in COBOL. One way to do so would be to define a line of output exactly like what lines of your data look like now, but with an extra Pic x(10) on the end of it. You would then read in your data line by line, and output it to a new location with 10 extra spaces on the end of it. That way your data is 10 characters longer, and you can go back and add that extra Pic x(10) to your main program. It should work after that.
With changing the copybook, you're only changing the representation of the data used in your program. Shouldn't you be restructuring the data source (i.e. the ISAM file) as well?
Late answer, but I thought you might be interested.
I've been working on our Cobol system for over 20 years and we've come across this issue many times.
Changes to the structure of our index files are what we consider a "Major Release". These require specific Conversion programs which:
Rename the physical file, moving it aside to an 'old' file
Open the 'old' version of the file (using a version of the copybook before the change)
Open (create) the 'new' version of the file
Move the contents of each of the 'old' records to a 'new' record and WRITEs it
Of course these conversions require the system to be 'down', hence the reason why they are considered major releases.
If you have files which are likely to have fields added to them in the future, you can add extra FILLER to the end of index file to let you cope with new fields being added. We tend to add a FILLER of 50 or 100. Of course this doesn't help you if you change one of the existing fields, or even the structure of any of the keys.
For file errors, you will want to keep a list handy. I recommend starting with a list you find online, and any time you get an error you cannot figure out in 5 seconds, add a detailed explanation of the resolution so you will have it in your notes the next time it happens. Here are a couple decent lists to start with
http://www.simotime.com/vsmfsk01.htm
http://www.briarcliff.edu/departments/cis/Cobol/Error%20Codes.html
In my list, file status 39 is:
OPEN-CONFLICT-FILE-ATR - The 'open' statement was unsuccessful because a conflict was detected between the fixed file attributes specified for that file in the program. These attibutes include the organization of the file (sequential, relative, or indexed), the prime record key, the alternate record keys, the code set, the maximum record size, and the record type (fixed or variable).
And this is from the personalized note: Check the file that you have assigned to your ddname in your JCL. Especially the length allocation. In your case, you know that the length does not match, since you just changed the program.
There are utilities to reformat datasets, particularly SYNCSORT. Or of course you can write your own.