Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm starting to learn about COBOL. I have some experience writing programs that deal with SQL databases and I guess I'm confused how COBOL stores and retrieves data that is stored in a mainframe for example. I know that it's not like relational databases but every example program I've seen takes data straight from the command line and I know that's not how real world COBOL programs process the data. Can someone explain or show me a good resource that can explain it?
COBOL is just another third generation computer language. It is just a little older than most which doesn't mean it is somehow incomplete (actually it comes with quite a bit of baggage - but that is another story).
As with any other third generation language, COBOL manipulates data files in pretty much the same way that you would in a C program. Nothing odd, mysterious or magical about it. Files are opened, read, written and closed using the file I/O features of the language.
Various mechanisms are used to form a link between an actual file and the program. The details here are often specific to the operating system you are working under. Generally, COBOL implementations try to isolate themselves from the operating environment through a logical file name as opposed to an actual name. This added indirection is important when you are writing programs that will be ported to different platforms (e.g. write and test within an IDE on a Windows platform, and then run on a mainframe).
The following examples relate to an IBM Mainframe environment.
Within the IBM mainframe world, you will find that programs are run as either batch or on-line (e.g. CICS). I will not describe how to set up for file I/O under CICS (that's a long story). Programs that are used to manipulate files are usually batch. Here is a rough illustration of how a batch program works:
Batch programs are run via JCL. JCL is used to identify the program to run ('EXEC' statement) and to identify which files your program will reference using 'DD' statements. The function of a DD Statement is to form a logical connection between an actual file and a name your COBOL program will reference when it wants to refer to the file (this is the isolation mechanism mentioned earlier). For example,
JCLDDNAM DD DSN='HLQ.MY.FILE'...
would associate the 'DD' name 'JCLDDNAM' to the file named 'HLQ.MY.FILE'. This part is platform dependant so the details are specific to the operating environment.
In the 'FILE-CONTROL' section of your COBOL program, you connect the 'DD NAME' defined in your JCL with the name you will use on each I/O statement to reference that file. This connection is defined using the 'SELECT' statement.
For example,
SELECT MYFILE
ASSIGN JCLDDNAM
remainder of select
makes a connection between whatever file you associated with 'JCLDDNAM' in your 'JCL' to 'MYFILE' that you will later reference in COBOL I/O statements. The SELECT statement itself is part of the ISO COBOL standard. However, many COBOL implementations define some non-standard extentions to facilitate various quirks to their file subsystems.
Open, read, write, close files within the 'PROCEDURE DIVISION' of you program using the name 'MYFILE' as in:
OPEN MYFILE
READ MYFILE
CLOSE MYFILE
The above is highly simplified, and there are a multitude of ways to do this within COBOL. Understanding the complete picture will take some real effort, time and practice. The I/O statements illustrated above are part of the COBOL standard, but every vendor will have their own extentions.
IBM COBOL supports a wide range of file organizations and access methods. You can review the IBM Enterprise COBOL Language Reference manual here to get the syntax and rules for file manipulation, However, the User Guide provides a lot of good examples for reading/writing files (you will have to dig a bit—but it is all laid out for you).
The setup to reference an SQL database via a COBOL program is somewhat different but involves setting up a connection between your program and the database subsystem. Within the IBM world this is done throug JCL, other environments will use different mechanisms.
IBM COBOL uses a pre-processor or co-processor to integrate database access and data exchange. For example, the following code would retrieve some data from a DB2 database:
MOVE 1234 TO PERSON-ID
EXEC SQL
SELECT FIRST_NAME, LAST_NAME
INTO :FIRST-NAME, :LAST-NAME
FROM PERSON
WHERE PERSON_ID = :PERSON-ID
END-EXEC
DISPLAY PERSON-ID FIRST-NAME LAST-NAME
The stuff between EXEC SQL and END-EXEC is a pretty simple SQL select statement. The names preceded by colons are COBOL host variables used to pass data to DB2 or receive it back. If you have ever coded database access routines before this should look very familiar to you. This link provides a simple introduction to incorporating SQL statements into an IBM Enterpirse COBOL program.
By the way, IBM Enterprise COBOL is capable of working with XML documents too. Sorry for the heavy IBM slant, but that is the environment I am most familiar with.
Hope this gets you started in the right direction.
Who said you cannot use SQL to retrieve data from a cobol application, maybe without spending money?
A company I used to work for, did just that - with SQLite. This little gem of a public domain library compiles SQL statements to bytecode, then it executes them.
By replacing the "backend" level of SQLite with a custom interface to the C library that deals with Cobol files, it was possible to query the Cobol data from other languages, Python in that case. It worked -- within the limits of SQLite of course, but it was stable, it seemed relational enough and it didn't even require a DB server :-)
Traditional COBOL batch environments use a 'data section' of the cobol program to directly declare database connections, which are in turn set up in JCL. Since COBOL predates SQL, those would have tended to be various other types of databases, but it's likely that IBM made SQL work with DB/2. I imagine you'll get another answer from someone closer to this stuff. If you look at the SQL preprocessors available for use with other languages you'll get the idea -- a cursor becomes a native datatype and delivers query results to native variables.
Mainframe Cobol uses embedded SQL (kinda like SQLj), e.g.:
Procedure Division.
Exec SQL
Select col1, col2
from myTable
into :ws-col1, :ws-col2
where col0 = :col0
End-Exec
In this case, the host variables ws-col0, ws-col1 and ws-col2 are defined in the working-storage section. The database interface manages getting that data in the right place.
Very easy compared to the distributed stuff actually.
All the IBM mainframe shop's I've worked in have used COBOL which talked with a relational database. Generally that has been IBM's DB2. Please note that DB2 is a relational database that run's on mainframes. It can also be run on Windows or Linux.
Twenty years ago a predominate way to enter data into the DB2 mainframe database was to use CICS. CICS is "presentation level" software that comminicates with character based data entry screens. Consider CICS the functional equivelant of PHP or ASP.NET.
Today there are many more options to get data into DB2. CICS is still an option but your "presentation layer" could be PHP, ASP.NET, Win Forms, Java JSF, Powerbuilder. The key thing is that your development platform would need to be able to work with a DB2 database driver. The platform could be Windows, Linux, and possibly others.
My point is that data can get into the mainframe DB2 database in a many ways from many platforms. The COBOL language might be involved in data entry, reporting, altering data COBOL, etc. But it might only be part of an multiple tier application that could be part Windows, web, and mainframe . I could give specific examples if you have more information about the application you'll be working with at your internship.
Related
Hy all,
this is my first question on stackoverflow, so let me know if something is wrong.
Howerver, I need to know if is it possible to read a dataset, defined with a DD name declared in a COBOL program, from a Java Stored Procedure under DB2.
The program flow is:
- a JCL invoke a STORED PROCEDURE
- the STORED PROCEDURE invoke the jar
- the jar try to open the data set through the DD name
I tried to use ZFile class from jZos library but the Java code can't see neither the DD name and the relative file on z/OS.
My doubt is: There is no way to accomplish this task because the JVM on DB2 runs in an isolated environment or there is a specific class/procedure to reach the data set?
Thanks in advance!
There is a significant difference between "is it possible," "is it allowed," and "is it a good idea."
Because you know the file name I believe it may be possible to achieve your goal via dynamic allocation of the file associated with the DD. The javadoc for ZFile indicates it "includes an interface to BPXWDYN (text based interface to MVS dynamic allocation)."
Whether this is allowed in your IT shop is a question for your architecture staff and DB2 Systems Programmers (the people responsible for installing, configuring, and performance of DB2). Just because something is possible does not mean it is allowed - there may be performance or security or audit considerations.
Even if it turns out this architecture is possible and allowed, it may be that there are better solutions. Talk to your architecture staff and z/OS and DB2 Systems Programmers about your requirements and why you wish to pursue this particular solution. Ask them for suggestions on improvements which still implement your requirements.
For example, if you intend to execute this stored procedure one million times in a batch job, and to dynamically allocate the file, open it, read its contents, close it, and then deallocate it for each execution - that is unlikely to perform well and is likely to have an adverse impact on the other applications which make use of DB2 stored procedures. Perhaps storing the contents of the file in a DB2 table is a better solution - I cannot say because I do not know your business requirements or the context of the rest of your application, I merely bring it up as an example.
Its been a while since Ive had to write what amounts to a custom format edi processor. The last time I wrote one, I was an AS/400 programmer (not iSeries to give you a timeframe). It was pretty easy, I built a structure and inspected the record type column and began processing based on the fix positions of data and record type.
Fast forward to 2012 and I have almost exactly the same requirements except I no longer have an AS/400 to make it easy.
For brevity, the first 2 columns contain a record type and the structure is based on that type. Any suggestion on how to best handle this in c# on a web server?
Some options I have considered are filehelpers and SSIS. I have full control over the environment so I can do pretty much anything that makes sense.
You can try using the Multi Record engine option of FileHelpers
http://www.filehelpers.com/example_multirecords.html
You must define as many record classes as different kind of lines you have and later provide a delegate that lets FileHelpers choose the right one.
There is also a Master Detail engine:
http://www.filehelpers.com/example_masterdetail.html
Last version of the library: http://teamcity.codebetter.com/viewLog.html?buildId=51642&tab=artifacts&buildTypeId=bt66
What is the difference between Stored Procedures and Prepared Statements... And which one is better and why...!! I was trying to google it but haven't got any better article...
Stored procedures are a sequence of instructions in PL/SQL language. Is a programming language implemented by some DBMS, that lets you store sequences of queries frequently applied to your model, and share the processing load with the application layer.
Prepared statements are queries written with placeholders instead of actual values. You write the query and it is compiled just once by the DBMS, and then you just pass values to place into the placeholders. The advantage of using prepared statements is that you enhance the performance considerably, and protect your applications from SQL Injection.
The difference is you cant store prepared statements. You must "prepare" them every time you need to execute one. Stored procedures, on the other hand, can be stored, associated to a schema, but you need to know PL/SQL to write them.
You must check if your DBMS supports them.
Both are very usefull tools, you might want to combine.
Hope this short explanation to be useful to you!
The other answers have hinted at this, but I'd like to list the Pros and Cons explicitly:
Stored Procedures
PROS:
Each query is processed more rapidly than a straight query, because the server pre-compiles them.
Each query need only be written once. It can be executed as many times as needed, even across different sessions and different connections.
Allows queries to include programming constructs (such as loops, conditional statements, and error handling) that are either impossible or difficult to write in SQL alone.
CONS
Require knowledge of whatever programming language the database server uses.
Can sometimes require special permissions to write them or call them.
Prepared Statements
PROS
Like stored routines, are quick because queries are pre-compiled.
CONS
Need to be re-compiled with each connection or session.
To be worth the overhead, each prepared statement must be executed more than once (such as in a loop). If a query is executed only once, more overhead goes into preparation of the prepared statement than you get back since the server needs to compile the SQL anyway, but also make the prepared statement.
For my money, I'd go with Stored Procedures every time since they only need to be written and compiled once. After that, every call to the procedure leads to saved time, whether you're on a new connection or not, and whether you're calling the procedure in a loop or not. The only downside is needing to spend some time learning the programming language. If I didn't have permissions to write stored procedures, I would use a prepared statement, but only if I had to repeatedly make the same query multiple times in the same session.
This is the conclusion I've come to after several months of off-and-on research into the differences between these two constructs. If anyone is able to correct bad generalizations I'm making, it will be worth any loss to reputation.
A stored Procedure is stored in the DB - depending on which DB (Oracle, MS SQL Server etc.) it is compiled and potentially prepared optimized when you create it on the server...
A prepared statement is a statement which is parsed by the server and an execution plan is created by the server ready for execution whenever you run the statement... usually it makes sense when a statement is run more than once... depending on the DB server (Oracle etc.) and even sometimes configuration options these "preparation" are either session-specific or "global"...
There is no "better" when you compare these two since they have their specific use cases...
In my application I want to use files for storing data. I don't want to use database or clear text file, the goal is to save double and integer values along with string just to identify the name of the record ; I simple need to save data on disk for generating reports. File can grow even to gigabyte. What format you suggest to use? Binary? If so what vcl component/library you know which is good to use? My goal is to create an application which creates and updates the files while another tool will "eat" those file
producing nice pdf reports for user on demand. What do you think? Any idea or suggestion?
Thanks in advance.
If you don't want to reinvent the wheel, you may find all needed Open Source tools for your task from our side:
Synopse Big Table to store huge amount of data - see in particular the TSynBigTableRecord class to store an unlimited number of records with fields, including indexes if needed - it will definitively be faster and use less disk size than any other regular SQL DB
Synopse SQLite3 Framework if you would rather use a standard SQLite engine for the storage - it comes with a full Client/Server ORM
Reporting from code, including pdf file generation
With full Source code, working from Delphi 6 up to XE.
I've just updated the documentation of the framework. More than 600 pages, with details of every class method, and new enhanced general introduction. See the SAD document.
Update: If you plan to use SQLite, you should first guess how the data will be stored, which indexes are to be created, and how a SQL query may speed up your requests. It's a bad idea to read all file content for every request: you should better structure your data so that a single SQL query would be able to return the expended results. Sometimes, using additional values (like temporary sums or means) to the data is a good idea. Also consider using the RTree virtual table of SQLite3, which is dedicated to speed up access to double min/max multi-dimensional data: it may speed up a lot your requests.
You don't want to use a full SQL database, and you think that a plain text file is too simple.
Points in between those include:
Something that isn't a full SQL database, but more of a key-value store, would technically not be a flat file, but it does provide a single "key+value" list, that is quickly searchable on a single primary key. Such as BSDDB. It has the letter D and B in the name. Does that make it a database, in your view? Because it's not a relational database, and doesn't do SQL. It's just a binary key-value (hashtable) blob storage mechanism, using a well-understood binary file format. Personally, I wouldn't start a new project and use anything in this category.
Recommended: Something that uses SQL but isn't as large as standalone SQL database servers. For example, you could use SQLite and a delphi wrapper. It is well tested, and used in lots of C/C++ and Delphi applications, and can be trusted more than anything you could roll yourself. It is a very light embedded database, and is trusted by many.
Roll your own ISAM, or VLIR, which will eventually morph over time into your own in-house DBMS. There are multiple files involved, and there are indexes, so you can look up data fast without loading everything into memory. Not recommended.
The most flat of flat binary fixed-record-length files. You mentioned originally in your question, power basic which has something called Random Access files, and then you deleted that from your question. Probably what you are looking for, especially for append-only write as the primary operation. Roll your own TurboPascal era "file of record". If you use the "FILE OF RECORD" type, you hit the 2gb limit, and there are problems with Unicode. So use TStream instead, like this. Binary file formats have a lot of strikes against them, especially since it is difficult to grow and expand your binary file format over time, without breaking your ability to read old files. This is a key reason why I would recommend you start out with what might at first seem like overkill (SQLite) instead of rolling your own binary solution.
(Update 2: After updating the question to mention PDFs and what sounds like a reporting-system requirement, I think you really should be using a real database but perhaps a small and easy to use one, like firebird, or interbase.)
I would suggest using TClientDataSet, and use it's SaveToFile() / SaveToStream() methods by the generating program, and LoadFromFile() / LoadFromStream() methods for the program that will "consume" the data. That way, you can still make indexed records without connecting to any external database, all while keeping the interchange data in a single file.
Define API to work with your flat file, so that the API can be implemented by a separate data layer in many ways.
Implement the API using standard embedded SQL database (ex SQLite or Firebird).
Only if there is something wrong with the standard solution think of your own.
I use KBMMemtable - see http://www.components4developers.com/ - fast, reliable, been around a long time - supports binary and CSV streaming in and out of files, as well indexing, filters, and lots of other goodies - TClientDataSet will not do well with large datasets.
With newer versions of DB2 you can write stored procedures in SQL or you can create procedures in Java (or other languages) that can then be added to the database and called just like SQL procedures. I'm wondering what the the pros and cons of each method are. I'm specifically interested in comparing those two types of procedures, not debating procedures versus external code which I think has already been covered. Here is what I've come up with so far:
SQL:
Better performance for basic SQL functionality
Less verbose for simple logic, i.e. you can run SQL directly
No extra compile step - just create procedure ...
Java:
More structured and full-featured code (classes, objects, reuse, libraries)
Better resources for help both in terms of programmers and documentation
Any other thoughts?
Not just Java but any procedural language: procedural, that's the key.
DB2 is a relational algebra, not a programming language. If you need to do something complex, it's better to use a procedural language than try to bend SQL into something it's not meant to be.
We actually use REXX for our DB2/z work but I've seen guys here use bash on the mainframe for quick prototyping. Some who only use SQL create all these horrible temporary tables for storing things that are best kept in structures and away from the DB.
My thoughts are that complexity is the only differentiating factor between using SQL directly in a stored procedure and a procedural approach. We have two rules of thumb when making these decisions.
1/ If a unit of work requires more than two trips to the database, we make it a stored procedure.
2/ If a stored procedures creates a temporary table, then we use a procedural language to do it.
You may already have found this but just in case and for any others that swing by here there's an IBM Redbook on SProcs and the changes for DB2 v9 available at
DB2 9 for z/OS Stored Procedures: Through the CALL and Beyond
which discusses the available options and their relative merits.
The advantage with the SQL stored procedures is they are portable to other Db2 UDb with minimal or no changes. But Db2 external procedures would be a better choice as you can do more with a procedural language than sql alone.
I would say cobol would be a better fit for DB2 external stored procedures than Java for the below reasons.
1). You would be able to reuse some code from existing programs. Converting a cobol sub program to a stored procedure or stored procedure to a cobol sub program is very easy to accomplish.
2). You would be able to use existing cobol development team who has functional knowledge with the system.