Let's say I EXEC a COBOL program from JCL, using a SYSIN DD to provide data.
If that program then CALLs a COBOL subprogram, and some data is still available from the input, will the subprogram be able to read them using an ACCEPT statement?
Or is the SYSIN only accessible from the main program of the run unit invoked from the JCL?
Edit: I reckon this would be a bad coding practice. I do not intend to use it, nor am I in need of better alternatives, of which I am aware (such as reading the input device data from the main data and passing it to subprogram(s) through their LINKAGE).
Actually I am not writing COBOL code, but studying / processing it, so I'm interested in "corner case" behavior to refine my understanding of COBOL semantics.
You just can to use the ACCEPT on the main program. If you want to send information, you should use a copy.
Related
I want to write a GUI frontend to gdb, using MI. Currently I can communicate with gdb via pipe, but a GUI debugger should be able to display source code and allow users to check/modify data using thier mouse.
The question is, in order to know what variable the user is pointing at, I think I need to write a parser. However, I don't want to implement the whole lexer and parser things. How can I get the locations of those identifiers in the source code?
[EDIT]
In short, I want the user to be able to check the value of a variable by hover over the variable using mouse, so I have to parse the code to know where does each variable appear. I want to achieve functions like this:
How can I get the locations of those identifiers in the source code?
... without writing a parser.
You can't. You would need to either write your own (for all programming languages your GUI will support), or hook one of the existing ones.
Clang makes it relatively easy to incorporate the C/C++ parser into a GUI, but ...
not everything can be parsed with Clang
this one aspect of writing a GUI is likely to be 100x more complicated than all the others, so perhaps not worth the effort.
I am new to Lua and want to ask whether it is possible to restrict lua syntax in config file? I know that config loading have to be performed in jail, but how we can cope with while 1 do end in config file we want to load? Is there a way to allow only strings, assignments and tables in config and if not, then what is the best way to check that lua file doesn't contain undesirable constructs? Is manual pre-parsing the only solution?
You seem to already know about "sandboxing" in Lua. So what's left is as you say malicious constructs like infinite loops. And to solve that you need to solve the Halting Problem. Which is not practical.
Instead of "manually" parsing and hoping you find all the malicious content (you won't), how about just running your Lua interpreter with a timer set so that the script will be interrupted if it takes longer than N seconds?
If you want to explicitly forbid certain constructs in Lua, you have to actually scan the file yourself. Note that there are valid uses for those constructs, even in config files, so you are restricting what the user can do.
It wouldn't be too hard to write a simple Lua lexer that ignores the contents of strings and comments, but errors on any of the Lua keywords other than return. Given proper sandboxing (ie: no functions are available to be called), that should be sufficient to weed out anything malicious.
Also, note that Lua 5.1 doesn't make it easy to keep the parser from parsing non-text data (ie: compiled Lua bytecode). 5.2 offers specific API support for forcing the loader to only recognize text and therefore reject bytecode.
Many games these days make available some lua scripting, but this is universally undocumented.
So let's say I can get a game to run my lua script (it's lua 5.1) - and the script can write what it finds to text files on disk. How much can I discover about environment the script is executing in?
For example it seems I can list keys in tables, and find out what's a function and what's some other type of object, but there's no obvious way to guess how many arguments function takes (and a mistake usually results in crash to desktop).
Most languages provide some reflection functionality that could be used here - how much is possible in embedded lua environment?
"debug" standard library has some functions, which you may find useful:
debug.getfenv - Returns the environment of object.
debug.getinfo - Returns a table with information about a function.
... and more
Lua Reference Manual also states:
several of these functions violate some assumptions about Lua code (e.g., that variables local to a function cannot be accessed from outside or that userdata metatables cannot be changed by Lua code) and therefore can compromise otherwise secure code.
So with debug library, you may access more.
Unfortunately, there is not much you can learn about functions in Lua - they by design accept any number of parameters. Without the ability to look at the sources, your only resort is the documentation and/or other samples.
The most you can do in this case is traverse the entire _G table recursively and dump every table/function, printing the results to a file.
"A mistake usually results in crash to desktop" is a sign of a really bad design - good API should tell you, that it expects A, and you passed B. For example in Lqt, a Qt binding to Lua, we check every parameter against the original Qt API, so that the programmer is notified of mistakes:
> QApplication.setFont(1, 2)
QApplication::setFont(number, number): incorrect or extra arguments, expecting: QFont*,string,.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm starting to learn about COBOL. I have some experience writing programs that deal with SQL databases and I guess I'm confused how COBOL stores and retrieves data that is stored in a mainframe for example. I know that it's not like relational databases but every example program I've seen takes data straight from the command line and I know that's not how real world COBOL programs process the data. Can someone explain or show me a good resource that can explain it?
COBOL is just another third generation computer language. It is just a little older than most which doesn't mean it is somehow incomplete (actually it comes with quite a bit of baggage - but that is another story).
As with any other third generation language, COBOL manipulates data files in pretty much the same way that you would in a C program. Nothing odd, mysterious or magical about it. Files are opened, read, written and closed using the file I/O features of the language.
Various mechanisms are used to form a link between an actual file and the program. The details here are often specific to the operating system you are working under. Generally, COBOL implementations try to isolate themselves from the operating environment through a logical file name as opposed to an actual name. This added indirection is important when you are writing programs that will be ported to different platforms (e.g. write and test within an IDE on a Windows platform, and then run on a mainframe).
The following examples relate to an IBM Mainframe environment.
Within the IBM mainframe world, you will find that programs are run as either batch or on-line (e.g. CICS). I will not describe how to set up for file I/O under CICS (that's a long story). Programs that are used to manipulate files are usually batch. Here is a rough illustration of how a batch program works:
Batch programs are run via JCL. JCL is used to identify the program to run ('EXEC' statement) and to identify which files your program will reference using 'DD' statements. The function of a DD Statement is to form a logical connection between an actual file and a name your COBOL program will reference when it wants to refer to the file (this is the isolation mechanism mentioned earlier). For example,
JCLDDNAM DD DSN='HLQ.MY.FILE'...
would associate the 'DD' name 'JCLDDNAM' to the file named 'HLQ.MY.FILE'. This part is platform dependant so the details are specific to the operating environment.
In the 'FILE-CONTROL' section of your COBOL program, you connect the 'DD NAME' defined in your JCL with the name you will use on each I/O statement to reference that file. This connection is defined using the 'SELECT' statement.
For example,
SELECT MYFILE
ASSIGN JCLDDNAM
remainder of select
makes a connection between whatever file you associated with 'JCLDDNAM' in your 'JCL' to 'MYFILE' that you will later reference in COBOL I/O statements. The SELECT statement itself is part of the ISO COBOL standard. However, many COBOL implementations define some non-standard extentions to facilitate various quirks to their file subsystems.
Open, read, write, close files within the 'PROCEDURE DIVISION' of you program using the name 'MYFILE' as in:
OPEN MYFILE
READ MYFILE
CLOSE MYFILE
The above is highly simplified, and there are a multitude of ways to do this within COBOL. Understanding the complete picture will take some real effort, time and practice. The I/O statements illustrated above are part of the COBOL standard, but every vendor will have their own extentions.
IBM COBOL supports a wide range of file organizations and access methods. You can review the IBM Enterprise COBOL Language Reference manual here to get the syntax and rules for file manipulation, However, the User Guide provides a lot of good examples for reading/writing files (you will have to dig a bit—but it is all laid out for you).
The setup to reference an SQL database via a COBOL program is somewhat different but involves setting up a connection between your program and the database subsystem. Within the IBM world this is done throug JCL, other environments will use different mechanisms.
IBM COBOL uses a pre-processor or co-processor to integrate database access and data exchange. For example, the following code would retrieve some data from a DB2 database:
MOVE 1234 TO PERSON-ID
EXEC SQL
SELECT FIRST_NAME, LAST_NAME
INTO :FIRST-NAME, :LAST-NAME
FROM PERSON
WHERE PERSON_ID = :PERSON-ID
END-EXEC
DISPLAY PERSON-ID FIRST-NAME LAST-NAME
The stuff between EXEC SQL and END-EXEC is a pretty simple SQL select statement. The names preceded by colons are COBOL host variables used to pass data to DB2 or receive it back. If you have ever coded database access routines before this should look very familiar to you. This link provides a simple introduction to incorporating SQL statements into an IBM Enterpirse COBOL program.
By the way, IBM Enterprise COBOL is capable of working with XML documents too. Sorry for the heavy IBM slant, but that is the environment I am most familiar with.
Hope this gets you started in the right direction.
Who said you cannot use SQL to retrieve data from a cobol application, maybe without spending money?
A company I used to work for, did just that - with SQLite. This little gem of a public domain library compiles SQL statements to bytecode, then it executes them.
By replacing the "backend" level of SQLite with a custom interface to the C library that deals with Cobol files, it was possible to query the Cobol data from other languages, Python in that case. It worked -- within the limits of SQLite of course, but it was stable, it seemed relational enough and it didn't even require a DB server :-)
Traditional COBOL batch environments use a 'data section' of the cobol program to directly declare database connections, which are in turn set up in JCL. Since COBOL predates SQL, those would have tended to be various other types of databases, but it's likely that IBM made SQL work with DB/2. I imagine you'll get another answer from someone closer to this stuff. If you look at the SQL preprocessors available for use with other languages you'll get the idea -- a cursor becomes a native datatype and delivers query results to native variables.
Mainframe Cobol uses embedded SQL (kinda like SQLj), e.g.:
Procedure Division.
Exec SQL
Select col1, col2
from myTable
into :ws-col1, :ws-col2
where col0 = :col0
End-Exec
In this case, the host variables ws-col0, ws-col1 and ws-col2 are defined in the working-storage section. The database interface manages getting that data in the right place.
Very easy compared to the distributed stuff actually.
All the IBM mainframe shop's I've worked in have used COBOL which talked with a relational database. Generally that has been IBM's DB2. Please note that DB2 is a relational database that run's on mainframes. It can also be run on Windows or Linux.
Twenty years ago a predominate way to enter data into the DB2 mainframe database was to use CICS. CICS is "presentation level" software that comminicates with character based data entry screens. Consider CICS the functional equivelant of PHP or ASP.NET.
Today there are many more options to get data into DB2. CICS is still an option but your "presentation layer" could be PHP, ASP.NET, Win Forms, Java JSF, Powerbuilder. The key thing is that your development platform would need to be able to work with a DB2 database driver. The platform could be Windows, Linux, and possibly others.
My point is that data can get into the mainframe DB2 database in a many ways from many platforms. The COBOL language might be involved in data entry, reporting, altering data COBOL, etc. But it might only be part of an multiple tier application that could be part Windows, web, and mainframe . I could give specific examples if you have more information about the application you'll be working with at your internship.
I'm creating a compiler with Lex and YACC (actually Flex and Bison). The language allows unlimited forward references to any symbol (like C#). The problem is that it's impossible to parse the language without knowing what an identifier is.
The only solution I know of is to lex the entire source, and then do a "breadth-first" parse, so higher level things like class declarations and function declarations get parsed before the functions that use them. However, this would take a large amount of memory for big files, and it would be hard to handle with YACC (I would have to create separate grammars for each type of declaration/body). I would also have to hand-write the lexer (which is not that much of a problem).
I don't care a whole lot about efficiency (although it still is important), because I'm going to rewrite the compiler in itself once I finish it, but I want that version to be fast (so if there are any fast general techniques that can't be done in Lex/YACC but can be done by hand, please suggest them also). So right now, ease of development is the most important factor.
Are there any good solutions to this problem? How is this usually done in compilers for languages like C# or Java?
It's entirely possible to parse it. Although there is an ambiguity between identifiers and keywords, lex will happily cope with that by giving the keywords priority.
I don't see what other problems there are. You don't need to determine if identifiers are valid during the parsing stage. You are constructing either a parse tree or an abstract syntax tree (the difference is subtle, but irrelevant for the purposes of this discussion) as you parse. After that you build your nested symbol table structures by performing a pass over the AST you generated during the parse. Then you do another pass over the AST to check that identifiers used are valid. Follow this with one or more additional parses over the AST to generate the output code, or some other intermediate datastructure and you're done!
EDIT: If you want to see how it's done, check the source code for the Mono C# compiler. This is actually written in C# rather than C or C++, but it does use .NET port of Jay which is very similar to yacc.
One option is to deal with forward references by just scanning and caching tokens till you hit something you know how to real with (sort of like "panic-mode" error recovery). Once you have run thought the full file, go back and try to re parse the bits that didn't parse before.
As to having to hand write the lexer; don't, use lex to generate a normal parser and just read from it via a hand written shim that lets you go back and feed the parser from a cache as well as what lex makes.
As to making several grammars, a little fun with a preprocessor on the yacc file and you should be able to make them all out of the same original source