Related
I have a task to convert COBOL code to .NET. Are there any converters available? I am trying to understand COBOL code in high level. I have a trouble understanding the COBOL code. Is there any flowchart generators? I appreciate any help.
Thank you..
Migrating software systems from one language or operating environment to another is always a challenge. Here are
a few things to consider:
Legacy code tends to be poorly structured as a result of a
long history of quick fixes and problem work-arounds. This really ups the signal-to-noise ratio
when trying to warp your head around what is really going on.
Converting code leads to further "de-structuring"
to compensate for mis-matches between the source and
target implementation platforms. When you start from a poorly structured base (legacy system),
the end result may be totally un-intelligible.
Documentation of the legacy architecture and/or business processes is generally so far out of
date that it is worse than useless, it may actually be misleading.
Complexity of COBOL code is almost always under estimated.
A number of "features" will be promulgated into the converted system that were originally
built to compensate for things that "couldn't be done" at one time (due to smaller memories,
slower computers etc.). Many of these may now be non-issues and you really don't want them.
There are no obvious or straight forward ways to refactor legacy process driven
systems into an equivalent object oriented system (at least not in a meaningful way).
There have been successful projects that migrated COBOL directly into Java. See naca.
However, the end result is only something its mother (or another COBOL programmer) could love, see this discussion
In general I would be suspicious of any product or tool making claims to convert your COBOL legacy
system into anything but another version of COBOL (e.g. COBOL.net). To this end you still
end up with what is essentially a COBOL system. If this approach is acceptable then you
might want to review this white paper from Micro Focus.
IMHO, your best bet for replacing COBOL is to re-engineer your system. If you ever find
a silver bullet to get from where you are to where you want to be - write a book, become
a consultant and make many millions of dollars.
Sorry to have provided such a negative answer, but if you are working with anything
but a trivial legacy system, the problem is going to be anything but trivial to solve.
Note: Don't bother with flowcharting the existing system. Try to get a handle on process input/output and program to program data transformation and flow. You need to understand the business function here, not a specific implementation of it.
Micro Focus and Fujitsu both have COBOL products that work with .NET. Micro Focus allow you to download a product trial, while the Fujitsu NetCOBOL site has a number of articles and case studies.
Micro Focus
http://www.microfocus.com/products/micro-focus-developer/micro-focus-cobol/windows-and-net/micro-focus-visual-cobol.aspx
Fujitsu
http://www.netcobol.com/products/Fujitsu-NetCOBOL-for-.NET/overview
[Note: I work for Micro Focus]
Hi
Actually, making COBOL applications available on the .NET framework is pretty straightforward (contrary to claim made in one of the earlier responses). Fujitsu and Micro Focus both have COBOL compilers that can create ILASM code for execution in the CLR.
Micro Focus Visual COBOL (http://www.microfocus.com/visualcobol) makes it particularly easy to deploy traditional, procedural COBOL as managed code with full support for COBOL data types, file systems etc. It also includes an updated OO COBOL syntax that takes away a lot of the verbosity & complexity of the syntax to be very easy to write COBOL code based on C# examples. It's unique approach also makes it easy to use all the Visual Studio tools such as IntelliSense.
The original question mentioned "convert" and I would strongly recommend against any approach that requires the source code to be converted to some other language before being used in a .NET environment. The amount of effort and risk involved is highly unlikely to be worth any benefits accrued. On the contrary, keeping the code in COBOL maintains the existing, working code and allows for the option to deploy onto other platforms in the future. For example, how about having a single set of source code and having the option to deploy into .NET as a native language and into a Java environment without changing a line of source code?
I recommend you get a trial copy of Visual COBOL from the link above and see how you can use your existing code in .NET without making any changes.
This is not an easy task. COBOL has fundamental ideas about data types that do not map well with the object-oriented .NET framework (e.g. in COBOL, all data types are represented in terms of fixed-size buffers) and in particular the way groups and arrays work do not map well to .NET classes.
I believe there are COBOL compilers that can actually compile .NET bytecode, but they would have their own runtime libraries to manage all of that. It might be worth looking at one of these compilers and simply leaving the legacy code in COBOL.
Other than that, line-by-line translation is probably not possible. Look at the code at a higher level and translate blocks of code at a time (e.g. at the procedure level or even higher).
There are a lot mechanisms how to convert COBOL to modern scalable environments, such as .NET or Java.
The first is a migration to a new environment with saving the existing COBOL code with some minor modifications (NET Microfocus COBOL);
The second is a migration to a new platform with simulation of COBOL statements and constructions. When there are some additional NET/Java libraries to simulate some specific COBOL logic:
ACCEPT goes to NETLibrary.Accept and so on.
The third approach is the most valuable one, when you migrate to "pure" NET/Java code with all the benefits of the new environment. It can be easily maintained and developed in the future.
However, the unique expertise and toolkits are required for this approach, and there are only a few players on the global market that can help you in this case.
If we are talking about automatic migration, the number of players decreases greatly and, unfortunately for you, you have to pay for the specific technologies and tools (like ours).
However, it is a better idea to invest your money in your future growth in the modern environment, than to spend your money on the "simulation" of old technologies.
Translations is not an easy task. Besides Micro Focus and Fujitsu there is also Raincode that offers a free version of Cobol that nicely integrates with Visual Studio.
I've skimmed Programming in Lua, I've looked at the Lua Reference.
However, they both tells me this function does this, but not how.
When reading SICP, I got this feeling of: "ah, here's the computational model underlying scheme"; I'm trying to get the same sense concerning Lua -- i.e. a concise description of it's vm, a "how" rather than a "what".
Does anyone know of a good document (besides the C source) describing this?
You might want to read the No-Frills Intro to Lua 5(.1) VM Instructions (pick a link, click on the Docs tab, choose English -> Go).
I don't remember exactly where I've seen it, but I remember reading that Lua's authors specifically discourage end-users from getting into too much detail on the VM; I think they want it to be as much of an implementation detail as possible.
Besides already mentioned A No-Frills Introduction to Lua 5.1 VM Instructions, you may be interested in this excellent post by Mike Pall on how to read Lua source.
Also see related Lua-Users Wiki page.
See http://www.lua.org/source/5.1/lopcodes.h.html . The list starts at OP_MOVE.
The computational model underlying Lua is pretty much the same as the computational model underlying Scheme, except that the central data structure is not the cons cell; it's the mutable hash table. (At least until you get into metaprogramming with metatables.) Otherwise all the familiar stuff is there: nested first-class functions with mutable local variables (let-bound variables in Scheme), and so on.
It's not clear to me that you'd get much from a study of the VM. I did some hacking on the VM a while back and it's a lot like any other register-oriented VM, although maybe a bit cleaner. Only a handful of instructions are Lua-specific.
If you're curious about the metatables, the semantics is described clearly, if somewhat verbosely, in Section 2.8 of the reference manual for Lua 5.1. If you look at the VM code in src/lvm.c you'll see almost exactly that logic implemented in C (e.g., the internal Arith function). The VM instructions are specialized for the common cases, but it's all terribly straightforward; nothing clever is involved.
For years I've been wanting a more formal specification of Lua's computational model, but my tastes run more toward formal semantics...
I've found The Implementation of Lua 5.1 very useful for understanding what Lua is actually doing.
It explains the hashing techniques, garbage collection and some other bits and pieces.
Another great paper is The Implmentation of Lua 5.0, which describes design and motivations of various key systems in the VM. I found that reading it was a great way to parse and understand what I was seeing in the C code.
I am surprised you refer to the C source for the VM as this is protected by lua.org and the tecgraf/puc rio in Brazil specially as the language is used for real business and commercial applications in a number of countries. The paper about The Implementation of lua contains details about the VM in the most detail it is permitted to include but the structure of the VM is proprietary. It is worth noting that versions 5.0 and 5' were commissioned by IBM in Europe for use on customer mainframes and their register-based version have a VM which accepts the IBM defined format of intermediate instructions.
What are some real-world projects done in concatenative languages like Forth, Factor, Joy, etc.?
factorcode.org, concatenative.org and tinyvid.tv are powered by Furnace, a Factor web server and framework.
PostScript is concatenative, and there's obviously a huge number of applications of PostScript. It's just not a general purpose programming language.
As Greg wrote, postscript is the mammoth example.
Concatenative languages pop up everywhere, quite naturally, because of the trivial nature of the language runtime. It's a favourite for many firmwares: I first encountered Forth "in the flesh" in the bootloader for a Sun Sparcstation. It powers the firmware for the OLPC.
Ocaml's parent, Caml was based on realising the semantics of functional programming as the Categorical Abstract Machine (the CAM in Caml).
Bibtex uses a concatenative language to compile style files.
There is the somewhat-obsolete but very cool Quartus Forth for Palm which allowed full compiled application development on the Palm device (Forth as a minimalist language works rather well in those circumstances). Their home page lists several Palm apps.
This FIG page has a list of mostly-embedded projects including a reference to the very cool use of Forth by NASA.
I met a guy at an Apple conference in Queensland back in about 1991 who had retailed a road planning application written in MacForth.
Christopher Diggins was talking about his Cat language being used inside Microsoft to help optimise compilers but I don't know if that went anywhere.
I suspect PowerMOPS (the successor to Neon) may elude the definition of concatenative because its big deal is adding object-orientation, which implies instances.
Take a look at FORTH Inc, They list several projects that they and their customers did, using their FORTH.
Eserv and nncron are written in SP-Forth.
Bitcoin protocol, and most of the other cryptocoins, uses pubkey scripts and signature scripts for validation of transactions:
Pubkey scripts and signature scripts combine secp256k1 pubkeys and signatures with conditional logic, creating a programable authorization mechanism.
These scripts are written in a concatenative language:
The script language is a Forth-like stack-based language deliberately designed to be stateless and not Turing complete. Statelessness ensures that once a transaction is added to the block chain, there is no condition which renders it permanently unspendable. Turing-incompleteness (specifically, a lack of loops or gotos) makes the script language less flexible and more predictable, greatly simplifying the security model.
Part of the firmware on Macs (at least in the older PowerPC models) was written in Forth.
See: Link
I've always been interested in writing and designing programming languages. Of course, it's pretty difficult to find an employer that will let you write a programming language as part of your job. So I'm looking for the "next best thing".
What fields of programming will let me get some experience solving some related problems? Or what kinds of employers are most likely to view all of my dinky little interpreters as relevant experience?
If your interest in language design is irrepressible, get a Ph.D. and make it your area of research. You can count on academia to support all manner of unprofitable activity.
None. The bulk of the professionals in that field do not design languages for a living, but retarget existing compilers to new (usually embedded) targets, or work on source2source conversion systems for legacy code, making a few language extensions in the process.
You should really ask yourself if you want this, because, besides from an extremely lucky shot, that is the realistic outlook of what you will do if you go into this industry.
Remember that the big public toolchain industry is not very profitable at the moment, and that maybe a good 100 languages are in largescale pulbic use and continually maintained, after 30 years of programming languages creation.
I know this is is very gloom, but I hope it sets you on the path to chuck the romantic, hobbyist view, and start researching how the real world in this field looks like.
Moreover, having done small hobby projects on your own is not really a pre. You need to show that you can work on large projects in a team, more than that you can create a small interpreter on your own. If you really want to pursue this, I'd recommend:
stay in school, and get a bachelor (preferably a master or PHD) in CS.
join some opensource team that works on a significant project in the field. gcc, but also the Java world, Tracemonkey (Mozilla), Mono etc. Verifiable experience in real world scenarios is very important.
I think the best way to get into this type of work would be to undertake an advanced degree with a specific focus on language design, compilers etc. It's going to be very tough for you to walk in off the street into a private company and start writing new language features otherwise.
You could also shoot a little higher and on your own, or with a small team, produce something that is much more than just a dinky little interpreter. Show your potential employer that you can produce something useful.
I have worked as an embedded programmer for the past ten years. Before that I wrote compilers (and assemblers, linkers, debuggers, etc.) for 20 years.
My co-workers joke that I turn every problem in to a parsing problem. And they're right. I've used techniques that are appropriate for language design many times during the course of my career.
Today, I play around with compiler stuff on the side: http://ellcc.org. It helps me scratch my language itch.
Actually, there is a fair bit of work going on with visual programming. It isn't exactly traditional programming language work as we know it but there is a need for it. For example, a lot of advanced data analysis tools rely on visual programming tools (Pentaho). You don't have to look too hard to find good practical uses of visual programming.
To get into visual programming languages, you will need to do an advanced degree with an advisor in the area. You will need to do some human computer interaction / interface work in addition to the programming language stuff.
An employer that has a rich "domain" (i.e. a complex industry) can benefit from a "domain specific language".
Will they realise this? Unlikely. They'll be too likely trapped in their deep domain (and entrenched legacy systems) to see that a targeted language could help unclog the mire.
But if you bury yourself in a complex industry for long enough to gain rich domain knowledge you may then be able to turn them with your own skunkwork DSL. Slim chance.
Stay in academia. If you want to develop a new language your chances of being paid to do so are vanishingly small. Newer languages tend to be expressions of a novel problem domain, and you only really encounter the chance to develop them where (a) novel problems are part of the scenery, and (b) no-one is troubled by the necessity to actually earn a living.
Please take your time over it, as well. Speaking as a jobbing developer, the last thing I need is another blasted language to learn :-)
In static analysis there is a lot to do, and the problems that come up are related to those that interest you.
Most currently popular languages came out of a geniune NEED to scratch a particular ITCH. Python came about because some non-C programmers NEEDed to customize inputs their C programs and libraries. Lua came out of the NEED to embed a scripting language in to C programs. Erlang was created to address the NEED of 99.9999999% uptime, hot code loading, and highly concurrent execution. Perl came out of the NEED to easily write programs that parsed text files.
So the very simple question any employer will be asking themselves, and you should ask yourself is. What NEED can I supply a solution to that doesn't exist. Hobby work very seldom shows that you are providing solutions to a NEED, most of the time it is showing that you like to re-invent the wheel for the sake of re-inventing the wheel.
I have recently spent several years translating legacy FORTRAN into Java. Prior to that, I found myself translating FORTRAN into C (for which I wrote a simple translation tool). After all this work, I find myself wondering how many others are doing similar language-to-language translations and whether an automated way of doing so would be beneficial.
I know about F2C, For_C, F2J and others, as well as some of the translation sites, but none seem to be all that successful. Having seen output from For_C, I can see why it just hasn't taken off. While it is technically correct, it is very difficult to maintain.
So, I guess what I am wondering is if there were are tool that produced more maintainable, more grok-able code than the code I have seen, would developers use it? Or are developers as jaded as many posts seem to indicate and unwilling to use generated code as it could never be as good as their manually translated code?
In short, no. Obviously time restraints necessitate it sometimes, but...
Rarely is code written in one language going to translate well to another - every language has certain ways of doing things that are more suited to the constructs available / common libraries / etc.
Consider for example a program written in C as compared to something written in Python - certainly you can write for loops and iterate through things in Python just as easily as you can in C, but it is much simpler to use list comprehensions and take advantage of the features the language provides.
I'd be surprised to see an example of a reasonably sized program written in any language that could be translated into 'correct', well-maintainable code in any other.
This was already covered to some extent in Conversion of Fortran 77 code to C++, but I'll take a stab at it here.
I think there's a lot of time wasted translating legacy code to new languages. It takes a phenomenal amount of time and energy to do, and you introduce new bugs when you do it.
Joel mentioned why rewriting from scratch is a horrible idea in Things you Should Never do Part I, and though I realize that translating something to a new language isn't quite the same as rewriting from scratch, I claim it's close enough:
Automated translation tools aren't wonderful because you don't get anything maintainable out of them. You pretty much have to know the old code to understand the new code, and then what have you gained?
To port something manually, you have to know how the code works to do it well. Rewriting code is seldom done by the original developers, so you seldom get people who understand everything that's going on to do the rewrite. I worked at a company where an outsource team was hired to translate an entire website backend from ColdFusion to JSP. That project kept getting delayed and delayed because the port team didn't know the code at all. Our guys never quite liked their design, and they never quite got it right, so there was constant iteration as everyone worked out all the issues that were solved in the original code. Then, the porting itself took forever.
You also need to be familiar with really technical inconsistencies between languages. People who are very familiar with two languages are rare.
For Fortran specifically, I now work at a place where there are millions of lines of legacy Fortran code, and no one here is about to rewrite it. There's just too much risk. Old bugs would have to be re-fixed, and there are hundreds of man-years that went into working out the math. Nobody wants to introduce those kinds of bugs, and it's probably downright unsafe to do it.
Instead of porting, we have hybrid codes. After all, you can link Fortran and C/C++, and if you make a C interface around your Fortran code, you can call it from Java. Modern codes here have C/C++ components that make calls into old Fortran routines, and if you do it this way you get the added benefit that Fortran compilers are screaming fast, so the old code continues to run as fast as it ever did.
I think the best way to handle this is to do any porting you need to do incrementally. Make a lightweight interface around your old fortran code and call the pieces you need, but only port things as you need them in the new part. There are also component frameworks for integrating multi-language applications that can make this easier, but you can check out Conversion of Fortran 77 code to C++ for more on that.
Since programming is hard, no such tool can really exist.
If it was trivial to change one language into another, the idea of "compiler" would be moot. You'd just map the language you liked into the language of the hardware, press the button and be done.
However, it's never that simple. Each VM, each language, each API library adds nuances that are just impossible to automate.
" I can see why it just hasn't taken off. While it is technically correct, it is very difficult to maintain."
Correct for F2C as well as Fortran to machine language. The object code generated from most compilers can't easily be read by people. Either it's cruddy or it's highly optimized. Either way, it doesn't look a thing like an expert human would write in the assembler language for that hardware.
If only compiling could be reduced to some XSLT-like transformations that preserved the clarity of the old language in the new language. If there was only some universal Lingua Franca of computing that would be the Rosetta Stone of programming.
Until someone invents that Lingua Franca of computing, every language translation job will be hard and will lead to code that's "difficult to maintain" in the new language.
I've used f2c, and I agree with whoever wanted to name it cc2fc instead. It isn't a way of transforming Fortran into anything vaguely usable as C. It's a way of taking a C compiler and making a Fortran compiler out of it.
It did work just fine at taking that Fortran code and turning it (through C) to a Macintosh library I could call from Macintosh Common Lisp. Those were the days.