Related
I have been asked to update a program written in 1987 in Delphi (I guess). I have no documentation about this program only a few side notes the programmer took that don't make too much sense to make.
The cd show this files:
Size | Filename
19956 VP.DTA
142300 VP.LEX
404 VP.NDX
126502 VP.RCS
131016 VP.SCR
150067 VP.XEL
101791 vp.exe
Is anyone of this files a database? If so can I access it's data?
I tried several code decompilers but they show a message saying it was not a Win32 compatible application.
The program run in MS-DOS.
Is it possible to obtain the source code? Can I use this code in any way to build a new application?
Update01: I can run the program in MS-DOS. The program conjugate verbs and shows an example sentence where the verb can be used. The GUI is a little bit confusing and there is no help menu so I can't see all the capabilities of the program.
Update02: In conversation with the owner of the program we found another solution. He ask me if it was possible to have the program in a server and the clients could login in with a user and a password and execute the program in a terminal. I have an account in my university server, which I can access throughout ssh and compile and execute c programs in it. The server is in linux so I couldn't try the program in it. If I set up a windows server, can I have multiple people accessing and executing the program in a terminal? The program is an exe. Doesn't this raise some security issues?
Delphi is from mid nineties, so that probably means Delphi's ancestor Turbo Pascal, not Delphi.
Some extensions sound familiar, as shortened versions of words:
ndx = index
dta = data
scr = screen (?)
lex = lexicon (list of words or deduped strings in general) (?)
Screen was sometimes used for e.g. helpscreens, a medieval form of helpfiles, they are typicall ansi screens that can be loaded directly into screen memory
There is a fair chance that this is something handcrafted, specially if that date of 1987 and the general assumption "pascal" is true, and not generated by some known database package at all.
Reverseengineering the fileformat might be a more worthwhile way than trying to reverseengineering the app.
A good start would to be to take a the unix "file" command to see if it can recognize the file types. (the file command searches for signatures inside files, and there are windows ports. I use Cygwin's)
A devel experienced in such matters can also see a lot from a hexdump (specially the first parts of a file)
Is it possible to obtain the source code?
Probably not, you may want to look at something like IDA Pro which can disassemble applications to C using something like Hex-Rays.
Do you know what the application is supposed to be?
If it's ms-dos, you're probably better off just drawing up new requirements and doing new development.
Look for DeDe to reverse engineering a delphi compiled program. But as far as i know, delphi is a real compiler. So there is no way to de-compiled it. If you are able to read assembler code then you can try de-compile it. Clipper and Foxpro (dos version) are another stories cause they not real compiler.
This is definitely not Delphi. It might be one of the database centric languages like Clipper 1. .SCR probably means "screen" and defines I/O masks. .NDX is a table index and .DTA means "data".
If it is clipper, you might actually be lucky, because as far as I remember these programs were P code, so it could be possible to decompile it.
It looks like CLipper (NDX and SCR). If you have a DBF file then it's Clipper for sure. But some people renamed the DBF to something like DAT. If it is Clipper, I believe there was a decompile named Valkyrie.
So here is the problem: Recently someone bought a new PC for server to replace an older dating from before 1985 (i wonder how it is possible to work daily from then) .
He wants to put there the old COBOL software and he isnt willing in any means to rewrite it to something better..
So is there any compiler for 1985 cobol? For nowadays red hat linux? Googling it found opencobol and other few but all converted the code to c... Seems too compilacted too me..
UPDATE AS REQUESTED
AIX was the old system
What's the problem with converting the COBOL to C and then compiling? As long as it works. Early C++ environments were implemented in the same way: they converted the C++ to C, and then invoked the C compiler.
Converting the COBOL to C allows them to use high-level abstractions that implement the COBOL equivalents in C. They can leverage the standard C libraries, and also convert the COBOL data access code into calls to widely available databases like MySQL. Finally, converting to C and then compiling leverages the vast amount of development effort that went into code generation. Were they to try compiling directly to object code, they'd have to generate the intermediate code expected by the GNU compiler subsystem, or they'd have to go directly to object code. Either one of those would be much more complicated than converting to C, meaning that the likelihood of bugs in the COBOL compiler would be much higher.
From where I sit, I'd say OpenCOBOL is worth looking into. Note that they say they implement "a substantial part of the COBOL 85 and COBOL 2002 standards." You probably want to make sure that they implement the parts that you need.
I would also suggest that you look into TinyCOBOL.
You don't mention when the application, or AIX was last updated. If these were updated in the last few years, you may be able to port the application, without re-compiling. You should check to see what COBOL compiler was used originally, e.g IBM, RM/COBOL, AcuCOBOL, etc. It might be possible to buy a run-time only version (will execute, but not compile), which would be cheaper than buying a compiler.
A company called Micro Focus make a cobol compiler for Windows but I can assure you it is not cheap at all!
Standard method for doing this is called migrating and involves a number of steps including converting source file to a textfile format or a filetype compatible with the target computer, using an approved method of converting to a file and writing to magtape with compatible recording method of Phase encoding or to disk or other data medium possibly in the ASN.xx mode, transferring to the new computer to then read in the file (through ASN.yy) and store it in a native or import file format, then either use a utility to convert it to the sourcefile format or by running the program development environment to access the native text file or import file and saving the content as a native sourcefile. Perform manual checks and amendments to the source or script code and then compile the program and repeat alterations until a working version is achieved. Create test data files on the new computer and create a new jobfile or macro to run the job in the development environment. When fully tested the program can be run live using data files and live macros or jobfiles migrated over from the old system or newly created in more or less the same way as bringing over the source code. An important point is that the live data must be read into a specialized data takeon or loading program to achieve a populated database before any new transactions occur in the case of a structured datafile being necessary. When moving from AIX or other versions of Unix to an entirely different operating system the characters for end of line and linefeed and end of record may need specific conversion if they are not handled by a file format convertor or exporter utility.
We're trying to untangle a hairball of 100's of units, removing some.
It would be helpful if there was tool that would show us what units were explicitly using unit X.
Penganza doesn't seem to have a report that does that. (Although it has lots of other useful reports.)
Can anyone suggest a tool or strategy for doing this, other than just hiding unit x and then hitting F9 ... repeatedly?
MMX (Model Maker Code Explorer) has a nice unit dependency analyzer (it is especially good at detecting cycles).
For more details, see this answer.
--jeroen
From a similar question here
You might want to take a look at at
CnPack.
CnPack includes a Uses cleaner
wizard wich hasn't failed me yet.
GExperts can show Project Dependencies.
Peganza Pascal Analyzer can do the work. I haven't worked with it much, but a former dev here wrote a system that uses PAL to do the analysis, then dumped the results into a database, and then there's a browser app that lets you enter a unit name and it returns the list of units affected, whether they would need to be rebuilt if the unit changed, or if the interface changed. We use lots of BPLs so you can sometimes change a unit and you don't have to re-build other binaries that use your unit, unless the interface changed. This saves us lots of work (hundreds of BPLs and EXEs).
Chris
Headway Software's Structure 101g (and Restructure 101g) can do that really well, with the Delphi plugin.
Disclaimer: I wrote the flavors to analyze Delphi. I use them professionally, helping clients.
We've just released a freeware utility that does exactly what you need plus quite a bit more. It's called the Delphi Unit Dependency Scanner (DUDs) and you can download it here: http://www.easy-ip.net/delphi-unit-dependency-scanner.html
Sorry it's a bit late!
I was going to mention Icarus, but when I googled them I got this stack overflow answer, which you might want to check out.
Then again, sometimes I just like to delete my whole Unit Output Directory, then count my new DCU's, and that works too.
The reason you may like Icarus and not GExperts is that it doesn't rely on you to have properly maintained the uses statements in your project file.
A newcomer in this field is the Delphi Plugin for Sonar. It does not list unit dependencies but can find unused files and "dead" code (and more).
Implemented features:
Counting lines of code, statements, number of files
Counting number of classes, number of packages, methods, accessors
Counting number of public API (methods, classes and fields)
Counting comments ratio, comment lines (including blank lines)
CPD (code duplication, how many lines, block and in how many files)
Code Complexity (per method, class, file; complexity distribution
over methods, classes and files)
LCOM4 and RFC
Code colorization
Unit tests reports
Assembler syntax in grammar
Include statement
Parsing preprocessor statements
Rules
Code coverage reports
Source code highlight for unit tests
“Dead” code recognition
Unused files recognition
I'm working on a large delphi 6 project with quite a lot of dependancies. It takes several minutes to compile the whole project. The recompilation after a few changes is sometimes much more longer so that it is quicker to terminate Delphi, erase all dcu files and recompile everything.
Does anyone know a way to identify, what makes the compiler slower and slower? Any tips how to organize the code to improve compiler performance?
I have already tried following things:
Explicitly include most of the units in the dpr instead of relying on the search path: It didn't improve anything.
Use the command line compiler dcc32: it isn't faster.
Try to see what the compiler does (using ProcessExplorer from SysInternals): apparently it runs most of the time a function called 'KibitzGetOverloads'. But I can't do anything with this information...
EDIT, Summary of the answers until now:
The answer that worked best in my case:
The function "Clean unused units references" from cnpack. It almost automatically cleaned more than 1000 references, making a "cold" compilation about twice faster. ("cold" compilation = erase all dcu files before compiling). It gets the reference list from the compiler. So if you have some {$IFDEF } check that all your configurations still compile.
The next thing I would like to try:
Refactoring the unit references manually (eventually using an abstract class)
but it is much more work, since I first need to identify where the problems are. Some tools that might help:
GExperts adds a project dependencies browser to the delphi IDE (but unfortunately it can not show the size of each branch)
Delphi Unit Dependency Viewer V1.0 do about the same thing but without Delphi. It can calculate some simple statistics (Which units is the most referenced, ...)
Icarus which is referenced on a link in one of the answer.
Things that didn't change anything in my case:
Putting every files from my program and all components in one folder without subfolders.
Defragmenting the disk (I tried with a ramdisk)
Using a ramdisk for the code source and output folders.
Turning off the live scanning antivirus
Listing all the units in the dpr file instead of relying on the search path.
Using the command line compiler dcc32 or ecc32.
Things that didn't apply to my case:
Avoiding having dependencies on network shares.
Using DelphiSpeedUp, because I already had it.
Using a single folder for all dcu (I always do it)
Things that I didn't try:
Upgrading to another Delphi version.
Using dcc32speed.exe
Using a solid-state drive (I didn't tried it, but I tried with a ramdisk where I put all the source code. But maybe I should have installed delphi on the ramdisk too)
Some things that could slow down the compiler
Redundant units in your uses clause. See this question for a link to CnPack.
Not explicitly adding units to your project file. You've already seem to have covered that.
Changed compiler settings, most notably include TDD32 info.
Try to get rid of unused units in your uses clause and see if it makes a difference.
using Delphi 7 and 2009, last week I pass from almost 2 minutes for compiling and another 45 seconds from hitting f9 and get the main form of my app to 20 seconds compiling and running. This things has drive me crazy for about 6 months and nothing I tried seems to work. Using filemon from SysInternals, I realize than every unit (mostly components) that compiler requires was searched in every folder that was in Search Path, yes, this produce a LOT of FileOpen, FileExists and FileNotFound, etc. What I do was, put every DCU, DFM, RES, etc from components all in a single folder, and having just this folder in the search path, and a couple of others folders required by the project; the results were amazing. Other problem prior to the fix, was debugging. It takes almos 40 seconds in each F7, F8 key press while debuging, this has been fixed too. Hope this info can help you. Greetings form Isla de Margarita, Venezuela. Excuse my english, if any error ;)
Check are there any paths in search paths that aren't on your local machine.
i.e. Don't link to binaries on network shares, and check that the search path isn't checking any network shares.
I haven't seen the compiler get slower over time, but it's been a long time since we used Delphi 6.
It seems to be generally agreed upon in the Delphi community that, if you don't want to upgrade to the latest and greatest (Delphi 2007 or 2009), then Delphi 7 is the best/fastest/most stable. You might consider upgrading.
KibitzGetOverloads sounds like something from the kibitz compiler -- the "background" compiler that gives you code-completion, background error highlighting, code tooltips, etc. Sounds like you'd be better off checking the call stack of the command-line compiler, not the IDE; you'd get something more helpful.
I have never found compiles to be faster after deleting the DCUs. DCUs are there to make the build incremental, therefore faster. If you're seeing faster compiles after deleting all DCUs, check your hardware. Have you defragged your hard disk lately? How much free space do you have on the drive?
Have you set a single folder to get the DCUs. If not, they will be scattered all over.
Put all the units and their implicitly called units (except installed components from Library path) in the dpr. To be sure you did not miss some, empty your search path, it still should compile.
After reducing the search path, you can try to reduce your library path by installing your components into fewer folders.
Although only partly relevant to your exact question, I hear that the use of a solid-state drive is vastly increasing compile time with Delphi - Nick Hodges said this himself on the Delphi Podcast a couple of week ago.
Brian
U can automatically get rid of
unnecesseary unit references, which is very efficient optimization for compiling speed.
In your situation, dividing your
project into packages can improve
compiling speed. With this way, it
just generates modified package(s),
not single massive binary for each
recompilation. Working with packages
can also help about easy deployment
of your project updates.
Turn off your live scanning antivirus
We had the same (or similar) problem.
I of our package has compilation Time about 12 min.
After changes, now we have moved to 32 sg.
After many tests we found that the "problematic situation" was the following:
In a single package:
The A unit uses a large number of units: U1, U2, U3, U4, ... U100 (Uses of Interface) in the same package. This is an important unit that centralizes all the initialization work.
All units of the package, U1, U2, U3, .., U100 uses unit A (use of implementation)
This "circular reference" does not give compilation errors because the USES are different, but caused a large compile-time.
SOLUTION:
Eliminate the reference to each unit, U1, U2, U3 ,...., U100 in the A Unit.
Now, A unit use a large number of units: U1, U2 ,...., U100, but the units U1, U2 ,..., U100, does not use the unit A.
After this change the compile-time is down drastically.
If you have a similar situation, you can try this.
Excuse for my bad english.
Greetings.
Neftalí -Germán Estévez-
I had the same problem and I can come up with (2) reasons it effected me.
Circular references. The gentleman who stated that one was correct. I would have certain LARGE projects that would compile fast, and SMALL projects that compiled slow. Could not figure it out until I restructured the code and then I got the faster compile speeds. Lots of small units. It's easy to build monolithic units. But, there are many penalties from it.
I've heard it a 1000 times, develop on a slow machine like your users might be using. Hey, that's for the testing department. I can't waste time with compiling, Delphi load speeds, packages, etc. I went out and bought a "GAMERS" computer (WOW) with the Solid State Drives (as mentioned earlier), 12GB RAM, OVERCLOCKED "i7" Intel chip, triple video cards (linked), all on Vista64 (Vista is not bad once it is finally running with all installed parts). It was a real pain to get it all set up. But, I am not waiting anymore on my computer. Pure compile speed, load speed, plus a new fresh machine without all of the crap that was installed on the last one over the last 2 years. I even unloaded DelphiSpeedUp. Did not need it. And I don't need to turn off AntiVirus, since I did that one as well, and got penalized with the internet crap. So AntiVirus stays on. Pure and simple, get a BALLS OUT machine. Your time is worth more than what you will spend on a new computer.
Try to install a ram disk and set your dcu output path to point there. This more than halved my compilation time with Delphi 2007 on top of DelphiSpeedUp.
The compiler will only compile units that have changed. If you have changed the code in the interface section all units that depend on the changed unit are compiled. If only code in the implementation section is changed, the compile will only that unit but presumably link all the modules. Implies a good design of interfaces up front but if you restructure the code to restrict changes to the implementation compile times might reduce. I have no idea by how much. This fact is mentioned in the Delphi help files under Multiple and indirect unit references in Delphi 7 "Using Delphi".
Do not compile on network drives. Seek time is dramatically worse.
Consider pointing your dcu ("unit output" directory to a ramdrive.
Limit the number of include/unit directories.
Try to avoid minor circular references that the compiler still accepts, specially for large units (e.g. generated ORM units for your OPF). It might cause large units to be compiled twice. (does Delphi allow minor mutual circulars, or is that a FPC only feature?)
I never tried, but hardcoding all files with full/relative path in the central .dpr might also help (script to regenerate/update?). (you mention that above, but was it with path xx in '\path\yyy' notation?).
Other long shots:
Use Kylix (file/dir I/O under Linux is dramatically better in my experience (though that is from FPC experience)). Maybe we need a reversed cross-kylix :-)
Use a separate (windows) build machine, and tweak NTFS over the registry to be less "safe". (which you don't care for, since everything is a revision system to begin with). Afaik these options can only be done global for all filesystems, hence the separate system. Throw in a raid array or Raptor too.
Forget solid state. Nice buzz atm, but the high write ratio will kill it eventually (both life and performance when it gets fuller and can't optimally allocate anymore), and you need the expensive intel ones to beat two $75 HD's in RAID.
P.s. Sorry for the FPC references. I do both, and I sometimes don't know anymore what belongs to what.
What I do is always make sure to have very few directories in the library path, and most of the components and static code. I also make sure that NO sourcecode is available in the library path, only .dcu/.res etc. Only browsepath has the sourcecode, and special circumstances are handled through searchpath for the project.
Just limit what you compile in any situation.
A few years later I am struggling again with increasing compiling times. I am currently using Delphi XE4 and I am at a point where I absolutely need to refactor the units references. I thought about a new way to identify where are the problems:
I’m using Process Monitor from Microsoft/SysInternals to monitor the compiler:
I start Process Monitor with a filter to show only dcc32.exe
(or bds.exe when working from the IDE).
I build my project from the command line.
At the end I look at the CreateFile operations in the log of Process Monitor.
For each unit there will an entry for the .PAS file (when the compiler starts working on this unit) and one for the .DCU file (when the compiler is complexly done with this unit). By working on the log with a text editor and/or with Excel I can extract this kind of information:
A kind of “tree”, where you recursively see in which order the units have been compiled.
For each unit the delay between “.PAS file opened“ and “.DCU file written”.
Then I try to interpret the results to find places where doing some refactoring would speed the compile time. It is not so easy, but I’m getting some encouraging results.
Our team had been using Delphi 6 for many years, then switched to Delphi 2006 years ago. With both versions we have the following problem: frequently the compiler complains about a unit which is supposedly used recursively. This unit is a 40k LOC unit which is at the core of a project with almost 1 million LOC (third party included).
The error message is incorrect: a full build on the project always works. Unfortunately, the error message does not tell us where the supposed circular reference is, just the name of that unit. Sometimes it even happens that valid error messages are listed 2-4 times until that circular reference problem is "found". Clearly the compiler is running in a circle here. Because of the size of that project it is hard to find the problem manually. Therefore I made a tool which proves that there really is no circular reference (the tool creates a directed dependency graph of the units and determines the coherence components in that graph - there are none except if I deliberately put some in).
This is not only affecting F9 compilation but also code completion / insight which is not working most of the time. Sometimes it works when I press ctrl-space a second time...
Any ideas how we can isolate or even fix the problem? Note that it will be very hard to split the 40k LOC unit into smaller ones because it contains about 15 large classes which depend on each other in the interface section (I know it's bad but should work anyway).
Update
We are constantly refactoring but this is one tough unit to refactor because everything depends on everything, almost. Have been trying to get around it via interfaces but we are talking about some classes with 100s of methods and properties. And it would be slower.
Upgrading to D2009 may be an option down the road but right now we're stuck with D2006 (the unicode stuff and the price tag are two of the stoppers here). Question is anyway if it would help since the problem is in there since D6 at least.
About trimming the uses clauses, we have been doing this frequently with Icarus. But that did not help so far. We are down to 90 custom units in the interface section now. However, with a true circular reference the problem could be in any unit. Also tried to add all units to the dpr.
The project shares a lot of code with other projects, and there are some IFDEFs. However, the defines are not set up in project options but via a common include file. Therefore all modules should see the same defines. Also, the problem reoccurs shortly after a full rebuild without switching to another project.
I will probably be downvoted for this. In D2005 I had a 10k loc unit (datamodule) that flat out stopped compiling. Had to separate out some datasets/code to another datamodule. That 10k unit was and is a mess. You really should consider refactoring out some code to other units. My module has since D2005 / separation grown even worse, but it still compiles in D2007. So my answer is a) refactor and b) upgrade to D2009.
It seems clear that this is due to a slight difference between the background compiler and the real thing. You could look around (QualityCentral) what's known on that topic.
Also, since you didn't explicitly state this, you should remove unnecessary units and move uses statements down to implementation if possible. Maybe your tool could help with this.
And just to be sure you should check the unit aliases and Path settings.
You write that a full build does always succeed, but shortly after the incremental build fails with this error. Assuming that you experience this in the IDE, have you tried to use the command line compiler dcc32 to do incremental builds?
If you don't feed it the "-Q" switch (which probably most Makefiles or scripts for command line builds do) it will output a lot of information what files it compiles in what order. You could either try to do an incremental build after the error appeared in the IDE, or you could keep a command line open next to the IDE and Alt+Tab to it for compilation, skipping compilation in the IDE completely.
I simply assume you have a way to build using dcc32, one way or another - with the size of your project I can't imagine otherwise.
We regularly fall in similar problems, and we never managed (or bothered long enough) to find the precise cause. There seems to be a problem in the order which Delphi chooses to compile the units when hitting Ctrl-F9, which is incompatible with the actual dependency order of the units.
Have you tried deleting "MyBigFatUnit.dcu" before hitting Ctrl-F9?
Have you tried to re-order the declaration of your units in your dpr/dpk files, so that units appear in a correct compilation order? (i.e.: if unit B depends on unit A, unit A should appear first in the dpr/dpk)
Do you have any other projects that use part of the same codebase? If you compile one of them under different compiler settings or IFDEFs, it might change certain things in some of the DCUs which would lead to a circular dependency. A full build rebuilds all DCUs and then the problem goes away.
Try Icarus (free) from Peganza. If that does not tell you what the problem is, try their Pascal Analyzer.
We have this problem as well, also with a fairly large codebase.
We are currently using D2009, but have had this problem with all previous versions of Delphi.
It most frequently happens immediately after doing an update from source control, so I suspect there is some timestamp issue within the Delphi build process.
In our case, if Ctrl-F9 fails and reports the circular reference, a second Ctrl-F9 will generally work
A way I have been told to deal with this is to open another arbitrary file in the project, change that file, save it, and then try running the incremental compile again. Surprisingly enough, this usually works.
We have a 4 MLOC project where this comes up from time to time and this "solution" works for me.
I've fought this before, in my experience the error is quasi-legitimate. It's been a quite a while (the error has nothing to do with the version) but my memory of the situation is that it involves a loop in which part of the loop is in the implementation.
Unit A uses B in the implementation. Unit B uses A in the interface. If you compile B first it calls for A but since the call for B is in the implementation it succeeds. If you compile A first it calls for B, B turns around and calls for A in the interface, boom. Such loops are only safe if both cross references are in the implementation.
The solution is to design things so there is a minimum of stuff used in the interface and to make certain there's nothing resembling a loop in those units. So long as you keep your type definitions separate from units with code this is pretty easy to do.
The error coming and going depending on what you are doing is a hallmark of this issue as it comes down to how you enter the loop. When you do a full build the order is consistent and you either get it 100% or 0%, it's not random.