How to refactor a Delphi unit with 10000 lines with no documentation? - delphi

I have been assigned the task to refactor a Delphi unit. Wow. 10000 lines of code, no documentation, tons of copy and paste code.
THere are many methods made with copy and paste that could be refactored, anyway I am lost in all those lines, I have the interface section where I can "find my way", but in general what do yuo suggest for tackling this kind of task?
Thanks.

Get yourself a copy of Working Effectively with Legacy Code by Michael Feathers. It has all kinds of techniques for safely refactoring code to get it running under a test framework. Examples are mostly in Java and C++ but should be easy enough to figure out.
Install a third-party refactoring tool (or multiple) such as CodeRush for Delphi(sadly no longer developed), Castalia or ModelMaker Code Explorer. Delphi has some refactoring support built in but in my experience it is too limited and tends to choke on very large code bases.
Buy a copy of Simian. It doesn't have direct support for Object Pascal but its plain text parser works well enough. If enough people request support for Object Pascal I'm sure they'd add it. I haven't found any other code duplication detection tool as capable as Simian.
I would also recommend bookmarking http://www.refactoring.com/catalog/ and http://www.industriallogic.com/xp/refactoring/catalog.html.
It also wouldn't hurt to get a copy of Clean Code: A Handbook of Agile Software Craftsmanship by Robert "Uncle Bob" Martin et al. It's easy to recognize bad code. It's much harder know when you're writing good code.
A word of caution: Focus on refactoring the code you need to work on. Its easy to start down the rabbit hole and wind up spending months refactoring code that wasn't immediately relevant to the task at hand.
And save your self some trouble. Don't try to "fix" code and refactor it at the same time. Refactor first, then fix bugs or add that new feature. Remember, refactoring is modifying without changing external behavior.
Resist the urge to attempt a complete rewrite. I learned the hard way that crappy code that meets the user's requirements is preferable to clean code that doesn't. Crappy code can always be incrementally improved until its something to be proud of.

I think the best thing you can do is to write DUnit Tests for the interface. It forces you to understand the existing code, helps during debugging and it ensures that the interface acts the same after refactoring.
The Top 12 Reasons to Write Unit Tests apply perfectly in your case:
Tests Reduce Bugs in New Features.
Tests Reduce Bugs in Existing Features.
Tests Are Good Documentation.
Tests Reduce the Cost of Change.
Tests Improve Design.
Tests Allow Refactoring.
Tests Constrain Features
Tests Defend Against Other Programmers
Testing Is Fun
Testing Forces You to Slow Down and Think
Testing Makes Development Faster
Tests Reduce Fear (Fear of change, Fear of breakage, Fear of updates)

I've faced similar situations. My condolences to you!
In my opinion, the most important thing is that you actually understand all the code as it is today. Minds better than mine may be able to simply read the code and understand it. However, I can't.
After reading the code for a general overview, I usually repeatedly single step through it in the debugger until I begin to see some patterns of operation and recognize code that I've read before. Maybe this is obvious, but thought I'd mention it.
You might also think about creating a good test suite that runs on the current code.

Does the interface section contain a bunch of class definitions? If so, create a new unit for every class and move each class to it's own unit.If you use Delphi 2007 or better, you can use the "refactor/Move" option to move those classes to the new (namespace) units.The next step is splitting the large classes into smaller classes. That's just a lot of manual work.Once your code is divided over multiple units, you can examine each unit, detect identical code and generate base classes that would be used as parent for the two classes that share similar functionality.

In addition of understanding the code etc, these tools may help refactoring and reorganizing the project:
Model Maker is powerful design, reverse-engineer and refactoring tool: http://www.modelmakertools.com/modelmaker/index.html
Model Maker Code Explorer is powerful plugin for Delphi IDE to help with refactoring, code navigation etc: http://www.modelmakertools.com/code-explorer/index.html

I would use some sort of UML tool to generate som class diagrams and other diagrams to get an overview of the system, and start splitting up and commenting like #Workshop Alex said.

Use a tool like Doxygen to help you map the code.
Help on that is here

Start out small and eventually do a partial or full rewrite. Start creating base classes to accomplish pieces of the puzzle without changing the output. Rinse-repeat until you have a new, supportable codebase.
Once you hit those copy-n-paste routines, you'll have base classes to do the work and it'll really help accelerate the task.

Related

Would it be safe to rely on DeHL for new projects?

I've been browsing the DeHL repository on GoogleCode, and it looks really good to me.
Many interesting features that make basic programming tasks easier; Some neat things that are in the DotNet FCL, but are missing from the Delphi RTL can be found in this library;
Coded in a modern way, making good use of new language features;
Each class, record type, member function and parameter is documented in such a way that it'll show in the code completion of the Delphi IDE;
Well-organized and clean code;
Plenty of unit tests;
Open source and Free;
Basically, it looks like this library should've been included with Delphi, as part of the RTL.
One major drawback: The project has been discontinued. :-(
Now my question is:
Would it be safe to rely on this library for future projects, and use it as a base framework to build upon?
Basically I'd like to hear from somebody who's actually used this library whether or not it's worth it to invest time in getting to know this library, and why.
IIRC the project was discontinued because it was an over-engineered first attempt and a lot of its features turned out really messy and bloated. You should look at Alex Ciobanu's second attempt, which is simply called Collections. It contains most of the interesting features from DeHL, but leaner.
Be careful, though. It still makes heavy use of generics, which will make your binary size really big if you use it a lot, because the compiler team hasn't implemented a way to collapse duplicate code yet.

How to start unit-test old and new code?

I admit that I have almost none experience of unittesting. I did a try with DUnit a while ago but gave up because there was so many dependencies between classes in my application.
It is a rather big (about 1.5 million source lines) Delphi application and we are a team that maintain it.
The testing for now is done by one person that use it before release and report bugs. I have also set up some GUI-tests in TestComplete 6, but it often fails because of changes in the application.
Bold for Delphi is used as persistance framework against the database.
We all agree that unittesting is the way to go and we plan to write a new application in DotNet with ECO as persistance framework.
I just don't know where to start with unittesting...
Any good books, URL, best practice etc ?
Well, the challenge in unit testing is not the testing itself, but in writing testable code. If the code was written not thinking about testing, then you'll probably have a really hard time.
Anyway, if you can refactor, do refactor to make it testable. Don't mix object creation with logic whenever possible (I don't know delphi, but there might be some dependency injection framework to help in this).
This blog has lots of good insight about testing. Check this article for instance (my first suggestion was based on it).
As for a suggestion, try testing the leaf nodes of your code first, those classes that don't depend on others. They should be easier to test, as they don't require mocks.
Writing unit tests for legacy code usually requires a lot of refactoring.
Excellent book that covers this is Michael Feather's "Working Effectively with Legacy Code"
One additional suggestion: use a unit test coverage tool to indicate your progress in this work. I'm not sure about what the good coverage tools for Delphi code are though. I guess this would be a different question/topic.
Working Effectively with Legacy Code
One of the more popular approaches is to write the unit-tests as you modify the code. All new codes gets unit tests, and for any code you modify you first write its test, verify it, modify it, re-verify it, and then write/fix any tests that you need due to your modifications.
One of the big advantages of having good unit test coverage is being able to verify that the changes you make don't inadvertently break something else. This approach allows you to do that, while focusing your efforts on your immediate needs.
The alternate approach I've employed is to develop my unit tests via Co-Ops :)
When you work with legacy code, mock objetcs are really usefull to build unit tests.
Take a look at this question regarding Delphi and mocks: What is your favorite Delphi mocking library?
For .Net unittesting read this : "The Art of Unit Testing: with Examples in .NET"
About best pratices :
What you said is right : Sometimes, it's difficult to write unit tests because of the dependancy between classes...
So write unit tests just after or just before ;-) the implementation of the classes. Like this, if you have some difficulties to write the tests, maybe it means you have a design problem !

How to approach learning a new SDK/API/library?

Let's say that you have to implement some functionality that is not trivial (it will take at least 1 work week). You have a SDK/API/library that contains (numerous) code samples demonstrating the usage of the part of the SDK for implementing that functionality.
How do you approach learning all the samples, extract the necessary information, techniques, etc. in order to use them to implement the 'real thing'. The key questions are:
Do you use some tool for diagramming of the control flow, the interactions between the functions from the SDK, and the sample itself? Which kind of diagrams do you find useful? (I was thinking that the UML sequence diagram can be quite useful together with the debugger in this case).
How do you keep the relevant and often interrelated information about SDK/API function calls, the general structure and calls order in the sample programs that have to be used as a reference - mind maps, some plain text notes, added comments in the samples code, some refactoring of the sample code to suit your personal coding style in order to make the learning easier?
Personally I use the prototyping approach. Keep development to manageable iterations. In the beginning, those iterations are really small. As part of this, don't be afraid to throw code away and start again (everytime I say that somewhere a project manager has a heart attack).
If your particular task can't easily or reasonably be divided into really small starting tasks then start with some substitute until you get going.
You want to keep it as simple as you can (the proverbial "Hello world") just to familiarize yourself with building, deploying, debugging, what error messages look like, the simple things that can and do go wrong in the beginning, etc.
I don't go as far as using a diagramming tool sorry (I barely see the point in that for my job).
As soon as you start trying things you'll get the hang of it, even if in the beginning you have no idea of what's going on and why what you're doing works (or doesn't).
I usually compile and modify the examples, making them fit something that I need to do myself. I tend to do this while using and annotating the corresponding documents. Being a bit old school, the tool I usually use for diagramming is a pencil, or for the really complex stuff two or more colored pens.
I am by no means a seasoned programmer. In fact, I am learning C++ and I've been studying the language primarily from books. When I try to stray from the books (which happens a lot because I want to start contributing to programs like LibreOffice), for example, I find myself being lost. Furthermore, when I'm using functionality of the library, my implementations are wrong because I don't really understand how the library was created and/or why things need to be done that way. When I look at sample source code, I see how something is done, but I don't understand why it's done that way which leads to poor design of my programs. And as a result, I'm constantly guessing at how to do something and dealing with errors as I encounter them. Very unproductive and frustrating.
Going back to my book comment, two books which I have ready from cover to cover more than once are Ivor Horton's Beginning Visual C++ 2010 and Starting Out with C++: Early Objects (7th Edition). What I really loved about Ivor Horton's book is that it contained thorough explanation of why something needs to be done a certain way. For example, before any Windows programming began, lots of explanation about how Windows works was given first. Understanding how and why things work a certain way really helps in how I develop software.
So to contribute my two pennies towards answering your question. I think the best approach is to pick up well written books and sit down and begin learning about that library, API, SDK, whatever in a structured approach that offers real-world examples along with explanations as to how and why things are implemented as they are.
I don't know if I totally missed your question, but I don't think I did.
Cheers!
This was my first post on this site. Don't rip me too hard. (:

Best way to add tests to an existing Rails project?

I have a Rails project which I neglected to build tests for (for shame!) and the code base has gotten pretty large. A friend of mine said that RSpec was a pain to use unless you use it from the beginning. Is this true? What would make him say that?
So, considering the available tests suites and the fact that the code base is already there, what would be my best course of action for getting this thing testable? Is it really that much different than doing it from the beginning?
This question came up recently on the RSpec mailing list, and the advice we generally gave was:
Don't bother trying to retro-fit specs to existing, working, code unless you're going to change it - it's exhausting and, unless the code needs to be changed, rather pointless.
Start writing specs for any changes you make from now on. Bug fixes are an especially good opportunity for this.
Try to train yourself into the discipline that before you touch the code, first of all write a failing example (=spec) to drive out the change.
You may find that the design of code which wasn't driven out by code examples or unit tests makes it awkward to write tests or specs for. This is perhaps what your friend was alluding to. You will almost certainly need to learn a few key refactoring techniques to break up dependencies so that you can exercise each class in isolation from your specs. Michael Feathers' excellent book, Working Effectively With Legacy Code has some great material to help you learn this delicate skill.
I'd also encourage you to use the built-in spec:rcov rake task to generate code coverage stats. It's extremely rewarding to watch these numbers go up as you start to get your codebase under test.
Maybe start with the models? They should be testable in isolation, which ought to make them the lowest-hanging fruit.
Then pick a model and start writing tests that say what it does. As you go along, think about other ways to test the code - are there edge cases that maybe you're not sure about? Write the tests and see how the model behaves. As you develop the tests, you may see areas in the code that aren't as clean and de-duplicated (DRY) as they might be. Now you have tests, you can refactor the code, since you know that you're not affecting behaviour. Try not to start improving design until you have tests in place - that way lies madness.
Once you have the models pinned down, move up.
That's one way. Alternatives might be starting with views or controllers, but you may find it easier to start with end-to-end transaction tests andwork your way into smaller and smaller pieces as you go along.
The accepted answer is good advice - although not practical in some instances. I recently was faced with this problem on a few apps of mine because I NEEDED tests for existing code. There simply was no other way around it.
I started off doing all unit tests, then moved onto functionals.
Get in the habit of writing failing tests for any new code, or whenever you're going to change a part of the system. I've found this has helped me gain more knowledge of testing as I go.
Use rcov to measure your progress.
Good luck!
Writing tests for existing code may reveal bugs in your code. These tests will force you to look at the existing code so you can see what test you need to write in order to get it to pass and you may see some code that could possibly be written better, or is now useless.
Another tip is to write a test when you encounter a bug so it should never re-occur, this is called regressional testing.
Retrofitting specs is not inevitably a bad idea. You go from working code to working code with known properties which allows you to understand whether any future change breaks anything. At the moment if you need to make a change how can you know what it will affect?
What people mean when they say that it is hard to add tests/specs to exisitng code is that code which is hard to test is often highly coupled which makes it hard to write low-level isolated tests.
One idea would be to start with full-stack tests using something like the RSpec story runner. You can then work from the 'outside in' isolating what you can in low-level isolated tests and gradually untangle the harder code bit by bit.
You can start writing "characterization tests". With this,you might what to try out the pretentious gem here:
It is still a work in progress though.

Best practices for refactoring classic ASP?

I've got to do some significant development in a large, old, spaghetti-ridden ASP system. I've been away from ASP for a long time, focusing my energies on Rails development.
One basic step I've taken is to refactor pages into subs and functions with meaningful names, so that at least it's easy to understand # the top of the file what's generally going on.
Is there a worthwhile MVC framework for ASP? Or a best practice at how to at least get business logic out of the views? (I remember doing a lot of includes back in the day -- is that still the way to do it?)
I'd love to get some unit testing going for business logic too, but maybe I'm asking too much?
Update:
There are over 200 ASP scripts in the project, some thousands of lines long ;) UGH!
We may opt for the "big rewrite" but until then, when I'm in changing a page, I want to spend a little extra time cleaning up the spaghetti.
Assumptions
The documentation for the Classic ASP system is rather light.
Management is not looking for a rewrite.
Since you have been doing ruby on rails, your (VB/C#) ASP.NET is passable at best.
My experience
I too inherited a classic ASP system that was slapped together willy-nilly by ex excel-vba types. There was a lot of this stuff <font size=3>crap</font> (and sometimes missing closing tags; Argggh!). Over the course of 2.5 years I added a security system, a common library, CSS+XHTML and was able to coerce the thing to validate xhtml1.1 (sans proper mime type, unfortunately) and built a fairly robust and ajaxy reporting system that's being used daily by 80 users.
I used jEdit, with cTags (as mentioned by jamting above), and a bunch of other plugins.
My Advice
Try to create a master include file from which to import all the stuff that's commonly used. Stuff like login/logout, database access, web services, javascript libs, etc.
Do use classes. They are ultra-primitive (no inheritance) but as jamting said, they can be convenient.
Indent the scripts properly.
Comment
Write an external architecture document. I personally use LyX, because it's brain-dead to produce a nicely formatted pdf, but you can use whatever you like. If you use a wiki, get the graphviz add-in installed and use it. It's super easy to make quick diagrams that can be easily modified.
Since I have no idea how substantial the enhancements need to be, I suggest having a good high-level to mid-level architecture document will be quite useful in planning the enhancements.
On the business logic unit tests, the only thing I found that works is setting up an xml-rpc listener in asp that imports the main library and exposes the functions (not subroutines though) in any of the main library's sub-includes, and then build, separately, a unit test system in a language with better support for the stuff that calls the ASP functions through xml-rpc. I use python, but I think Ruby should do the trick. (Does that make sense?). The cool thing is that the person writing the unit-test part of the software does not need to even look at the ASP code, as long as they have decent descriptions of the functions to call, so they can be someone beside you.
There is a project called aspunit at sourceforge but the last release was in 2004 and it's marked as inactive. Never used it but it's pure vbscript. A cursory look at the code tells me it looks like the authors knew what they were doing.
Finally, if you need help, I have some availability to do contract telecommuting work (maybe 8 hours/week max). Follow the link trail for contact info.
Good luck! HTH.
Since a complete rewrite of a working system can be very dangerous i can only give you a small tip: Set up exuberant tags, ctags, on your project. This way you can jump to the definition of a function and sub easy, which i think helps a lot.
On separating logic from "views". VBScript supports som kind of OO with classes. I tend to write classes which do the logic which I include on the asp-page which acts as a "view". Then i hook together the view with the class like Username: <%= MyAccount.UserName %>. The MyAccount class can also have methods like: MyAccount.Login() and so on.
Kind of primitive, but at least you can capsulate some code and hide it from the HTML.
My advice would be to carry on refactoring, classic ASP supports classes, so you should be able to move all everything but the display code into included ASP files which just contain classes.
See this article of details of moving from old fashioned asp towards ASP.NET
Refactoring ASP
Regarding a future direction, I wouldn't aim for ASP.NET web forms, instead I'd go for Microsoft's new MVC framework an add-on to of ASP.NET) It will be much simpler migrating to this from classic ASP.
I use ASPUnit for unit testing some of our classic ASP and find it to be helpful. It may be old, but so is ASP. It's simple, but it does work and you can customize or extend it if necessary.
I've also found Working Effectively with Legacy Code by Michael Feathers to be a helpful guide for finding ways to get some of that old code under test.
Include files can help as long as you keep it simple. At one point I tried creating an include for each class and that didn't work out too well. I like having a couple main includes with common business logic, and for complicated pages sometimes an include with logic for each of those pages. I suppose you could do MVC with a similar setup.
Is there any chance you could move from ASP to ASP.Net? Or are you looking at keeping it in classic ASP, but just cleaning it up. If at all possible, I would recommend moving as much as possible moving to .Net. It looks like you may be rewriting/reorganizing a lot of code anyway, so moving to .Net may not be a lot of extra effort.
Presumably someone else wrote most or all of the system that you're now maintaining. Look for the usual bad habits (repeated code, variables that are too widely scoped, nested if statements, etc.), and refactor as you would any other language. Keep an eye out for recurring things in the same file or different files and abstract them into functions.
If the code was written/maintained by various people, there might be some issues with inconsistent coding style. I find that bringing the code back into line makes it easier to see things that can be refactored.
"Thousands of lines long" makes me suspicious that there may also be situations where loosely-related things are being displayed on the same page. There again, you want to abstract them into separate subroutines.
Eventually you want to be writing objects to help encapsulate stuff like database connectivity, but it will be a while before you get there.
This is very old, but couldn't resist adding my two cents. If you must rewrite, and must continue to use classic ASP:
use JScript! much more powerful, you get inheritance, and there some good side benefits like using the same methods for server-side validation as you use for client-side
you can absolutely do MVC - I wrote an MVC framework, and it was not that many lines of code
you can also generate your model classes automatically with a bit of work. I have some code for this that worked quite well
make sure you are doing parameterized queries, and always returning disconnected recordsets
Software Development Project Management practices indicates that softwares like this are requiring to retire.
I know how hard it is to do the right thing, even more when the responsible manager knows sht and is scared of everything other than the wost way possible.
But still. It's necessary to start working on the development of a new software. It's simply impossible to maintain this one forever, and the loger they wait for retiring it the worse.
If you don't have proper specification/requirements documentation (I think no asp software in the world does, given the noobatry hability of those coders), you'll need both a group of users that know the software features and a manager to be responsible for validating the requirements. You'll need to review every feature and document its requirements.
During that process you'll go learning more about the software and its business. Once you have enough info, you can start developing a new one.

Resources