Is there a standard on how long you retain source code? - save

I have been developing software for decades and never deleted any source code on purpose. I am in a situation where I was asked to find a standard, best practice, or specification document stating that source code should always be retained or a retention time when it should be deleted. Is there anything like this?

Related

Decompiling Delphi Programs [duplicate]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Why decompiling a delphi exe, is so easy, compared to others executables built with other programming languages/compilers?
There are a few things that help with reversing delphi programs:
You get the full form data including the name of event handler methods
All members with published visibility have metadata used with RTTI
The compiler is pretty bad at optimizing. It does no whole program optimization and the assembly is usually a straight forward translation of the original source with only minor optimizations. (At least it was in the versions I used, might have improved since then)
All classes, even those compiled with RTTI off have some level of metadata available. In particular it's possible to get the name and inheritance structure of classes. And for any instance of a class you happen to see in the debugger you can get its VMT and thus its class name.
Delphi uses textfiles describing the content of your form and hooks up event handlers by name. This approach obviously needs enough metadata to deserialize that textual representation of a from and hook up the eventhandlers by name.
An alternative some other GUI toolkits use is auto-generating code that initializes the form and hooks up the event handler with code. Since this code directly uses pointers to the eventhandlers and directly assigns to properties/calls setters it doesn't need any metadata. Which has the side-effect that reversing becomes a bit harder.
It shouldn't be too hard to create a program that transforms a dfm file into a series of hardcoded instructions that creates the form instead. So a tool like DeDe won't work that well anymore. But that doesn't gain you much in practice.
But figuring out which evenhandler corresponds to which control/event is still rather easy. Especially since stuff like FLIRT identifies most library functions. So you just need to breakpoint the one you're interested in and then step into the user code.
The statement you make is false. Delphi is not particularly more easy to decompile than code produced by other mainstream compilers.
For .net languages there is Reflector.
C++ is covered in this Stack Overflow question.
Python/Perl/Ruby etc. are interpreted.
If you were able to prove that the results of decompiling a Delphi executable were of significantly higher quality than in other widely used languages then your question would carry more weight.
Story from the trenches: Decompiling a tiny Delphi DLL
I've been through a Delphi decompiling session myself. It was one of those fake-sounding "I lost my sources" thing, I really did lose the sources for a tiny Firebird UDF library. Now I do no better, I didn't jump right into decompiling because the library was so small and I knew a rewrite would be much faster.
This DLL exports a function that looks like this:
function udf_do_some_math(Number1, Number2:Currency): Currency;
After doing the sane thing and rewriting the function and doing some regression tests I discovered some obscure corner-cases where the new function's result wasn't the same as the old function's result! The trouble was, the new function's result was the correct result, the old DLL contained a BUG and I had to reproduce the BUG - with this function consistency is more important then accuracy.
Again, did the sane thing and tried to "guess" at the BUG. I knew it was a rounding issue but simply couldn't figure out what it was. Finally I decided to give decompilers I try. After all this was a small library, the entry-point was straight-forward and I didn't really need re-compilable code, nor 100% decompilation: I only needed enough to figure out the old BUG so I can reproduce it!
Decompiling failed! I tried lots of different decompilers, including a couple of "commercial" ones. Most produced what on the surface looked like good data, but not enough to figure out the old bug. The most promising one, the one with version specific knowledge of the VCL and RTL gave the worst failure: sure, it figured out the RTL calls, gave them names, but failed to locate the exported function! The one function I was interested in wasn't shown int the list of entry points, and it should have been straight forward since it's an exported function.
This decompiling attempt should have been easy because:
The code was fairly simple and not a lot of it.
It was a DLL with an exported function, none of the complexity you'd expect from an event-driven exe.
I wasn't interested in re-compilable code, I simply wanted to find an old bug so I can reproduce it.
I didn't ask for Pascal code, assembler would've been good enough.
I knew precisely what the code was doing and how it was doing it. It wasn't cryptic 3rd party code.
My solution
After decompilers failed me I turned to my own trusty Delphi IDE for debugging. I wrote a small Delphi program that directly imports the function from the DLL, created a fake Firbird memory manager DLL so my DLL can load, called my old function with the parameters I knew would give bad results, steped into the code using the debugger and closely watched the FPU registers. After a few failed attempts I finally noticed a value was popped from the FPU stack as integer where it shouldn't have been Integer so I had my BUG: I mistakenly defined an Integer local variable where I should have used Currency. Armed with that knowledge I was able to reproduce the bug.
Only thing that is easier in Delphi is retrieving VCLs.
After using decompilers like DeDe you will get application user interface but without any logic.
So if you want to retrieve only forms and buttons - Delphi is easier than other compilers, but if you want to know what is going on after clicking on the button you'll need to use ollydbg or other (debugger/disassembler) as for other languages that creates executables.
There are pros and cons. I am not sure what angle your referring to as being easier. There is also a huge difference in a 1 form simple application, versus a very in-depth application that has many forms and tons of classes and functions. It's like Notepad versus Office 2013 (given they were coded in delphi, just an example comparing complexity not language).
In a small app, having the extra information that Delphi apps "usually" contain can make it a breeze. However, in a large application it may "help", but you have a million calls to dig through. They may help you get in the near vicinity, but calls inside of calls inside of calls, then multiple returns used as jumps... makes you dizzy. Then if the app "was" packed or protected, some things can still be a garbled mess. While it may work programming wise, reading it can be a lot harder. I was in one the other day, where all of the strings were encrypted, so "referenced text strings" were no help, and the encryption was not a simple md5 or base64, it was some custom algorithm. Maybe an MD5 with a salt, then base64 encoded? I never could get to the exact method on the strings. I knew what some of them were supposed to be, but couldn't reproduce the method, even though it looked like it was base64, it was the base64 of the string already encrypted some how... I dont rely on text strings, but in a large large app, every little bit helps.
Of course, my interpretation of this question, was looking at a Delphi exe in OllyDbg. I could be off base on where you guys were going with this topic, but I feel in regards to Olly and reversing, I am on point (if that was what you were talking about) lol.

When and how should I obfuscate my Delphi code?

What should I know about code obfuscation in Delphi?
Should I or shouldn't I do it?
How it is done and is there any good tools (commercial/free) to automate it?
Why would you need to?
As a whole Delphi does not decompile back, unlike .net, so, while decompilation is always a bit of a risk, Ive never found a decompiler that actually did it to a useful way, lots of areas got left as assembler and so on.
If people want to rework your work, they can, no matter what, obfuscation or not, heck, some coders write almost naturally obfuscated code (having worked with a few)
My vote therefore, is shouldnt bother. Unless someone can show me a decompiler for delphi that really works, and produces full sets of compilable, and all delphi where it was originally, I wouldnt worry one drop.
Pythia is a program that can obfuscate binaries (not the source) created with Delphi or C++ Builder. Source code for Pythia is here.
Before:
After:
There's no point obfuscating since the compiler already does that for you.
There is no way to re-create the source code from the binary.
And components can be distributed in a useful way without having to distribute the source code.
So there usually is no (technical) reason for distributing the source code.
You could do other things to reduce an attacker's ability to disable your software activation system, for example, but in a native-compiled system like Delphi, you can't recreate source code from the binaries. Another answer (the accepted one at the moment) says exactly this, and someone else pointed out a helpful tool to obfuscate the RTTI information that people might use to gain some insight into the internals of your software.
You could investigate the following hardening techniques to block modification of your system, if that's what you really want:
Self-modifying code, with gating logic that divides critical functions of your code such as software activation, into various levels of inter-operable checksums, and code damage and repair.
Debug detection. You can detect debuggers being used on your software and attempt to block the software from working in this case.
Encrypt the PE binary data on disk, and decrypt it either at load time, or just in time before it runs, so that critical assembler code can not be so easily reverse engineered back to assembly language.
As others have stated, hackers working on your software do not need to restore the original sources to modify it. They will attempt, if they try it at all, to modify your binaries directly, and will use a detailed and expansive knowledge of assembler language to circumvent things you may wish them not to.
You can use free JCF (Jedi Code Formatter) to obfuscate your source code. However, pascal syntax does not allow strong obfuscation and JCF even doesn't do it's best (well, it's a code formatting tool, not obfuscator!)

Delphi decompiling [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Why decompiling a delphi exe, is so easy, compared to others executables built with other programming languages/compilers?
There are a few things that help with reversing delphi programs:
You get the full form data including the name of event handler methods
All members with published visibility have metadata used with RTTI
The compiler is pretty bad at optimizing. It does no whole program optimization and the assembly is usually a straight forward translation of the original source with only minor optimizations. (At least it was in the versions I used, might have improved since then)
All classes, even those compiled with RTTI off have some level of metadata available. In particular it's possible to get the name and inheritance structure of classes. And for any instance of a class you happen to see in the debugger you can get its VMT and thus its class name.
Delphi uses textfiles describing the content of your form and hooks up event handlers by name. This approach obviously needs enough metadata to deserialize that textual representation of a from and hook up the eventhandlers by name.
An alternative some other GUI toolkits use is auto-generating code that initializes the form and hooks up the event handler with code. Since this code directly uses pointers to the eventhandlers and directly assigns to properties/calls setters it doesn't need any metadata. Which has the side-effect that reversing becomes a bit harder.
It shouldn't be too hard to create a program that transforms a dfm file into a series of hardcoded instructions that creates the form instead. So a tool like DeDe won't work that well anymore. But that doesn't gain you much in practice.
But figuring out which evenhandler corresponds to which control/event is still rather easy. Especially since stuff like FLIRT identifies most library functions. So you just need to breakpoint the one you're interested in and then step into the user code.
The statement you make is false. Delphi is not particularly more easy to decompile than code produced by other mainstream compilers.
For .net languages there is Reflector.
C++ is covered in this Stack Overflow question.
Python/Perl/Ruby etc. are interpreted.
If you were able to prove that the results of decompiling a Delphi executable were of significantly higher quality than in other widely used languages then your question would carry more weight.
Story from the trenches: Decompiling a tiny Delphi DLL
I've been through a Delphi decompiling session myself. It was one of those fake-sounding "I lost my sources" thing, I really did lose the sources for a tiny Firebird UDF library. Now I do no better, I didn't jump right into decompiling because the library was so small and I knew a rewrite would be much faster.
This DLL exports a function that looks like this:
function udf_do_some_math(Number1, Number2:Currency): Currency;
After doing the sane thing and rewriting the function and doing some regression tests I discovered some obscure corner-cases where the new function's result wasn't the same as the old function's result! The trouble was, the new function's result was the correct result, the old DLL contained a BUG and I had to reproduce the BUG - with this function consistency is more important then accuracy.
Again, did the sane thing and tried to "guess" at the BUG. I knew it was a rounding issue but simply couldn't figure out what it was. Finally I decided to give decompilers I try. After all this was a small library, the entry-point was straight-forward and I didn't really need re-compilable code, nor 100% decompilation: I only needed enough to figure out the old BUG so I can reproduce it!
Decompiling failed! I tried lots of different decompilers, including a couple of "commercial" ones. Most produced what on the surface looked like good data, but not enough to figure out the old bug. The most promising one, the one with version specific knowledge of the VCL and RTL gave the worst failure: sure, it figured out the RTL calls, gave them names, but failed to locate the exported function! The one function I was interested in wasn't shown int the list of entry points, and it should have been straight forward since it's an exported function.
This decompiling attempt should have been easy because:
The code was fairly simple and not a lot of it.
It was a DLL with an exported function, none of the complexity you'd expect from an event-driven exe.
I wasn't interested in re-compilable code, I simply wanted to find an old bug so I can reproduce it.
I didn't ask for Pascal code, assembler would've been good enough.
I knew precisely what the code was doing and how it was doing it. It wasn't cryptic 3rd party code.
My solution
After decompilers failed me I turned to my own trusty Delphi IDE for debugging. I wrote a small Delphi program that directly imports the function from the DLL, created a fake Firbird memory manager DLL so my DLL can load, called my old function with the parameters I knew would give bad results, steped into the code using the debugger and closely watched the FPU registers. After a few failed attempts I finally noticed a value was popped from the FPU stack as integer where it shouldn't have been Integer so I had my BUG: I mistakenly defined an Integer local variable where I should have used Currency. Armed with that knowledge I was able to reproduce the bug.
Only thing that is easier in Delphi is retrieving VCLs.
After using decompilers like DeDe you will get application user interface but without any logic.
So if you want to retrieve only forms and buttons - Delphi is easier than other compilers, but if you want to know what is going on after clicking on the button you'll need to use ollydbg or other (debugger/disassembler) as for other languages that creates executables.
There are pros and cons. I am not sure what angle your referring to as being easier. There is also a huge difference in a 1 form simple application, versus a very in-depth application that has many forms and tons of classes and functions. It's like Notepad versus Office 2013 (given they were coded in delphi, just an example comparing complexity not language).
In a small app, having the extra information that Delphi apps "usually" contain can make it a breeze. However, in a large application it may "help", but you have a million calls to dig through. They may help you get in the near vicinity, but calls inside of calls inside of calls, then multiple returns used as jumps... makes you dizzy. Then if the app "was" packed or protected, some things can still be a garbled mess. While it may work programming wise, reading it can be a lot harder. I was in one the other day, where all of the strings were encrypted, so "referenced text strings" were no help, and the encryption was not a simple md5 or base64, it was some custom algorithm. Maybe an MD5 with a salt, then base64 encoded? I never could get to the exact method on the strings. I knew what some of them were supposed to be, but couldn't reproduce the method, even though it looked like it was base64, it was the base64 of the string already encrypted some how... I dont rely on text strings, but in a large large app, every little bit helps.
Of course, my interpretation of this question, was looking at a Delphi exe in OllyDbg. I could be off base on where you guys were going with this topic, but I feel in regards to Olly and reversing, I am on point (if that was what you were talking about) lol.

Performance Improvement

I am thinking about improving the performance of IO, I don't completely understand the IO structure and I would like some help from the developers here.
I think if all fields are read when the first command to get ID and class is executed and stored in Object store and then RetrieveObject gets objects from ObjectStore it might give some performance improvement. Does this make sense?
Regards
Sandeep
It would be a question to ask to http://www.instantobjects.org/#newsgroups or to IO authors directly.
You have some structure diagrams at http://www.instantobjects.org/diagrams.html
The included IOHelp.chm file has a lot of useful information.
IO has no official release since 2006. But the SVN version at sourceforge has support for Delphi 2010. I suggest you get this updated version first.
About performance Improvement, did you use the StartTransaction/CommitTransaction methods of your TInstantConnector instance? It could have a big performance improvement in writing.
About reading, I didn't find any caching mechanism of data in the source code (after a quick review - but I could have missed something). But there is a statement cache included, which is not enabled by default. See the Statement_Cache.txt file in the Docs
You could take a look at other ORM frameworks for Delphi, you've a list at ORM for DELPHI win32
I should of course recommend ours: http://synopse.info/forum/viewforum.php?id=2 which has caching of both statements and data implement. :)

How to recognize malicious source code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
BE AWARE! Creating spyware, computer viruses and similar nasties can be illegal where you live and is considered extremely unethical by almost everyone. Still, I need to ask this to raise awareness about how easy it is to create one. I am asking this after the W32/Induc-A was introduced to this world by someone who came up with a nasty way to spread one. So I want to know how a virus can be created so I will be able to recognise them in the future!
Recently a new virus was discovered which spreads itself by replacing the developers' copies of library code. Actually, through the source code of Delphi 4 through 7. What happened is that there's a virus in the wild which searches the computer for a file called SYSCONST.PAS, to which it will add itself as source code. This file happens to be a source file for the runtime libraries of Delphi. (This runtime source code is available for Delphi developers.) As a result, after being infected a programmer would create lots of new versions of this virus without even knowing it. Since virus scanners sometimes generate false positives many developers might thus decide to ignore the warnings of the scanner and maybe they'll even disable their scanner while building their project. To make it worse, their project might even trigger the scanners of their customers so it's likely that those programmers won't check their source code but will just try to fool the scanner somehow. That is, if a virus scanner is even able to recognise the virus, which isn't very likely. Thus, we software developers might be creating viruses without realizing what we're doing!
So, how to create a virus? Simple: get your source code infected by a virus and you're done!
Okay, so the source code of Delphi 4 through 7 might be infected. All Delphi developers, please check your source files! The case is just a proof-of-concept and apparently it can be very successful. Besides, most virus scanners won't check source code but just focus on executables. This virus could stay undetected for quite a while.
This virus also was successful because it misused source code. Delphi is a commercial project and the source code is available. But who is sure that these hackers won't be attacking open-source projects in similar ways? There are lots of open-source projects out there and who is going to check them all making sure they're all behaving in a decent way? And if someone is checking the code, will he be able to recognise if something is malicious code?
So, to make sure we can recognize malicious source code, I have to ask: How do I create a virus? How do I recognise the code that will create a virus? What is it that most malware will want to do?
There is a bit of discussion about the Delphi runtime source code, about this code being open-source or not. Borland uses a dual-license for their source code from the moment when they started to support Linux with Kylix. As a result, the source code has a "GPL" symbol declared which indicates if the libraries are compiled as GPL code or not. As GPL, the source code would be open-source. This also happens to be the source version that was attacked by the virus. Anyway, to avoid discussions here, I've asked this question here so we can focus more on the virus problem and less on Delphi. Basically, we're talking about a virus that attacks source code. Technically, all source code could be at risk but open source code is a likely candidate since hackers know it's structure and can target those files that are rarely modified, thus rarely checked. (And if they can hack their way into a CVS system, they could even erase the traces of their modifications, thus no one might notice the modiifications!)
While this does not really answer your question, I think a really interesting paper to read is Reflections on Trusting Trust by Ken Thompson. It raises a fascinating point that even if your source code is free of defects (viruses, trojans, etc.), you might still be producing defective executables if your compiler is defective. And even if you rebuild the compiler from clean source code, you can still have the same problem.
Unless you're building your computer from the ground up with your own microchips, hand-assembling your own BIOS, writing your own operating system, compiler, and software, you have to draw the line somewhere and trust that the hardware and software upon which you're building your systems are correct.
You could check for the Evil Bit on incoming packets... http://en.wikipedia.org/wiki/Evil_bit
If you want to recognize malware, you must know how it works. This means researching malware and aquirering the skill to produce malware.
search for 29A - they wrote papers on virus
read about rootkits (there are even books on it)
read about reverse engineering
read source code of malware - there's plenty of it in the web.
learn assembler
learn about your OS
reverse the os-kernel
get clam-av, check the source
I won't provide links here. They are easily found though.
If you really want to learn, and are willing to put in the time, your time is probably better spent on google to find then participate in a greyhat community. this topic is highly complex.
if your question is as simple as "what's an easy way to recognize a virus from its source code", well, it probably won't be easy, because there's infinite ways to go about it.
You ask "What is it that most malware will want to do?".
An excellent source for this sort of information is The Hacker Quarterly, which is so mainstream, you may find it at your local bookstore, or you can subscribe online to get it mailed to you.
It was started to help hackers and phreakers share information. It is still very popular with hackers today and is considered by many to be controversial in nature.
Contents of the Current Issue include:
Not The Enemy
Regaining Privacy in a Digital World
The Security-Conscious Uncle
Why the "No-Fly List" is a Fraud
TELECOM INFORMER
Finding Information in the Library of Congress
Hacking the DI-524 Interface
Simple How-to on Wireless and Windows Cracking
If You Can't Stand the Heat, Hack the Computers!
Security: Truth Versus Fiction
Hacking the Beamz
HACKER PERSPECTIVE: Jason Scott
iTunes Stored Credit Card Vulnerability
Zipcar's Information Infrastructure
The How and Why of Hacking the U.N.
Listen to Radio Hackers!
HACKER SPACES - EUROPE
Abusing Metadata
Verizon FIOS Wireless Insecurities
TRANSMISSIONS
Using Network Recon to Solve a Problem
Suing Telemarketers for Fun and Profit
HACKER HAPPENINGS
Plus LETTERS and MARKETPLACE
There is also an excellent series of articles on Hacking at Wikipedia and on Computer Viruses.
... And yes, it is important for programmers to understand how hacking and code breaking works, so they can do the best they can to circumvent it in their programs.
There is no difference between malicious code and an unintentional security bug.
You might as well be asking "How can I write a useful program that has no bugs and is impossible to exploit".
As we all learn in CS its impossible to even write debuggers to catch infinite loops let alone intelligent malevolence.
My advice for security conscious applications is an ex(p|t)ensive code review and use of commercially available static analysis software.

Resources