The reason why I am asking is that I'm learning F# and would like to attend TopCoder competitions. However, F# is not among the list of languages supported there. But C# is on the list (to be honest, this is the case for almost all online coding competitions, except Google Code Jam and Facebook Hacker cup).
The possible workarounds I can think of at this moment are
1) find a translator that can translate F# source code directly into C#
2) compile F# code into .net executable first, then disassemble it back to C# code
The minimum requirement is that the generated C# must be able to compile into a runnable .net executable, preferable as less external dependency as possible.
The first approach seems unlikely, a quick google search turns out nothing relevant.
Approach two looks more promising, there are .net disassemblers exist.
I tried the most popular one --- Reflector from Red Gate. While it can perfectly dissemble C# executables, it appears to have problems with executables compiled from F#: it happily disassembled, but the resulting C# code has some special characters such as adding a leading $ sign to a class name and other weird stuffs, so it cannot be compiled. I was using Visual Studio 2010 Professional, the latest Reflector beta version (which is free).
Am I missing anything here? Is it possible?
Update:
It looks like this is still impossible. For now, I'll use C# instead.
As others already pointed out in the comments - if there is some way to do that, there will be quite a few nasty cases where it probably won't quite work and it will be very fragile...
One way to deal with the problem (for you) is to just write the solution in F# and then rewrite it to C#. This may sound stupid, but there are some advantages:
In F#, you can easily prototype the solution, so you'll be able to find the right solution faster.
When translating code to C#, you'll probably find yourself using features like lambda expressions more often, so it may even improve your C# skills...
If you rely on .NET libraries, then this part of code will be easy to translate.
Of course, the best thing would be to convince the organizers that they should support F# (which probably wouldn't be too difficult if they allow C# already), but I understand that this may be a challange.
Related
Two obfuscation-related questions:
1) Is there any tool that can disassemble F# back to its source form, or something close to it, from the MSIL target form? This is not an attempt at security through obscurity but I want to protect some source code from "theft".
2) I looked briefly at some F# compiler output and in general it appears pretty gibberish compared to what you get if you disassemble C# compiled code, presumably because C# is closer to the MSIL intermediate representation. The only partly mangled code I've seen from the C# compiler is iterators (and presumably async as of C# 5.0).
So far my impression is that the F# compiled code is reasonably "obfuscated" but is that true? (I realize this is a somewhat subjective question.)
I haven't heard of anything like this; however, I think it's quite likely such a tool will appear in the relatively-near future.
Assemblies produced by the F# compiler (i.e., MSIL and related metadata) aren't obfuscated in any way. However, some of the code it produces is far different than the code produced by the C# or VB.NET compilers, so it's not going to be as easy to reverse-engineer (simply because the tools to do so aren't available). Of course, as #Craig Stuntz said, this doesn't afford much protection against an experienced, motivated attacker.
If you're really paranoid, you might consider using an obfuscation tool on your compiled assemblies before shipping them. I've been using {SmartAssembly} with F# since late 2010, so I know that one works with F#; if you go with another tool, make sure you test it against some reasonably complicated F# assemblies before buying it -- at the time I was looking for an obfuscator, many of them didn't work correctly (or at all) with F# assemblies.
I wrote up some notes a while back about obfuscating F# assemblies, if you want to read more: Any experience using .NET obfuscators on F# assemblies?
F# is part of the .NET language therefore it can be decompiled. You can have a look at RedGate's Reflector if you want to spend money or 0xd4d's dnSpy (and yes, its the same developer as the very-well known deobfuscator De4Dot). Decompiled code is really close to hard-coded code, the logic is still the same and you can copy/paste the source code.
If you want to protect a F# application you may consider using an obfuscator, & currently they are almost all handled by De4Dot so it's hard to choose wisely, though .NETGuard is really strong, it can handle F# applications, it can produce a native output & it has some strong constant protection and De4Dot cannot handle it.
I've been browsing the DeHL repository on GoogleCode, and it looks really good to me.
Many interesting features that make basic programming tasks easier; Some neat things that are in the DotNet FCL, but are missing from the Delphi RTL can be found in this library;
Coded in a modern way, making good use of new language features;
Each class, record type, member function and parameter is documented in such a way that it'll show in the code completion of the Delphi IDE;
Well-organized and clean code;
Plenty of unit tests;
Open source and Free;
Basically, it looks like this library should've been included with Delphi, as part of the RTL.
One major drawback: The project has been discontinued. :-(
Now my question is:
Would it be safe to rely on this library for future projects, and use it as a base framework to build upon?
Basically I'd like to hear from somebody who's actually used this library whether or not it's worth it to invest time in getting to know this library, and why.
IIRC the project was discontinued because it was an over-engineered first attempt and a lot of its features turned out really messy and bloated. You should look at Alex Ciobanu's second attempt, which is simply called Collections. It contains most of the interesting features from DeHL, but leaner.
Be careful, though. It still makes heavy use of generics, which will make your binary size really big if you use it a lot, because the compiler team hasn't implemented a way to collapse duplicate code yet.
I have inherited a legacy app written in C++ (VS2003) MFC that was not updated in years.
I have limited experience in C++, being mainly a Delphi developer. All other apps of the company are written in Delphi.
Going forward, I see a few choices:
1) Keep the app as is and become a C++ MFC developer. But I don't like the idea of using an outdated technology (MFC) for years to come, trying to keep up with new Windows versions and UI standards. It somehow feels like making several steps backwards and I don't think this is the best way to go (?)
2) Convert the app to any modern UI technology offered with C++ and become a C++ developer, but at least using modern technology. Might be a lot of work, not sure.
3) Rebuild the app from scratch in Delphi, where I will be a lot more productive thinking about the future. It's a lot more work right now, but it might pay off later.
Obviously, I personally prefer 3) but I would like to know from your experience which way is the best for the product.
It's a long term decision to make and I will have to stick with it, therefore I don't want to rush into one direction.
(I have intentionally not tagged this question as C++, trying to get answers from Delphi developers in similar situations)
EDIT:
Thanks to all for your answers.
After learning that it is possible to switch to C++ Builder with a MFC application, this seems to be the best solution.
It combines the least amount of modifications to the current app with the possibility to go forward using the VCL for future GUI improvements.
EDIT2:
It's not possible to combine MFC and VCL in one app, therefore C++ Builder won't be an option. (thanks David for pointing this out)
In general everything depends on how complex the application's logic is and what is the projected life time of the application. If it requires maintenance for another 20 years, then
I'd rewrite the UI in Delphi and move the business logic into C++ DLL (for beginning and possibly rewrite it in Delphi either). Then it can turn that the application can be maintained this way for another 10 years and relatively easily ported to other platforms if needed (less work would be required).
This is a hard question to answer generically. Can you provide any more information about your specific app? What sort of technologies does it use? How separated is the UI from underlying layers and logic?
Some general-ish points though:
Rewriting an app is generally a bad idea, for the following reasons:
It's surprisingly hard to get an accurate idea of the requirements. You're sure you know what it does (after all, it's right there in front of you!), but then you release your rewritten app and you get complains that functionality you didn't know was there is missing, that functionality is harder to access if you've changed something, etc.
It introduces bugs. The code, especially if it's old, is full of bugfixes, tweaks, etc. You will lose all that if you rewrite, especially if it's a different language and you can't reuse any code at all.
When using a different UI layer (MFC to something else) separating the UI can be very hard if the app wasn't written well in the first place. You will probably end up doing a lot of refactoring, even if you don't do a complete rewrite and simply move from MFC to 'something else'.
MFC is kept up to date (ish) - there is a MFC Ribbon control, for example, as well as modern controls and Windows 7 support. The least amount of work, probably, would be to upgrade to a modern version of Visual C++ and become a C++ developer. However, you're quite right that MFC is an old technology and is unpleasant to use, not only because of its design, but also because modern form designers etc are great to use.
You're a Delphi developer. Without rewriting the entire thing, you could consider migrating to C++Builder. Consider this:
You can use old versions of MFC with C++Builder. I've never done this, since the VCL is miles ahead, but it's possible and there are a number of people who do it. Check out this forum, for example. (Credit for that link: this thread.)
Once you have your app compiling and working with C++Builder, you can start migrating to the VCL. As a Delphi developer you'll find using this, even with C++, very familiar. It's the same form designer of course, and using it from C++ is pretty simple - it's a different language but code is often line-for-line translatable. Everything you're used to (DFM files, units, event handlers, etc) all translate.
Not only that, but Delphi code can be used in C++ projects. Just add the units to the project, and in your C++ code include the auto-generated unitname.hpp file. You can't (easily) use C++ code from Delphi, but you could create new modules in Delphi and use them from C++. As you do this, more and more of your app will slowly become Delphi code - ie, you don't need to rewrite in a different language all in one go.
As a Delphi developer, I'd suggest going the C++Builder route. Get it working with MFC, and then migrate your windows to the VCL. At that point, you could start rewriting modules in Delphi, or you may find yourself comfortable enough in C++ to continue developing as is.
Edit: I noticed in a reply above you like the idea above of making it a C++ DLL. The link I gave a paragraph or two above of using C++ object from Delphi might be more applicable than I thought. It would fit the RAD Studio (mix of C++ and Delphi) method as well.
Keeping the app as is, tying you to MFC, is likely not very productive - You'll need to learn a GUI toolkit you'll most likely never use for something else (Delphi is great for GUI, MFC doesn't even come close IMO), in addition to a new language.
That leaves you with the choice of rewriting it in a somewhat unfamiliar language using an unfamiliar GUI toolkit, which'll take a lot more time than rewriting it in a familiar language using a familiar GUI toolkit. So you should just get started porting this to Delphi.
Rewriting C++ code in Delphi isn't as easy as you think. A better way to rewrite it is by just redesigning it from scratch, without looking at the old code. Feel free to look how the old application worked, so you can rebuild it. Just don't look at the code. That way, you should get a more modern result.
Of course, if you use the RAD Studio then you have both the C++ as Delphi compiler, thus it should be able to continue to develop the C++ application, although this means you have to learn C++. Then again, any good programmer should be able to just move to another programming language and learn to use it within 2 weeks to a month. C++ can be complex but still, learning C++ and then maintaining the legacy app should take a lot less time than a complete rewrite.
Do keep in mind that any generic C++ application should be able to be compiled for any platform, although the MFC will probably restrict this to just Windows. Still, it's a language that has an even better backwards compatibility than Delphi!
But to keep in mind, will this app run on a different platform in the future? Should it become a .NET application? Or run on Linux? Should it support tablet computers? Android? Your choices today might be outdated again in two years. And since Delphi has a bit uncertain future right now, mostly because C#/.NET became so popular, you might have a more safe bet with C++. Try to replace the MFC libraries with a more modern UI technology, preferably one that's available for multiple platforms, and think very, very well about the future usages of this application.
In general I'd say:
If it's a tiny tool application, and it takes just a couple of days to do a full rewrite: go for it. Don't waste your time creating dll wrappers or to interface with the existing code in other ways. Just do a full rewrite and be done with it.
Otherwise: you'll probably be making changes in one specific area of the application at the time only. Unless the code is a complete spaghetti, you could even get away with making some local changes without fully understanding the implementation details of the rest of the code.
In any case, you need to invest some time into understanding the application and its language + frameworks.
You have a great opportunity to learn C++ and MFC. Take advantage of it. When Delphi goes astray you will have the required knowledge to keep on coding with a language that won't go away so easily, and you can even broaden your development horizons to areas Delphi (and C++ Builder) will never reach. MFC is no more outdated than the VCL is (although I agree the original design is worse).
Good UI programming has nothing to do with the ability to drop controls on a form visually. Many great applications are not built that way. Actually, trying to rewrite it in Delphi could bring you issue in the future, as long as Embarcadero delivers slowly, and without a credible roadmap.
I recommend
1) Keep the app as is and become a C++ MFC developer. But I don't like the idea of using an outdated technology (MFC) for years to come, trying to keep up with new Windows versions and UI standards. It somehow feels like making several steps backwards and I don't think this is the best way to go (?)
Since MFC is well supported and keeps going with the time. MFC is also a what-you'd-call intrusive framework, meaning that the framework dependencies are usually not easily refactored. (The author of CPPDepend published some nice stats on that IIRC, but I can certainly vouch for this from my own experience with large MFC applications).
If you're gonna rewrite to any modern UI framework, don't code the UI in C++ (judging from the fact that Delphi is an option, it is not about realtime visualizations or something like that).
(I'll unask the unasked question here: I you're gonna rewrite, XXXXXXXXXXXXX?) please gentle(wo)men, let's not do the flame
Does the app come with a descent amount of automated tests? If not you're pretty much stuck with option 1 and hope for the best. If there are many tests you can do a lot more with the code without breaking all kinds of things you didn't know were there.
Just curious - are protocol buffers usable with F#? Any caveats etc.?
I'm just trying to answer this question myself.
Marc Gravell's protobuf-net project worked out-of-the-box with F# because it uses standard .NET idioms. You can use attributes to get serializing without having to write .proto files or do any two-stage compilation, or you can generate the necessary code from standard .proto files. Performance is good for .NET but a lot slower than alternatives like OCaml's built-in Marshal module. However, this library forces you to make every field in every message type mutable. This is really counter-productive because messages should be immutable. Also, the documentation leaves a lot to be desired but, then, this is free software.
I haven't managed to get Jon Skeet's protobuf-csharp-port library to work at all yet.
Ideally, you'd be able to serialize all of the built-in F# types (tuples, records, unions, lists, sets, maps, ...) to this wire format out-of-the-box but none of the existing open source solutions are capable of this. I'm also concerned by the complexity of these solutions: Jon Skeet's is 88,000 lines of C# code and comments (!).
As an aside, I am disappointed to see that Google protocol buffers do not specify standard formats for DateTime or decimal numbers.
I haven't looked at Proto# yet and cannot even find a download for Froto. There is also ProtoParser but it just parses .proto files and cannot actually serialize anything.
There isn't an F# specific one listed here, but there is an OCaml one, or there is a .NET "general" one (protobuf-net).
In all honesty, I simply haven't gotten around to trying protobuf-net with F# objects, in part because I simply don't know enough F#, but if you can create POCOs they should work. They would need to have some kind of mutability (perhaps even just private mutability) to work with protobuf-net, though.
If you are happy to generate a C# DTO and just consume that from F#, then protobuf-net or Jon's port should work just fine.
I'd expect both my own port and Marc Gravell's to work just fine with F#, to the same extent that any other .NET library does. In other words, neither port is written in a way which is likely to produce idiomatic F# code, but they should work.
My port will generate C# code, so you'll need to build that as a separate project for your serialization model - but that should interoperate with F# without any problems. The generated types are immutable (with mutable builders) so that should help in an F# context.
Of course, you could always take the core parts of either project and come up with an idiomatic F# solution too - whether you port the whole project to F# or use the existing libraries with an F# code generator and helper functions, or something like that.
I have recently spent several years translating legacy FORTRAN into Java. Prior to that, I found myself translating FORTRAN into C (for which I wrote a simple translation tool). After all this work, I find myself wondering how many others are doing similar language-to-language translations and whether an automated way of doing so would be beneficial.
I know about F2C, For_C, F2J and others, as well as some of the translation sites, but none seem to be all that successful. Having seen output from For_C, I can see why it just hasn't taken off. While it is technically correct, it is very difficult to maintain.
So, I guess what I am wondering is if there were are tool that produced more maintainable, more grok-able code than the code I have seen, would developers use it? Or are developers as jaded as many posts seem to indicate and unwilling to use generated code as it could never be as good as their manually translated code?
In short, no. Obviously time restraints necessitate it sometimes, but...
Rarely is code written in one language going to translate well to another - every language has certain ways of doing things that are more suited to the constructs available / common libraries / etc.
Consider for example a program written in C as compared to something written in Python - certainly you can write for loops and iterate through things in Python just as easily as you can in C, but it is much simpler to use list comprehensions and take advantage of the features the language provides.
I'd be surprised to see an example of a reasonably sized program written in any language that could be translated into 'correct', well-maintainable code in any other.
This was already covered to some extent in Conversion of Fortran 77 code to C++, but I'll take a stab at it here.
I think there's a lot of time wasted translating legacy code to new languages. It takes a phenomenal amount of time and energy to do, and you introduce new bugs when you do it.
Joel mentioned why rewriting from scratch is a horrible idea in Things you Should Never do Part I, and though I realize that translating something to a new language isn't quite the same as rewriting from scratch, I claim it's close enough:
Automated translation tools aren't wonderful because you don't get anything maintainable out of them. You pretty much have to know the old code to understand the new code, and then what have you gained?
To port something manually, you have to know how the code works to do it well. Rewriting code is seldom done by the original developers, so you seldom get people who understand everything that's going on to do the rewrite. I worked at a company where an outsource team was hired to translate an entire website backend from ColdFusion to JSP. That project kept getting delayed and delayed because the port team didn't know the code at all. Our guys never quite liked their design, and they never quite got it right, so there was constant iteration as everyone worked out all the issues that were solved in the original code. Then, the porting itself took forever.
You also need to be familiar with really technical inconsistencies between languages. People who are very familiar with two languages are rare.
For Fortran specifically, I now work at a place where there are millions of lines of legacy Fortran code, and no one here is about to rewrite it. There's just too much risk. Old bugs would have to be re-fixed, and there are hundreds of man-years that went into working out the math. Nobody wants to introduce those kinds of bugs, and it's probably downright unsafe to do it.
Instead of porting, we have hybrid codes. After all, you can link Fortran and C/C++, and if you make a C interface around your Fortran code, you can call it from Java. Modern codes here have C/C++ components that make calls into old Fortran routines, and if you do it this way you get the added benefit that Fortran compilers are screaming fast, so the old code continues to run as fast as it ever did.
I think the best way to handle this is to do any porting you need to do incrementally. Make a lightweight interface around your old fortran code and call the pieces you need, but only port things as you need them in the new part. There are also component frameworks for integrating multi-language applications that can make this easier, but you can check out Conversion of Fortran 77 code to C++ for more on that.
Since programming is hard, no such tool can really exist.
If it was trivial to change one language into another, the idea of "compiler" would be moot. You'd just map the language you liked into the language of the hardware, press the button and be done.
However, it's never that simple. Each VM, each language, each API library adds nuances that are just impossible to automate.
" I can see why it just hasn't taken off. While it is technically correct, it is very difficult to maintain."
Correct for F2C as well as Fortran to machine language. The object code generated from most compilers can't easily be read by people. Either it's cruddy or it's highly optimized. Either way, it doesn't look a thing like an expert human would write in the assembler language for that hardware.
If only compiling could be reduced to some XSLT-like transformations that preserved the clarity of the old language in the new language. If there was only some universal Lingua Franca of computing that would be the Rosetta Stone of programming.
Until someone invents that Lingua Franca of computing, every language translation job will be hard and will lead to code that's "difficult to maintain" in the new language.
I've used f2c, and I agree with whoever wanted to name it cc2fc instead. It isn't a way of transforming Fortran into anything vaguely usable as C. It's a way of taking a C compiler and making a Fortran compiler out of it.
It did work just fine at taking that Fortran code and turning it (through C) to a Macintosh library I could call from Macintosh Common Lisp. Those were the days.