I’m writing a small application based on FParsec.
Today, I’m looking for an opportunity to make a version for Compact Framework.
Apparently, it is not that simple to build FParsec sources for .NET CF. The FParsecCS library has unsafe code and some references to the types that are not available in CF. Namely, System.Collections.Generic.HashSet, System.Text.DecoderFallbackException, and more.
I’m wondering if there’s any way to make it built. Obviously, I’m trying not to alter the code as it would be hard to update when further versions of FParsec released.
I don’t really care about performance. If there is a generic CharStream that can be used instead of high-performance one you have, that would be quite sufficient.
Thank you for your help.
I don’t have any experience with .NET CF and never tried to make FParsec run on it. However, there’s a Silverlight version of FParsec, which might be a good starting point for a port to .NET CF. The Silverlight version builds on the LOW_TRUST version of FParsec, which doesn’t use any "unsafe" code. Hopefully, the stream size limitation of the LOW_TRUST version won’t be an issue for your application.
The easiest way to deal with the HashSet dependency probably is to implement you own simple HashSet type (based on a Dictionary) that implements the few methods that FParsec actually uses for its error handling. If the DecoderFallbackException is not supported, you can just comment out the respective exception handlers.
If you track your changes with HG, it shouldn’t be difficult to merge in updates to FParsec. Depending on how extensive the changes for .NET CF are, I could also include them in the main source tree for another conditional compiler symbol.
Related
The fslex and fsyacc tools currently require 2-stage compilation, generating files that are then compiled by fsc. It seems to me that these tools would be much easier to use if the source files were embedded resources, fed to fslex and fsyacc programmatically and the generated code compiled on-the-fly using the CodeDom.
Is this feasible and, if so, what would be required to implement this?
Jon, this is a great question; in fact, one of the design goals I have for fsharp-tools (new lexer- and parser-generator implementations for F#) is for them to be embeddable, specifically to enable scenarios like this.
As of now, I haven't implemented (yet) the functionality which would let you do this easily in fsharplex, but don't let that deter you; I've written fsharplex (and the other tools in fsharp-tools) in a more-or-less purely-functional style, so there shouldn't be any issues with global state or anything like that. It should be relatively straightforward to hack up the compiler code so you can build a regex AST using some combinators, run the compiler to get a compiled DFA, then emit IL for your state machine into a dynamic assembly (which you could then "bake" and execute).
fsharpyacc currently uses an approach where I've put the bulk of the compilation logic into a purely-functional library, Graham; the idea there is that the grammar analysis/manipulation and parser DFA compilation algorithms should be generic, reusable, and easy to test, so anyone else wanting to build language tools with F# will have a common framework on which to build them. Likewise, contributions/improvements to Graham can easily flow back to fsharpyacc. Eventually, I will modify fsharplex to use this same approach, which will allow you to embed the regex compiler in your own code simply by referencing the NuGet package (you'd just need to write the code to generate IL from the DFA).
fsharplex and fsharpyacc use MEF to allow various backends to be plugged in; for now, they're only targetting fslex and fsyacc for compatibility reasons, but I'd like to implement code-based backends (as opposed to the current table-based backends) to get better performance in the future.
Update -- I just re-read your question and noticed you want to embed the *.fsl and *.fsy files themselves and invoke the respective compilers at run-time. You could accomplish this by compiling the tools and referencing the assemblies from your own projects. IIRC, I exposed an entry point in both compilers so they could be called from outside code; the main entry points (e.g., what gets executed when you invoke the tools from a console) simply parse the command-line arguments then pass them into this "external" entry point.
There is one problem with directly embedding the *.fsl and *.fsy files though; if you embed them, then run them through fsharplex and fsharpyacc at run-time, your user-defined actions (e.g., the code executed when a lexer or parser rule is matched) will still be specified as F# source code -- you'd need to decide how you want to compile them into executable code.
It should be feasible to provide a parser combinator-like interface with a backend that uses expression trees (the LISP "eval" of F#) or something similar, for full integration with the language. Or else a TypeProvider. There are many options. If table generation is an expensive computation, it could be cached by providing a Cache, for example a disk cache.
I think nothing except lack of time, dedication and expertise, prevents us from having tools with (non-monadic) parser combinator-like interface, yet efficient compiled implementation.
Sometimes I get back to this pet project of mine, playing with an algebraic approach to optimizing regular expressions (and lexers) specified in source using combinators and then compiled to a state machine. It still lacks a few key pieces for efficiency, but there it is:
https://github.com/toyvo/ocaml-regex-algebraic
The current F# Compiler is written in F#, is open source and runs on .Net and Mono, allowing it to execute on many platforms including Windows, Mac and Linux. F#'s Code Quotations mechanism has been used to compile F# to JavaScript in projects like WebSharper, Pit and FunScript. There also appears to be some interest in running F# code on the JVM.
I believe a version of the OCaml compiler was used to originally Bootstrap the F# compiler.
If someone wanted to build an F# compiler that runs on the JVM would it be easier to:
Modify the existing F# compiler to emit Java bytecode and then compile the F# compiler with it?
Use a JVM based ML compiler like Yeti to Bootstrap a minimal F# compiler on the JVM?
Re-write the F# compiler from scratch in Java as the fjord project appears to be attempting?
Something else?
Another option that should probably be considered is to convert the .NET CLR byte code into JVM byte-code like http://www.ikvm.net does with JVM > CLR byte codes. Although this approach has been considered and dismissed by the fjord owner.
Getting buy-in from the top with option 1) and have the F# compiler team have pluggable backends that could emit Java bytecode sounds in theory like it would produce the most polished solution.
But if you look at other languages that have been ported to different platforms this is rarely the case. Most of the time it's been a rewrite from scratch. But this is also likely due to the original language team having no interest in supporting alternative platforms themselves and that the original host implementation might've not been able to support multiple backends and it's already too slow for this to be a feasible option to start with.
My hunch is a combination of re-writing from scratch and being able to do as much code sharing and automation as possible from the original implementation. E.g. if the test suites could be re-used for both implementations it would take a lot of the burden off the JVM port and go a long way in ensuring language parity.
If I really had to do this, I would probably start with the #1 approach - add JVM backend to the existing compiler. But I would also try to argue for a different target VM.
Quotations are not very relevant - as an author of WebSharper I can assure you that while quotations can give you a nice F#-like language to program with, they are restrictive, and not optimized. I imagine that for potential JVM F# users the bar would be a lot higher - full language compatibility and comparable performance. This is very hard.
Take tail calls, for example. In WebSharper we apply heuristics to optimize some local tail calls to loops in JavaScript, but that is not enough - you cannot in general rely on TCO, as you do in general F# libraries. This is ok for WebSharper as our users do not expect to have full F#, but will not be ok for a JVM F# port. I believe most JVM implementations do not do TCO, so it will have to be implemented with some indirection, introducing a performance hit.
An bytecode re-compilation approach mentioned by #mythz sounds very attractive as it allows more than just porting F# - ideally it allows porting more .NET software to the JVM. I worked quite a bit with .NET bytecode analysis on an internal WebSharper 3.0 project - we are looking at the option of compiling .NET bytecode instead of F# quotations to JavaScript. But there are huge challenges there:
A lot of code in BCL is opaque (native) - and you cannot decompile it
The generics model is fairly complicated. I have implemented a JavaScript runtime that models class and method generics, instantiation, type generation, and basic reflection with some precision and reasonable performance. This was difficult enough in dynamic JavaScript with closures and is seems quite difficult to do in a performant way on the JVM - but maybe I just do not see a simple solution.
Value types create significant complications in the bytecode. I am yet to figure this one out for WebSharper 3.0. They cannot be ignored either, as they are used extensively by many libraries you would want ported.
Similarly, basic reflection is used in many real-world .NET libraries - and it is a nightmare to cross-compile in terms of both lots of native code and proper support for generics and value types.
Also, the bytecode approach does not remove the question on how to implement tail calls. AFAIK, Scala does not implement tailcalls. They have certainly the talent and the funding to do that - the fact that they do not, tells me a lot about how practical it is to do TCO on the JVM. For our .NET->JavaScript port I will probably go a similar route - no TCO guarantees unless you specifically ask for trampolining which will work but cost you an order of magnitude (or two) in performance.
There is a project that compiles OCaml to the JVM, OCaml-Java: it's pretty complete and in particular can compile the OCaml's compiler (written in OCaml) sources. I'm not sure which aspects of the F# language you're interested in, but if you're mainly looking at getting a mature strict typed functional language to the JVM, that may be a good option.
I suspect any approach would be a lot of work, but I think your first suggestion is the only one that would avoid introducing lots of additional incompatibilities and bugs. The compiler's pretty complex and there are a lot of corner cases around overload resolution, etc. (and the spec probably has gaps too), so it seems very unlikely that a new implementation would have consistently compatible semantics.
Modify the existing F# compiler to emit Java bytecode and then compile the F# compiler with it?
Use a JVM based ML compiler like Yeti to Bootstrap a minimal F# compiler on the JVM?
Porting the compiler shouldn't be that hard if it is written in F#.
I'd probably go the first way, because this is the only way one could hope to keep the new compiler in sync with the .net F# compiler.
Re-write the F# compiler from scratch in Java as the fjord project appears to be attempting?
This is certainly the least elegant approach, IMHO.
Something else?
When the compiler is done, you'll have 90% of the work left to do.
For example, not knowing much F#, but I assume it is easy to use any .NET libraries out there.
That means, the basic problem is to port the .NET ecosystem, somehow.
I was looking for something in similar lines, though it was more like a F# to Akka translator/compiler. As far as F# -> JVM is concerned, I came across two not quite production ready options:
1. F# -> [Fjord][1] -> JVM.
2. F# -> [Funscript][2] -> [Vert.X][3] -> JVM
I'm trying to find a straightforward way to consume arbitrary iOS libraries from MonoTouch. At the moment, I need this calendar functionality, but the question applies to any such component.
I've read the Xamarin article on creating iOS bindings, but the process of building these bindings looks so complex (and tedious and likely error prone) that I think it would actually be easier for me to re-implement the given functionality in C# from scratch than it would to go through this process. Creating these bindings would require a deep dive into ObjectiveC, and I'm using Xamarin precisely so I don't have to do that.
As it stands, I am torn because I really want the ability to access some iOS libs, but don't have the time to master this process enough to create these bindings. Is there any other way to access these libraries?
(I wonder if there is or could be any sort of automated binding generator? It seems to me that 95% of the work is boilerplate translation of ObjectiveC headers to C# idioms, and an automated tool could do this, and then the final tweaking could be done by hand.)
You can:
Consume the ones that are already bound: you can find many on github, in particular in monotouch-bindings, and in the (just announced) Xamarin's Components Store;
Bind them yourself. That does require some Objective-C knowledge. Some tools/scripts exists but, in the end, the manual by hand editing is where the Objective-C knowledge is needed. There are general unit test (e.g. for Touch.Unit) that you can re-use that will dramatically reduce the number of bugs in them (blog post will be coming up soon to describe them in details).
Convert (or write from scratch) some into C# components;
Just curious - are protocol buffers usable with F#? Any caveats etc.?
I'm just trying to answer this question myself.
Marc Gravell's protobuf-net project worked out-of-the-box with F# because it uses standard .NET idioms. You can use attributes to get serializing without having to write .proto files or do any two-stage compilation, or you can generate the necessary code from standard .proto files. Performance is good for .NET but a lot slower than alternatives like OCaml's built-in Marshal module. However, this library forces you to make every field in every message type mutable. This is really counter-productive because messages should be immutable. Also, the documentation leaves a lot to be desired but, then, this is free software.
I haven't managed to get Jon Skeet's protobuf-csharp-port library to work at all yet.
Ideally, you'd be able to serialize all of the built-in F# types (tuples, records, unions, lists, sets, maps, ...) to this wire format out-of-the-box but none of the existing open source solutions are capable of this. I'm also concerned by the complexity of these solutions: Jon Skeet's is 88,000 lines of C# code and comments (!).
As an aside, I am disappointed to see that Google protocol buffers do not specify standard formats for DateTime or decimal numbers.
I haven't looked at Proto# yet and cannot even find a download for Froto. There is also ProtoParser but it just parses .proto files and cannot actually serialize anything.
There isn't an F# specific one listed here, but there is an OCaml one, or there is a .NET "general" one (protobuf-net).
In all honesty, I simply haven't gotten around to trying protobuf-net with F# objects, in part because I simply don't know enough F#, but if you can create POCOs they should work. They would need to have some kind of mutability (perhaps even just private mutability) to work with protobuf-net, though.
If you are happy to generate a C# DTO and just consume that from F#, then protobuf-net or Jon's port should work just fine.
I'd expect both my own port and Marc Gravell's to work just fine with F#, to the same extent that any other .NET library does. In other words, neither port is written in a way which is likely to produce idiomatic F# code, but they should work.
My port will generate C# code, so you'll need to build that as a separate project for your serialization model - but that should interoperate with F# without any problems. The generated types are immutable (with mutable builders) so that should help in an F# context.
Of course, you could always take the core parts of either project and come up with an idiomatic F# solution too - whether you port the whole project to F# or use the existing libraries with an F# code generator and helper functions, or something like that.
The reason why I am asking is that I'm learning F# and would like to attend TopCoder competitions. However, F# is not among the list of languages supported there. But C# is on the list (to be honest, this is the case for almost all online coding competitions, except Google Code Jam and Facebook Hacker cup).
The possible workarounds I can think of at this moment are
1) find a translator that can translate F# source code directly into C#
2) compile F# code into .net executable first, then disassemble it back to C# code
The minimum requirement is that the generated C# must be able to compile into a runnable .net executable, preferable as less external dependency as possible.
The first approach seems unlikely, a quick google search turns out nothing relevant.
Approach two looks more promising, there are .net disassemblers exist.
I tried the most popular one --- Reflector from Red Gate. While it can perfectly dissemble C# executables, it appears to have problems with executables compiled from F#: it happily disassembled, but the resulting C# code has some special characters such as adding a leading $ sign to a class name and other weird stuffs, so it cannot be compiled. I was using Visual Studio 2010 Professional, the latest Reflector beta version (which is free).
Am I missing anything here? Is it possible?
Update:
It looks like this is still impossible. For now, I'll use C# instead.
As others already pointed out in the comments - if there is some way to do that, there will be quite a few nasty cases where it probably won't quite work and it will be very fragile...
One way to deal with the problem (for you) is to just write the solution in F# and then rewrite it to C#. This may sound stupid, but there are some advantages:
In F#, you can easily prototype the solution, so you'll be able to find the right solution faster.
When translating code to C#, you'll probably find yourself using features like lambda expressions more often, so it may even improve your C# skills...
If you rely on .NET libraries, then this part of code will be easy to translate.
Of course, the best thing would be to convince the organizers that they should support F# (which probably wouldn't be too difficult if they allow C# already), but I understand that this may be a challange.