Is it possible that F# will be optimized more than other .Net languages in the future? - f#

Is it possible that Microsoft will be able to make F# programs, either at VM execution time, or more likely at compile time, detect that a program was built with a functional language and automatically parallelize it better?
Right now I believe there is no such effort to try and execute a program that was built as single threaded program as a multi threaded program automatically.
That is to say, the developer would code a single threaded program. And the compiler would spit out a compiled program that is multi-threaded complete with mutexes and synchronization where needed.
Would these optimizations be visible in task manager in the process thread count, or would it be lower level than that?

I think this is unlikely in the near future. And if it does happen, I think it would be more likely at the IL level (assembly rewriting) rather than language level (e.g. something specific to F#/compiler). It's an interesting question, and I expect that some fine minds have been looking at this and will continue to look at this for a while, but in the near-term, I think the focus will be on making it easier for humans to direct the threading/parallelization of programs, rather than just having it all happen as if by magic.
(Language features like F# async workflows, and libraries like the task-parallel library and others, are good examples of near-term progress here; they can do most of the heavy lifting for you, especially when your program is more declarative than imperative, but they still require the programmer to opt-in, do analysis for correctness/meaningfulness, and probably make slight alterations to the structure of the code to make it all work.)
Anyway, that's all speculation; who can say what the future will bring? I look forward to finding out (and hopefully making some of it happen). :)

Being that F# is derived from Ocaml and Ocaml compilers can optimize your programs far better than other compilers, it probably could be done.

I don't believe it is possible to autovectorize code in a generally-useful way and the functional programming facet of F# is essentially irrelevant in this context.
The hardest problem is not detecting when you can perform subcomputations in parallel, it is determining when that will not degrade performance, i.e. when the subtasks will take sufficiently long to compute that it is worth taking the performance hit of a parallel spawn.
We have researched this in detail in the context of scientific computing and we have adopted a hybrid approach in our F# for Numerics library. Our parallel algorithms, built upon Microsoft's Task Parallel Library, require an additional parameter that is a function giving the estimated computational complexity of a subtask. This allows our implementation to avoid excessive subdivision and ensure optimal performance. Moreover, this solution is ideal for the F# programming language because the function parameter describing the complexity is typically an anonymous first-class function.
Cheers,
Jon Harrop.

I think the question misses the point of the .NET architecture-- F#, C# and VB (etc.) all get compiled to IL, which then gets compiled to machine code via the JIT compiler. The fact that a program was written in a functional language isn't relevant-- if there are optimizations (like tail recursion, etc.) available to the JIT compiler from the IL, the compiler should take advantage of it.
Naturally, this doesn't mean that writing functional code is irrelevant-- obviously, there are ways to write IL which will parallelize better-- but many of these techniques could be used in any .NET language.
So, there's no need to flag the IL as coming from F# in order to examine it for potential parallelism, nor would such a thing be desirable.

There's active research for autoparallelization and auto vectorization for a variety of languages. And one could hope (since I really like F#) that they would concive a way to determine if a "pure" side-effect free subset was used and then parallelize that.
Also since Simon Peyton-Jones the father of Haskell is working at Microsoft I have a hard time not beliving there's some fantastic stuff comming.

It's possible but unlikely. Microsoft spends most of it's time supporting and implementing features requested by their biggest clients. That usually means C#, VB.Net, and C++ (not necessarily in that order). F# doesn't seem like it's high on the list of priorities.

Microsoft is currently developing 2 avenues for parallelisation of code: PLINQ (Pararllel Linq, which owes much to functional languages) and the Task Parallel Library (TPL) which was originally part of Robotics Studio. A beta of PLINQ is available here.
I would put my money on PLINQ becoming the norm for auto-parallelisation of .NET code.

Related

Are there any tools that assist in porting F# to OCaml?

Unfortunately, due to .NET's lack of an incremental GC (either in the MS or Mono implementation), building soft real-time software such as games with F# is problematic. I've written a language in F# that, if -
a) it doesn't perform adequately in the face of the generational GC (arbitrary pauses during the interactive simulation, and
b) OCaml gets a good complete port to the LLVM backend -
I will port it from F# to OCaml. I have avoided as much .NET-specific libraries as I could, and since F#'s syntax is based on OCaml's, I'm assuming there should be some automated tools to assist in converting the code.
Anyone know of such things, either finished or in progress?
Thanks deeply!
To answer your question in an answer - as far as I know, there are no such tools and I do not think it is likely somebody will create them.
Although F# is inspired by OCaml, it has evolved a lot and is different in a number of ways (see this SO discussion), so automatic conversion is not trivial. Even if somebody did that, it would be more like compilation to hard to read OCaml than conversion to idiomatic code that you can later continue working on.
To add a few general comments, when you speak about "real-time" I imagine controlling some robot in a factory dealing with dangerous stuff or an airplane control. In these areas, concerns about GC are certainly valid. However, I do not think games are necessarily "real-time". You need good performance, that's for sure, but people have been writing games with .NET and F# quite happily. For some F# examples, see:
... a nice blog with a couple of game samples (that you can actually try & buy)
a 3D airplane shooter game that also looks fairly realistic
and there is also a book that uses games to explain F#
These are probably simpler than what you're aiming for, but it may be good enough to show that writing games using GC is doable.
Unfortunately, due to .NET's lack of an incremental GC (either in the MS or Mono implementation), building soft real-time software such as games with F# is problematic.
A few points here:
Incremental GCs are not the only way to get low pause times. Concurrent GCs like VCGC do the work in bulk but do it concurrently with mutators running, e.g. the VCGC implementation I described in the non-free article here was running with sub-millisecond pause times.
Incremental GC does not necessarily mean low pause times. For example, OCaml's GC typically incurs 10ms pauses and can incur arbitrarily-long pauses when it encounters a deep thread stack or long array in the heap.
I have measured typical pause times of 10ms with OCaml and 30ms with F# on .NET 3. With a simple implementation I was able to build a fault tolerant server in F# from scratch that handled 20k msgs/s with 50% of latencies under 114us and 95% under 500us.
I've written a language in F# that, if -
a) it doesn't perform adequately in the face of the generational GC (arbitrary pauses during the interactive simulation, and
I wouldn't give up on the platform is your first working version has unacceptable latency. There are lots of things you can do to bring the max latency down.
b) OCaml gets a good complete port to the LLVM backend -
I seriously doubt OCaml will ever get what I'd consider to be a "good complete port to the LLVM backend". They'll just retarget LLVM with the current typeless IR and it won't do much better than the current ocamlopt compiler because LLVM isn't designed to optimize that kind of workload.
I will port it from F# to OCaml. I have avoided as much .NET-specific libraries as I could, and since F#'s syntax is based on OCaml's, I'm assuming there should be some automated tools to assist in converting the code.
No automated tools but I've ported hundreds of thousands of lines of code between OCaml and F# now and it is generally very easy because most code is written in the core ML subset of both languages.

How to program FPGA using F#

I usually use F# for writing numerical algorithms. Functional programming constructs in F# helps to express algorithms in a very natural way. I often end up with a succinct and understandable implementation, and may be able to parallelize it quite fast if there is a chance of parallelism.
I wonder there is a way to compile F# programs down to FPGA. In this way, I can still use F# to avoid boilerplate codes in FPGA programming, and make use of high performance computing in FPGA. Is this possible to do so? If yes, could you provide some hints for me to start with?
I've read about (but never used) Avalda's F# to FPGA conversion, but their site is currently returning a completely blank page. I don't know if that's just temporary of if it means they've gone belly-up.
F# should be ideal for this task because it is derived from the ML family of languages that were bred for metaprogramming. However, I am not aware of any work in this area (although I have had the idea of working on it myself).
I would focus on writing a compiler in F# that compiled a DSL to an FPGA, rather than trying to compile general F# code.
Here's a list for HLS tools using C. My experience with one of them in 2006 was not favourable but I expect them to be much better today.
Regarding F#, I doubt this will exist any time soon.

Should I use F# to analyse Graphs

We have a program which perform graph analysis (namely Maximum-flow problem) on several graphs.
There is also the opportunity to process these in parallel.
There is already a large C# Code base, but we are intending to rewrite a large portion of this. Would it be better to do this type of operation in F#, as opposed to say C#?
Thanks, Pete
I think that this depends largely on the composition of your team - how well do you know F#?
I feel like I am able to develop almost anything more quickly in F# than C#. In particular, highly algorithmic programs are often more concise and readable when expressed in F# thanks to type inference. However, if you don't have much experience with F# there is a significant learning curve, which means that you may be better off sticking to C# if you already know that language well.
C#'s support for doing operations in parallel is roughly equivalent to F#'s, particularly if your task doesn't require accessing any shared mutable state (which seems to be the case for your task). If I understand your problem correctly, you'd just like to run the same operation on multiple graphs at the same time, which ought to be quite easy in either language. If you were trying to parallelize the max-flow algorithm itself, then F# might be a bit easier due to its stronger support for immutable data types. Where F# really beats C# is in doing asynchronous operations, but that seems less relevant here.

Functional programming and multicore architecture

I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing. I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?
Functional programming minimizes or eliminates side effects and thus is better suited to distributed programming. i.e. multicore processing.
In other words, lots of pieces of the puzzle can be solved independently on separate cores simultaneously without having to worry about one operation affecting another nearly as much as you would in other programming styles.
One of the hardest things about dealing with parallel processing is locking data structures to prevent corruption. If two threads were to mutate a data structure at once without having it locked perfectly, anything from invalid data to a deadlock could result.
In contrast, functional programming languages tend to emphasize immutable data. Any state is kept separate from the logic, and once a data structure is created it cannot be modified. The need for locking is greatly reduced.
Another benefit is that some processes that parallelize very easily, like iteration, are abstracted to functions. In C++, You might have a for loop that runs some data processing over each item in a list. But the compiler has no way of knowing if those operations may be safely run in parallel -- maybe the result of one depends on the one before it. When a function like map() or reduce() is used, the compiler can know that there is no dependency between calls. Multiple items can thus be processed at the same time.
I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing... I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?
The argument behind the belief you quoted is that purely functional programming controls side effects which makes it much easier and safer to introduce parallelism and, therefore, that purely functional programming languages should be advantageous in the context of multicore computers.
Unfortunately, this belief was long since disproven for several reasons:
The absolute performance of purely functional data structures is poor. So purely functional programming is a big initial step in the wrong direction in the context of performance (which is the sole purpose of parallel programming).
Purely functional data structures scale badly because they stress shared resources including the allocator/GC and main memory bandwidth. So parallelized purely functional programs often obtain poor speedups as the number of cores increases.
Purely functional programming renders performance unpredictable. So real purely functional programs often see performance degradation when parallelized because granularity is effectively random.
For example, the bastardized two-line quicksort often cited by the Haskell community typically runs thousands of times slower than a real in-place quicksort written in a more conventional language like F#. Moreover, although you can easily parallelize the elegant Haskell program, you are unlikely to see any performance improvement whatsoever because all of the unnecessary copying makes a single core saturate the entire main memory bandwidth of a multicore machine, rendering parallelism worthless. In fact, nobody has ever managed to write any kind of generic parallel sort in Haskell that is competitively performant. The state-of-the-art sorts provided by Haskell's standard library are typically hundreds of times slower than conventional alternatives.
However, the more common definition of functional programming as a style that emphasizes the use of first-class functions does actually turn out to be very useful in the context of multicore programming because this paradigm is ideal for factoring parallel programs. For example, see the new higher-order Parallel.For function from the System.Threading.Tasks namespace in .NET 4.
When there are no side effects the order of evaluation does not matter. It is then possible to evaluate expressions in parallel.
The basic argument is that it is difficult to automatically parallelize languages like C/C++/etc because functions can set global variables. Consider two function calls:
a = foo(b, c);
d = bar(e, f);
Though foo and bar have no arguments in common and one does not depend on the return code of the other, they nonetheless might have dependencies because foo might set a global variable (or other side effect) which bar depends upon.
Functional languages guarantee that foo and bar are independant: there are no globals, and no side effects. Therefore foo and bar could be safely run on different cores, automatically, without programmer intervention.
All the answers above go to the key idea that "no shared mutable storage" is a key enabler to execute pieces of a program in parallel. It does not really solve the equally hard problem of finding things to execute in parallel. But the typical clearer expressions of functionality in functional languages do make it theoretically easier to extract parallelism from a sequential expression.
In practice, I think the "no shared mutable storage" property of languages based on garbage collection and copy-on-change semantics make them easier to add threads to. The best example is probably Erlang, that combines near-functional semantics with explicit threads.
This is a little bit of a vague question. One perk of multi-core CPUs is that you can run a functional program and let it plug away serially without worrying about affecting any computing going on that has to do with other functions the machine is carrying out.
The difference between a multi-U server and a multi-core CPU in a server or PC is the speed savings you get by having it on the same BUS, allowing better and faster communication to the cores.
edit: I should probably qualify this post by saying that in most of the scripting I do, with or without multiple cores, I rarely see a problem in getting my data through hackish parallelizing, such as running multiple small scripts at once in my script so I'm not slowed down by things like waiting for URLs to load and what not.
double edit: Furthermore, a lot of functional programming languages have had forked parallel variants for decades. These better utilize parallel computation with some speed improvement, but they never really caught on.
Omitting any technical/scientific terms the reason is because functional program doesn't share data. Data is copied and transfered among functions, thus there is no shared data in the application.
And shared data is what causes half the headaches with multithreading.
The book Programming Erlang: Software for a Concurrent World by Joe Armstrong (the creator of Erlang) talks quite a bit about using Erlang for multicore(/multiprocessor) systems. As the wikipedia article states:
Creating and managing processes is trivial in Erlang, whereas threads are considered a complicated and error-prone topic in most languages. Though all concurrency is explicit in Erlang, processes communicate using message passing instead of shared variables, which removes the need for locks.

If you already know LISP, why would you also want to learn F#?

What is the added value for learning F# when you are already familiar with LISP?
Static typing (with type inference)
Algebraic data types
Pattern matching
Extensible pattern matching with active patterns.
Currying (with a nice syntax)
Monadic programming, called 'workflows', provides a nice way to do asynchronous programming.
A lot of these are relatively recent developments in the programming language world. This is something you'll see in F# that you won't in Lisp, especially Common Lisp, because the F# standard is still under development. As a result, you'll find there is a quite a bit to learn. Of course things like ADTs, pattern matching, monads and currying can be built as a library in Lisp, but it's nicer to learn how to use them in a language where they are conveniently built-in.
The biggest advantage of learning F# for real-world use is its integration with .NET.
Comparing Lisp directly to F# isn't really fair, because at the end of the day with enough time you could write the same app in either language.
However, you should learn F# for the same reasons that a C# or Java developer should learn it - because it allows functional programming on the .NET platform. I'm not 100% familiar with Lisp, but I assume it has some of the same problems as OCaml in that there isn't stellar library support. How do you do Database access in Lisp? What about high-performance graphics?
If you want to learn more about 'Why .NET', check out this SO question.
If you knew F# and Lisp, you'd find this a rather strange question to ask.
As others have pointed out, Lisp is dynamically typed. More importantly, the unique feature of Lisp is that it's homoiconic: Lisp code is a fundamental Lisp data type (a list). The macro system takes advantage of that by letting you write code which executes at compile-time and modifies other code.
F# has nothing like this - it's a statically typed language which borrows a lot of ideas from ML and Haskell, and runs it on .NET
What you are asking is akin to "Why do I need to learn to use a spoon if I know how to use a fork?"
Given that LISP is dynamically typed and F# is statically typed, I find such comparisons strange.
If I were switching from Lisp to F#, it would be solely because I had a task on my hands that hugely benefitted from some .NET-only library.
But I don't, so I'm not.
Money. F# code is already more valuable than Lisp code and this gap will widen very rapidly as F# sees widespread adoption.
In other words, you have a much better chance of earning a stable income using F# than using Lisp.
Cheers,
Jon Harrop.
F# is a very different language compared to most Lisp dialects. So F# gives you a very different angle of programming - an angle that you won't learn from Lisp. Most Lisp dialects are best used for incremental, interactive development of symbolic software. At the same time most Lisp dialects are not Functional Programming Languages, but more like multi-paradigm languages - with different dialects placing different weight on supporting FPL features (free of side effects, immutable data structures, algebraic data types, ...). Thus most Lisp dialects either lack static typing or don't put much emphasis on it.
So, if you know some Lisp dialect, then learning F# can make a lot of sense. Just don't think that much of your Lisp knowledge applies to F#, since F# is a very different language. As much as an imperative programming used to C or Java needs to unlearn some ideas when learning Lisp, one also needs to unlearn Lisp habits (no types, side effects, macros, ...) when using F#. F# is also driven by Microsoft and taking advantage of the .net framework.
F# has the benefit that .NET development (in general) is very widely adopted, easily available, and more mass market.
If you want to code F#, you can get Visual Studio, which many developers will already have...as opposed to getting the LISP environment up and running.
Additionally, existing .NET developers are much more likely to look at F# than LISP, if that means anything to you.
(This is coming from a .NET developer who coded, and loved, LISP, while in college).
I'm not sure if you would? If you find F# interesting that would be a reason. If you work requires it, it would be a reason. If you think it would make you more productive or bring you added value over your current knowledge, that would be a reason.
But if you don't find F# interesting, your work doesn't require it and you don't think it would make you more productive or bring you added value, then why would you?
If the question on the other hand is what F# gives that lisp don't, then type inference, pattern matching and integration with the rest of the .NET framework should be considered.
I know this thread is old but since I stumbled on this one I just wanted to comment on my reasons. I am learning F# simply for professional opportunities since .NET carries a lot of weight in a category of companies that dominate my field. The functional paradigm has been growing in use among more quantitatively and data oriented companies and I'd like to be one of the early comers to this trend. Currently there doesn't an exist a strong functional language that fully and safely integrates with the .NET library. I actually attempted to port some .NET from Lisp code and it's really a pain b/c the FFI only supports C primitives and .NET interoperability requires an 'interface' construct and even though I know how to do this in C it's really a huge pain. It would be really, really, good if Lisp went the extra mile in it's next standard and required a c++ class (including virtual functions w/ vtables), and a C# style interface type in it's FFI. Maybe even throw in a Java interface style type too. This would allow complete interoperability with the .NET library and make Lisp a strong contender as a large-scale language. However with that said, coming from a Lisp background made learning F# rather easy. And I like how F# has gone the extra mile to provide types that you would commonly see it quantitative type work. I believe F# was created with mathematical work in mind and that in itself has value over Lisp.
One way to look at this (the original question) is to match up the language (and associated tools and platforms) to the immediate task. If the task requires an overwhelming percentage of .NET code, and it would require less shoe-horning in one language than another to meet the task head-on, then take the path of least resistance (F#). If you don't need .NET capabilities, and you're comfortable working with LISP and there's no arm-bending to move away from it, keep using it.
Not really much different from comparing a hammer with a wrench. Pick the tool that fits the job most effectively. Trying to pick a tool that's objectively "best" is nonsense. And in any case, in 20 years, all of the currently "hot" languages might be outdated anyway.

Resources