Should I use F# to analyse Graphs - f#

We have a program which perform graph analysis (namely Maximum-flow problem) on several graphs.
There is also the opportunity to process these in parallel.
There is already a large C# Code base, but we are intending to rewrite a large portion of this. Would it be better to do this type of operation in F#, as opposed to say C#?
Thanks, Pete

I think that this depends largely on the composition of your team - how well do you know F#?
I feel like I am able to develop almost anything more quickly in F# than C#. In particular, highly algorithmic programs are often more concise and readable when expressed in F# thanks to type inference. However, if you don't have much experience with F# there is a significant learning curve, which means that you may be better off sticking to C# if you already know that language well.
C#'s support for doing operations in parallel is roughly equivalent to F#'s, particularly if your task doesn't require accessing any shared mutable state (which seems to be the case for your task). If I understand your problem correctly, you'd just like to run the same operation on multiple graphs at the same time, which ought to be quite easy in either language. If you were trying to parallelize the max-flow algorithm itself, then F# might be a bit easier due to its stronger support for immutable data types. Where F# really beats C# is in doing asynchronous operations, but that seems less relevant here.

Related

How to program FPGA using F#

I usually use F# for writing numerical algorithms. Functional programming constructs in F# helps to express algorithms in a very natural way. I often end up with a succinct and understandable implementation, and may be able to parallelize it quite fast if there is a chance of parallelism.
I wonder there is a way to compile F# programs down to FPGA. In this way, I can still use F# to avoid boilerplate codes in FPGA programming, and make use of high performance computing in FPGA. Is this possible to do so? If yes, could you provide some hints for me to start with?
I've read about (but never used) Avalda's F# to FPGA conversion, but their site is currently returning a completely blank page. I don't know if that's just temporary of if it means they've gone belly-up.
F# should be ideal for this task because it is derived from the ML family of languages that were bred for metaprogramming. However, I am not aware of any work in this area (although I have had the idea of working on it myself).
I would focus on writing a compiler in F# that compiled a DSL to an FPGA, rather than trying to compile general F# code.
Here's a list for HLS tools using C. My experience with one of them in 2006 was not favourable but I expect them to be much better today.
Regarding F#, I doubt this will exist any time soon.

Why is F#'s type inference so fickle?

The F# compiler appears to perform type inference in a (fairly) strict top-to-bottom, left-to-right fashion. This means you must do things like put all definitions before their use, order of file compilation is significant, and you tend to need to rearrange stuff (via |> or what have you) to avoid having explicit type annotations.
How hard is it to make this more flexible, and is that planned for a future version of F#? Obviously it can be done, since Haskell (for example) has no such limitations with equally powerful inference. Is there anything inherently different about the design or ideology of F# that is causing this?
Regarding "Haskell's equally powerful inference", I don't think Haskell has to deal with
OO-style dynamic subtyping (type classes can do some similar stuff, but type classes are easier to type/infer)
method overloading (type classes can do some similar stuff, but type classes are easier to type/infer)
That is, I think F# has to deal with some hard stuff that Haskell does not. (Almost certainly, Haskell has to deal with some hard stuff that F# does not.)
As is mentioned by other answers, most of the major .NET languages have the Visual Studio tooling as a major language design influence (see e.g. how LINQ has "from ... select" rather than the SQL-y "select... from", motivated by getting intellisense from a program prefix). Intellisense, error squiggles, and error-message comprehensibility are all tooling factors that inform the F# design.
It may well be possible to do better and infer more (without sacrificing on other experiences), but I don't think it's among our high priorities for future versions of the language. (The Haskellers may see F# type inference as somewhat weak, but they are probably outnumbered by the C#ers who see F# type inference as very strong. :) )
It might also be hard to extend the type inference in a non-breaking fashion; it is ok to change illegal programs into legal ones in a future version, but you have to be very careful to ensure previously-legal programs do not change semantics under new inference rules, and name resolution (an awful nightmare in every language) is likely to interact with type-inference-changes in surprising ways.
I think that the algorithm used by F# has the benefit that it is easy to (at least roughly) explain how it works, so once you understand it, you can have some expectations about the result.
The algorithm will always have some limitations. Currently, it is quite easy to understand them. For more complicated algorithms, this could be difficult. For example, I think you could run into situations where you think that the algorithm should be able to deduce something - but if it was general enough to cover the case, it would be non-decidable (e.g. could keep looping forever).
Another thought on this is that checking the code from the top to the bottom corresponds to how we read code (at least sometimes). So, maybe the fact that we tend to write the code in a way that enables type-inference also makes the code more readable for people...
F# uses one pass compilation such that
you can only reference types or
functions which have been defined
either earlier in the file you're
currently in or appear in a file which
is specified earlier in the
compilation order.
I recently asked Don Syme about making
multiple source passes to improve the
type inference process. His reply was
"Yes, it’s possible to do multi-pass
type inference. There are also
single-pass variations that generate a
finite set of constraints.
However these approaches tend to give
bad error messages and poor
intellisense results in a visual
editor."
http://www.markhneedham.com/blog/2009/05/02/f-stuff-i-get-confused-about/#comment-16153
The short answer is that F# is based on the tradition of SML and OCaml, whereas Haskell comes from a slightly different world of Miranda, Gofer, and the like. The differences in historical tradition are subtle, but pervasive. This distinction is paralleled in other modern languages too, such as the ML-like Coq which has the same ordering restrictions vs the Haskell-like Agda which doesn't.
This difference is related to lazy vs strict evaluation. The Haskell side of the universe believes in laziness, and once you already believe in laziness the idea of adding laziness to things like type inference is a no-brainer. Whereas in the ML side of the universe whenever laziness or mutual recursion is necessary it must be explicitly noted by the use of keywords like with, and, rec, etc. I prefer the Haskell approach because it results in less boilerplate code, but there are a lot of folks who think it's better to make these things explicit.

Is it possible that F# will be optimized more than other .Net languages in the future?

Is it possible that Microsoft will be able to make F# programs, either at VM execution time, or more likely at compile time, detect that a program was built with a functional language and automatically parallelize it better?
Right now I believe there is no such effort to try and execute a program that was built as single threaded program as a multi threaded program automatically.
That is to say, the developer would code a single threaded program. And the compiler would spit out a compiled program that is multi-threaded complete with mutexes and synchronization where needed.
Would these optimizations be visible in task manager in the process thread count, or would it be lower level than that?
I think this is unlikely in the near future. And if it does happen, I think it would be more likely at the IL level (assembly rewriting) rather than language level (e.g. something specific to F#/compiler). It's an interesting question, and I expect that some fine minds have been looking at this and will continue to look at this for a while, but in the near-term, I think the focus will be on making it easier for humans to direct the threading/parallelization of programs, rather than just having it all happen as if by magic.
(Language features like F# async workflows, and libraries like the task-parallel library and others, are good examples of near-term progress here; they can do most of the heavy lifting for you, especially when your program is more declarative than imperative, but they still require the programmer to opt-in, do analysis for correctness/meaningfulness, and probably make slight alterations to the structure of the code to make it all work.)
Anyway, that's all speculation; who can say what the future will bring? I look forward to finding out (and hopefully making some of it happen). :)
Being that F# is derived from Ocaml and Ocaml compilers can optimize your programs far better than other compilers, it probably could be done.
I don't believe it is possible to autovectorize code in a generally-useful way and the functional programming facet of F# is essentially irrelevant in this context.
The hardest problem is not detecting when you can perform subcomputations in parallel, it is determining when that will not degrade performance, i.e. when the subtasks will take sufficiently long to compute that it is worth taking the performance hit of a parallel spawn.
We have researched this in detail in the context of scientific computing and we have adopted a hybrid approach in our F# for Numerics library. Our parallel algorithms, built upon Microsoft's Task Parallel Library, require an additional parameter that is a function giving the estimated computational complexity of a subtask. This allows our implementation to avoid excessive subdivision and ensure optimal performance. Moreover, this solution is ideal for the F# programming language because the function parameter describing the complexity is typically an anonymous first-class function.
Cheers,
Jon Harrop.
I think the question misses the point of the .NET architecture-- F#, C# and VB (etc.) all get compiled to IL, which then gets compiled to machine code via the JIT compiler. The fact that a program was written in a functional language isn't relevant-- if there are optimizations (like tail recursion, etc.) available to the JIT compiler from the IL, the compiler should take advantage of it.
Naturally, this doesn't mean that writing functional code is irrelevant-- obviously, there are ways to write IL which will parallelize better-- but many of these techniques could be used in any .NET language.
So, there's no need to flag the IL as coming from F# in order to examine it for potential parallelism, nor would such a thing be desirable.
There's active research for autoparallelization and auto vectorization for a variety of languages. And one could hope (since I really like F#) that they would concive a way to determine if a "pure" side-effect free subset was used and then parallelize that.
Also since Simon Peyton-Jones the father of Haskell is working at Microsoft I have a hard time not beliving there's some fantastic stuff comming.
It's possible but unlikely. Microsoft spends most of it's time supporting and implementing features requested by their biggest clients. That usually means C#, VB.Net, and C++ (not necessarily in that order). F# doesn't seem like it's high on the list of priorities.
Microsoft is currently developing 2 avenues for parallelisation of code: PLINQ (Pararllel Linq, which owes much to functional languages) and the Task Parallel Library (TPL) which was originally part of Robotics Studio. A beta of PLINQ is available here.
I would put my money on PLINQ becoming the norm for auto-parallelisation of .NET code.

Functional programming and multicore architecture

I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing. I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?
Functional programming minimizes or eliminates side effects and thus is better suited to distributed programming. i.e. multicore processing.
In other words, lots of pieces of the puzzle can be solved independently on separate cores simultaneously without having to worry about one operation affecting another nearly as much as you would in other programming styles.
One of the hardest things about dealing with parallel processing is locking data structures to prevent corruption. If two threads were to mutate a data structure at once without having it locked perfectly, anything from invalid data to a deadlock could result.
In contrast, functional programming languages tend to emphasize immutable data. Any state is kept separate from the logic, and once a data structure is created it cannot be modified. The need for locking is greatly reduced.
Another benefit is that some processes that parallelize very easily, like iteration, are abstracted to functions. In C++, You might have a for loop that runs some data processing over each item in a list. But the compiler has no way of knowing if those operations may be safely run in parallel -- maybe the result of one depends on the one before it. When a function like map() or reduce() is used, the compiler can know that there is no dependency between calls. Multiple items can thus be processed at the same time.
I've read somewhere that functional programming is suitable to take advantage of multi-core trend in computing... I didn't really get the idea. Is it related to the lambda calculus and von neumann architecture?
The argument behind the belief you quoted is that purely functional programming controls side effects which makes it much easier and safer to introduce parallelism and, therefore, that purely functional programming languages should be advantageous in the context of multicore computers.
Unfortunately, this belief was long since disproven for several reasons:
The absolute performance of purely functional data structures is poor. So purely functional programming is a big initial step in the wrong direction in the context of performance (which is the sole purpose of parallel programming).
Purely functional data structures scale badly because they stress shared resources including the allocator/GC and main memory bandwidth. So parallelized purely functional programs often obtain poor speedups as the number of cores increases.
Purely functional programming renders performance unpredictable. So real purely functional programs often see performance degradation when parallelized because granularity is effectively random.
For example, the bastardized two-line quicksort often cited by the Haskell community typically runs thousands of times slower than a real in-place quicksort written in a more conventional language like F#. Moreover, although you can easily parallelize the elegant Haskell program, you are unlikely to see any performance improvement whatsoever because all of the unnecessary copying makes a single core saturate the entire main memory bandwidth of a multicore machine, rendering parallelism worthless. In fact, nobody has ever managed to write any kind of generic parallel sort in Haskell that is competitively performant. The state-of-the-art sorts provided by Haskell's standard library are typically hundreds of times slower than conventional alternatives.
However, the more common definition of functional programming as a style that emphasizes the use of first-class functions does actually turn out to be very useful in the context of multicore programming because this paradigm is ideal for factoring parallel programs. For example, see the new higher-order Parallel.For function from the System.Threading.Tasks namespace in .NET 4.
When there are no side effects the order of evaluation does not matter. It is then possible to evaluate expressions in parallel.
The basic argument is that it is difficult to automatically parallelize languages like C/C++/etc because functions can set global variables. Consider two function calls:
a = foo(b, c);
d = bar(e, f);
Though foo and bar have no arguments in common and one does not depend on the return code of the other, they nonetheless might have dependencies because foo might set a global variable (or other side effect) which bar depends upon.
Functional languages guarantee that foo and bar are independant: there are no globals, and no side effects. Therefore foo and bar could be safely run on different cores, automatically, without programmer intervention.
All the answers above go to the key idea that "no shared mutable storage" is a key enabler to execute pieces of a program in parallel. It does not really solve the equally hard problem of finding things to execute in parallel. But the typical clearer expressions of functionality in functional languages do make it theoretically easier to extract parallelism from a sequential expression.
In practice, I think the "no shared mutable storage" property of languages based on garbage collection and copy-on-change semantics make them easier to add threads to. The best example is probably Erlang, that combines near-functional semantics with explicit threads.
This is a little bit of a vague question. One perk of multi-core CPUs is that you can run a functional program and let it plug away serially without worrying about affecting any computing going on that has to do with other functions the machine is carrying out.
The difference between a multi-U server and a multi-core CPU in a server or PC is the speed savings you get by having it on the same BUS, allowing better and faster communication to the cores.
edit: I should probably qualify this post by saying that in most of the scripting I do, with or without multiple cores, I rarely see a problem in getting my data through hackish parallelizing, such as running multiple small scripts at once in my script so I'm not slowed down by things like waiting for URLs to load and what not.
double edit: Furthermore, a lot of functional programming languages have had forked parallel variants for decades. These better utilize parallel computation with some speed improvement, but they never really caught on.
Omitting any technical/scientific terms the reason is because functional program doesn't share data. Data is copied and transfered among functions, thus there is no shared data in the application.
And shared data is what causes half the headaches with multithreading.
The book Programming Erlang: Software for a Concurrent World by Joe Armstrong (the creator of Erlang) talks quite a bit about using Erlang for multicore(/multiprocessor) systems. As the wikipedia article states:
Creating and managing processes is trivial in Erlang, whereas threads are considered a complicated and error-prone topic in most languages. Though all concurrency is explicit in Erlang, processes communicate using message passing instead of shared variables, which removes the need for locks.

If you already know LISP, why would you also want to learn F#?

What is the added value for learning F# when you are already familiar with LISP?
Static typing (with type inference)
Algebraic data types
Pattern matching
Extensible pattern matching with active patterns.
Currying (with a nice syntax)
Monadic programming, called 'workflows', provides a nice way to do asynchronous programming.
A lot of these are relatively recent developments in the programming language world. This is something you'll see in F# that you won't in Lisp, especially Common Lisp, because the F# standard is still under development. As a result, you'll find there is a quite a bit to learn. Of course things like ADTs, pattern matching, monads and currying can be built as a library in Lisp, but it's nicer to learn how to use them in a language where they are conveniently built-in.
The biggest advantage of learning F# for real-world use is its integration with .NET.
Comparing Lisp directly to F# isn't really fair, because at the end of the day with enough time you could write the same app in either language.
However, you should learn F# for the same reasons that a C# or Java developer should learn it - because it allows functional programming on the .NET platform. I'm not 100% familiar with Lisp, but I assume it has some of the same problems as OCaml in that there isn't stellar library support. How do you do Database access in Lisp? What about high-performance graphics?
If you want to learn more about 'Why .NET', check out this SO question.
If you knew F# and Lisp, you'd find this a rather strange question to ask.
As others have pointed out, Lisp is dynamically typed. More importantly, the unique feature of Lisp is that it's homoiconic: Lisp code is a fundamental Lisp data type (a list). The macro system takes advantage of that by letting you write code which executes at compile-time and modifies other code.
F# has nothing like this - it's a statically typed language which borrows a lot of ideas from ML and Haskell, and runs it on .NET
What you are asking is akin to "Why do I need to learn to use a spoon if I know how to use a fork?"
Given that LISP is dynamically typed and F# is statically typed, I find such comparisons strange.
If I were switching from Lisp to F#, it would be solely because I had a task on my hands that hugely benefitted from some .NET-only library.
But I don't, so I'm not.
Money. F# code is already more valuable than Lisp code and this gap will widen very rapidly as F# sees widespread adoption.
In other words, you have a much better chance of earning a stable income using F# than using Lisp.
Cheers,
Jon Harrop.
F# is a very different language compared to most Lisp dialects. So F# gives you a very different angle of programming - an angle that you won't learn from Lisp. Most Lisp dialects are best used for incremental, interactive development of symbolic software. At the same time most Lisp dialects are not Functional Programming Languages, but more like multi-paradigm languages - with different dialects placing different weight on supporting FPL features (free of side effects, immutable data structures, algebraic data types, ...). Thus most Lisp dialects either lack static typing or don't put much emphasis on it.
So, if you know some Lisp dialect, then learning F# can make a lot of sense. Just don't think that much of your Lisp knowledge applies to F#, since F# is a very different language. As much as an imperative programming used to C or Java needs to unlearn some ideas when learning Lisp, one also needs to unlearn Lisp habits (no types, side effects, macros, ...) when using F#. F# is also driven by Microsoft and taking advantage of the .net framework.
F# has the benefit that .NET development (in general) is very widely adopted, easily available, and more mass market.
If you want to code F#, you can get Visual Studio, which many developers will already have...as opposed to getting the LISP environment up and running.
Additionally, existing .NET developers are much more likely to look at F# than LISP, if that means anything to you.
(This is coming from a .NET developer who coded, and loved, LISP, while in college).
I'm not sure if you would? If you find F# interesting that would be a reason. If you work requires it, it would be a reason. If you think it would make you more productive or bring you added value over your current knowledge, that would be a reason.
But if you don't find F# interesting, your work doesn't require it and you don't think it would make you more productive or bring you added value, then why would you?
If the question on the other hand is what F# gives that lisp don't, then type inference, pattern matching and integration with the rest of the .NET framework should be considered.
I know this thread is old but since I stumbled on this one I just wanted to comment on my reasons. I am learning F# simply for professional opportunities since .NET carries a lot of weight in a category of companies that dominate my field. The functional paradigm has been growing in use among more quantitatively and data oriented companies and I'd like to be one of the early comers to this trend. Currently there doesn't an exist a strong functional language that fully and safely integrates with the .NET library. I actually attempted to port some .NET from Lisp code and it's really a pain b/c the FFI only supports C primitives and .NET interoperability requires an 'interface' construct and even though I know how to do this in C it's really a huge pain. It would be really, really, good if Lisp went the extra mile in it's next standard and required a c++ class (including virtual functions w/ vtables), and a C# style interface type in it's FFI. Maybe even throw in a Java interface style type too. This would allow complete interoperability with the .NET library and make Lisp a strong contender as a large-scale language. However with that said, coming from a Lisp background made learning F# rather easy. And I like how F# has gone the extra mile to provide types that you would commonly see it quantitative type work. I believe F# was created with mathematical work in mind and that in itself has value over Lisp.
One way to look at this (the original question) is to match up the language (and associated tools and platforms) to the immediate task. If the task requires an overwhelming percentage of .NET code, and it would require less shoe-horning in one language than another to meet the task head-on, then take the path of least resistance (F#). If you don't need .NET capabilities, and you're comfortable working with LISP and there's no arm-bending to move away from it, keep using it.
Not really much different from comparing a hammer with a wrench. Pick the tool that fits the job most effectively. Trying to pick a tool that's objectively "best" is nonsense. And in any case, in 20 years, all of the currently "hot" languages might be outdated anyway.

Resources