Are there any Distributed Hash Table implementations in Erlang? I searched the web and found a number of research papers but I did not find a mature implementation.
Riak is quite mature and mostly implemented in Erlang. It's a bit more than just a DHT.
Redis is an incredibly fast implementation I have used with success before, and there is a great Erlang binding for it : http://streamhacker.com/2009/12/21/erldis-erlang-redis-client/
Related
I'm starting to dive into Erlang for the first time, and OTP is held aloft by lovers and critics alike as being the gold standard for highly available, distributed processing.
Given that OTP has been around for decades and is openly documented, why is it that other languages supporting lightweight threads/processes haven't adopted versions of their own? Are there technical/political challenges? Or does everyone just shrug and learn Erlang?
Thanks!
The largest issue is that most language runtimes don't have built-in lightweight concurrency and error isolation with exit signal propagation. Without those things you would have a really hard time properly porting OTP.
For the languages that do have the right kind of runtime, I am seeing some effort or at least plans to build OTP inspired frameworks. Cloud Haskell is the first that comes to mind. I also expect that Go and Rust will eventually have something like OTP if they don't already.
There are technical challenges, as Erlang itself is designed for the same features OTP is known for. Case in point, Basho Riak is a distributed fault-tolerant key/value store written in Erlang. One might be able to port it to Haskell or some similar functional language, but it would probably be a lot of work. Just for fun, you might look into OTP stuff written in the Elixir language.
Actually, it has been (tried).
Akka is the library which takes some OTP features and implements them in Scala for JVM.
Given the principles underlying JVM and BEAM (the Erlang VM) are very different (mainly GC, scheduling and message passing are radically different), I can't say how successful that implementation is and how many benefits of the original OTP it preserves. There's a lot of (heated) debate on that in the internets.
I want to write actor style code for embedded processors and I am trying to decide between writing everything in Erlang vs writing everything in zeromq+any language. Using zeromq looks to be very powerful in the sense that I can use any programming language and make my development a lot easier(many available libs) but then I am not sure if there is any gotcha in this power? I understand that Erlang represent actor model much better especially with OTP concepts but then it seems easy to represent similar actor model with zeromq? Am I looking at this correctly?
1.What do I really lose not using Erlang for embedded applications (where distributed processing, a power point of Erlang, is NOT required) and just build things on top a generic messaging framework like zeromq?
2.Is Erlang offering more than a coordinated messaging framework for a non-distributed embedded application?
3.What specific capabilities of Erlang could took too long to implement with zeromq?
You're comparing apples and oranges. Part of the advantage of using Erlang is the language; if you're going to put it up against zmq + some other language, the other language in that comparison really matters. zmq + ARM assembly? Erlang brings all the wonderful advantages of not hand-coding ASM.
As for what else Erlang brings to the table, Embedded Erlang? Absolutely argues that Erlang has advantages in fault tolerance, hot code loading, rapid development by leveraging Erlang and OTP, easy interaction with C libraries, and simple debugging by live REPL and copy-paste of terms.
Some of those things, such as hot reload, on-device REPL, and established libraries, will definitely take some real hacking to reproduce from the ground up.
My point would be that you will have to work very hard to get the same kind of error handling in Zmq. Erlang has some really nice built-in error handling when things begins to go bad. There has been considerable time spent in Erlang optimizing that part and making it robust.
Zmq on the other hand, is probably faster in some combinations with some languages when you make simple benchmarks. There is less overhead, so it may process messages faster than what Erlang can provide.
But chances are that you will end up re-implementing large parts of Erlang in the language of your choice. And you will probably not do a job as good as 6-10 developers working on Erlang/OTP for 15 years.
On the other hand, Erlang is not a simple language to learn. There is way more to it than just learning how to program in a functional style. Especially the concurrency patterns and failure handling can take some time getting used to.
ZeroMQ =/= Erlang covers many differences. The claim there is that ZeroMQ only provides the messaging aspect, not the light-weight processes, process monitoring and other aspects.
I've skimmed Programming in Lua, I've looked at the Lua Reference.
However, they both tells me this function does this, but not how.
When reading SICP, I got this feeling of: "ah, here's the computational model underlying scheme"; I'm trying to get the same sense concerning Lua -- i.e. a concise description of it's vm, a "how" rather than a "what".
Does anyone know of a good document (besides the C source) describing this?
You might want to read the No-Frills Intro to Lua 5(.1) VM Instructions (pick a link, click on the Docs tab, choose English -> Go).
I don't remember exactly where I've seen it, but I remember reading that Lua's authors specifically discourage end-users from getting into too much detail on the VM; I think they want it to be as much of an implementation detail as possible.
Besides already mentioned A No-Frills Introduction to Lua 5.1 VM Instructions, you may be interested in this excellent post by Mike Pall on how to read Lua source.
Also see related Lua-Users Wiki page.
See http://www.lua.org/source/5.1/lopcodes.h.html . The list starts at OP_MOVE.
The computational model underlying Lua is pretty much the same as the computational model underlying Scheme, except that the central data structure is not the cons cell; it's the mutable hash table. (At least until you get into metaprogramming with metatables.) Otherwise all the familiar stuff is there: nested first-class functions with mutable local variables (let-bound variables in Scheme), and so on.
It's not clear to me that you'd get much from a study of the VM. I did some hacking on the VM a while back and it's a lot like any other register-oriented VM, although maybe a bit cleaner. Only a handful of instructions are Lua-specific.
If you're curious about the metatables, the semantics is described clearly, if somewhat verbosely, in Section 2.8 of the reference manual for Lua 5.1. If you look at the VM code in src/lvm.c you'll see almost exactly that logic implemented in C (e.g., the internal Arith function). The VM instructions are specialized for the common cases, but it's all terribly straightforward; nothing clever is involved.
For years I've been wanting a more formal specification of Lua's computational model, but my tastes run more toward formal semantics...
I've found The Implementation of Lua 5.1 very useful for understanding what Lua is actually doing.
It explains the hashing techniques, garbage collection and some other bits and pieces.
Another great paper is The Implmentation of Lua 5.0, which describes design and motivations of various key systems in the VM. I found that reading it was a great way to parse and understand what I was seeing in the C code.
I am surprised you refer to the C source for the VM as this is protected by lua.org and the tecgraf/puc rio in Brazil specially as the language is used for real business and commercial applications in a number of countries. The paper about The Implementation of lua contains details about the VM in the most detail it is permitted to include but the structure of the VM is proprietary. It is worth noting that versions 5.0 and 5' were commissioned by IBM in Europe for use on customer mainframes and their register-based version have a VM which accepts the IBM defined format of intermediate instructions.
Hi I'd like to pick up one FP language (it's always a pain when you work in a position that does not require you learn much), and after doing some research, I felt Erlang and OCaml are the two that I'd really like to get my feet wet for the following reasons:
1) I work mainly on high-availability web server back-end system in C++. I heard Erlang is a great fix in scalability and fault-tolerance. Though I don't think my current company will have any project in Erlang, I feel Erlang may be a good language for my long term career development.
2) I have a co-worker who is really good at OCaml, I mean he is really good at it (but he does not work on that for his daily work now. He maintains several library). So I figured that he may be a good resource if I learn OCaml.
My interests are mainly on distributed systems (my current work is some midldle-ware development work) and high-performance computing (guess what, I had a couple of years graduate school research on it, in particular PDE in Financial applications -- so I always felt I may go back to do some finance modeling work maybe sometime later)
Any suggestions? Please don't suggest "learn both", as I am not that smart :-)
Thanks
Ocaml is a great language -- one of my favorites -- but if your interest is distributed systems than I'd recommend going with Erlang, which is head and shoulders ahead of the other FP languages with regards to distributed systems (although there's an offshoot of Ocaml called Jocaml which has some interesting aspects).
Ocaml is weaker even when just looking at parallelism, given its underlying architecture. Both Haskell and Clojure have better stories here, imho. (That said: once you get one FP language, you'll be able to carry the fundamental principles to other languages pretty easily, and they might be useful in the future. Both Scala and Clojure could easily sneak their way into organizations by virtue of the JVM.)
I think Ocaml is a great way to get started in FP, and Erlang is not very difficult once you have the basic FP concepts down.
But the suggestion from 'aneccodeal' is fantastic-- i.e., if you are interested in Ocaml and have a friend who is already strong in it, by all means develop a concurrency (perhaps MPI) library for it.
Keep in mind, however, that one of the barriers to making Ocaml concurrent is the lack of concurrent garbage collection (or so I have read).
If you have a co-worker who is "really good" with OCaml then it sounds like you have a great resource assuming that s/he is willing to answer your questions. It's always easier to learn when there's someone knowledgeable around that you can ask questions of if you get stuck.
Yes it's true that OCaml doesn't have the best story when it comes to parallelism, but there are ways to get parallelism in OCaml (fork-based seems to be the most common - checkout prelude.ml which includes things like parallel map: http://github.com/kig/preludeml/tree/master ). Also, it seems that Erlang's Actor-based concurrency is really fairly easy to duplicate in other languages. Maybe you and your co-worker could work on a project to develop an Actor-based concurrency library for OCaml? That would give you a nice learning project that you co-worker would probably find interesting enough to work on with you... in addition you could end up creating something useful for the entire OCaml community.
I would also consider to look at F# (especially when VS 2010 is out).
Learning a new language is a lot easier and more convenient with a nice IDE.
F# and OCaml are very similar as you can see in other SO threads (e.g. here)
I've just started reading Joe Armstrongs book on Erlang and listened to his excellent talk on Software Engineering Radio.
Its an interesting language/system and one whose time seems to have come around with the advent of multi-core machines.
My question is: what is there to stop it being ported to the JVM or CLR? I realise that both virtual machines aren't setup to run the lightweight processes that Erlang calls for - but couldn't these be simulated by threads? Could we see a lightweight or cutdown version of Erlang on a non Erlang VM?
You could not use JVM/CLR libraries, given their reliance on mutable objects.
Erlang exception handling is quite different from JVM and CLR exceptions, you would need to handle this somehow.
Implementing processes as threads would mean that any sizable Erlang system runs out of memory pretty fast (process size on my machine on creation: 1268 bytes, thread stack size in CLR: 1 MB) and communication between processes is much slower than in Erlang.
What you probably want is an Actor Model implementation on JVM or CLR.
Scala and Clojure have already been mentioned. In addition, there are many Actor implementations for JVM:
Kilim, Functional Java, Jetlang, Actors Guild, ActorFoundry, and at least one for CLR: Retlang, which can be used from any JVM/CLR language.
For educational reasons, we are implementing a subset of ErlangVM for CLR. We were highly inspired by Kresten Krab Thorup and his project Erjang, a JVM based Erlang VM. Erjang uses kilim framework for representing lightweight processes, and it starts to attract attention.
Javalimit - Erjang's author blog.
Erjang repository
This is a well trod-discussion. Some context might be useful.
From the Erlang mailing list last November:
The start of a long discussion thread
continuing here
and going a bit mental here
and ending on Joe's contribution.
My contribution to the debate about Erlang on the JVM? No, not a good idea :(
Nothing at all, actually. You might have a look at Clojure, which is an interesting functional language built on the JVM.
Axum -- an incubation project on the CLR -- was clearly inspired by Erlang.
Erjang is a virtual machine for Erlang, which runs on Java™.
I don't know of any technical problem inhiting this.
Actually Scala (a JVM functional language) uses what is called an Actor Model that is very similar to, and as I understand it borrows heavily from, the Erlang model of shared-nothing concurrency.
Threads could not simulate Erlang processes. They're much too heavy-weight.
Just for completeness additional source about topic.
Possible? Yes. Practical? Well, probably not; they solve different problems in very different ways, and thus have lots of major differences in the way they do things. This would make porting hard, and performance would likely suffer severely. That doesn't mean it can't be done, just that there are better ways to accomplish what such a port would bring to the table.