Erlang Hot Code Loading not widely used? - erlang

I just saw this 2012 video from LinuxConf.au about Erlang in production.
There's a part on the video where Bernard says no big Erlang projects use Hot Code Loading apart from Ericsson, because it's really hard to guarantee things will work. It's around minute 29.
Is that still true? Are there tools to help test a hot code load or or make it easier nowadays?

This is not true. Every Erlang user uses hot code loading to his advantage in one way or another -- whether it is for development, testing, troubleshooting, one-off fixes, or full scale deployments. This is one of the major Erlang advantages. Rather unique too.
For example, WhatsApp, one of the biggest Erlang users, relies on hot code loading for almost all code pushes.
I have personally worked with hot code loading in scenarios where each change was well understood and often performed by the same person who made the change. It works extremely well and good engineers don't have any problems doing this. Speaking of tools, loading modules one by one from Erlang shell using l(...). or all at once using l(). (see here) works just fine. Some prefer release-based tools like relx.
Others, like Ericsson, use enterprise-style deployments with hot code loading after rigorous testing of clear-cut releases and patches. The goal here is to upgrade without using spare capacity and special procedures for draining and shifting load. Operationally this may be simpler and more efficient than restarts, but testing can be more expensive.

It is difficult to know whether it is a feature widely or scarcely used. Nowadays there are plenty of Erlang systems out there. I can however think of reasons why and why not to use it, since I have been working with bot options for quite some time.
In favour of using it:
It is obviously quite useful during development to ensure a fast feedback cycle. I always develop with an open shell, and with functions to load code automatically as son as compile.
In the rare case you need to implement a monolithic application with high availability requirements, it is basically the only option
The main reason not to use it, as the presentation states: it is hard. Even if you manage to understand exactly how it works (which is not the hardest part).
It is not, in my opinion just a problem of tooling, but rather that you are getting a lot of intrinsic complications just by the fact that now your code is part of the mutable running state of the system. You basically end up having a long running system that changes behaviour, so you add those to the problems you already had:
You are no longer sure that restarting the system will not change behaviour on any fundamental way. So you will probably need to put extra care on making sure that whatever code you load, it is also written to disk.
Changing the way your modules work (i.e. loading new code) is very tricky unless a) you never break compatibility, b) you somehow figure out the order in which the modules should change or c) you assume the worst that can happen is a few crashes due to undefined functions, function or case clauses, etc, and hope for the best (the actual worst is when the new and old modules interact in unexpected ways while you haven't finished loading all of the new ones and the actually run some impossible logic).
You will almost certainly will end up killing some process running old code when loading new code at some point. Maybe your supervisors will help you, maybe not. In any case that could be very confusing and difficult to debug.
As the presentation also states, is very hard to test (if not impossible).
Etc.
Adding to all those, you are running a long living server with long living state, which is far from ideal.
So my advise is always that, if you could get away with a distributed application and rolling upgrades, you should do it. That option is much easier to handle, and in my experience, performs better overall.

Related

llvm based code mutation for genetic programming?

for a study on genetic programming, I would like to implement an evolutionary system on basis of llvm and apply code-mutations (possibly on IR level).
I found llvm-mutate which is quite useful executing point mutations.
As far as I have understood, the instructions get count/numbered, one can then e.g. delete a numbered instruction.
However, introduction of new instructions seems to be possible as one of the availeable statements in the code.
Real mutation however would allow to insert any of the allowed IR instructions, irrespective of it beeing used in the code to be mutated.
In addition, it should be possible to insert library function calls of linked libraries (not used in the current code, but possibly available, because the lib has been linked in clang).
Did I overlook this in the llvm-mutate or is it really not possible so far?
Are there any projects trying to /already have implement(ed) such mutations for llvm?
llvm has lots of code analysis tools which should allow the implementation of the afore mentioned approach. llvm is huge, so I'm a bit disoriented. Any hints which tools could be helpful (e.g. getting a list of available library functions etc.)?
Thanks
Alex
Very interesting question. I have been intrigued by the possibility of doing binary-level genetic programming for a while. With respect to what you ask:
It is apparent from their documentation that LLVM-mutate can't do what you are asking. However, I think it is wise for it not to. My reasoning is that any machine-language genetic program would inevitably face the "Halting Problem", e.g. it would be impossible to know if a randomly generated instruction would completely crash the whole computer (for example, by assigning a value to a OS-reserved pointer), or it might run forever and take all of your CPU cycles. Turing's theorem tells us that it is impossible to know in advance if a given program would do that. Mind you, LLVM-mutate can cause for a perfectly harmless program to still crash or run forever, but I think their approach makes it less likely by only taking existing instructions.
However, such a thing as "impossibility" only deters scientists, not engineers :-)...
What I have been thinking is this: In nature, real mutations work a lot more like LLVM-mutate that like what we do in normal Genetic Programming. In other words, they simply swap letters out of a very limited set (A,T,C,G) and every possible variation comes out of this. We could have a program or set of programs with an initial set of instructions, plus a set of "possible functions" either linked or defined in the program. Most of these functions would not be actually used, but they will be there to provide "raw DNA" for mutations, just like in our DNA. This set of functions would have the complete (or semi-complete) set of possible functions for a problem space. Then, we simply use basic operations like the ones in LLVM-mutate.
Some possible problems though:
Given the amount of possible variability, the only way to have
acceptable execution times would be to have massive amounts of
computing power. Possibly achievable in the Cloud or with GPUs.
You would still have to contend with Mr. Turing's Halting Problem.
However I think this could be resolved by running the solutions in a
"Sandbox" that doesn't take you down if the solution blows up:
Something like a single-use virtual machine or a Docker-like
container, with a time limitation (to get out of infinite loops). A
solution that crashes or times out would get the worst possible
fitness, so that the programs would tend to diverge away from those
paths.
As to why do this at all, I can see a number of interesting applications: Self-healing programs, programs that self-optimize for an specific environment, program "vaccination" against vulnerabilities, mutating viruses, quality assurance, etc.
I think there's a potential open source project here. It would be insane, dangerous and a time-sucking vortex: Just my kind of project. Count me in if someone doing it.

how you design prototypes in erlang?

in the early phase of design of erlang small app - how do you do prototyping?
Is it better to first prototype without OTP just to prove all main mechanics in plain erlang and in further elaboration add what OTP offers with refined requirements / aspects or use OTP from the beginning?
(The answer below is not trying to plug my instructional, it just happens to apply directly to the OP's question; were it possible I would just send the OP a private message or email. At the time of this answer my demonstration system is only barely worth even reading, aside from basic architecture concepts.)
I start with a slew of function stubs. I do this in most languages (even something like this in assembler). The special thing about this in Erlang is that my initial stubs represent supervisors or logical managers, not one-off solutions to elements of my fundamental problem.
Beyond that, I like to do something most people abhor these days: talking the problem out in prose to discover inconsistencies in the way I view the problem. I've just started on an example of this here (as in, I'm still working on this before and after work daily as of today, 2014.11.06): http://zxq9.com/erlmud.
Some system stubs (conceptual, not OTP -- which is integral to the idea I'm trying to demonstrate in the project, actually) are here: https://github.com/zxq9/erlmud/tree/be7c6a8ae0d91aac37850083091ae4d15f1369a4/erlmud-0.1 for example. Over the next few days they will change significantly until there is a prototype system that works instead of just stubs. If you're really curious about this, follow the commits from the one I linked over the next two weeks or so (paid work schedule permitting, of course).
One positive thing I've noticed about prototyping with stubs and not jumping straight into OTP behaviors is that very often the behavior that is assumed to be a proper fit for a component turns out not to be. There are many cases where I anticipate I will want a gen_server, but after writing some stubs and messing around a bit I find myself beginning to manually implement an FSM. Sometimes that happens in reverse, too, I think I need an FSM and wind up writing a server, or realize I could benefit from a proper gen_event. Once you've ironed out what you're doing it is pretty easy to convert pure Erlang into OTP. It is much less easy to edit your mental model of how a component works once you've written a gen_fsm or gen_server, because you start to feel invested in the idea of thinking of it in OTP terms prematurely.
Remember: typing is the easy part, the real battle is figuring out what to type. So begin boldly by writing executable stubs and toy with them.
There is no special recipe to do prototypes in Erlang. How would you do a prototype in Java, C#, Scala, (put any language here) ?
When prototyping, you need to achieve your proof of concept as fast as possible and deliver a minimal vital project.
In your case, does OTP helps you to deliver your minimal vital project or not?
If yes, then use it. And of course don't use it if it isn't.
Are you familiar with OTP concepts in the first place? If not, then you need to learn them. And thats mean that you need to invest more time in learning OTP. Is that ok for your prototyping purpose?
I'm only trying to highlight the fact that prototyping in Erlang isn't different from any other language.

Updating a VB6 application to .net

I know a lot of questions have been asked about VB6 migration (and I've read most of them), but I'm still not entirely certain on what the best way to go about this is.
We have a client that we built an order tracking application for about a decade back and they came to us this week saying they were having some issues with it. The app was written entirely in VB6, which has been something of a hassle as tracking down the necessary tools to work with a project so old took some considerable effort. In an effort to make any future maintenance less of a headache, my boss wants to pitch the idea to them of updating the app to .net and wants to know what exactly that would entail. I've never done anything like this before, but what I've read (both here and elsewhere) suggests that Microsoft's "auto-update" from VB6 to .net simply doesn't work very well and I'd pretty much have to rebuild the app from the ground up.
To get to the crux of my question: is this the case? Would I pretty much just need to rewrite it, or is there another means of going about this that could/would save me a lot of time/effort?
Any insight would be greatly appreciated.
VB6 and VB.NET are radically different. The syntax has changed, and so has the underlying structures, forms, custom controls, and almost every single aspect you can possibly think about.
A complete redesign and reassessment of needs and functionality is imperative. With .NET the plethora of new libraries and features supersede the antiquated VB6 libraries, OCXs, etc. Also if you feel bold, you can migrate your code to C# and other CIL languages aside from VB.
Out of hand, the Microsoft migration tool will not do much. Moreover, it also depends on whether you have your business logic well separated from your GUI. Otherwise, it will make it even harder. Depending on the size of your application, it might make it quite expensive. Another possible solution you might consider is to run your app in a virtual environment or on a remote app http://technet.microsoft.com/en-us/library/cc730673(v=ws.10).aspx that will ease the deployment pain.
I have also researched this topic.
Try the smart rewrite solution that converts 95% of the code automatically.
first, run your app through the assessment wizard to determine estimated costs and resources needed.
http://visualwebgui.com/Gizmox/Solutions/InstantbCloudmoveb/tabid/744/Default.aspx

Scale now or later?

I am looking to start developing a relatively simple web application that will pull data from various sources and normalizing it. A user can also enter the data directly into the site. I anticipate hitting scale, if successful. Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack? Framework or not? Any thoughts, suggestions, or comments would help.
Disregard my vague description of the idea, I'd love to share once I get further along.
Later. I can't remember who said it (might have been SO's Jeff Atwood) but it rings true: your first problem is getting other people to care about your work. Worry about scale when they do.
Definitely go with a well structured framework for your own sanity though. Even if it doesn't end up with thousands of users, you'll want to add features as time goes on. Maintaining an expanding codebase without good structure quickly becomes fairly horrible (been there, done that, lost the client).
btw, if you're tempted to write your own framework, be aware that it is a lot of work. My company has an in-house one we're quite proud of, but it's taken 3-4 years to mature.
Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack?
A LAMP stack is scalable. Apache provides many, many alternatives.
Framework or not?
Always use the highest-powered framework you can find. Write as little code as possible. Get something in front of people as soon as you can.
Focus on what's important: Get something to work.
If you don't have something that works, scalability doesn't matter, does it?
Then read up on optimization. http://c2.com/cgi/wiki?RulesOfOptimization is very helpful.
Rule 1. Don't.
Rule 2. Don't yet.
Rule 3. Profile before Optimizing.
Until you have a working application, you don't know what -- specific -- thing limits your scalability.
Don't assume. Measure.
That means build something that people actually use. Scale comes later.
Absolutely do it later. Scaling pains is a good problem to have, it means people like your project enough to stress the hardware it's running on.
The last company I worked at started fairly small with PHP and the very very first versions of CakePHP that came out (when it was still in beta). Some of the code was dirty, the admin tool was a mess (code-wise), and sure it could have been done better from the start. But do you know what? They got it out the door before their competitors did, and became extremely successful.
When I came on board they were starting to hit the limits of their current potential scalability, and that is when they decided to start looking at CDN's, lighttpd caching techniques, and other ways to clean up the code and make things run smoother when under heavy load. I don't work for them anymore but it was a good experience in growing an architecture beyond what it was originally scoped at.
I can tell you right now if they had tried to do the scalability and optimizations before selling content and getting a website live - they would never have grown to the size they are now. The company is www.beatport.com if you're interested in who I'm talking about (To re-iterate, I'm not trying to advertise them as I am no longer affiliated with them, but it stands as a good case study and it's easier for people to understand what I'm talking about when they see their website).
Personally, after working with Ruby and Rails (and understanding the separation!) for a couple of years, and having experience with PHP at Beatport - I can confidently say that I never want to work with PHP code again =p
Funny to ask "scale now or later?" and label it "ruby on rails".
Actually, Ruby on Rails was created by David Heinemeier Hansson, who has a whole chapter in his book labeled "Scale later" :))
http://gettingreal.37signals.com/ch04_Scale_Later.php
I agree with the earlier respondents -- make it useful, make it work and get people motivated to use it first. I also agree that you should pick off-the shelf components (of which there are many) rather than roll your own, as much as possible. At the same time, make sure that you choose components for your infrastructure that you know to be scalable so that you can go there when you need to, without having to re-write major chunks of your application.
As the Product Manager for Berkeley DB, I've seen countess cases of developers who decided "Oh, we'll just write that to a flat file" or "I can write my own simple B-tree function" or "Database XYZ is 'good enough', I don't have to worry about concurrency or scalability until later". The problem with that approach is that a) you're re-inventing the wheel (and forgoing what others have learned the hard way already) and b) you're ignoring the fact that you'll have to deal with scalability at some point and going with a 'good enough' solution.
Good luck in your implementation.

Should I re-write the webapp front end in Erlang?

I have a Rails webapp [deployed on Heroku] which makes a series of HTTP calls to other sites on a repeated basis, using Heroku's rake:cron feature. The current situation isn't ideal; the rake:cron process is executed in a single thread, which means HTTP calls are made sequentially; which means in turn that there's a long time between calls to the same site [typically 2 mins].
I'd like to execute this process in parallel, and reduce the time between calls to 10 secs. Having seen Kevin Smith's 'Erlang in Practice' I'm sold on the idea of using Erlang as a replacement backend. What I'm trying to figure out [given Damien Katz's comments], is whether I should a) re-write the entire webapp in Erlang, front end and all or b) maintain a split structure, with a Rails frontend / Erlang backend.
I like the idea of using a 100% Erlang stack for the project; I'll need to use some kind of Erlang web framework [Nitrogen ? Erlyweb ?]; I'm concerned they're not mature enough and I'll spend my time bogged down on the web part of the project with them.
Anyone any views ? Thanks.
What's the actual impact on your visitors (of the two-minute interval between HTTP backend calls)?
If there isn't much of a difference, I'd say this sounds like premature optimization and that you'd be much better off skipping Erlang for now.
The two previous posters have pretty much covered they philosophical aspects of your question. So I'll answer the framework maturity/getting bogged down part of your question.
In the event that you decide you do want to rewrite the webapp in erlang for whatever reason then I wouldn't be too concerned about the framework slowing you down. Both erlyweb and nitrogen are already feature complete enough that you can work pretty quickly with them. I've developed a fairly complex agile project management app in nitrogen and found it to be quite intuitive and not really lacking in features that I needed. A few hours in the evenings and a few weeks later and I had a working app up and running.
As to which one to use that depends on the type of app you want to build.
Nitrogen's target is extremely dyamic web applications. Most of the page is rendered using javascript and it is highly event driven.
ErlyWeb is more suited to a site where the content is the primary focus less so than a rich client type of application. It uses the MVC style of architecure.
Good luck on whatever you decide.
It depends. How much Erlang do you know? How much code have you already written?
How much project experience do you have? Is this for work or for fun?
Rewriting projects from scratch is often a recipe for disaster, especially if you are trying
to learn a new language along the way. It seems to me like you would not be asking this question if you were already fluent in both languages, in which case I would recommend that you just stick to Ruby if it's a work project.
I disagree with the above poster that changing the language is a premature optimization, if it is necessary.
Changing the language is a big deal. It can't be done at the last minute.
However, I would probably not change the language at all for the reason you outlined.
If you don't have any other reasons than performance for switching, you should probably just
look at multi-threading in Ruby or some other optimization.
I'm all about using the right tool for the job. Unless you have an absolutely dead on reason to port the front, there's absolutely nothing wrong with hooking the two together.

Resources