I have a Rails webapp [deployed on Heroku] which makes a series of HTTP calls to other sites on a repeated basis, using Heroku's rake:cron feature. The current situation isn't ideal; the rake:cron process is executed in a single thread, which means HTTP calls are made sequentially; which means in turn that there's a long time between calls to the same site [typically 2 mins].
I'd like to execute this process in parallel, and reduce the time between calls to 10 secs. Having seen Kevin Smith's 'Erlang in Practice' I'm sold on the idea of using Erlang as a replacement backend. What I'm trying to figure out [given Damien Katz's comments], is whether I should a) re-write the entire webapp in Erlang, front end and all or b) maintain a split structure, with a Rails frontend / Erlang backend.
I like the idea of using a 100% Erlang stack for the project; I'll need to use some kind of Erlang web framework [Nitrogen ? Erlyweb ?]; I'm concerned they're not mature enough and I'll spend my time bogged down on the web part of the project with them.
Anyone any views ? Thanks.
What's the actual impact on your visitors (of the two-minute interval between HTTP backend calls)?
If there isn't much of a difference, I'd say this sounds like premature optimization and that you'd be much better off skipping Erlang for now.
The two previous posters have pretty much covered they philosophical aspects of your question. So I'll answer the framework maturity/getting bogged down part of your question.
In the event that you decide you do want to rewrite the webapp in erlang for whatever reason then I wouldn't be too concerned about the framework slowing you down. Both erlyweb and nitrogen are already feature complete enough that you can work pretty quickly with them. I've developed a fairly complex agile project management app in nitrogen and found it to be quite intuitive and not really lacking in features that I needed. A few hours in the evenings and a few weeks later and I had a working app up and running.
As to which one to use that depends on the type of app you want to build.
Nitrogen's target is extremely dyamic web applications. Most of the page is rendered using javascript and it is highly event driven.
ErlyWeb is more suited to a site where the content is the primary focus less so than a rich client type of application. It uses the MVC style of architecure.
Good luck on whatever you decide.
It depends. How much Erlang do you know? How much code have you already written?
How much project experience do you have? Is this for work or for fun?
Rewriting projects from scratch is often a recipe for disaster, especially if you are trying
to learn a new language along the way. It seems to me like you would not be asking this question if you were already fluent in both languages, in which case I would recommend that you just stick to Ruby if it's a work project.
I disagree with the above poster that changing the language is a premature optimization, if it is necessary.
Changing the language is a big deal. It can't be done at the last minute.
However, I would probably not change the language at all for the reason you outlined.
If you don't have any other reasons than performance for switching, you should probably just
look at multi-threading in Ruby or some other optimization.
I'm all about using the right tool for the job. Unless you have an absolutely dead on reason to port the front, there's absolutely nothing wrong with hooking the two together.
Related
in the early phase of design of erlang small app - how do you do prototyping?
Is it better to first prototype without OTP just to prove all main mechanics in plain erlang and in further elaboration add what OTP offers with refined requirements / aspects or use OTP from the beginning?
(The answer below is not trying to plug my instructional, it just happens to apply directly to the OP's question; were it possible I would just send the OP a private message or email. At the time of this answer my demonstration system is only barely worth even reading, aside from basic architecture concepts.)
I start with a slew of function stubs. I do this in most languages (even something like this in assembler). The special thing about this in Erlang is that my initial stubs represent supervisors or logical managers, not one-off solutions to elements of my fundamental problem.
Beyond that, I like to do something most people abhor these days: talking the problem out in prose to discover inconsistencies in the way I view the problem. I've just started on an example of this here (as in, I'm still working on this before and after work daily as of today, 2014.11.06): http://zxq9.com/erlmud.
Some system stubs (conceptual, not OTP -- which is integral to the idea I'm trying to demonstrate in the project, actually) are here: https://github.com/zxq9/erlmud/tree/be7c6a8ae0d91aac37850083091ae4d15f1369a4/erlmud-0.1 for example. Over the next few days they will change significantly until there is a prototype system that works instead of just stubs. If you're really curious about this, follow the commits from the one I linked over the next two weeks or so (paid work schedule permitting, of course).
One positive thing I've noticed about prototyping with stubs and not jumping straight into OTP behaviors is that very often the behavior that is assumed to be a proper fit for a component turns out not to be. There are many cases where I anticipate I will want a gen_server, but after writing some stubs and messing around a bit I find myself beginning to manually implement an FSM. Sometimes that happens in reverse, too, I think I need an FSM and wind up writing a server, or realize I could benefit from a proper gen_event. Once you've ironed out what you're doing it is pretty easy to convert pure Erlang into OTP. It is much less easy to edit your mental model of how a component works once you've written a gen_fsm or gen_server, because you start to feel invested in the idea of thinking of it in OTP terms prematurely.
Remember: typing is the easy part, the real battle is figuring out what to type. So begin boldly by writing executable stubs and toy with them.
There is no special recipe to do prototypes in Erlang. How would you do a prototype in Java, C#, Scala, (put any language here) ?
When prototyping, you need to achieve your proof of concept as fast as possible and deliver a minimal vital project.
In your case, does OTP helps you to deliver your minimal vital project or not?
If yes, then use it. And of course don't use it if it isn't.
Are you familiar with OTP concepts in the first place? If not, then you need to learn them. And thats mean that you need to invest more time in learning OTP. Is that ok for your prototyping purpose?
I'm only trying to highlight the fact that prototyping in Erlang isn't different from any other language.
I am looking to start developing a relatively simple web application that will pull data from various sources and normalizing it. A user can also enter the data directly into the site. I anticipate hitting scale, if successful. Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack? Framework or not? Any thoughts, suggestions, or comments would help.
Disregard my vague description of the idea, I'd love to share once I get further along.
Later. I can't remember who said it (might have been SO's Jeff Atwood) but it rings true: your first problem is getting other people to care about your work. Worry about scale when they do.
Definitely go with a well structured framework for your own sanity though. Even if it doesn't end up with thousands of users, you'll want to add features as time goes on. Maintaining an expanding codebase without good structure quickly becomes fairly horrible (been there, done that, lost the client).
btw, if you're tempted to write your own framework, be aware that it is a lot of work. My company has an in-house one we're quite proud of, but it's taken 3-4 years to mature.
Is it worth putting in the time now to use scalable or distributed technologies or just start with a LAMP stack?
A LAMP stack is scalable. Apache provides many, many alternatives.
Framework or not?
Always use the highest-powered framework you can find. Write as little code as possible. Get something in front of people as soon as you can.
Focus on what's important: Get something to work.
If you don't have something that works, scalability doesn't matter, does it?
Then read up on optimization. http://c2.com/cgi/wiki?RulesOfOptimization is very helpful.
Rule 1. Don't.
Rule 2. Don't yet.
Rule 3. Profile before Optimizing.
Until you have a working application, you don't know what -- specific -- thing limits your scalability.
Don't assume. Measure.
That means build something that people actually use. Scale comes later.
Absolutely do it later. Scaling pains is a good problem to have, it means people like your project enough to stress the hardware it's running on.
The last company I worked at started fairly small with PHP and the very very first versions of CakePHP that came out (when it was still in beta). Some of the code was dirty, the admin tool was a mess (code-wise), and sure it could have been done better from the start. But do you know what? They got it out the door before their competitors did, and became extremely successful.
When I came on board they were starting to hit the limits of their current potential scalability, and that is when they decided to start looking at CDN's, lighttpd caching techniques, and other ways to clean up the code and make things run smoother when under heavy load. I don't work for them anymore but it was a good experience in growing an architecture beyond what it was originally scoped at.
I can tell you right now if they had tried to do the scalability and optimizations before selling content and getting a website live - they would never have grown to the size they are now. The company is www.beatport.com if you're interested in who I'm talking about (To re-iterate, I'm not trying to advertise them as I am no longer affiliated with them, but it stands as a good case study and it's easier for people to understand what I'm talking about when they see their website).
Personally, after working with Ruby and Rails (and understanding the separation!) for a couple of years, and having experience with PHP at Beatport - I can confidently say that I never want to work with PHP code again =p
Funny to ask "scale now or later?" and label it "ruby on rails".
Actually, Ruby on Rails was created by David Heinemeier Hansson, who has a whole chapter in his book labeled "Scale later" :))
http://gettingreal.37signals.com/ch04_Scale_Later.php
I agree with the earlier respondents -- make it useful, make it work and get people motivated to use it first. I also agree that you should pick off-the shelf components (of which there are many) rather than roll your own, as much as possible. At the same time, make sure that you choose components for your infrastructure that you know to be scalable so that you can go there when you need to, without having to re-write major chunks of your application.
As the Product Manager for Berkeley DB, I've seen countess cases of developers who decided "Oh, we'll just write that to a flat file" or "I can write my own simple B-tree function" or "Database XYZ is 'good enough', I don't have to worry about concurrency or scalability until later". The problem with that approach is that a) you're re-inventing the wheel (and forgoing what others have learned the hard way already) and b) you're ignoring the fact that you'll have to deal with scalability at some point and going with a 'good enough' solution.
Good luck in your implementation.
I'm going to develop a collaborative site, and one of the features will be collaborative editing with realtime changes. i.e. when two or more users are editing the same doc, they can see each other changes as soon as they happen.
I have some experience with Ruby on Rails, so I was thinking about using EventMachine, but with all this hype around Node.js, I am know considering using it instead. So, what would be the main benefits of using Node.js instead of EventMachine?
tl;dr
What are the main differences between EventMachine and Node.js (besides the language)?
EventMachine has nothing to do with Rails apart from them both being written in the same language. You can get EventMachine as bare as Node.js; all you have to do is not add libraries to your project. In my experience the EventMachine libraries (like em-http) are much nicer than anything for Node. And you can use fibers instead of callbacks to avoid callback hell. Complete exception handling is pretty much impossible in Node because of all the callbacks. Plus Ruby is a nicer, more complete language than Javascript.
I tend towards the "use what you know" (even if it's a heavier architecture). Because of that, I don't see it being quite as simple as "EventMachine vs NodeJS." Mainly, the difference can be summarized as this:
NodeJS is a framework/language that was written to handle event based programming in JavaScript. That is its driving force. It's not an after thought, or a third party mechanism. It's baked right in to the language. You create callbacks/events because that's how the language is built. It's not a third party plug in, and doesn't alter your workflow.
EventMachine is a gem in Ruby that gives developers access to some of the goodness of the event based programming model. It's heavily used and well tested, but not baked directly in to the language. Both are locked to one CPU, but with event programming at Nodes core, it still has a leg up. Ruby wasn't written with concurrency in mind.
That said, technical problems can be overcome. The more important questions (from my view) that should guide your decision are these:
What will your production environment look like? Do you have complete control over the server? Can you host it however you want? Or will it be on a shared system to start with, and then you have to expand on that?
Do all the developers on your team have the ability to learn a new language very fast? How fast will they be able to understand an event-based language like JavaScript for the middle tier?
Do you need all of the architecture that Rails gives you (full Testing framework, scaffolding, models, controllers, etc)? Or is that overkill?
There are quite a few technical differences between the two. One is a language, one is a framework. Really, how heavy of a stack you want to run? How much learning will your developers have to do? Do you want a full stack the gives you a lot of niceties, that you may not use, or do you want a bare bones set up that runs extremely fast and concurrent, even though you may have to write extra boiler plate code and learn a new lanugage?
While Rails is not as heavy as some web application architectures, you're still going to need more processor power than you would to handle a similar amount of throughput in NodeJS. Assuming quality code for both systems. Bad code written on either stack is going to prevent the stack from shining. It really comes down to- Do you really want to learn a whole new way of doing things, or utilize your current understanding of Ruby to get things off the ground fast?
I know it's not really a definitive answer, but I hope this helps guide you to a decision!
One thing worth mentioning is the production story. EM, like most Rack stuff, has plenty of testing and monitoring tools available that are well tested, whereas Node.js falls well short in this respect.
At the time of writing, it seems almost impossible to get clear metrics from Node to answer questions like 'Do I need to scale'. There are options starting to form out there from the likes of Joyent, and always the roll-your-own argument, but nothing anywhere near tools such as NewRelic.
Node.js is very good from a performance / configurability point of view, but personally I wouldn't host it in production just yet.
Node.js
You get far better control low level control over what's going in. You can include general libraries to build on top of node.js to tweak your level of abstraction to your own liking. For example you can use connect or express depending on whether you want a view engine written for you.
You can use socket.io or now depending on how much you want your client-server connection abstracted. You can opt to include any of numerous MVC libraries or write your own.
Event-Machine
An asynchronous IO library just like node.js
It comes down to a Ruby vs JavaScript preference, how much flexibility you want with abstractions or lack of abstractions and whether you want to use node as your actual web server.
a detailed view at confusion has already been proposed... just a personal view
[] node.js will be better, if you are ready to learn and experiment more than you think because:
it's thread mechanism is awesome (inspired from that of 'erlang')
you can build a purpose specific server (easily) which will be real productive
This question contains a lot of background information, to make sure you fully understand why we are looking at these technologies.
The question is basically this:
For a large, spreadsheet-type, module that we need to develop for our webmodule for our application, are there any pitfalls we should know about if we decide to use Silverlight for it?
Issues we already know, and don't need any discussion/reminders about:
We're aware of the problems around using a plugin-type solution, which may or may not be installed on the users machine (and in some cases, probably can't be installed). These risks needs to be mitigated, but we're aware of them. Please don't get hung up on this.
We're a .NET company, so while ruby on rails and lots of other different platforms and architectures are good for this solution, they are not in the scope of the decision here. We have lots of code already written in .NET that we need to take advantage of, otherwise the project will never be finished regardless of platform.
Background
We have a web module for our application with employee-related information and some input forms. Our Windows desktop application is mostly a department leader type of application, to manage employees, but the web module contains mostly employee-centric functions. The web module contains mostly report-type webpages, to list information from the system, or input-forms.
The module we need to add now is more of a heavy spreadsheet type application. You change something one place, and something changes somewhere else, like sums, what is enabled/disabled, etc.
We know we can manage all of that with AJAX, but another issue here is that the application will potentially load a lot of database data in order to put the data in front of the user, and with a AJAXy solution, we're afraid that the request/response method here will have to reload quite a lot of information on every request, even to respond to seemingly easy questions.
A way to mitigate that would basically be to load information into a Session-object or similar, but that's a big no-no, so we'd rather not do that. This is a multi-user module, and some of the data is rather static, but some of the data is also going to have to be refreshed from time to time, so if 10 users loads a lot of data into the session, that's going to be a pretty big memory-hit.
We will be using ASP.NET (MVC) for this if we choose to go this route, that is, developing the module in pure HTML and similar technologies.
Then we looked at Silverlight, and would then load all the information down into the Silverlight application on the client. It would hold the current state, and would only need to touch the database to refresh some of the information, some of the time, instead, as we think the request/response model with ASP.NET (MVC) would work, on every little request.
But, since we have only done minor things with Silverlight, we're not that experienced with it, and we're afraid that some assumptions we might have, stated or unconcious, turns out to be wrong or flawed, which will make this project impossible or very hard to manage at some point.
For instance, just to take an example, is there a limit to how much memory the Silverlight application is allowed to load (I know, if I have to ask I can probably not afford it), for instance if there is a limit on 10MB, then that would be nice to know about before we're midway and start to load the really heavy data.
To make it simpler to give examples, let's just assume we're building a spreadsheet, that has so much data, that for the simple "changed a number here, what else changed", too much data from the database has to be loaded for a proper request/response model to be used, and if we move the entire thing to Silverlight, what will make that project hard or impossible?
Knowing about such things would at least give us the ability to consider if the price is acceptable.
In short, why should we not use Silverlight for this and instead go for ASP.NET (MVC)?
And again, "use Ruby on Rails instead", is not really an answer here. The options are ASP.NET (MVC) which we have experience with, or Silverlight which we don't but can gain.
Of course, if Ruby on rails, given that we'd have to start pretty much from scratch infrastructure-wise, and have to learn a new programming language, and framework, and download and learn a new IDE/tool, if it would still allow us to cut the development time in half, then please give us some information about how that might work, but I daresay that won't really happen here.
You should know that Silverlight (version 3.0) does not support any printing whatsoever, which to me sounds like a whopper of a showstopper for you (sorry, I couldn't resist). The good news is that full printing support has been added in version 4, but that is still in beta. Rumours say it should be out before the summer if everything works out according to plan, so if that fits with your roadmap I would use SL4 right from the start.
There are no memory limitations in Silverlight, but for the local storage (IsolatedStorage) mechanism there is a default limit of 1MB. But you can easily get around that by asking the users permission to increase the local storage space when he/she starts up the application. More on that here: Silverlight Tip of the Day #20 – How to Increase your Isolated Storage Quota.
(Edit)
Aside from the missing printing functionality that will be fixed in SL4 I cannot see any problems with your scenario. I would easily take the Silverlight route if I were you, especially since you already have extensive knowledge of .NET/C#.
For a rich interface as you've described, I would definately go with Silverlight or Flash rather than a html/javascript/ajax solution.
These technologies make for much better and consistent interfaces across platforms, you can buy in various components to speed things up and support things like copy-n-paste and code in a more structured way.
Another element is skills, if you have the skills to achieve it in a particular technology, then go with that.
To the answer you question the best way I can; you should not use silverlight if you decide to use flash.
HTH
What language, framework, and hosting considerations should one make before starting development of a scalable web application?
The most important consideration is not to over-engineer to the point that it gets in the way of building and launching something. Analysis paralysis is the single biggest inhibitor to productivity, progress and results.
Yes, do some planning. Pick a framework. Perfection in a framework will be impossible to find because it doesn't exist, partially because you don't know what you need until you build it anyways. Chances are, if you pick something, it will be better than picking nothing.
Yes, try to pick flexible, inter-operable tools for where you see yourself going.
Yes, look for a good built-in feature set where you see yourself going in the next 6-18 Months. Trying to look beyond that is not really realistic anyways as most projects change so much anyways going towards the first release.
So, pick what you're comfortable with or what is familiar. Don't follow the crowd, do what gets you the best results, quickest, and often. Understand that you might have to change in the future. So, whatever you build now, try to use unit testing so you can re-factor if ever needed.
If what you're building is going to be super successful, it will be a great problem to have, and an easy one to work on once it's making money as you'll be able to get other talent to help you.
Share what you end up picking and why for your situation -- it helps the us learn from you too!
Don't necessarily marry yourself to one language or framework. It may be that some parts of your site work better with different languages and frameworks than others. For example, all of 37signals' sites are based on Ruby on Rails, but they recently wrote a blog post about how the underlying technology of one is actually written in Erlang now because it's much easier to do concurrency that way.
Obviously there's a level of complexity where things turn into a mishmash, but using the right tool for the job — even if that means different tools for different jobs — can simplify things.
Firstly on language, it largely doesn't matter. PHP, Java and .Net being probably the biggest three are all proven in the sense that they run some of the largest sites on the Web so don't listen to anyone who tells you one is more suited than any of the others.
Some might also put Ruby and Django/Python in this list. I have nothing against them but I'm not aware of any big (say top 50) sites using either.
Hosting considerations depend on how low you want to start but basically the order is:
Shared;
Virtual Private Server;
Dedicated.
Scalability will largely be about your application's design than any language, framework or provider. Efficient database schema, efficient delivery and use of Javascript/CSS and in-memory caching are all issues common to any language or framework.
Language - I'd recommend something with good frameworks and good testing libraries like Perl or Java.
Framework - it depends on what do you plan to do. If you start with a hosting that does not allow FastCGI, it is best to avoid such frameworks like Catalyst or Rails. That's why I love CGI::Application (primarily Perl, but ported to other languages too) - it can run as CGI, FastCGI or mod_perl. For development it can be run from it's own web server.
Hosting - nothing is better than you own server. It can be your own server, leased server or virtual server. But you can start with cheapest hosting and when you need more, you should be able to afford it.
It depends.
Start by looking at your requirements (Functional or user defined) (Non Functional - aspects that describe your desired system link text)
Next I would clarify what it means to have a scalable web application. Define it as test cases that can be clearly tested (must support X page views / second with response time < Y seconds).
Once I had those pieces in place I would look at what type of skills my development team can support (for the intial project and on going maintenance). Then find some case studies of applications out in the wild that use similar language or framework. If someone else has made a specific language / framework scale then chances are good that you can too.
Finally go out and look for some hosting providers that support your chosen language, framework and requirements.