Planning Scalable Web Application Development - scalability

What language, framework, and hosting considerations should one make before starting development of a scalable web application?

The most important consideration is not to over-engineer to the point that it gets in the way of building and launching something. Analysis paralysis is the single biggest inhibitor to productivity, progress and results.
Yes, do some planning. Pick a framework. Perfection in a framework will be impossible to find because it doesn't exist, partially because you don't know what you need until you build it anyways. Chances are, if you pick something, it will be better than picking nothing.
Yes, try to pick flexible, inter-operable tools for where you see yourself going.
Yes, look for a good built-in feature set where you see yourself going in the next 6-18 Months. Trying to look beyond that is not really realistic anyways as most projects change so much anyways going towards the first release.
So, pick what you're comfortable with or what is familiar. Don't follow the crowd, do what gets you the best results, quickest, and often. Understand that you might have to change in the future. So, whatever you build now, try to use unit testing so you can re-factor if ever needed.
If what you're building is going to be super successful, it will be a great problem to have, and an easy one to work on once it's making money as you'll be able to get other talent to help you.
Share what you end up picking and why for your situation -- it helps the us learn from you too!

Don't necessarily marry yourself to one language or framework. It may be that some parts of your site work better with different languages and frameworks than others. For example, all of 37signals' sites are based on Ruby on Rails, but they recently wrote a blog post about how the underlying technology of one is actually written in Erlang now because it's much easier to do concurrency that way.
Obviously there's a level of complexity where things turn into a mishmash, but using the right tool for the job — even if that means different tools for different jobs — can simplify things.

Firstly on language, it largely doesn't matter. PHP, Java and .Net being probably the biggest three are all proven in the sense that they run some of the largest sites on the Web so don't listen to anyone who tells you one is more suited than any of the others.
Some might also put Ruby and Django/Python in this list. I have nothing against them but I'm not aware of any big (say top 50) sites using either.
Hosting considerations depend on how low you want to start but basically the order is:
Shared;
Virtual Private Server;
Dedicated.
Scalability will largely be about your application's design than any language, framework or provider. Efficient database schema, efficient delivery and use of Javascript/CSS and in-memory caching are all issues common to any language or framework.

Language - I'd recommend something with good frameworks and good testing libraries like Perl or Java.
Framework - it depends on what do you plan to do. If you start with a hosting that does not allow FastCGI, it is best to avoid such frameworks like Catalyst or Rails. That's why I love CGI::Application (primarily Perl, but ported to other languages too) - it can run as CGI, FastCGI or mod_perl. For development it can be run from it's own web server.
Hosting - nothing is better than you own server. It can be your own server, leased server or virtual server. But you can start with cheapest hosting and when you need more, you should be able to afford it.

It depends.
Start by looking at your requirements (Functional or user defined) (Non Functional - aspects that describe your desired system link text)
Next I would clarify what it means to have a scalable web application. Define it as test cases that can be clearly tested (must support X page views / second with response time < Y seconds).
Once I had those pieces in place I would look at what type of skills my development team can support (for the intial project and on going maintenance). Then find some case studies of applications out in the wild that use similar language or framework. If someone else has made a specific language / framework scale then chances are good that you can too.
Finally go out and look for some hosting providers that support your chosen language, framework and requirements.

Related

I'm interested in developing a website from the ground up. Where do I start? What should I learn? What should I use?

I'm quite new to the field of computer science but I think I've got a pretty decent idea for a website to aid classroom CS learning and collaboration. I'd really like to develop the website from the ground up and make it a sort of pet project in hopes of eventually getting it out on the web for free. Hopefully I can get some teachers to adopt it for use with their classes.
The problem is that I honestly don't know where to start. I've got the idea but I don't have enough formal education to guide the implementation of my idea. The site should have quite a bit of functionality in the long run. I'll need to be able to store user and class data/files as well as offer discussion boards and other things.
Without getting into too many details, what is the best way for me to get started? What languages and databases should I be most interested in as I build the site and ensure scalability and future functionality developments? I would really appreciate any information you could give me on how to structure the project/stack as I don't have much of a clue at this point. I have the idea. Now I just need a little bit of help getting started.
Thanks!
There are definitely already projects out there that will (more than likely) do everything you're currently considering. That said, there's immense benefit in doing a project like this for personal development - you get to learn, and you expand your public portfolio. If you run the project as open source, you can also demonstrate your ability to work with others. All very good (hireable) attributes.
Are there any programming languages you already know? Are there any that your course is going to be teaching that you know ahead of time?
There are so many different languages and frameworks available to choose from, but I'll only mention a few.
Language: Framework
.NET: ASP.NET MVC
python: django
ruby: ruby-on-rails
I'm a huge fan of django. Python is quite a nice language to learn. I'd recommend django purely from a biased point of view. Python runs on Windows, Linux, and Mac, though you probably don't want to host python on windows (culture more than ability).
Conversely, if you really like Windows, ASP.NET MVC makes building out websites very very easy. Mono does allow you to run .NET on linux and mac, but you might find support lacking, and I wouldn't suggest using Mono for your first project.
PHP is (was?) another popular language for building websites in. There are tonnes of web frameworks available for PHP. Popular opinion seems to be that PHP makes it easier for developers to write bad code, though it is possible to write good code with PHP.
Unfortunately, without knowing a rough direction in which you're headed, it's nearly impossible to offer some concrete advice. Database choice will generally come down to what language and platform (linux/.net) you're targeting. Web server also fits this profile. Once you decide on a language, narrowing down the other choices become a lot easier.
Learn HTML to start with and keep improving as per needed with css , javascript. You won't need more then this.

EventMachine vs Node.js

I'm going to develop a collaborative site, and one of the features will be collaborative editing with realtime changes. i.e. when two or more users are editing the same doc, they can see each other changes as soon as they happen.
I have some experience with Ruby on Rails, so I was thinking about using EventMachine, but with all this hype around Node.js, I am know considering using it instead. So, what would be the main benefits of using Node.js instead of EventMachine?
tl;dr
What are the main differences between EventMachine and Node.js (besides the language)?
EventMachine has nothing to do with Rails apart from them both being written in the same language. You can get EventMachine as bare as Node.js; all you have to do is not add libraries to your project. In my experience the EventMachine libraries (like em-http) are much nicer than anything for Node. And you can use fibers instead of callbacks to avoid callback hell. Complete exception handling is pretty much impossible in Node because of all the callbacks. Plus Ruby is a nicer, more complete language than Javascript.
I tend towards the "use what you know" (even if it's a heavier architecture). Because of that, I don't see it being quite as simple as "EventMachine vs NodeJS." Mainly, the difference can be summarized as this:
NodeJS is a framework/language that was written to handle event based programming in JavaScript. That is its driving force. It's not an after thought, or a third party mechanism. It's baked right in to the language. You create callbacks/events because that's how the language is built. It's not a third party plug in, and doesn't alter your workflow.
EventMachine is a gem in Ruby that gives developers access to some of the goodness of the event based programming model. It's heavily used and well tested, but not baked directly in to the language. Both are locked to one CPU, but with event programming at Nodes core, it still has a leg up. Ruby wasn't written with concurrency in mind.
That said, technical problems can be overcome. The more important questions (from my view) that should guide your decision are these:
What will your production environment look like? Do you have complete control over the server? Can you host it however you want? Or will it be on a shared system to start with, and then you have to expand on that?
Do all the developers on your team have the ability to learn a new language very fast? How fast will they be able to understand an event-based language like JavaScript for the middle tier?
Do you need all of the architecture that Rails gives you (full Testing framework, scaffolding, models, controllers, etc)? Or is that overkill?
There are quite a few technical differences between the two. One is a language, one is a framework. Really, how heavy of a stack you want to run? How much learning will your developers have to do? Do you want a full stack the gives you a lot of niceties, that you may not use, or do you want a bare bones set up that runs extremely fast and concurrent, even though you may have to write extra boiler plate code and learn a new lanugage?
While Rails is not as heavy as some web application architectures, you're still going to need more processor power than you would to handle a similar amount of throughput in NodeJS. Assuming quality code for both systems. Bad code written on either stack is going to prevent the stack from shining. It really comes down to- Do you really want to learn a whole new way of doing things, or utilize your current understanding of Ruby to get things off the ground fast?
I know it's not really a definitive answer, but I hope this helps guide you to a decision!
One thing worth mentioning is the production story. EM, like most Rack stuff, has plenty of testing and monitoring tools available that are well tested, whereas Node.js falls well short in this respect.
At the time of writing, it seems almost impossible to get clear metrics from Node to answer questions like 'Do I need to scale'. There are options starting to form out there from the likes of Joyent, and always the roll-your-own argument, but nothing anywhere near tools such as NewRelic.
Node.js is very good from a performance / configurability point of view, but personally I wouldn't host it in production just yet.
Node.js
You get far better control low level control over what's going in. You can include general libraries to build on top of node.js to tweak your level of abstraction to your own liking. For example you can use connect or express depending on whether you want a view engine written for you.
You can use socket.io or now depending on how much you want your client-server connection abstracted. You can opt to include any of numerous MVC libraries or write your own.
Event-Machine
An asynchronous IO library just like node.js
It comes down to a Ruby vs JavaScript preference, how much flexibility you want with abstractions or lack of abstractions and whether you want to use node as your actual web server.
a detailed view at confusion has already been proposed... just a personal view
[] node.js will be better, if you are ready to learn and experiment more than you think because:
it's thread mechanism is awesome (inspired from that of 'erlang')
you can build a purpose specific server (easily) which will be real productive

Appropriate use of Grails, Rails, etc?

We've got an Excel spreadsheet floating around right now (globally) at my company to capture various pieces of information about each countries technology usage. The problem is that it goes out, gets changes, but they're never obvious, and often conflicting - and then we have to smash them together. To me, the workbook is no more than a garbage in/garbage out type application waiting to be written.
In a company that has enough staff and knowledge to dedicate to Enterprise projects, for some reason, agile and language/frameworks such as Rails, Grails, etc. are frowned upon. That said, I can't help but think that this is almost a perfect fit for the need, given the scaffolding features for extremely simple implementations of capturing raw fields with only a couple lookups (i.e. a pre-defined category). I'm thinking this would be considered a very appropriate use of these frameworks.
Has anyone worked on these types of quick and dirty apps before in normally large-scale, heavy-handed enterprise environments with success? Any tips for communicating this need/appropriateness to non-technical management?
The only way to get this implemented in a rigid organization is to get this working and demo it -- without approval. It's very hard for management to say no to a finished project.
I work for a really big company & have written many utility apps based on Rails (as well as contributed to some larger Rails projects). That said, the biggest concern is not the quality of the app, but who's going to support/maintain it when you leave or get hit by the bus.
IMHO, The major fear that an enterprise organization has - especially if the application becomes more critical to it's core business - is how to support it. If it doesn't fit into it's neat little box of supported technologies, it's less likely to happen.
Corporations have been bitten by this many times in the past & are cautious when bringing in new technology.
So, if you can drum up more folks to learn Ruby/Rails in your group (or elsewhere in your company), you may be able to make a good case for it. Otherwise, sad to say, your probably better off implementing something on Sharepoint :-(.
If you already have a Java infrastructure, then creating a Grails app will require little to no additional IT ramp up to support and maintain. The support and maintenance cost and effort should be the same as for a Java application (i.e. Grails apps run on Tomcat, use the same JVM, use the same diagnostic/profiling tools, etc.).
In my experience, larger IT organizations have a harder time supporting Ruby when its not already in the toolchain because its a new language, new deployment environment, and requires a considerable amount of support and maintenance ramp up.
I would develop a minimal viable product, then make friends with someone in IT who can help you deploy it into a staging or production environment. Then get a few of the users to hop on board and test it like its a Beta product. After that, open it up to a larger audience.
So as others have said, forgiveness over permission, but be smart about the impact on the IT organization.

Which web framework for someone who wants a job?

I want to learn a framework that promotes good programming practices and is respected by the programming community.
However, I also want a framework that I can use for a day job.
Which one would you recommend?
This question comes from my experience of learning the basics of Django because it was highly acclaimed by developers on Stack Overflow and Hacker News. However.. there's hardly any jobs in my area (NYC) that are asking for Django developers.
As a long-time ASP.NET guy, I've recently gone through a similar decision process to figure out what other web frameworks I should try. Here's what I learned so far which may apply to your case too:
framework/platofrm choices (and hence job opportunities) are highly regional-- the Bay Area job market differs alot from what you'll find in NYC, Chicago, Montreal, or London. Look at local job listings (craigslist and indeed are good places to start) to get a good sense at what's in demand.
similarly, usage varies alot based on the size and type of company. if you want to get a job in a large company, Spring MVC and ASP.NET MVC may be your best bets. In small companies, DJango and (especially) Rails seem to be on the rise.
usage also sometimes varies by industry. for example, many HR apps seem be to .NET based, while financial/banking apps seem to favor Java. if you want to work in a particular industry, check out what up-and-coming companies in that industry are using.
when investing your scarce time in learning something new, favor technologies which are on the upswing of the adoption curve (e.g. Rails) rather than frameworks with wider adoption which may not be growing as fast. Also be wary of very early or niche frameworks which may not ever gain wide adoption.
the one common thread between most (or almost all) frameworks gaining in popularity is that they're MVC frameworks and rely heavily on a solid understanding of REST. Learning those concepts in depth is a good idea.
before deciding to invest a lot of time in one framework, gain a basic understanding of several of them, so you can get a reasonable sense of what you like and don't about each-- and so if you end up applying for a job using a framework you haven't learned, at least you'll be able to talk intelligently about it.
If you focus on what you enjoy, you'll be more motivated to learn it. For example, personally I found Rails (regarless of employment opportunities) more interesting than Spring or Django, so I decided to focus on Rails first. Others may have different impressions-- follow your programmer instincts. That said, there are often few jobs using technologies you find fascinating, so try to strike the right balance: technology you like that many companies are actually hiring people to use!
once you answer the basic "what framework" question, there are many more questions lurking, including picking a javascript framework, validation framework, an ORM, etc. Don't worry too much about those choices yet-- when starting, just pick the default implementation for your framework. But as you get more advanced, the same argument about frameworks also hold for those other things-- e.g. it's useful to know a few ORMs.
Personally, I decided on this approach:
continue building stuff in what I knew best (ASP.NET) but transition all work to ASP.NET MVC, where I can better understand MVC and REST concepts which apply cross-platform
learn JQuery (again, platform neutral)
blow off the ORM choice alltogether for now-- too many other things to worry about
build a few projects in Rails, which is the framework I see used most in the newer SF-Bay-Area startups I've been looking at
learn the basics (e.g. read a book or two, try a few samples) about Python/Django, Java/Spring, and Groovy/Grails.
I've encountered real projects at cool, small companies using Django, Ruby on Rails and (eiuw!) even Zope. .NET is for teletubbies - I've only ever heard of it being used by big corporations that don't know better.
I would say that knowing two or three is better than knowing one that is widely used because you will gain a better understanding of how it works as a concept. For instance if you've only used Java, there is something probably missing in your understanding of OOP, because you're pigeon-holed into thinking about it in one way. If you already know Django though you Spring would probably be a good compliment to that.
i'd probably say ASP.NET MVC. I always see lots of .NET jobs around and this seems to be a solid framework which i think in fact powers all the stackoverflow family. As a PHP developer i must also make a mention of Zend Framework which is used by a number of big sites including bbc.co.uk and is now frequently mentioned in advertisements for PHP jobs.
I want to learn a framework that promotes good programming practices and is respected by the programming community. However, I also want a framework that I can use for a day job.
Sorry to be the bearer of bad news here, but those two desires tend to conflict. IMHO most business managers tend to go for (ugly) rapid development on top of CRMs or other higher-level 3rd party codebases. Building elegant websites from the ground up mostly happens in startups, or true web companies where the website is the sole product. There are not that many of those companies; and many of those that seem to fit are actually a mess on the inside, i.e. due to time pressure, messy legacy code and many other reasons you often don't get to write according to "good programming practices" anyway.
I agree with Kaleb Brasee that Java and .NET are the two main platforms when job availability is a priority.
Every job market is unique, so look at job openings in your area, or call a handful of recruiters and ask what they see a need for / could easily place you in a junior position for. What I'm seeing is that Microsoft Sharepoint is in demand, and a few other regional CMS'es are in demand (in Denmark I see Sitecore regularly).
I think ASP.NET MVC 2.0 together with MVC Areas and ASP.NET Dynamic Data will have a good story, a good solution, for many of those bosses who want rapid development. And I think the resulting code could be quite okay, or at least not bad compared to many of the "CMS beaten into something else" sites that exist. But this is a brand new thing for the .NET platform, and it will need to be sold to the decision makers first...
Bottom line: If you want job security first and foremost, then look at large CMS's like Sharepoint, and work on other technologies in your spare time. Optionally you could take a job at a startup / a web company later; but look before you leap.
Have you tried Spring MVC? Many companies do use Java for web-apps (or .NET) and web service based applications.
Since you mentioned Ruby on Rails, you might want to learn Ruby on Rails. It has got some good programming practices in it and a very well thought architecture. The Ruby community itself have also (in my personal opinion) created very innovative frameworks and highly favor testing and quality. You can see this by the innovative testing framework like Cucumber, webrat, shoulda, coulda, rspec, test/spec. Many startups also uses Rails as their platform, so it should be easier for you to get a job. You can start looking at Working With Rails and 37signals job board. So there is a good ecosystem inside Rails and Ruby community.
But the downside of Rails compare to Django is mainly there are too much magic (less explicit) and the docs is not as good as Django. If you want to get a Django job, try looking at several news site because Django grew up from a newspaper site so it is adopted alot in news based sites.
I would recommend ASP.NET MVC, Ruby on Rails, or Python/Django, they all seem to be popular and successful, and based on the MVC paradigm which is definitely the right tool for the job when it comes to the web.
.NET and Java are by far the 2 largest platforms used by employers, and hence the most in-demand when searching for a job. Java has a few popular frameworks, with JSF, Spring MVC and Struts all seeming to be about equal in demand. I don't use .NET, but from what I've seen, ASP.NET and ASP.NET MVC are the major ones.
I would say that most of the frameworks mentioned here promotes good practices. But that doesn't neccesarily mean that the companies using those frameworks are actually following those good practices! In fact most probably aren't. So don't expect too much.
You see, places like Stack Overflow, Hacker News etc. are a great way to connect with people who really care about their craft. Sadly this is a minority. There are millions of programmers in the world. Most of them suck. The code they write sucks. They don't care. They are not interested in improving their skills. They just want to learn the bare minimum required to collect their paycheck, go home, feed the dog, spend some time with the family, watch some TV, go too bed and do it all over again the next day.
Okay that was a bit harsh :) What I'm getting at is that you are probably better off asking this question to some of the managers at the companies where you would like to work. My guess is that most of them will answer .NET or Java. If you are up for a laugh ask them why they chose that particular technology over something else, and see how many buzzwords they throw at you ;)

Should I re-write the webapp front end in Erlang?

I have a Rails webapp [deployed on Heroku] which makes a series of HTTP calls to other sites on a repeated basis, using Heroku's rake:cron feature. The current situation isn't ideal; the rake:cron process is executed in a single thread, which means HTTP calls are made sequentially; which means in turn that there's a long time between calls to the same site [typically 2 mins].
I'd like to execute this process in parallel, and reduce the time between calls to 10 secs. Having seen Kevin Smith's 'Erlang in Practice' I'm sold on the idea of using Erlang as a replacement backend. What I'm trying to figure out [given Damien Katz's comments], is whether I should a) re-write the entire webapp in Erlang, front end and all or b) maintain a split structure, with a Rails frontend / Erlang backend.
I like the idea of using a 100% Erlang stack for the project; I'll need to use some kind of Erlang web framework [Nitrogen ? Erlyweb ?]; I'm concerned they're not mature enough and I'll spend my time bogged down on the web part of the project with them.
Anyone any views ? Thanks.
What's the actual impact on your visitors (of the two-minute interval between HTTP backend calls)?
If there isn't much of a difference, I'd say this sounds like premature optimization and that you'd be much better off skipping Erlang for now.
The two previous posters have pretty much covered they philosophical aspects of your question. So I'll answer the framework maturity/getting bogged down part of your question.
In the event that you decide you do want to rewrite the webapp in erlang for whatever reason then I wouldn't be too concerned about the framework slowing you down. Both erlyweb and nitrogen are already feature complete enough that you can work pretty quickly with them. I've developed a fairly complex agile project management app in nitrogen and found it to be quite intuitive and not really lacking in features that I needed. A few hours in the evenings and a few weeks later and I had a working app up and running.
As to which one to use that depends on the type of app you want to build.
Nitrogen's target is extremely dyamic web applications. Most of the page is rendered using javascript and it is highly event driven.
ErlyWeb is more suited to a site where the content is the primary focus less so than a rich client type of application. It uses the MVC style of architecure.
Good luck on whatever you decide.
It depends. How much Erlang do you know? How much code have you already written?
How much project experience do you have? Is this for work or for fun?
Rewriting projects from scratch is often a recipe for disaster, especially if you are trying
to learn a new language along the way. It seems to me like you would not be asking this question if you were already fluent in both languages, in which case I would recommend that you just stick to Ruby if it's a work project.
I disagree with the above poster that changing the language is a premature optimization, if it is necessary.
Changing the language is a big deal. It can't be done at the last minute.
However, I would probably not change the language at all for the reason you outlined.
If you don't have any other reasons than performance for switching, you should probably just
look at multi-threading in Ruby or some other optimization.
I'm all about using the right tool for the job. Unless you have an absolutely dead on reason to port the front, there's absolutely nothing wrong with hooking the two together.

Resources