Is Cassandra production ready for Ruby on Rails? - ruby-on-rails

I'm working on a project that is considering using Cassandra as a database. We would like to eventually migrate to Cassandra even if we use MySQL to start with, given its scalability. I know that big companies like Facebook, Digg, and recently Twitter is using Cassandra, but I don't believe any of those sites run off Rails. My question is whether or not it's feasible to use Cassandra using Ruby on Rails. Points to consider:
We heavily rely on the Authlogic gem. Would switching to Cassandra affect how it works?
Are there any mature ruby clients for Cassandra? Looking on Github it seems that fauna's client (now twitters's client) is the most mature. Has anyone had production experience with it?
Appreciate any tips.

Twitter is running rails on most of their front end. Fauna's client is actually built and released by twitter, so you can be pretty certain that it's up to date and stable on large workloads. Looking at the history of commits shows that there are frequent improvements being pushed to it, which is great.
Most likely Authlogic would need to be customized to work properly with Cassandra. In particular, it appears to provide certain methods based on named_scope and relational data.
It does appear that someone has built a plugin for DataMapper support in Authlogic: http://twitter.com/collintmiller/statuses/2064046718. You may be able to use that as a starting point for making it compatible with Cassandra.
Good luck!

I don't think starting with MySQL and then moving to Cassandra is a good idea.
Cassandra is a NoSQL solution, while MySQL is a "classic" SQL-driven database.
This means that your models would be different.
If you start with MySQL, you will have to rely on ActiveRecord for creating your models. If you then change to Cassandra, you will have to change all your models to a NoSQL-compatible middleware (such as BigRecord). This not only means changing your models, but also the controllers that use them (since their interface would be different).
This said, Cassandra and the like are supposed to be used on very demanding applications - like twitter.
The rest of web applications out there are orders of magnitude less intense - are you sure you still would need Cassandra?
PostgreSQL, and a well-designed database, is just good enough 98% of the time.

If you then change to Cassandra, you will have to change all your models to a NoSQL
This isn't true at all. If you have programmed in such a way that your MySQL db does loads of joins, then yes, you may have a problem. We avoided joins as much as we could from the beginning when we started the MySQL route. Then when we started migrating to Casandra it was fairly easy, we did so with 1 model only at first. Then say 4 models in one go. Etc. Works well. In fact, when you read the interview with twitter you'll notice they ran MySQL and Casandra in parallel for the same model for a while: http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an-interview-with-ryan-king.
As to Authlogic, you can keep that part in mySQL for as long as you like, just keep it loosely coupled with your Cassandra data.

I'm researching Cassandra, MongoDB, and CouchDB right now.
One way to tell which has the most developer support, is by checking the number of watchers on the highest rated github project for each. At least as a rough estimate.
Right now it's
852 - MongoDB
http://github.com/jnunemaker/mongomapper
544 - CouchDB
http://github.com/jchris/couchrest
178 - Cassandra
http://github.com/fauna/cassandra
Although, I have to say with a bunch of high profile sites (Twitter, Digg, Reddit, etc) recently announcing that they're moving to Cassandra, this is a big vote of confidence for them.
Mongo seems to have the most and best documentation so far. Their auto-sharding is still in alpha though so how well it scales still remains to be seen I think.
I'm just starting to learn about all this stuff, so if others have insight please share.

There is also http://github.com/NZKoz/cassandra_object, which IIANM builds on top of the fauna client. "Cassandra Object provides a nice API for working with Cassandra. CassandraObjects are mostly duck-type compatible with ActiveRecord objects so most of your controller code should work ok... Use this in production only if you're looking to help out with the development, there are a bunch of rough edges right now."

Related

Heroku vs DotCloud vs Duostack vs other cloud/PaaS providers (Rails and non-Rails)?

We have a very simple function (We look something up from a third party database and return an answer. It's literally five lines of code.) We would like to offload this task from our main server because we expect a high volume of traffic for this one function and would like to optimize it.
We are thinking about testing the promise of many cloud/PaaS providers, where they handle scaling and performance responsibilities.
We're most interested in Rails environments, but are curious to hear experiences from others about any company in the space.
Here are the PaaS companies we found that supports Rails:
1) Heroku
2) DotCloud
3) Duostack
Questions:
1) Do you know of other Rails-specific companies? Also feel free to list non-Rails companies since we're interested in following other companies in case they eventually provide Rails support.
2) How has your experience been with these companies?
Foreword and disclaimer: I work for DotCloud; so the following might be biased. You've been warned.
DotCloud could be interesting for you if you like the following features:
run something else than Ruby (what about some Django or Pylons code talking with your SQL DB? Or even some PHP blog like WordPress or Drupal, using the same user authentication database?)
experiment with databases like Redis or MongoDB, or background ruby workers, without paying for add-ons
SSH access, crontab access (without requiring an add-on)
cheaper workers (I didn't come up with this one; some of our users coming from the Heroku world told us that workers were insanely expensive there)
Duostack is indeed very nice if you want to mix specifically Rails and Node.js. I've been told that they had awesome auto-configuration facilities.
Finally, if you only plan to do Rails and nothing else, ever, you might as well stick with Heroku since they've been in that business for a while, and are probably more mature than the first two of the batch.
Shameless plug: DotCloud is offering a beta test drive; so if you want to see what it looks like, just subscribe to the beta and you will be quickly enough be able to see for yourself. Heroku has a free tier as well.
You could add EngineYard in the mix - but i'd be inclined to use Heroku as my first choice, Dotcloud second (it's a newish product, and is very good but still in development)
If you want more control over your app/servers or want to run it on any cloud or your own infrastructure without having to download/deploy anything, you can try Cloud 66 (www.cloud66.com)
Disclaimer: I work for Cloud 66
A lot has changed on the scene since this question was asked. We recently looked into these services and settled on Heroku, but even more recently decided to continue managing my own deployments directly on EC2. Here are some points not mentioned in the other answers.
Heroku
Now supports much more than just ruby
Has really great-looking support for PostgreSQL
Uses LXC for process containers, like DotCloud
DotCloud
Is now Docker, and is putting a lot of manpower into developing docker.io
Doesn't have a free tier any more
I'm not sure if DotCloud is using Docker internally or not, since the docs say explicitly it isn't production-ready yet.
Our decision to stick with plain EC2 was motivated by the fact that it's cheaper and affords a lot more flexibility. For example, we use local-only http servers behind our public server to do some of our request processing, which doesn't really fit into the PaaS models out there. We would have had to reimplement all our back-end components as redis workers, and pay for them as additional dynos. The fact that Amazon RDS now supports PostgreSQL was also a compelling factor. Incidentally, Amazon has a full-stack PaaS offering as well, Elastic Beanstalk.
Just stumbled upon the question. There are similar ones around here. The problem is also: The PaaS scene is changing very quickly. New vendors are popping in every week or so.
Nowadays OpenShift from Red Hat might also be mentioned here as a Ruby PaaS.
OFFTOPIC + shameless plug: I have compiled a list of PHP PaaS here: http://blog.fortrabbit.com/comparing-cloud-hosting-platforms/

Choosing Ruby on Rails as platform for an browser based online game

I have some (I think) really great ideas for an online strategy game similar to Travian. There's some content that I haven't yet figured out and some other challenges that I don't know of yet.
This is quite a big project and perhaps too heavy for one person that isn't a skilled web developer (yet). I'd still like to give it a shot, but I'm having trouble choosing a platform. The world "scales" has been thrown around a lot lately and I've seen Ruby on Rails being bashed because it doesn't scale well, so I've come here to get some answers.
I like Ruby on Rails, both Ruby and Rails. I'm certainly no expert at it but I love working with it. I have also worked with Python + Django before and also with PHP (which I am not fond of.)
Ideally the game would have, let's say, 7000 players per server, presumably a lot of data to be processed per second. Would RoR still be a viable platform?
I'm sorry if this question is vague, I guess I'm looking for a "RoR is fine, go at it!" kind of answer. Anything you might want to add is fine.
Thanks!
So if I were you, I would be looking into non-blocking servers like node.js, just because they are MUCH more suited to keeping many connections open for long periods of time, which is what games need to do, compared to traditional web servers.
That being said
There are 3 main things to worry about when you are scaling a web app; memory, execution speed, and io (hd and network) in that order.
For memory, things are much better then they used to be. Phusion Passenger uses copy on write to fork its workers, so the rails environment will get shared among all the workers on a given slice, which is pretty significant. There have also been huge improvements to the way ruby manages memory compared to "the dark times", if you are using 1.8.7 then you want to be using the patches that make up Ruby Enterprise Edition (the difference is like night and day). 1.9.x was pretty much a total rewrite of the runtime, so if you are using that the memory issues ruby had have already been addressed.
For execution speed, 1.8.7 is typically "fast enough" (at least after tuning garbage collection settings). 1.9.2 is actually around the same speed as python, which puts it on the faster side of interpreted languages. How important this point is completely depends on the nature of your application.
Last point is IO, which isn't really a concern of rails, but more your persistence strategy. Rubyists tend to love new things, so you will find first class support for things like redis and mongodb, with loads of people talking about using them and their wins/gotchas. I would look into mongo if I were you and see if the durability trade-offs are acceptable.
I was in java/.net before going to rails, and at the end of the day you are going to pay more for infrastructure, but the amount will be completely dwarfed by what you save in development time.
build it in Rails, host it on Heroku.com - job done. Almost infinite scaling that you don't have to worry about how it works (it just does) and it hosts a lot of highly trafficked Facebook apps so can more than handle it.
I think Ruby on Rails is a good choice for what you need. Actually, we recently created a platform for an online gaming tournament, where players and their gaming bots were playing a logic game.
We used Ruby on Rails and Sidekiq on the backend, ReactJS and WebSocket on the frontend. And it worked well for a quite massive number of players. Here is the tutorial based on what we learnt while building it: How to Write a Game Engine Using Ruby on Rails
As you said yourself, you already have the answer and you are only looking for encouraging words:). I am not RoR expert myself, but I don't think scalability is still such a great issue on this platform. I would advice you to do an architecture spike (XP terminology). Write a test with 7000 clients and method which would perform similar operations to what you intend to create. For example you might load files, render views or even just wait... The point is to test only the thing you are worried about. Good luck!
This is a bit of an impossible question to answer because in order to know whether rails is suitable for what you want to do we would need to a lot more about what you are trying to do. The best advice I can give in the absence of information is for you to check out the railslab scaling videos in order to work it out for yourself.

How to start developing rails/django application using NoSql Database like cassandra?

Hello i want to use NoSql database in my rails/django application for learning point of view. What the various things i should kept in mind.
Any tutorials?
things to be kept in mind?
Any Tips like do and don't?
EDIT
I am fully flexible. I want to learn. I know php,rails,django.I want to create some application using Nosql database as learning point of view. cassandra is just an example.Any other Nosql will also work.
The django-nonrel scene is a good place to start reading up
Daniel Kehoe has a nice tutorial on using Rails 3, Devise and MongoDB.
http://github.com/fortuity/rails3-mongoid-devise
Addressing the Rails part of your question.
I would say that it is definitely viable use Ruby on Rails with Cassandra so long as you are comfortable with giving up some of the ActiveRecord idioms you may have become accustomed to. But that is probably true to a great or lesser extent with any marriage of Rails and a NoSQL datastore. The closest thing to ActiveRecord that is emerging is the Cassandra Object gem but this is still a work in progress by the author's admission. The most stable interface seems to be the Cassandra gem but this is a relatively low level API.
If you are interested in Cassandra from a learning perspective or you have identified it as the best option for an application you are building then well and good. Assuming you select Rails and Cassandra you can rest assured that the support for these will only improve and probably quite quickly given the growing interest in NoSQL and in Cassandra in particular.
However if you have other options which would work and Cassandra is only one of them then I would enter some caveats. Firstly, Rails on Cassandra is an evolving entity so you may encounter instability or things you expect to work don't work at all or completely differently that you expect. Secondly, and related to this, is that there are very few Rails on Cassandra deployments out in the wild right now so getting support from forums this will be all the more difficult. You may end up on your own with something when you can't afford to be. You may end up having to roll up your sleeves and pitch in to help with supporting the code yourself, which may be no bad thing.
Personally, I would wait to see how this picture pans out before I'd go with Cassandra unless I felt that nothing else could do the job as well. If it's for learning then I'd say go ahead. Can be a lot of fun being at the bleeding edge of things like this.
References:
Real world Ruby and Cassandra
Up and running with Cassandra
This is a example app for django with cassandra and there is an interesting post about django-nonrel and finally a feed about cassandra
You must keep in your mind: The ORM of django doesn't work completely because the difference between SQL backend and no-SQL backend, some thinks like joins, complex filters and things like that. You must change your mentality about of database.
in Django the dev team are working for official no-SQL support

Is the Rails ecosystem a suitable replacement for drupal

I want to make a community based site, which is Drupal's strength. However I also want to try other frameworks, especially Rails.
One of the best things about drupal is its huge modules library. If I were to switch to Rails, would I be able to find similar functionality freely available as plugins, or would I have to rebuild?
Does Rails have the equivalent of (as plugins or gems):
CCK/Fields?
Node Reference?
Views / Views Relationships?
PathAuto?
Threaded Commenting?
Multisite Functionality?
Apache Solr (or equivalent) Integration?
Thanks.
I'm afraid you'll probably hear this answer a lot, but it's not a suitable comparison.
Drupal is a ultimately a CMS, Rails is a framework. Apples to oranges, or perhaps even Apple Juice to oranges. Out of the box, you fire up Drupal and it does 'things': it has a database structure, the concept of nodes, interfaces blah, blah. If you fire up Rails you have an empty project.
As far as I know there isn't a "Drupal-on-Rails" project that would be a suitable equivalent. However, I can attest to the fact that there is an awful lot of Ruby/Rails community and O/S work out there and you might find something suitable. I'd also say that the level of modularity in Ruby and Rails tends to mean that the range of plugins/modules/gems one can use is much greater.
My personal $0.02. If Drupal does what you need, just use Drupal: it's mature and has a great community. It's never a good idea to try to port Project X over to a new language as a learning exercise because you'll inevitable fall into the "Well that's how it's done in language X!" trap and become disenchanted with the new system.
If you're wanting to learn Rails (which you should, it's awesome) I'd suggest you'd be best working on a small project and seeing what the ecosystem offers before deciding if it's suitable for the needs of your bigger projects.
I have to second what Govan said, but add to it.
With Drupal, unless you really want to get into building your own modules and extensions you are really interacting with an application. Even when you start using CCK, all you are really doing is flipping switches, filling in forms and defining new options for content on the site.
Ruby on Rails is two things, and neither of them bares much similarity to Drupal. You asked "How hard is it really?". To answer that you need to understand what both Ruby and Rails are. Ruby is a programming language designed to make the life of the object purist programmer simpler and more pleasant. So, the first part of how hard is it is simply to answer "how long do you feel it would take you to learn a completely new programming language, like PHP but different".
Rails is an 'opinionated' framework. It's opinionated in that it lays out how a Ruby web project should be structured, as well as providing multiple APIs for everything from database access to web presentation. To answer the "how hard is it" question for Rails then (assuming you know Ruby by this point), you have to answer how much do you need to learn about cacheing, database design, page design, RESTful programming etc etc.
It's not a short journey. you asked if there is an equivalent to CCK for Ruby and Rails which implies to me that at this point your knowledge of programming is somewhat limited. Ruby and Rails interact with the database. CCK lets you define things in a database. Thus, with Ruby and Rails you are effectively bypassing the wonderful dialogs and forms that CCK provides you with and doing the data definition bits yourself, by hand, in code.
From experience, when I've hired experts in another programming language and framework into my Rails teams, it has taken them between 1 and 3 months to get productive, and a further 3 to 6 months for their productivity to start to raise and approach that of the Rails experts on my team.
Thus, in your particular case, I would not recommend a switch away from Drupal to Ruby on Rails.
Drupal (core) on ohloh (130k lines of code) is estimated to be 34 years of work worth.
Drupal (contributions) on ohloh (modules for Drupal 4-6 (7M lines of code)) is estimated to be 2113 years of work.
That is the power of a community, and that is something that you can never replicate. I remeber there was a guy, who tried to port Drupal to python calling it drupy, but that project died before something useful ever came out of it. Even if you copy the code, you can never copy the community.
The thing you need to realize, is that each community is different. So even if you find a project that can solve your code needs in a RoR or a different language/framework, it will never be like Drupal and vice versa.
So don't try to find a replacement for Drupal, but go explore and try new things. You might end up learning new things, that you can use for your Drupal projects.
I've read this times and times again that people saying comparing drupal an ror is comparing apple to orange which is wrong.
I think the saying itself BS. Yes we want to compare apple to orange and find out which is better. We even want to compare apple to steak. Said that, they are different. Yes, we all know. I have limited experience with either. I first thought Drupal was great and can help me build the website I wanted overnight (or over a week or month) then it didn't happen (not blaming Drupal).
My impression is that, Drupal maybe still great but it has a learning curve and needs a lot of other knowledge or talents to use it well and tweak it. RoR on the other hand is a more general framework and needs programming (Drupal needs too actually).
If you are more of a web designer person with a little PHP maybe Drupal is better fit.
If you are more of a web developer type don't want to spend time looking for modules and make them work but rather do them yourself (not really from ground up) then maybe RoR is for you (with the same amount of learning). So yes they are both good for different purpose, background, etc.
For now I will go with RoR (or dJango and other ORANGEs). My 2 cents.
Rails, since version 3.0, has officially adopted the once-controversial engine way of incorporating third-party apps. this is roughly the equivalent of Drupal's modules/plug-ins, from a 10k foot perspective. To build a community-based site, you could make use of an engine called, appropriately enough, "Community Engine." http://communityengine.org/features.html"
The Rails ecosystem doesn't have anywhere near the same number of modules Drupalists have available to them, but there are enough good quality ones to cover the chief basics.
Drupal has so many strong areas, its hard for just one or two people to recreate it in a decent amount of time with any language. PHP, Ruby, Python, etc.
You have the core node system, taxonomy, aliasing, menus, users, permissions, and modules, the database api, and form api, among others.
You'd have to know how to assemble all these pieces independently and create the structure necessary for it to all work together.
It would take more than 'a few hours'. I would say, even if you are a ROR master, you're looking at a year to two years of solid consistent work to get the best parts of Drupal for a new system.

What are the limits of ruby on rails?

I have a memory of talking to people who have got so far in using Ruby on Rails and then had to abandon it when they have hit limits, or found it was ultimately too rigid. I forget the details but it may have had to do with using more than one database.
So what I'd like is to know is what features/requirements fall outside of Ruby on Rails, or at least requires such contortions that it is better to use another more flexible framework, even though you may have to lose some elegance or write extra boilerplate code.
Rails (not ruby itself) is proud to be "Opinionated Software".
What this means in practice is that the authors of rails have a certain target audience in mind (themselves basically) and aim rails specifically at that. If X feature isn't needed for that target audience, it doesn't get added.
Off the top of my head, things that rails explicitly doesn't support that people may care about:
Foreign keys in databases
Connections to multiple DB's at once
SOAP web services (since rails 2.0)
Connections to multiple database servers at once
That said, it is very easy to extend rails with plugins, and there are plugins which add all of the above functionality to rails, and a lot more, so I wouldn't really count these as limits.
The only other caveat is that rails is built around the idea of creating CRUD web applications using MVC. If you're trying to do something which is NOT a CRUD web app (like twitter, which is actually a messaging system, or if you are insane and want to use a model like ASP.NET webforms) then you will also encounter problems. In this case you're better off not using rails, as you're essentially trying to build a boat out of bicycle parts.
In all likelihood, the problems you will run into that can't just be fixed with a quick plugin or a day or 2 of coding are all inherent problems with the underlying C Ruby runtime (memory leaks, green threads, crap performance, etc).
Ruby on Rails does not support two-phase commits out of the box, which maybe required if your database-backed application needs to guarantee immediate consistency AND you need to use two or more database schemas.
For many web applications, I would venture that this is not a common use-case. One can perfectly well support eventual consistency with two or more databases. Or one could support immediate consistency with one database schema. The former case is a great problem to have if your app has to support a mondo amount of transactions (note the technical term :). The latter case is more typical, and Rails does just fine.
Frankly, I wouldn't worry about limits to using Ruby on Rails (or any framework) until you hit real scalability problems. Build a killer app first, and then worry about scalability.
CLARIFICATION: I'm thinking of things that Rails would have a hard-time supporting because it might require a fundamental shift in its architecture. I'll be generous and include some things that are part of the gem/plugin ecosystem such as foreign key enforcement or SOAP services.
By two-phase commits, I mean attempting to make two commits to physically distinct servers within one transactional context.
Use case #1 for a two-phase commit: you've clustered your database, so that you have 2 or more database servers and your schema is spread across both servers. You may want to commit to both servers, because you want to allow ActiveRecord think do a "foreign key map" that traverses across the different servers.
Use case #2 for a two-phase commit: you're attempting to implement a messaging solution (sorry, I'm J2EE developer by day). The message producer commits to the messaging broker (one server) and to the database (a different server).
Also found some good discussion about the limits of ActiveRecord.
I think there is a greater “meta-question” here, that could be answered and that is “when is it OK to lean on external libraries to speed up development time?”
Third party libraries are often great and can drastically reduce development time, however there is a major problem, Joel Spolsky calls this “the law of leaky abstractions.” If you look that up on Google his post will come up first. Essentially this means that the trade off in development time means that you have no idea what is going on under the covers. So when something breaks you are completely stuck and have very limited methods of debugging. This also means that if you hit one of the features that are simply unsupported in RAILS, that you really need, you’ll have no next step except to write the feature yourself, if you’re lucky. Many libraries can make this difficult to do.
We’ve been burned badly in my dev shop by this issue. Our solutions worked fine under normal load, but we found that the third party subscription libraries that we were using simply could not stand up to the kind of load that we experienced once our site started to get a large number of concurrent users. This puts us in a very difficult place; essentially we have to rewrite the entire subscription service ourselves, with performance in mind. Doing this means that we’ve wasted all the time that we spent using the library.
Third party libraries can be great for small to medium sized applications; they can drastically reduce development time and hide complexities that aren’t necessary to deal with in the early stages of development. However eventually they will catch up with you and you’ll likely have to rewrite or re-engineer your solution to get past the “law of leaky absctractions”
Ruby don't have a functionality like IsPostBack in ASP.Net
Orion's answer is right on. There are few hard limits to AR/Rails: deploying to Windows, AR connectors that aren't frequently used, e.g. Firebird, ), but even the things he mentioned, multiple databases and DB servers, there are gems and plugins that address those for legacy, sharding, and other reasons.
The real limitation is how time-consuming it is to keep on top of all the things that rails devs are working on, and researching specific issues, given how many blogs, and how much mailing list volume there are.

Resources