ActiveRecord and NoSQL - ruby-on-rails

I've been working with Rails for a few years and am very used to ActiveRecord, but have recently landed a task that would benefit from (some) NoSQL data storage.
A small amount of data would be best placed in a NoSQL system, but the bulk would still be in an RDBMS. Every NoSQL wrapper/gem I've looked at, though, seems to necessitate the removal of ActiveRecord from the application.
Is there a suggested method of combining the two technologies?

Not sure what NoSQL service you are looking into, but we have used MongoDB in concert with Postgres for a while now. Helpful hint, they say you need to get rid of ActiveRecord, but in reality, you don't. Most just say that because you end up not setting up your database.yml and/or running rake commands to setup AR DB.
Remember also that Postgres has HStore and JSON datatypes which give similar functionality as NoSQL datastores. Also, if the data you are looking to store outside of your AR DB is not very complex, I would highly recommend looking into Redis.

If you look at the Gemfile.lock of this project, you can see that it uses ActiveRecord with Mongoid.
Even if you use other gems that don't need ActiveRecord, you shouldn't care. If you are using it, you should have a valid reason to do so.

Related

How can I migrate a Rails app from using MongoDB to PostgreSQL?

I have an existing Rails app that I've put about 100 hours of development into. I would like to push it up to Heroku but I made the mistake of using mongoDB for all of my developmental work. Now I have no schemas or anything of the sort and I'm trying to push out to Heroku and use PostgreSQL. Is there a way I can remove Mongoid and use Postgres? I've tried using DataMapper, but that seems to be doing more harm than good.
use postgresql's json data type, transform mongo collections to tables, each table should be the id and doc (json), then its easy to move from one to the other.
Whether the migration is easy or hard depends on a very large number of things including how many different versions of data structures you have to accommodate. In general you will find it a lot easier if you approach this in stages:
Ensure that all the Mongo data is consistent in structure with your RDBMS model and that the data structure versions are all the same.
Move your data. Expect that problems will be found and you will have to go back to step 1.
The primary problems you can expect are data validation problems because you are moving from a less structured data platform to a more structured one.
Depending on what you are doing regarding MapReduce you may have some work there as well.

Switching from SQl to MongoDB in Rails 3

I am considering switching a quite big application (Rails 3.0.10) from our SQL database (SQLite and Postgres) to MongoDB. I plan to put everything in it, mainly utf-8 string, binary file and user data. (Maybe also a little full text search) I have complex relationships (web structure: categories, tags, translations..., polymorphic also) and I feel that MongoDB philosophy is to avoid that and to put everything in big document, am I right ?
Does anyone have experience with MongoDB in Rails ? Particularly switching a app from ActiveRecord to Mongoid ? Do you think it's a good idea ? Do you know a guide/article to learn the MongoDB way to organize complex data ?
ps : In MongoDB, I particularly like the freedom offers by its architecture and its performance orientation. It's my main personal motivations to consider the switch.
I am using mongodb with mongoid, for 5-6 months. Have also worked with postgres + AR, MySQL + AR. Have no experience with switching AR to mongoid.
Are you facing any performance issues or expect to face them soon? If not I would advice to avoid the switch, as the decision seems just to be based on coolness factor of Mongodb.
They both have their pros and cons, I like the speed of mongodb, but there are many restrictions on what you can do to achieve that(like no joins, no transaction support and slow field vs. field(updated_at > created_at) queries).
If there are performance issues, I would still recommend to stick with your current system, as the switch might be a big task and it would be better if you spend half the time in optimizing the current system. After reading the question, I get a feeling that you have never worked with mongodb before, there are a many things which can bite you and you would not be fully aware of how to solve them.
However, If you still insist on switching, you need to carefully evaluate you data structure and the way you query them. In relational database, you have the normal forms, which have the advantage that whatever structure you start with, you will roughly reach the same end result once you do the normalization. In mongodb, there are virtually unlimited ways in which you can model your documents. You need to carefully model your documents to avail the benefits of mongodb. The queries you need to run play a very important role in your structuring along with the actual data you want to store.
Keep in mind, you do not have joins in mongodb(can be mitigated, with good modeling). As of now you can not have queries like, field1 = field2, i.e. you can't compare fields, but need to provide a literal to query against.
Take a look at this question: Efficient way to store data in MongoDB: embedded documents vs individual documents. Somebody points the OP to a discussion where embedded documents are recommended, but pretty much similar scenario, OP chooses to go with standalone documents, because of the nature of the queries he will be using to fetch the data.
All I want to say is, it should be a informed decision, which should be taken after you completely model your system with mongodb, have some performance tests with some real data to see if mongodb will solve your problem and should not be based on coolness factor.
UPDATE:
You can do field1 = field2 using $where clause, but its slow and is advised to be avoided.
We are currently switching from PostgreSQL, tsearch, and PostGIS in a production application. It has been a challenging process to say the least. Our data model is a better fit for mongodb because we don't need to do complex joins. We can model our data very easily into the nested document structure mongodb provides.
We have started a mirror site with the mongodb changes in it so we can leave the production site alone, while we stumble through the process. I don't want to scare you, because in the end, we will be happy we made the switch - but it is a lot of work. I would agree with the answer from rubish: be informed, and make the decision you feel is best. Don't base it on the 'coolness' factor.
If you must change, here are some tips from our experience:
ElasticSearch fits well with mongo's document structure to replace PostgreSQL's tsearch full text search extensions.
It also has great support for point based geo indexing. (Points of interest closest to, or within x miles/kilometers)
We are using Mongo's built in GridFS to store files, which works great. It simplifies the sharing of user contributed images, and files across our cluster of servers.
We are using rake tasks to dump data out of postgresql into yaml format. Then, we have another rake task in the mirror site which imports and converts the data into models stored in mongodb.
The data export/import might work using a shared redis database, resque on both sides, and an observer in the production application to log changes as they happen.
We are using Mongoid as our ODM, and there are a lot of scopes within our models that needed to be rewritten to work with Mongoid vs ActiveRecord.
Over all, we are very happy with MongoDB. It offers us much more flexibility in the way we model our data. I just wish we would have discovered it before the project were started.
skip active record,
Alternatively, if you’ve already created your app, have a look at config/application.rb
and change the first lines from this:
require "rails/all"
to this:
require "action_controller/railtie"
require "action_mailer/railtie"
require "active_resource/railtie"
require "rails/test_unit/railtie"
It’s also important to make sure that the reference to active_record in the generator block is commented out:
Configure generators values. Many other options are available, be sure to check the documentation.
# config.generators do |g|
# g.orm :active_record
# g.template_engine :erb
# g.test_framework :test_unit, :fixture => true
# end
As of this this writing, it’s commented out by default, so you probably won’t have to change anything here.
I hope it will be helpful to you while you switching app from AR to mongo.
Thanks.

Using different datastores in the Rails same app?

So this is more or less an implementation question, this is the senario I have, basically we have an app which uses MySQL as it's datastore, user accounts, transactions etc, but we want to add in a robust charting feature and the data will be stored in Redis, now basically my question is:
Is it possible, and what are the best practices for integrating another datastore into an app which already depends on another one. Can I use Rack to generate the reports? etc...
I want to turn this into a sort of open discussion because I think the need for a solution like this is going to rise as we see more and more key/value stores that offer benefits far different than a RDBMS, an NoSQL stores as well. They all have their benefits but no one solution covers them all.
Thoughts?
You can have models that do not inherit ActiveRecord::Base. Add your preferred Redis client gem, do whatever config is necessary, and start writing Redis models.
I can try to reopen this topic, because should be very practical.
Have same issue with this. I want to replicate data from SQL to NoSQL. SQL used as main database storage, because data integrity, relations etc. And NoSQL as secondary database storage set for reading. In SQL you have much associations divided to much tables. Many one-to-one association saved in different tables for better readability. This associations should be saved as one document with NoSQL. It gives unbelievable speed. Only one load. Great for data exchange for API.
Do someone positive experience with replication SQL data to more consistent NoSQL documents?

Can I use a text file as my database in ROR?

I would like to use a delimited text file (xml/csv etc) as a replacement for a database within Ruby on Rails. Solutions?
(This is a class project requirement, I would much rather use a database if I had the choice.)
I'm fine serializing the data and sending it to the text file myself.
The best way is probably to take the ActiveModel API and build your methods that parse your files in the appropriate ways.
Here's a good presentation about ActiveModel and ActiveRelation where he builds a custom model, which should have a lot of similar concepts (but different backend.) And also a good blog post by Yehuda about the ActiveModel API
Have you thought of using SQLite? It is much better solution.
It uses a single file.
It is way faster than doing the serialization yourself.
It is zero configuration. Very simple to use.
You get ACID compliance, transactions sub selects etc etc.
MySQL has a way to store tables in CSV. It has some pretty serious limitations, but it sounds like your requirements demand something with some pretty serious limitations anyway.
I've never set up a Rails project that way, and I don't know what it would take, but it seems like it might be possible.
HSQLDB seems to work by storing data on disk as a SQL script that creates your database. It records changes in memory and a log file, and when you shut down it recreates a single SQL script again. I've not used this one myself.
HSQLDB doesn't appear to be one of the supported databases in Rails. I don't know what it would take to add support for a new database.

Object database for Ruby on Rails

Is there drop-in replacement for ActiveRecord that uses some sort of Object Store?
I am thinking something like Erlang's MNesia would be ideal.
Update
I've been investigating CouchDB and I think this is the option I am going to go with. It's a toss-up between using CouchRest and ActiveCouch. CouchRest is pretty mature, and is used in the CouchDB peepcode episode, but it's not a drop-in replacement for ActiveRecord, which is a bit of a disadvantage.
Suffice to say CouchDB is pretty phenomenal.
Update (November 10, 2009)
CouchDB hasn't really worked for me. CouchDB doesn't really support arbitrary queries (queries need to be written and compiled ahead of time). It also breaks on very large datasets.
I have been playing with MongoDB and it's really incredible. Schema-less JSON data store with queries and indexing.
I've even started building a management tool for it called Ming.
Try Maglev!
AciveCouch purports to be just such a library for CouchDB, which is, in fact, written in Erlang. I wouldn't say it's as mature as ActiveRecord though.
That is the closest thing I know of to what you're asking for.
Madeleine is an implementation of the Java Prevayler object store
see http://madeleine.rubyforge.org/
I'm currently working on a ruby object database that uses mysql as a backing store (hence it's called hybriddb) that you may be interested in.
It uses no SQL or migrations, you just save your objects to the database, it also tries to work around the conventional problems with object databases (speed, finding objects quickly, large object graphs) transparently.
It is still an early version so take care. The code is here
http://github.com/pauliephonic/hybriddb/tree/master The development branch has support for transactions and I'm currently adding basic validations.
I have a web site with some tutorials etc. http://www.hybriddb.org/pages/tutorial_starter
Any comments are welcome there.
Apart from Madeleine, you can also see:
http://purple.rubyforge.org/
But it depends on scale too. Mnesia is known to support large amount of data, and is clustered, whereas these solutions won't work so well with large amount of data.
If amount of data is not huge, another options is:
http://copiousfreetime.rubyforge.org/amalgalite/files/README.html

Resources