How can I migrate a Rails app from using MongoDB to PostgreSQL? - ruby-on-rails

I have an existing Rails app that I've put about 100 hours of development into. I would like to push it up to Heroku but I made the mistake of using mongoDB for all of my developmental work. Now I have no schemas or anything of the sort and I'm trying to push out to Heroku and use PostgreSQL. Is there a way I can remove Mongoid and use Postgres? I've tried using DataMapper, but that seems to be doing more harm than good.

use postgresql's json data type, transform mongo collections to tables, each table should be the id and doc (json), then its easy to move from one to the other.

Whether the migration is easy or hard depends on a very large number of things including how many different versions of data structures you have to accommodate. In general you will find it a lot easier if you approach this in stages:
Ensure that all the Mongo data is consistent in structure with your RDBMS model and that the data structure versions are all the same.
Move your data. Expect that problems will be found and you will have to go back to step 1.
The primary problems you can expect are data validation problems because you are moving from a less structured data platform to a more structured one.
Depending on what you are doing regarding MapReduce you may have some work there as well.

Related

Rails seeding cities and areas tables

I have a Rails application where there are standard cities and areas rows that I want to have initialized in my tables in production and development. I have the data in separate files in array form.
Example cairo_areas.txt file has the following:
["Downtown - Abdin",
"Downtown - Abu El Rish",
"Downtown - Ahmed Helmy",
"Downtown - Ahmed Maher",
"Downtown - Gamea' El Banat"]
This is only part of the array of course to serve as an example.
Now I've tried to look for the best way to initialize in the database, but most of the answers I found seem very old.
In short the two approaches I found are:
Insert the data in seeds.rb. The downside is that most of the data there are for testing purposes, and I'm trying to separate the original data from the Faker data.
Approach 2 is to create a task to seed the city and area data into the database, but most of the answers I found are very old in older versions of Rails.
I want to know what is the best way to insert authentic production data into the database at initializing?
According to what Mark told you on his answer there is a third option which is widely used for adding data to production environments which is creating a migration that creates those records or modifies existing ones.
Two things must be said about the seeds.rb, this file should contain the minimal amount of data, for which according to the business logic of your system, makes the system functional. E.g, creation of roles, permissions and other information which your systems makes no sense without it. Also, this seeds.rb script should be idempotent, so you can re run the seeds.rb at any time without breaking the logic of the data. So you should use Model.find before doing Model.create basically. In that sense you can keep adding information to that seed file and re run it any time.
You might wonder how to choose over a migration or a rake task. I would suggest you to use migrations, because when used correctly you have the possibility to rollback those changes. Option that you do not have when using tasks.
Sorry if I was to verbose, I hope it helps !

RoR: Migrating from PostgreSQL to Elastic

we are looking to transfer a part of our database to from PostgreSQL to Elastic: basically we want to combine three tables - Properties, Listings and Addresses into single document.
I couldn't find any standard tools and probably, as we have a Ruby on Rails application, the pimpliest and most reliable way will be to write a migration which will just iterate through the models, compose them into single document and save to Elastic.
The task doesn't seem to be complicated but as that it's my first experience with Elastic I want to check with the community.
Thanks.
The closest thing I'm away of is the JDBC importer. However, I think writing your own script is probably equally as fast.
There is a postgres function, row_to_json, that will convert a resulting row to JSON, which you can then publish into elasticsearch. There's nothing I'm aware of that will automatically do this for you. Assuming it's not billions of rows, I'd stick with your plan of writing a short script to run your query, and HTTP post the results into elasticsearch.
You'll need to decide on two things: Index name, and document type(s).
Some notes:
The consistency model between a relational database like postgres and an eventually consistent document store like elasticsearch are quite different. You should be aware of these differences and the drawbacks of them.
You will likely want the data in elasticsearch to be de-normalized, as there are no awesome ways of doing joins.

ActiveRecord and NoSQL

I've been working with Rails for a few years and am very used to ActiveRecord, but have recently landed a task that would benefit from (some) NoSQL data storage.
A small amount of data would be best placed in a NoSQL system, but the bulk would still be in an RDBMS. Every NoSQL wrapper/gem I've looked at, though, seems to necessitate the removal of ActiveRecord from the application.
Is there a suggested method of combining the two technologies?
Not sure what NoSQL service you are looking into, but we have used MongoDB in concert with Postgres for a while now. Helpful hint, they say you need to get rid of ActiveRecord, but in reality, you don't. Most just say that because you end up not setting up your database.yml and/or running rake commands to setup AR DB.
Remember also that Postgres has HStore and JSON datatypes which give similar functionality as NoSQL datastores. Also, if the data you are looking to store outside of your AR DB is not very complex, I would highly recommend looking into Redis.
If you look at the Gemfile.lock of this project, you can see that it uses ActiveRecord with Mongoid.
Even if you use other gems that don't need ActiveRecord, you shouldn't care. If you are using it, you should have a valid reason to do so.

Can I use a text file as my database in ROR?

I would like to use a delimited text file (xml/csv etc) as a replacement for a database within Ruby on Rails. Solutions?
(This is a class project requirement, I would much rather use a database if I had the choice.)
I'm fine serializing the data and sending it to the text file myself.
The best way is probably to take the ActiveModel API and build your methods that parse your files in the appropriate ways.
Here's a good presentation about ActiveModel and ActiveRelation where he builds a custom model, which should have a lot of similar concepts (but different backend.) And also a good blog post by Yehuda about the ActiveModel API
Have you thought of using SQLite? It is much better solution.
It uses a single file.
It is way faster than doing the serialization yourself.
It is zero configuration. Very simple to use.
You get ACID compliance, transactions sub selects etc etc.
MySQL has a way to store tables in CSV. It has some pretty serious limitations, but it sounds like your requirements demand something with some pretty serious limitations anyway.
I've never set up a Rails project that way, and I don't know what it would take, but it seems like it might be possible.
HSQLDB seems to work by storing data on disk as a SQL script that creates your database. It records changes in memory and a log file, and when you shut down it recreates a single SQL script again. I've not used this one myself.
HSQLDB doesn't appear to be one of the supported databases in Rails. I don't know what it would take to add support for a new database.

Object database for Ruby on Rails

Is there drop-in replacement for ActiveRecord that uses some sort of Object Store?
I am thinking something like Erlang's MNesia would be ideal.
Update
I've been investigating CouchDB and I think this is the option I am going to go with. It's a toss-up between using CouchRest and ActiveCouch. CouchRest is pretty mature, and is used in the CouchDB peepcode episode, but it's not a drop-in replacement for ActiveRecord, which is a bit of a disadvantage.
Suffice to say CouchDB is pretty phenomenal.
Update (November 10, 2009)
CouchDB hasn't really worked for me. CouchDB doesn't really support arbitrary queries (queries need to be written and compiled ahead of time). It also breaks on very large datasets.
I have been playing with MongoDB and it's really incredible. Schema-less JSON data store with queries and indexing.
I've even started building a management tool for it called Ming.
Try Maglev!
AciveCouch purports to be just such a library for CouchDB, which is, in fact, written in Erlang. I wouldn't say it's as mature as ActiveRecord though.
That is the closest thing I know of to what you're asking for.
Madeleine is an implementation of the Java Prevayler object store
see http://madeleine.rubyforge.org/
I'm currently working on a ruby object database that uses mysql as a backing store (hence it's called hybriddb) that you may be interested in.
It uses no SQL or migrations, you just save your objects to the database, it also tries to work around the conventional problems with object databases (speed, finding objects quickly, large object graphs) transparently.
It is still an early version so take care. The code is here
http://github.com/pauliephonic/hybriddb/tree/master The development branch has support for transactions and I'm currently adding basic validations.
I have a web site with some tutorials etc. http://www.hybriddb.org/pages/tutorial_starter
Any comments are welcome there.
Apart from Madeleine, you can also see:
http://purple.rubyforge.org/
But it depends on scale too. Mnesia is known to support large amount of data, and is clustered, whereas these solutions won't work so well with large amount of data.
If amount of data is not huge, another options is:
http://copiousfreetime.rubyforge.org/amalgalite/files/README.html

Resources