Using different datastores in the Rails same app? - ruby-on-rails

So this is more or less an implementation question, this is the senario I have, basically we have an app which uses MySQL as it's datastore, user accounts, transactions etc, but we want to add in a robust charting feature and the data will be stored in Redis, now basically my question is:
Is it possible, and what are the best practices for integrating another datastore into an app which already depends on another one. Can I use Rack to generate the reports? etc...
I want to turn this into a sort of open discussion because I think the need for a solution like this is going to rise as we see more and more key/value stores that offer benefits far different than a RDBMS, an NoSQL stores as well. They all have their benefits but no one solution covers them all.
Thoughts?

You can have models that do not inherit ActiveRecord::Base. Add your preferred Redis client gem, do whatever config is necessary, and start writing Redis models.

I can try to reopen this topic, because should be very practical.
Have same issue with this. I want to replicate data from SQL to NoSQL. SQL used as main database storage, because data integrity, relations etc. And NoSQL as secondary database storage set for reading. In SQL you have much associations divided to much tables. Many one-to-one association saved in different tables for better readability. This associations should be saved as one document with NoSQL. It gives unbelievable speed. Only one load. Great for data exchange for API.
Do someone positive experience with replication SQL data to more consistent NoSQL documents?

Related

Best practices for developing an application with Neo4J

I recently came across an application which uses NEO4j as the backend. In my experience with SQL and other Key-value based databases, I have developed an understanding(which could be refined) that other databases store data and your application derives the information while with NEO4J you store the information. This means that the logic of deriving the information is already captured in the model of NEO4J. I am not able to get my head around this because now I cannot have logic that can be composed and most importantly something that can be tested with unit tests. I can sure have component level tests using embedded neo4j but then that's not the same. Can someone please help me understand the application development philosophy/methodology with NEO4J.
...other databases store data and your application derives the information while with NEO4J you store the information.
Hmmm.... Define data and define information. Mostly it goes: Data is something that requires further processing to become information (that is, something informative - something you can derived some conclusion or insights from).
Anyhow, doubt this has anything to do with Graph databases vs relational/aggregate databases. A database, as the name suggests, stores data.
This means that the logic of deriving the information is already captured in the model of NEO4J.
I'm not sure what you mean by "the logic... is already captured". Some queries are much easier with Neo+Cypher that with say SQL; like "Find all the friends of my friends that live in Berlin", but I would hardly relate this to 'logic'.
I cannot have logic that can be composed and most importantly something that can be tested with unit tests.
What do you mean by 'logic that can be composed'? And unit tests has nothing to do with this I'm afraid - there's no logic being tested if you talk about graph vs other databases.
Can someone please help me understand the application development philosophy/methodology with NEO4J.
There's really not much to it. Neo4J is a database like any other database, only that it uses a different model from relational/aggregate databases.
To highlight two of its strengths:
No joins - That's a pain point with relational/aggregate databases, especially with complex queries. Essentially, nearly all system involve a data model that is a graph (you only need one many-to-many relationship in your data model for that), and not using a graph database is a form of dimensionality reduction. The reasons relational databases prevailed for so many years is nothing short of a set of historical coincidences.
Easier DB migrations - and that's for being a schema-less data base. You ripe the same benefits with any other schema-less database.
I strongly recommend you read the 'NOSQL Overview' appendix of the free Graph Databases. It focus on a lot of these points.

Multi-schema Postgres on Heroku

I'm extending an existing Rails app, and I have to add multi-tenant support to it. I've done some reading, and seeing how this app is going to be hosted on Heroku, I thought I could take advantage of Postgres' multi-schema functionality.
I've read that there seems to be some performance issues with backups when multiple schemas are in use. This information I felt was a bit outdated. Does anyone know if this is still the case?
Also, are there any other performance issues, or caveats I should take into consideration?
I've already thought about adding a field to every table so I can use a single schema, and have that field reference to the tenants table, but given the time windows multiple schemas seem the best solution.
I use postgres schemas for a multi-tenancy site based on some work by Ryan Bigg and the Apartment gem.
https://leanpub.com/multi-tenancy-rails
https://github.com/influitive/apartment
I find that having seperate schemas for each client an elegant solution which provides a higher degree of data segregation. Personally I find the performance improves because Postgres can simply return all results from a table without have to filter to an 'owner_id'.
I also think it makes for simpler migrations and allows you to adjust individual customer data without making global changes. For example you can add columns to specific customers schemas and use feature flags to enable custom features.
My main argument relating to performance would be that backup is a periodic process, whereas customer table scoping would be on every access. On that basis, I would take any performance hit on backup over slowing down the customer experience.

No SQL database or Graph database for data intensive application

I am building an rails application which will have huge amount of user and their activities and tasks and dynamic attributes. Currently I am using postgresql but I know it is not a good choice this kind of application.
I am a confused between Graph databases and 'Nosql databases` which one to choose.
A graph database is a NoSQL database, as is a document database, column-store database, and a key/value database. Regarding which one to choose: Unfortunately that cannot be answered so simply, as a lot will depend on your specific application.
But... why choose one? Each type of NoSQL data store has its specific advantages. You can build a system based on multiple data stores, each one used to its specific advantage(s). This concept is known as polyglot persistence.
The Microsoft Patterns & Practices team published guidance around this very topic, and I'd suggest reading through it. You can also check out the book NoSQL Distilled which also goes into this topic and the specifics of each data store classification.
If you want to see an example of an app built on several data stores, check out Cloud Ninja Polyglot Persistence. Members of my team (including myself) got together and built this as a learning/training exercise.

Using separate DBs for each user in ASP.NET application

I'm building an ASP.NET MVC application.
I will store a lot of data per application user, so I was considering having separate DBs for each user's data. Of course all DBs will be 100% the same in terms of structure. They should all be derived from some kind of master DB.
Can someone give me any leads on how to realize this?
A database per user sounds a terrible idea for a lot of reasons. If the data or reads/writes are huge you should consider sharding (splitting the database into separate databases based on primary keys but NOT a separate DB for each primary key).
If you are using SQL Server this can handle several billion rows with just a bit of tuning -http://www.sql-server-performance.com/2011/index-maintenance-performance/ for a good reference on this.
Unless you are planning to handle millions of users with millions of rows in each table you will not need to use multiple DB's.
MS Sql can easily handle a billion rows in a table with good indexing and still keep good query performance.
Using multiple DB's will most likely only cause headaches.
That said, you should design scrips to create the databases, and later update scripts for changing them since you would have to propagate any changes to all databases.
But my recommendation is to stick with one DB. If you actually manages to outgrow that you most likely need to redesign the database structure anyway :)

When developing web applications when would you use a Graph database versus a Document database?

I am developing a web-based application using Rails. I am debating between using a Graph Database, such as InfoGrid, or a Document Database, such as MongoDB.
My application will need to store both small sets of data, such as a URL, and very large sets of data, such as Virtual Machines. This data will be tied to a single user.
I am interested in learning about peoples experiences with either Graph or Document databases and why they would use either of the options.
Thank you
I don't feel enough experienced with both worlds to properly and fully answer your question, however I'm using a document database for some time and here are some personal hints.
The document databases are based on a concept of key,value, and static views and are pretty cool for finding a set of documents that have a particular value.
They don't conceptualize the relations between documents.
So if your software have to provide advanced "queries" where selection criteria act on several 'types of document' or if you simply need to perform a selection using several elements, the [key,value] concept is not appropriate.
There are also a number of other cases where document databases are inappropriate : presenting large datasets in "paged" tables, sortable on several columns is one of the cases where the performances are low and disk space usage is huge.
So in many cases you'll have to perform "server side" processing in order to pick up the pieces, and with rails, or any other ruby based framework, you might run into performance issues.
The graph database are based on the concept of tripplestore, meaning that they also conceptualize the relations between the entities.
The graph can be traversed using the relations (and entity roles), and might be more convenient when performing searches across relation-structured data.
As I have no experience with graph database, I'm not aware if the graph database can be easily queried/traversed with several criterias, however if an advised reader has such an information I'd really appreciate any examples of such queries/traversals.
I'm currently reading about InfoGrid and trying to figure if such databases could by handy in order to perform complex requests on a very large set of data, relations included ....
From what I can read, the InfoGrah should be considered as a "data federator" able to search/mine the data from several sources (Stores) wich can also be a NoSQL database such as Mongo.
Wich means that you could use a mongo store for updates and InfoGraph for data searching, and maybe spare a lot of cpu and disk when it comes to complex searches inside a nosql database.
Of course it might seem a little "overkill" if your app simply stores a large set of huge binary files in a database and all you need is to perform simple key queries and to retrieve the result. In that cas a nosql database such as mongo or couch would probably be handy.
Hope some of this helps ;)
When connecting related documents by edges, will you get a shallow or a deep graph? I think the answer to that question is important when deciding between graphdbs and documentdbs. See Square Pegs and Round Holes in the NOSQL World by Jim Webber for thoughts along these lines.

Resources