Multitenant Rails app - Strategies for pulling customer data

Multitenant Rails app - Strategies for pulling customer data - ruby-on-rails

I am developing a multitenant Rails app using Postgresql schemas. All is going great, but my situation is a little different from the conventional multitenant apps out there; namely my app will require that I pull customer data for each tenant from their database to mine.
Here is where it gets tricky. I wrote a jRuby gem that connects to each customer's database and pulls data to my server, and then it processes that data and loads it into my Rails app (each customer set of data will end up in the appropriate tenant schema). Therefore this gem is the only place that it aware of all my tenants and their configuration (database info, which tables to pull, and so on).
My question is: What do you think of this design choice? Some of the problems I am already seeing is that this forced the app to be in two Ruby states, i.e. it normally functions in Ruby, but when I need to do a pull, I have to switch to jRuby. Furthermore, it is hard to inspect into tenant configuration without resort to this gem.
Any comments or feedback on this? Is there another path I could have taken with this?

Related

Rails 4 - Multiple apps using centralized database

Suppose we have this scenario:
www.main.com - Main interface where admin (foo, bar, etc..) can store products, based on their own e-commerce
www.foo.com - Sample store that sells items from the "foo" store
www.bar.com - Sample store that sells items from the "bar" store
The problem is finding a way to centralize the database structure and models.
I prefer to keep every single store in separated apps (so I exclude rails engines).
For instance, if a user buys something in the "foo" store, I need to interact with the main db and update it.
How can I do this?

Rails works much better with a "one database, one app" model. There are lots of ways how you can share models across apps (gems, engines, git submodules, etc), but none of those ways is great. You end up introducing lots of overhead in your development, deployment, and testing process. You also invite lots of hidden dependencies between code, as Rails doesn't give you easy way to keep clean abstraction (for example, you wrote a helper for store Foo, and then your coworker used that helper in store Bar, and then every time you change Foo, Bar breaks).
I recommend a centralized API approach instead:
api.foobarmain.com - a app/service that provides RESTful API for all the functionality of all stores.
This app has all the db models, and it exposes them as resources in the API for other apps to interact with.
This app can have an admin UI for all the stores, if you need it. Alternatively, admin UI may be another client of an API.
www.foo.com - a full stack app that interacts with API at api.foobarmail.com
There is no shared database connection to API, everything that you need to interact has to be exposed via API.
There will be no shared code between www.foo.com and www.bar.com. Code reuse happens only by virtue of using the same API app/service.
From the perspective of www.foo.com, the model layer (in MVC) is powered by API, not by database.
You can still have its own database on www.foo.com if you need to store data specific to www.foo.com only.
www.bar.com - another full stack app that interacts with API
so on ...

Another way is using multiple schema if you are using Postgresql. I have a similar issue with a new project that I about to start.
You can use gem 'apartment' to deal with different schemas. The queries will be a bit complicated but with different schemas you will end up with one database and you can create different namespaces to respond accordingly.
You can set up so app select the correct schema based on the Domain or Subdomain.
Here is the link: https://github.com/influitive/apartment

Large Rails suite architecture: combine three apps into one container app

Our Rails suite is comprised of three independent Rails apps:
JSON API (Rails app)
Admin dashboard (Rails app)
Shared data models (Rails engine)
Both the API and Admin dashboard require the shared data models engine in their Gemfiles. All models and custom classes are stored in the engine, and both apps make heavy use of the shared components. The API lives on one Heroku server, and the Admin dashboard lives on another separate Heroku server (two separate Heroku apps). Each use their own respective Postgres databases. All three apps have their own GIT repos.
The API database stores information pertinent to our public users, and the Admin database stores mostly statistical information for admin eyes only.
A caveat of the setup is that the Admin dashboard app has direct access to the API database, and vice-versa. I understand that this is bad practice and may not seem to make sense, but there was a reason for this (mainly because the Admin dashboard needed to access all records of certain API tables, and the use of a custom API to communicate over the wire was not feasible). A similar reason exists for the API-to-Admin database communication.
This setup works for our purposes, nothing is broken, and resources are allocated efficiently. However, productivity is beginning to suffer due to the slow and uncomfortable development process. An example: a change to the API is required. Chances are that the shared models engine needs a change and therefore a feature branch is needed in both repos. After committing and pushing, the Admin dashboard now contains an old version of the models engine (is behind by one patch version). The problem lies in trying to coordinate all three Rails apps, when only one app needs a change. Another problem is migrations. Since the models engine contains two different database connections, I must create the migration once in the models engine then create it again in the appropriate app (API or Admin).
My ideal setup would involve one large Rails container app with separate engines contained within. The separate engines would be: API, Admin, Models. Also, I’m beginning to think that using only one database might make things easier. I would also like to keep the API on its own server instance, and the Admin on a separate server. The reason for this is that the API is public facing (communicates with a public iOS app) and the Admin is used mainly as a CMS and reporting engine.
I am looking for solutions and advice from experience managing similar Rails / Heroku architectures. Specific questions:
Should I attempt to combine the three Rails apps into one container
app and use the engine approach?
Should I attempt to combine the two
databases into one database?
Is it possible to have one Rails
container app, and allocate different servers to different engines?
If I should keep all apps separate, is their an easier and more
productive way to implement new features and fixes on a daily basis?

How can I have multiple schemas and multiple subdomains when hosting on Heroku?

I am planning on using Devise and Apartment in my upcoming application to create subdomains for each organization that creates an account. I would like to host my application on Heroku, but ran across the following quote:
The most common use case for using multiple schemas in a database is
building a software-as-a-service application wherein each customer has
their own schema. While this technique seems compelling, we strongly
recommend against it as it has caused numerous cases of operational
problems. For instance, even a moderate number of schemas (> 50) can
severely impact the performance of Heroku’s database snapshots tool,
PG Backups.
What technique would work well with Heroku to host basecamp-style subdomains in rails 4 where many users can log in to the subdomain which they are part of?
If Heroku does not work, what other PaaS options are there that would do this well?

Domain
Firstly, you need to be sure that you're using your own custom domain for the subdomains.
Heroku's standard xxx.herokuapp.com won't be able to handle another subdomain on top of that - so you'll basically need to use your custom domain from the get-go
It will be good to reference this documentation for more information!
Multi Tenancy
Although I don't have experience with PGSQL's schemas, I do have some with multi tenancy as a whole.
There are a number of great resources here:
Basecamp-style Subdomains (by DHH)
Multitenancy Railscasts (Pro)
Apartment Gem Documenatation
Essentially, multi-tenancy is just a way to scope the data so that it's only the tenant's that you see / interact with. In the sense of the DB, the two ways to achieve this are either to use different DB's (as you would with MYSQL), or use a schema (like with PGSQL)
Whilst I can't give you a direct fix for your issue, I can help you with some ideas:
Models
One way to achieve multi-tenancy, especially with the likes of MYSQL, is to do it through the model:
How do i work with two different databases in rails with active records?
#lib/admin.rb
class Admin < ActiveRecord::Base
self.abstract_class = true
establish_connection "#{Rails.env}_admin"
end
#app/models/option.rb
Class Option < Admin
# do stuff
end
This works very well for us, although we have not got it working for scoped accounts yet. We've been thinking of setting a ##class_variable for the Account or something, but haven't been working on that right now.
This works very well for MYSQL - powered databases, but also means you'll have to create db's for every account, which will not work with PGSQL (as far as I'm aware)
PGSQL Schemas
I feel this is kind of a cheat way to do this, as all the data is still stored in 1 database - it's basically just scoped around different types of data.
The problem here is that real multi tenancy should be where you completely separate the user's data, so you could cut it out of the app completely if they wanted. From a security & access perspective, it's the most flexible & modular way.
The problem for Heroku is they can only use one database (they give everyone access to their AWS database instances), meaning they can't allow you to create 50+ free databases (it just won't work very well).
You can, of course, use your own stack to create the databases you require, but in terms of PGSQL, it's just about creating the schemas for your data & then using something like -Apartment to make it happen:
PostgreSQL works slightly differently than other databases when
creating a new tenant. If you are using PostgreSQL, Apartment by
default will set up a new schema and migrate into there. This provides
better performance, and allows Apartment to work on systems like
Heroku, which would not allow a full new database to be created.

Best practice for importing a partial database dump into a rails app daily?

The iTunes Enterprise Partner Feed is "a data feed of the complete set of metadata from iTunes and the App Store" and "is available in two different formats - either as the files necessary to build a relational database or as stand-alone flat files that are country and media dependent."
I need to consume the data from this feed (which is essentially exported into flat files) and allow linking of my own Model objects (User, Activity, etc.) to data provided by the feed (App, Developer, etc.) The data is provided as a weekly full export and a daily incremental export.
I have two ideas for ways to implement this:
Create all of the models in my rails app and write my own importer that will insert/update records directly into my app's database daily via cron using models I've created (App, Developer, etc.)
Keep this database entirely separate and open up REST API that my own app will consume
My naive approach with #1 to keep everything in the Rails app is based on the need to be able to observe changes in the data I'm getting from the EPF. For example, if an App's description is updated, I want to be able to create an Activity object via an observer to track that update.
On one hand #2 feels like a better approach because it creates a standalone API into the data that can be consumed from several different apps I create. On the other hand, I'm just not sure how I'd accomplish the data change notifications without using observers directly on my own models. Or, how I would even consume the data in an object oriented way that could still be used with my own models. It feels like a lot of duplicate work to have to query the API for, say, an App's data, create a proper Active Record object for it and then save it just so it can be linked to one of my own models.
Is there a standard way to do this that I'm totally missing? Any pointers?
EDIT: Rails engines sound interesting but it would mean that each app would still need to consume and insert the data separately. That doesn't sound so DRY. It sounds more and more like a REST API is the way to go. I just don't know how to bridge the gap from API to Active Record model.

Rails Engines might be a good fit for this. You can create a Rails Engine gem and add all of your models and rake tasks to consume the data. Then you can include this gem in any app that uses it and also create an API app which includes the gem. You should be able to create observers in your other apps that interact with the gem.
I have quite a few apps that interact with each other and this approach works well for me. I have one central app that includes all the engines that consume data and I run all of my cronjobs from this app. I use the use_db plugin which allows my app to communicate with different databases. Each engine has use_db as a dependency and I keep the database configuration inside the gem. One example:
Engine Gem = transaction_core
This gem consumes transaction data from a source and inserts it into my transaction database.
The gem is included in my central app and I pull the transaction data using a rake task on the cron
I include this gem in several other apps that need to use the transaction data. Since the engine automatically adds the models and database config to the app, there is no additional work required to use the models in the app.
I have not used observers inside an app that includes my engines, but I see no reason why it would not work. With the engine the models work as if they are in your app/models directory. Hope this helps!
Modest Rubyist has a good 4 part tutorial on Rails 3 plugins that includes Engines:
http://www.themodestrubyist.com/2010/03/05/rails-3-plugins---part-2---writing-an-engine/

Should I use multiple databases?

I am about to create an application with Ruby on Rails and I would like to use multiple databases, basically is an accounting app that will have multiple companies for each user. I would like to create a database for each company
I found this post http://programmerassist.com/article/302
But I would like to read more thoughts about this issue.
I have to decide between MySQL and PosgreSQL, which database might fit better my problem.

There are several options for handling a multi-tenant app.
Firstly, you can add a scope to your tables (as suggested by Chad Birch - using a company_id). For most use-cases this is fine. If you are handling data that is secure/private (such as accounting information) you need to be very careful about your testing to ensure data remains private.
You can run your system using multiple databases. You can have a single app that uses a database for each client, or you can have actually have a seperate app for each client. Running a database for each client cuts a little against the grain in rails, but it is doable. Depending on the number of clients you have, and the load expectations, I would actually suggest having a look at running individual apps. With some work on your deployment setup (capistrano, chef, puppet, etc) you can make this a very streamlined process. Each client runs in a completely unique environment, and if a particular client has high loads you can spin them out to their own server.
If using PostgreSQL, you can do something similar using schemas.
PostgresQL schemas provide a very handy way of islolating your data from different clients. A database contains one or more named schemas, which in turn contain tables. You need to add some smarts to your migrations and deployments, but it works really well.
Inside your Rails application, you attach filters to the request that switch the current user's schema on or off.
Something like:
before_filter :set_app
def set_app
current_app = App.find_by_subdomain(...)
schema = current_app.schema
set_schema_path(schema)
end
def set_schema_path(schema)
connection = ActiveRecord::Base.connection
connection.execute("SET search_path TO #{schema}, #{connection.schema_search_path}")
end
def reset_schema_path
connection = ActiveRecord::Base.connection
connection.execute("SET search_path TO #{connection.schema_search_path}")
end

The problem with answers about multiple databases is when they come from people who don't have a need or experience with multiple databases. The second problem is that some databases just don't allow for switching between multiple databases, including allowing users to do their own backup and recovery and including scaling to point some users to a different data server. Here is a link to a useful video
http://aac2009.confreaks.com/06-feb-2009-14-30-writing-multi-tenant-applications-in-rails-guy-naor.html
This link will help with Ruby on Rails with Postgresql.
I currently have a multi-tenant, multi-database, multi-user (many logons to the same tenant with different levels of access), and being an online SaaS application. There are actually two applications one is in the accounting category and the other is banking. Both Apps are built on the same structure and methods. A client-user (tenant) can switch databases under that user's logon. An agent-user such as a tax accountant can switch between databases for his clients only. A super-user can switch to any database. There is one data dictionary i.e. only one place where tables and columns are defined. There is global data and local data. Global data such as a master chart-of-accounts which is available to everyone (read only). Local data is the user's database. A new user can get a clone of a master database. There are multiple clones to choose from. A super-user can maintain the clone databases.
The problem is that it is in COBOL and uses ISAM files and uses the CGI method. The problem with this is a) there is a perception that COBOL is outdated, b) getting trained people, c) price and d) online help. Otherwise it works and I'm happy with it.
So I'm researching what to replace it with and what a minefield that is.

It has past time and the decission for this has been to use PostgreSQL schemas, making multitenant applications, I have a schema called common where related data is stored.
# app/models/organisation.rb
class Organisation < ActiveRecord::Base
self.table_name = 'common.organisations'
# set relationships as usual
end
# app/models/user.rb
class User < ActiveRecord::Base
self.table_name = 'common.users'
# set relationships as usual
end
Then for migrations I have done that with this excellent tutorial. http://timnew.github.com/blog/2012/07/17/use-postgres-multiple-schema-database-in-rails/ use this, this is way better than what I saw in other places even the way Ryan Bates did on railscasts.
When a new organisation is created then a new schema is created with the name of the subdomain the organisation. I have read in the past that it's not a good idea to use different schemas but it depends on the job you are doing, this app has almost no soccial component so it's a good fit.

No, you shouldn't use multiple databases.
I'm not really sure what advice to give you though, it seems like you have some very basic misunderstandings about database design, you may want to educate yourself on the basics of databases first, before going further.
You most likely just want to add a "company id" type column to your tables to identify which company a particular record belongs to.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart