Rails app with multiple subdomains on separate machines - ruby-on-rails

I've got a Rails application that has 3 main parts:
www.example.com: This is the main website
api.example.com: The API
dashboard.example.com: The dashboard interface for signed up users
I've currently got them setup in a single rails app with namespaces, and share models. I recently got accepted on the RackSpace startup program which gives me $2000 (!) worth of FREE cloud storage each month so I thought I'd distribute the app up into smaller pieces, so that the subdomains are hosted on separate servers.
How can I do that without deploying the same code to 3 different servers? I reviewed this related question, but it seemed to imply using a single Git repo for all three "projects" and I wasn't sure how that could would get deployed.

There are two very different goals intertwined here:
One is distributing the load over multiples machines. You don't need to break out your application into multiple apps for that. You could use a load balancer for this (incoming requests are spread across multiple machines running the same app), or split your app up.
The other is more about the logical architecture of your app. In general the bigger an app is, the more complicated it is and it's easier to work on simple apps. Therefore you might want to split your app up into smaller apps. This is orthogonal to the hardware issue - you could choose to split your app up threeways but still run them on the same machine (it does of course give you the flexibility to allocate resources differently to your app)
Coming back to you question, one aspect worth exploring is should there be any shared models? To what extent can (for example) your dashboard query your api (possibly private apis) rather than querying the database directly? Splitting up your application into multiple services makes some things easier, some things harder, but it does help you keep individual applications smaller and keep the boundaries between them clean without any hidden dependencies. This logical split doesn't have to correspond the the physical layout: one server may host several applications
Lastly the most direct answer to your question is to write a rails engine. An engine can contain pretty much anything an app can (applications are special cases of engines), for example models, views, controllers routes etc. You'd create one with your common code, package this up as a gem (doesn't have to be a public one) and add it each app's Gemfile.

Load Balancer
For lack of more experience, I would say what my gut is telling me - simply, you don't need any more servers (unless of course you have massive throughput)
You may be better putting this on the SuperUser community or something - I personally believe you'll be better with a load balancer (although I've never used one before), to give you the capacity to spread the load of your apps across mulitple server instances.
Quite how this will work is beyond me (I've never had to use one), but I would certainly look at that far quicker than trying to split your app between servers:
Load balancing is a computer networking method for distributing
workloads across multiple computing resources, such as computers, a
computer cluster, network links, central processing units or disk
drives. Load balancing aims to optimize resource use, maximize
throughput, minimize response time, and avoid overload of any one of
the resources. Using multiple components with load balancing instead
of a single component may increase reliability through redundancy.
Load balancing is usually provided by dedicated software or hardware,
such as a multilayer switch or a Domain Name System server process.
Load balancing is differentiated from channel bonding in that load
balancing divides traffic between network interfaces on per network
socket (OSI model layer 4) basis, while channel bonding implies a
division of traffic between physical interfaces at a lower level,
either per packet (OSI model Layer 3) or an data link (OSI model Layer
2) basis.

Related

Where should I put shared services for multiple kubernetes-clusters?

Our company is developing an application which runs in 3 seperate kubernetes-clusters in different versions (production, staging, testing).
We need to monitor our clusters and the applications over time (metrics and logs). We also need to run a mailserver.
So basically we have 3 different environments with different versions of our application. And we have some shared services that just need to run and we do not care much about them:
Monitoring: We need to install influxdb and grafana. In every cluster there's a pre-installed heapster, that needs to send data to our tools.
Logging: We didn't decide yet.
Mailserver (https://github.com/tomav/docker-mailserver)
independant services: Sentry, Gitlab
I am not sure where to run these external shared services. I found these options:
1. Inside each cluster
We need to install the tools 3 times for the 3 environments.
Con:
We don't have one central point to analyze our systems.
If the whole cluster is down, we cannot look at anything.
Installing the same tools multiple times does not feel right.
2. Create an additional cluster
We install the shared tools in an additional kubernetes-cluster.
Con:
Cost for an additional cluster
It's probably harder to send ongoing data to external cluster (networking, security, firewall etc.).
3) Use an additional root-server
We run docker-containers on an oldschool-root-server.
Con:
Feels contradictory to use root-server instead of cutting-edge-k8s.
Single point of failure.
We need to control the docker-containers manually (or attach the machine to rancher).
I tried to google for the problem but I cannot find anything about the topic. Can anyone give me a hint or some links on this topic?
Or is it just no relevant problem that a cluster might go down?
To me, the second option sound less evil but I cannot estimate yet if it's hard to transfer data from one cluster to another.
The important questions are:
Is it a problem to have monitoring-data in a cluster because one cannot see the monitoring-data if the cluster is offline?
Is it common practice to have an additional cluster for shared services that should not have an impact on other parts of the application?
Is it (easily) possible to send metrics and logs from one kubernetes-cluster to another (we are running kubernetes in OpenTelekomCloud which is basically OpenStack)?
Thanks for your hints,
Marius
That is a very complex and philosophic topic, but I will give you my view on it and some facts to support it.
I think the best way is the second one - Create an additional cluster, and that's why:
You need a point which should be accessible from any of your environments. With a separate cluster, you can set the same firewall rules, routes, etc. in all your environments and it doesn't affect your current workload.
Yes, you need to pay a bit more. However, you need resources to run your shared applications, and overhead for a Kubernetes infrastructure is not high in comparison with applications.
With a separate cluster, you can setup a real HA solution, which you might not need for staging and development clusters, so you will not pay for that multiple times.
Technically, it is also OK. You can use Heapster to collect data from multiple clusters; almost any logging solution can also work with multiple clusters. All other applications can be just run on the separate cluster, and that's all you need to do with them.
Now, about your questions:
Is it a problem to have monitoring-data in a cluster because one cannot see the monitoring-data if the cluster is offline?
No, it is not a problem with a separate cluster.
Is it common practice to have an additional cluster for shared services that should not have an impact on other parts of the application?
I think, yes. At least I did it several times, and I know some other projects with similar architecture.
Is it (easily) possible to send metrics and logs from one kubernetes-cluster to another (we are running kubernetes in OpenTelekomCloud which is basically OpenStack)?
Yes, nothing complex there. Usually, it does not depend on the platform.

Valid CoreOS multi tenancy scenario?

I'm currently tinkering with a scenario for using CoreOS. It's probably not the 1st class use case. But I'd like to get a pointer if it's valid though. As I'm really at the beginning of getting a grip on CoreOS I hope that my "use case" is not totally off.
Imagine a multi tenant application where every tenant should get it's own runtime environment. Let's take a web app running on Node.js and PostgreSQL for data storage as given. Each tenant environment would be be running on CoreOS in their respective containers. Data persistance is left out for now. For me it's currently more about the general feasibility.
So why CoreOS?
Currently I try to stick with the idea of separated environments per tenant. To optimise the density of DB and web server instances per hardware host I thought CoreOS might be the right choice instead of "classic" virtualisation.
Another reason is that a lot of tenants might not need more than a single, smallish DB instance and a single, smallish web server. But there might be other tenants that need some constantly scaled out deployments. Others might need a temporary scale out during burst times. CoreOS sounds like a good fit here as well.
On the other side there must be a scalable messaging infrastructure (RabbitMQ) in behind that will handle a lot of messages. This infrastructure will be used by all tenants and needs to dynamically scalable at best. Probably there will be a "to be scaled" Elasticsearch infrastructure as well. Viewed through my current "CoreOS for everything goggles" this seems a good fit as well.
In case this whole scenario is generally valid, I currently cannot see how it would be possible to route the traffic for a general available web site to the different tenant containers.
Imagine the app is running at app.greatthing.tld. A user can login and should be presented the app served for it's tenant. Is this something socketplane and/or flannel are there to solve? Or how would a solution look like to get the tenant served by the right containers? I think it's kind of a general issue. But at least in the context of a CoreOS containerized environment I cannot see how to deal with this at all.
CoreOS takes care of scheduling your container in the cluster with their own tools such as fleetctl/etcd/systemd and also takes care of persistent storage when resheduled to a different container using flocker (experimental). They have their own load balancers.

FoundationDB, the layer: Is it hosted on client application or server nodes?

Recently I was reading about concept of layers in FoundationDB. I like their idea, the decomposition of storage from one side and access to it from other.
There are some unclear points regarding implementation of the layers. Especially how they communicate with the storage engine. There are two possible answers: they are parts of server nodes and communicate with the storage by fast native API calls (e.g. as linked modules hosted in the server process) -OR- hosted inside client application and communicate through network protocol. For example, the SQL layer of many RDBMS is hosted on the server. And how are things with FoundationDB?
PS: These two cases are different from the performance view, especially when the clinent-server communication is high-latency.
To expand on what Eonil said: the answer rests on the distinction between two different sense of "client" and "server".
Layers are not run within the database server processes. They use the FDB client API to make requests of the database, and do not (with one exception*) get to pierce the transactional key-value abstraction.
However, there is nothing stopping your from running the layers on the same physical (or virtual) server machines as the database server processes. And, as that post from the community site mentions, there are use cases where you might very much wish to do this in order to minimize latencies.
*The exception is the Locality API, which is mostly useful in exactly those cases where you want to co-locate client-side layers with the data on which they operate.
Layers are on top of client-side library feature.
Cited from http://community.foundationdb.com/questions/153/what-layers-do-you-want-to-see-first
That's a good question. One reason that it doesn't always make sense
to run layers on the server is that in a distributed database, that
data is scattered--the servers themselves are a network hop away from
a random piece of data, just like the client.
Of course, for something like an analytics layer which is aware of
what data each server contains, it makes sense to run a distributed
version co-located with each of the machines in the FDB cluster.

How to separate Erlang applications from each other?

Is there something like virtual environment or sandboxing for Erlang applications? Is it possible to share nodes between many applications owners knowing that nobody can break another app?
Nodes are the virtual environment for Erlang applications, so you can't just load arbitrary applications into one node and have everything play nice. There are way too many kinds of shared resource to compete for within a node to allow that (module names, registered process names, ETS table names, ...).
However, nodes can very easily communicate with one another more or less transparently, so spinning up a new node for every collection of apps you don't want to manually vet to make sure they work together is fine. You can run more than one app in a node obviously, but you do have to verify yourself that they won't step on each others toes.
It doesn't cost you a whole lot of memory or CPU to run multiple nodes, so I would almost always recommend running different erlang systems (collections of apps that work together) in different nodes even if you have only one physical machine.

Cloud computing: Learn to scale server up/down automatically

I'm really impressed with the power of cloud computing when it comes to the possibility to scale up and down your facilities depending on your load.
How can I shift my paradigm and learn to write my applications in that way? Write it once and forget(no matter of the future load) would be the best solution.
How can I practice my skills in that area?
Setup virtualization environment when I can add another VMs into the private cloud(via command line?) on some smart algorithms to foresee the load for some period of time?
Ideally I want to practice it without buying actual Cloud computing services and just on my hardware.
The only thing I want to practice here is app/web role and/or message queue systems scaling when current workers have too many jobs in queue. So let's rule out database scaling from the question's goal as too big topic.
One option I will throw out is to use a native Cloud execution framework. You might look at CloudIQ Platform. One component is CloudIQ Engine. It allows you to develop cloud native apps in C/C++, Java and .NET. You get the capabilities of scale up by simply adding workers to your cloud. The framework automatically distributes your applications to the new machine(s), and once installed, will begin sending work to them as requests come in. So in effect the cloud handles your queueing issue for you.
Check out the Download and Community links for more information.
You should try AWS- Amazon's offering a free tier that gives you storage, messaging and micro instances (only linux). you can start developing small try-outs without paying. writing an application that scales isn't that hard- try to break your flow into small, concurrent tasks. client-server applications are even easier- use a load balancer to raise\terminate servers by demand.

Resources