I would like to setup a cluster for a small app/db (with galera) and was watching for session storage.
The only reason of the cluster is because we need it to be always available, ressources should not be a matter.
I am wondering if I setup my 3 hosts with the same stack :
app
db (sync with galera)
redis (1master+1slave per host clustered between the 3 hosts)
Will my app's containers be able to access every part of redis's sessions stored, no matter where they are ? Or are they bound to the linked redis-container I set it up with at first ?
Related
Right now I am struggling with debugging of NodeJs application which is clustered and is running on Docker. Found on this link and this information in it:
Remember, Node.js is still single-threaded in most cases, so even on a
single server you’ll likely want to spin up multiple container
replicas to take advantage of multiple CPU’s
So what does it mean, clustering of NodeJs app is pointless when it is meant to be deployed on Kubernetes ?
EDIT: I should also say that, by clustering I mean forking workers with cluster.fork() and goal of the application is to build simple REST API with high load traffic.
Short answer is yes..
Containers are just mini VM's and kubernetes is the orchestration tool that manages all the running 'containers', checking for health, resource allocation, load etc.
So, if you are running your node application in a container with an orchestration tool like kubernetes, then clustering is moot as each 'container' will be using 1 CPU or partial CPU depending on how you have it configured. Multiple containers essentially just place a new VM in rotation and kubernetes will direct traffic to each.
Now, when we talk about clustering node, that really comes into play when using tools like PM2, lets say you have a beefy server with 8 CPU's, node can only use 1 per instance so tools like PM2 setup a cluster and will route traffic along each of the running instances.
One thing to keep in mind though is that your application needs to be cluster OR container ready. Meaning nothing should be stored on the ephemeral disk as with each container restart that data is lost OR in a cluster situation there is no guarantee the folders will be available to each running instance and if you cluster with multiple servers etc you are asking for trouble :D ( this is where an object store would come into play like S3)
I am doing containerisation of some legacy application (web + service + C++ app) which runs on Linux environment and currently has more than 10 clients.
I could set up and run the app (C++ app) from Docker. Since the app is going to read some property file which will be different for different clients. So I tried to mount drive for sharing data outside of Docker (runtime some files may got changed).
But my biggest concern is how do I run a single container for different clients whose runtime (in memory state) will be different? (Application will run forever, until someone kills/stops it).
Do I need to run n containers for n clients?
Does Docker swarm/Kubernetes have some property for such a scenario?
Will ech client get its own dedicated container?
Can you suggest also some further reading/studying for such scenarios?
And for the Database - since every client will have different data - different DB should be used?
You can isolate containers by supplying them with a unique name and environment variables.
Example:
docker run --name client1 --env-file ./client1.env your-image-name-here
You can have isolated environments and configurations which is unique for each context respectively with this approach.
You need N containers for N clients. But you can use same image for N containers. So one container per customer and each container will be identified by it's own unique name and environment variables.
What you make sound like your basic needs are :
same application for everyone
one configuation per client
one DB per client
The first point is easy to solive : "same application = same image".
Then, what you'll need to personalize the application will be configuration and DB path.
If you want to containerize the DB, the questions will be the same, to let's say you have a DB url instead (it could could be a container : it doesn't matter that much).
There are various options to personnalize your applications :
inherit from the common image and decline it in as many images as you need... with a serious impact on maintenability
add customisation through "docker-compose", which is easier to read, write and maintain!
If one instance per client is ok, just go with a docker-compose per client.
If you happen to need more, go for swarm mode (you can use swarm mode for one instance as well).
In both cases, you'll need a docker-compose (actually you don't really NEED it since you can do all the same through command line, but it's less easy to maintain to my opov, and less easy to explain!).
It may look like that :
version: "3.7"
my-service:
image: your/common/image:1.0
volume:
- /a/path/from/host/with/confs:/a/path/to/container/conf/dir # will replace content there!
environment:
- "DB_URL=my-cny.denver.com:3121/db_client" # can vary or be the same if DB_NAME vary instead
- "DB_NAME=my-cny.denver.com:3121/db_client_1" #vary the name of the DB
- "DB_PSSWD=toto"
...
There are things you shouldn't do, such as writing clear PWD here, but that's just an example.
There are better mecanisms for config file and sensitive data that should be managed though "config" and "secret" mecanisms.
Is it good practice to have multiple services connect to the same database server but each having their own database.
I guess having one Postgres instance is better than each service/container having their own instance.
My question is that should each service:
run in their own container with db server instance in the same container
run in their own container and the db for that service run on a separate container just for the db (multiple db containers/one per service)
one db SERVER, multiple databases, one container for all databases and all services connect this container/db server
I understand that each service should have their own db, but does that also means they should be completely decoupled even server wise.
I guess the reason I want to have one db SERVER is so that resources are not "wasted" as multiple instance of db server running
I also understand that having one server will mean that all services will coupled hardware wise
It doesn’t really matter. Modern infrastructures tend not to care about the overhead of running multiple copies of the same service. Since database I/O can often be a critical performance point, you might find it more manageable to not share a database, so that you can put databases under heavier load on dedicated and/or larger hardware.
(Also consider running your database(s) on dedicated hardware, not under Docker: they’re the one thing you must back up, and you’ll update them much less often than the rest of your application stack, so their lifecycle is fundamentally different from a disposable Docker container. If you’re using a public cloud service that offers a managed database service and are willing to pay for it, that can also be a very reasonable option.)
Whatever you decide, you almost definitely need to make all of the parameters (host, database, username, password) configurable, usually via environment variables. (I see too many SO questions that have host names hard-coded in source code.) You should be able to deploy the same image with different options in development, test, and production environments, which will generally have different host names.
Yes it is ok to have multiple services on the same server.
The deployment configuration should be guided by your operational needs (cost, performance monitoring etc.) as long as the services are decoupled you'd have the freedom to move the data around based on your operational needs
Well, you can consider these:
1) If your service doesn't care about your database server, I mean like, for Service A it must be MongoDB, Service B must be Ms. SQL Server, etc.
Then you can go with this setup:
One Database Server -> with, Multiple Databases -> where, One Database for each Service.
2) But, if you find it does matter, as I describe below:
Service A -> Postgres
Service B -> Postgres
Service C -> Ms. SQL Server
Service D -> MongoDB
then you will have 3 database servers where Service A & B will share the same Postgres database server (containing 2 databases, one for Service A, one for Service B) and Service C will have 1 Ms. SQL Server (containing, 1 database) also Service D MongoDB server (containing 1 database).
Usually, you will find this case while you are working with different teams (which handle each service) and each team decides on their own choices.
We use Mongodb as the central Database for our application; a consumer facing mobile app. At present its a 7-member replica-set with replica-set-1 being the master at the moment. The backend which connects to the mongo replica is build in Ruby on Rails and we use mongoid as the ODM.
There are mainly 3 pieces connecting to the MongoDB replica-set.
The consumer application
The Admin and customer care management application
The Data retrieval application ( for analytics and such purposes )
All these 3 apps connect to the same replica set as of now.
What I would like to know is whether is it possible to connect different applications to specific replicas.
For example, the mobile app connects to the primary for writes and the replicas 2-4
to read; the customer care management application connects to the primary
( for writes ) and replicas 5-7 for reads.
I dont think explicitly mentioning specific replicas in the mongoid.yml configuration is working. Even though I have already mentioned only replica-set-7 in the mongoid hosts file for the data retrieval application, I do see certain queries in the log file of replica-set-2 and 3.
So obviously, MongoDB decides the criteria to distribute the queries among its replicas despite the configuration specified at the client mongoid end.
I would really love to know if such a thing is possible at all using MongoDb and mongoid as it would help us solve a lot of our load issues. Right now heavy queries from the customer care and data retrieval apps affects the consumer facing mobile app as well; as the reads are not segragated. So basically would like to separate out the reads.
Also, if at all this is possible, I would again have my eyes raised on any possible pitfalls for this; specially that all 3 applications can write to the DB. For example, replica-3 suddenly becomes the primary after an election and its not explicitly mentioned in the configuration of the data retrieval application. What might happen there would become a concern.
I am not at all sure whether this is possible; but just wanted to know if theres a way to figure out this. Any help would be really appreciable.
When you connect to any member of a replica set, the client is told the full state of the replica set and can connect to any of them. The initial set of hosts are just the seeds for that process - as long as your application can reach one of those hosts, it doesn't matter which hosts are in that configuration.
Mongo does have the concept of tagged replica set members. When creating a connection or executing a query you can specify the tags to use to select the replica set member to read from.
My question is related to microservices & service discovery of a service which is spread between several hosts.
The setup is as follows:
2 docker hosts (host A & host B)
a Consul server (service discovery)
Let’s say that I have 2 services:
service A
service B
Service B is deployed 10 times (with random ports): 5 times on host A and 5 times on host B.
When service A communicates with service B, for example, it sends a request to serviceB.example.com (hard coded).
In order to get an IP and a port, service A should query the Consul server for an SRV record.
It will get 10 ip:port pairs, for which the client should apply some load-balancing logic.
Is there a simpler way to handle this without me developing a client resolver (+LB) library for that matter ?
Is there anything like that already implemented somewhere ?
Am I doing it all wrong ?
There are a few options:
Load balance on client as you suggest for which you'll either need to find a ready-build service discovery library that works with SRV records and handles load balancing and circuit breaking. Another answer suggested Netflix' ribbon which I have not used and will only be interesting if you are on JVM. Note that if you are building your own, you might find it simpler to just use Consul's HTTP API for discovering services than DNS SRV records. That way you can "watch" for changes too rather than caching the list and letting it get stale.
If you don't want to reinvent that particular wheel, another popular and simple option is to use a HAProxy instance as the load balancer. You can integrate it with consul via consul-template which will automatically watch for new/failed instances of your services and update LB config. HAProxy then provides robust load balancing and health checking with a lot of options (http/tcp, different balancing algorithms, etc). One possible setup is to have a local HAProxy instance on each docker host and a fixed port assigned statically to each logical service (can store it in Consul KV) so you connect to localhost:1234 for service A for example and localhost:2345 for service B. Local instance means you don't pay for extra round trip to loadbalancer instance then to the actual service instance but this might not be an issue for you.
I suggest you to check out Kontena. It will solve this kind of problem out of the box. Every service will have an internal DNS that you can use in communication between services. Kontena has also built-in load balancer that is very easy to use making it very easy to create and scale micro services.
There are also lot's of built-in features that will help developing containerized applications, like private image registry, VPN access to running services, secrets management, stateful services etc.
Kontena is open source project and the code is visible on Github
If you look for a minimal setup, you can wrap the values you receive from Consul via ribbon, Netflix' client based load balancer.
You will find it as a module for Spring Cloud.
I didn't find an up-to-date standalone example, only this link to chrisgray's dropwizard-consul implementation that is using it in a Dropwizard context. But it might serve as a starting point for you.