How Do I Configure Docker Containers Behind A Load Balancer?

How Do I Configure Docker Containers Behind A Load Balancer? - docker

My IT infrastructure department has provided me with the following setup: A netscaler load balancer (lb) in front of 3 virtual machines (vm01, vm02, vm03). Each virtual machine was setup with IIS.
I have installed Docker Engine on all three virtual machines and have replicated the same 3 containers (appcontainer1, appcontainer2, appcontainer3) on all 3 virtual machines. Each container contains a .NET Core Web API application (api1, api2, api3).
Each container is configured to expose its port 80 for access to the api and is mapped to a port on the virtual machine where it is running. In other words appcontainer1 is run with docker run -p 8091:80 ., appcontainer2 is run with docker run -p 8092:80 ., and appcontainer3 is run with docker run -p 8093:80 ..
The problem I am running into is how do I call my web applications from a client machine. For example, if I wanted to directly call ap1 on vm01, I would call vm01.domain.com:8091, but how do I make a call to lb.domain.com:8091 and have it resolve correctly on one of the virtual machines?
A crudely put together paint drawing of the situation:
Do I configure the netscaler load balancer to be a reverse proxy and forward the port along to the virtual machines?
Do I configure a separate DNS entry per application (ap1.domain.com, ap2.domain.com, api3.domain.com) and configure IIS (or nginx or Apache) on each virtual machine to resolve to the appropriate port?
Is there a way to configure Docker to do this?
Am I doing it all wrong and over thinking the whole thing?
Should I be using some sort of container orchestration instead?
Is there a sensible way to do this without bothering the infrastructure team to reconfigure everything?

You need to setup each IIS on each VM as a reverse proxy with ARR (Application request routing) module. There are a few tricks that you will use that MAY arise (Hello Microsoft) during this process. I cannot say anything on the load balancer though. Still, it shouldn't be hard to configure it to evenly distribute the load on the machines. All you need is to tell LB to direct any call to lb.domain.com:XXXX to one of the VMs in a round robin manner. You -probably- can do it to vary the port too, which allows you to have your traffic distributed amongst 3VMs x3containers = 9 containers.
However, it is recommended not to expose Kestrel server on the net. Instead, put it behind IIS or whatever. And to configure IIS to act as a reverse proxy, you can either build 3 sites and bind them to the corresponding ports with minimal configuration, or use a single site that uses IIS and resolve the incoming request using rewrite rules. To be honest IIS is a pain to use with docker.
BUT what I actually recommend is to use swarm if your OS supports it and expose a single port per VM. These are one of:
WS2019,
WS2016 1709 update or later (These have no GUI)
Windows 10 1709 update.
The swarm is still problematic in Windows :/ Also it has very frustrating seemingly random errors involving "localhost:PORT" and stuff. For instance, I cannot access my containers on my server (WS2016, pre-1709) using localhost:PORT combination. Same goes for my development machine (Win10 latest) which has just recently become an issue. It was fine before "something" happened and it stopped working.
If you are flexible about which proxy to use, I recommend taking a look at nginx, Kubernetes and if you are on the experimental side traefik, that allows you to get away without using a container orchestration tool (i.e. swarm)

Related

Run two applications on same port on same machine

I had a interview 3 years back and in one of the design interview rounds a question came up ,how can you have two java application (Deployed on tomcat) run on the same port . You can use any tools like docker etc but you can't have a separate virtual machine (like Vmware or VM virtual box) . I am not sure if docker can be used (I just said may be we can use two docker containers, not sure if it would be the right approach) . Any ideas if its possible and how .

You can't have 2 programs that use the same port.
To solve that, you can set up a reverse proxy (Nginx, Traefik or the like) that listens on the port and then routes the traffic to the applications based on what the requests look like. The applications would listen on their own ports. So one port each.
You can route on different things, but in your case you might set it up so requests that start with /app1/ go to one application and requests that start with /app2/ go to the other.
Nginx and Traefik both have standard images available that are pretty easy to set up in Docker.

You can't have two processes listen to the same ports on the same IP address on the same machine.
To work around this, as Hans Kilian says, you'll need a reverse proxy.
Alternatively, if the machine's network interfaces are configured for multiple IP addresses, you can assign one to each running server - and then you're free to use the same port on the other IP-address(es). This is independent of the actual server that you use - be it Tomcat, Docker, or anything else.
Naturally, configuring the different processes to listen to specific IP addresses is dependent on the software itself. As you're asking about Tomcat: It's connectors (see server.xml) have the configuration.
I consider Reverse Proxies more the standard approach found in the wild, but as you were talking about an interview question: This is another option

The only approach to have this scenario is to use reverse proxy which can route the request to the specific java applications based on url match and redirect logic.
The apps must be running on different ports, eg. in case of nginx as proxy create two server blocks with each having a location block.

Docker based Web Hosting

I am posting this question due to lack of experience and I need professional suggestions. The questions in SO are mainly on how to deploy or host multiple websites using Docker running on a single Web Host. This can be done, but is it ideal for moderate traffic websites.
I deploy Docker based Containers in my local machine for development. A software container has a copy of the primary application, as well all dependencies — libraries, languages, frameworks, and everything else.
It becomes easy for me to simply migrate the “docker-compose.yml” or “dockerfile” into any remote Web Server. All the softwares and dependencies get installed and will run just like my local machine.
(Say) I have a VPS and I want to host multiple websites using Docker. The only thing that I need to configure is the Port, so that the domains can be mapped to port 80. For this I have to use an extra NGINX for routing.
But VPS can be used to host multiple websites without the need of Containerisation. So, is there any special benefit of running Docker in Web Servers like AWS, Google, Hostgator, etc., OR Is Docker best or idle for development only in local machine and not to be deployed in Web Servers for Hosting.

The main benefits of docker for simple web hosting are imo the following:
isolation each website/service might have different dependency requirements (one might require php 5, another php 7 and another nodejs).
separation of concerns if you split your setup into multiple containers you can easily upgrade or replace one part of it. (just consider a setup with 2 websites, which need a postgres database each. If each website has its own db container you won't have any issue bumping the postgres version of one of the websites, without affecting the other.)
reproducibility you can build the docker image once, test it on acceptance, promote the exact same image to staging and later to production. also you'll be able to have the same environment locally as on your server
environment and settings each of your services might depend on a different environment (for example smtp settings or a database connection). With containers you can easily supply each container it's specific environment variables.
security one can argue about this one as containers itself won't do much for you in terms of security. However due to easier dependency upgrades, seperated networking etc. most people will end up with a setup which is more secure. (just think about the db containers again here, these can share a network with your app/website container and there is no need to expose the port locally.)
Note that you should be careful with dockers port mapping. It uses the iptables and will override the settings of most firewalls (like ufw) per default. There is a repo with information on how to avoid this here: https://github.com/chaifeng/ufw-docker
Also there are quite a few projects which automate the routing of requests to the applications (in this case containers) very enjoyable and easy. They usually integrate a proper way to do ssl termination as well. I would strongly recommend looking into traefik if you setup a webserver with multiple containers which should all be accessible at port 80 and 443.

Best approach to create containers

I am developing an application with nodejs, mysql that has the following dependencies
Nginx (for reverse proxying the db and the nodejs server)
ghostscripts (dependent os is ubuntu)
pdftk (dependent os is ubuntu)
I would like to know what would be the best approach if I want to use docker containers to pack my application.
Should I create one Nginx container, one nodejs container and one MySQL and make them talk to each other? I know this is a better approach since its scalable, but in this case how and where should I install ghostscript and pdftk? (the nodejs application makes use of Ghostscript and pdftk for pdf files)
or
should I create one ubuntu docker container and install everything (viz. Nginx, pdftk, Ghostscript, mysql) in it?

Splitting an application up into separate containers requires a well defined API that support calls over the network (usually HTTP or some other application protocol on the TCP stack).
As both ghostscripts and pdftk are commandline tools invoked using a CLI you cannot call them from another container out of the box, you would need to develop some external facing API for that.
When setting the boundaries of your containers, think in terms of domains. The container becomes a the smallest unit that you will deploy and scale. That unit should be self contained and have a well defined, single purpose.
It is not clear from your description exactly what role nginx plays, but assuming that is some kind of client facing webserver or proxy, 3 containers makes sense in your case
NodeJs + PDFTK + Ghostscripts (The application)
Nginx (The webserver/proxy)
MySQL (The database)
The NodeJS application has all its application dependencies inside, but are more loosely coupled to Nginx and MySQL to whom it can communicate over the network.

You should create separate containers for each application, because this allows you to achieve:
Independent deploy.
Independent scaling.
Independent development.
Isolation and security.
For convenience, you can use docker-compose, which allows you to launch configure and launch multiple docker containers with a single command.
I would recommend that you deploy the database not in a Docker container in production because the database stores the state, it is also unreliable, and this increases the complexity of support.

How to manage multiple projects on Docker?

In our company ~7 projects, each based on Docker. Each project contain base services, like MySQL, Nginx, PHP. Some of projects communicate with other projects. Because of many services on same port, we make new docker host (docker-machine) for each project. From here few problems are coming:
VirtualBox assign random IP to each Docker host, depends on sequence of executing.
Hard to switch from project to project, need to set different shell envs all the time. Easy to make mistake.
Well, I'm searching for more enterprise solution to manage many docker machines. Or a some technique that can help me with current situation.

I had similar problems last summer.
First, I started to deploy my projects to swarm-cluster as services, instead of clustering several docker VMs. This enabled me to play around services with only the service IDs. It is important that how to separate projects into services, this part may be cumbersome depending on your project.
https://docs.docker.com/engine/swarm/swarm-tutorial/deploy-service/
Then, I build my configuration and monitoring software once on swarm-manager and use it. You can use your automation tools on docker-manager to control services.

A virtual machine consumes resources and it is better to avoid it if is not necessarily. Instead you could deploy the projects in the docker swarm on bare metals.
But because every project has an entry point that needs to be accesible from the outside world (i.e. https://site1.com and https://site2.com) you can't expose the same port (443 or 80) for all the frontend services in the same swarm. For this you can use a reverse proxy like HAProxy or Nginx that forwards the requests to the right service based on the hostname. The reverse proxy could be also a service in the swarm. In this situation you should not expose the projects' ports anymore.
A reverse proxy has many other advantages, like SSL termination (this makes the SSL certificate management a lot easier).
If you add the projects to the same custom network then the services from different projects could communicate securely and directly, using their docker service name and the internal port (i.e. 80).

HaProxy for service discovery on a marathon mesos docker linked containers

Please this is not asked anywhere I have checked. Here is what I have done. I am able to deploy single instance of mesos, marathon and docker. Moving next step ahead I want to have 2 mesos slave(docker containers) linked to each other. Just using docker the same can be achieved by using the docker link feature. But while using the orchestration(mesos) and scheduler(marathon)it seems u need to use service discovery.
My setup up is simple and runnning on a single host. So I will have 2 docker containers one running a simple pub/sub and one running rabbitmq. How can I use HA PRoxy in this setup. I have seen some documents provided by mesosphere
http://mesosphere.com/docs/getting-started/service-discovery/ but it is not clear how to go about it.

The canonical approach for service discovery with Mesos + Marathon + Docker is currently what is described in the document you linked.
I'm assuming you're able to get the two applications running in Marathon already.
Typically what happens is:
1) Configure your application definition to include the ports that your application requires.
2) You set up the provided haproxy-marathon-bridge script to run periodically using a utility like cron. This script scrapes Marathon's API to figure out what host and port the application instances are running on and what the known "friendly" port is.
In the example in the service discovery article, the first application has friendly ports of 80 and 443, whilst the second has a friendly port of 8081.
The script then generates a haproxy.cfg configuration that has rules mapping localhost:friendly_port to actual_host:actual_port.
3) Configure your applications to look for each other on localhost:friendly_port. HAProxy will route connections appropriately.
Hope this helps your understanding!

I created a haproxy service discovery docker container that you can run in mesos. It's not production ready but I am using it in my development environment doing exactly what you're trying to do. The reason I prefer this over what comes with marathon is I haven't found a good way to do complicated haproxy configurations with haproxy-marathon-bridge. With spiderweb you can create a template for the haproxy configuration which enables you to do things such as acl routing etc. It doesn't support health checks yet which is something that will need to be done before its production ready. You can see the project here https://github.com/SBRDevelopment/spiderweb.

We have combined Mesos and Marathon with consul and registartor,
so in the end you have haproxy configuration auto-generated with consul-template.
try https://github.com/eBayClassifiedsGroup/PanteraS
All in one container.

With Mesos-DNS you can also do the following:
Setup mesos-dns as in this guide: http://programmableinfrastructure.com/guides/service-discovery/mesos-dns-haproxy-marathon/ (you can skip HAProxy steps they are not required)
When you start your docker containers make sure that they have "namespace %slave_ip_with_mesos_dns%" (replace string with IP address) in their /etc/resolv.conf files.
if lets say name of an app is "peek" it should be reachable from other applications at peek.marathon.mesos

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart