How to mimic a "Docker containers behind reverse-proxy or load balancer setup" during development - docker

Developing and testing applications inside of Docker containers and finally running them in the same manner in production is a great approach to reduce the used to work on my machine but is magically failing in production risk.
This is also reflected in factor no. 10 of the well-known 12 factor manifesto which I try to follow wherever it makes sense for the given use case: Keep development, staging, and production as similar as possible.
On our production server, we have a reverse-proxy in place which is taking incoming requests on port 80 and forwards them to the correct container depending on the Host header used when accessing the host via virtual domains - e.g. requests to app1.domain.name go to the app1 container while requests to app2.domain.name go to the app2 container etc. We use traefik for that but it could also be jwilder/nginx-proxy or any other reverse-proxy or load balancer. No other container ports are publicly exposed for the application containers.
My question now is, what would be the most efficient way to simulate this setup during development? I could think of the following ones:
Ignore the reverse-proxy during development and expose a public port for each service under which one can reach it during development. In production, however, do not publicly expose any ports. While this is easy to do, it does not give exact parity between the development and production setup.
Run a similar reverse-proxy locally during development. While this gives much better parity between development and production environment, this requirement needs to be documented somewhere and places some burden on new developers to get the prerequisites right before actually diving into the application itself - something that the Docker approach actually tries to avoid.
Automate setting up a development environment which mimics the production one (including the reverse-proxy) by putting everything into a virtual machine (e.g. with Vagrant). While this would be convenient from a developer's point of view, it takes some further time to setup initially and consumes more resources.
Did I miss here some other approach which is superior to the ones described?

Related

Should I dockerize mysql and nginx in production?

Our company has a dedicated Linux server that wants to host all services on it.
We have several wordpress, laravel, asp and node websites. We want to dockerize all of these. But we want all services to use the same mysql.
Should we also run mysql in Docker? or not.
How will it be to up and down Docker Compose of one of the projects? Do they affect each other?
I am a little confused.
Well, it all depends on the size of your application/services. On a virtual machine, I would not suggest Dockerizing everything and running a docker-compose to up services. Take for example a database like MySQL, in docker container there are some constraints like the maximum size of the volume/container and networking, which by using the docker-compose you need to take care of with additional parameters, daemon changes. Which can be all configured but to know what exactly needs to be configured in what way is a painful process. There can also be problems with the replication of database, you should not have one database in production. What if the data gets lost? Shouldn't you have a second replica?
Now, for the reverse proxy, it depends. Depends on the size of the production as well. What happends if the container is restarted, upgraded? Will the proxy be down and all your services be unavailable? YES! It may be only for a few minutes, but this is production we are talking about.
On the other hand, it all depends on the size of the project, the size of the traffic, and the budget. Take for example a deployment on kubernetes (you did not specify the deployment target, only docker compose so i will default to kubernetes), where everything is in the form of containers. For each node, you have a ingress-controller (one of the most popular is nginx). If this is production you are talking about, then you can write ingress rules to route the traffic. Ingress-controller is deployed as a DaemonSet, so each node has its own ingress-controller and if one node is down, you would also have another one. The same goes for the database.
What I am trying to say, is that running a simple docker-compose on a machine in production is very risky. Use an environment that can scale up either horizontally or vertically (docker swarm, kube). I hope, I clarified the idea behind the production deployment well.

Docker based Web Hosting

I am posting this question due to lack of experience and I need professional suggestions. The questions in SO are mainly on how to deploy or host multiple websites using Docker running on a single Web Host. This can be done, but is it ideal for moderate traffic websites.
I deploy Docker based Containers in my local machine for development. A software container has a copy of the primary application, as well all dependencies — libraries, languages, frameworks, and everything else.
It becomes easy for me to simply migrate the “docker-compose.yml” or “dockerfile” into any remote Web Server. All the softwares and dependencies get installed and will run just like my local machine.
(Say) I have a VPS and I want to host multiple websites using Docker. The only thing that I need to configure is the Port, so that the domains can be mapped to port 80. For this I have to use an extra NGINX for routing.
But VPS can be used to host multiple websites without the need of Containerisation. So, is there any special benefit of running Docker in Web Servers like AWS, Google, Hostgator, etc., OR Is Docker best or idle for development only in local machine and not to be deployed in Web Servers for Hosting.
The main benefits of docker for simple web hosting are imo the following:
isolation each website/service might have different dependency requirements (one might require php 5, another php 7 and another nodejs).
separation of concerns if you split your setup into multiple containers you can easily upgrade or replace one part of it. (just consider a setup with 2 websites, which need a postgres database each. If each website has its own db container you won't have any issue bumping the postgres version of one of the websites, without affecting the other.)
reproducibility you can build the docker image once, test it on acceptance, promote the exact same image to staging and later to production. also you'll be able to have the same environment locally as on your server
environment and settings each of your services might depend on a different environment (for example smtp settings or a database connection). With containers you can easily supply each container it's specific environment variables.
security one can argue about this one as containers itself won't do much for you in terms of security. However due to easier dependency upgrades, seperated networking etc. most people will end up with a setup which is more secure. (just think about the db containers again here, these can share a network with your app/website container and there is no need to expose the port locally.)
Note that you should be careful with dockers port mapping. It uses the iptables and will override the settings of most firewalls (like ufw) per default. There is a repo with information on how to avoid this here: https://github.com/chaifeng/ufw-docker
Also there are quite a few projects which automate the routing of requests to the applications (in this case containers) very enjoyable and easy. They usually integrate a proper way to do ssl termination as well. I would strongly recommend looking into traefik if you setup a webserver with multiple containers which should all be accessible at port 80 and 443.

How to setup nginx in front of node in docker for Cloud Run?

I need to setup reverse proxy nginx in front of nodejs app that need to be deployed in google cloud run.
Use Cases
- Need to serve assets gzipped via nginx (I don't want to overhead node for gzip compression)
- To block small DDOS attacks
I didn't find any tutorial to setup nginx and node in cloud run.
Also I need to install PM2 to for node.
How to do this setup in docker? also how can I configure nginx before deploying?
Thanks in advance
I need to setup reverse proxy nginx in front of nodejs app that need
to be deployed in google cloud run.
Cloud Run already provides a reverse proxy - Cloud Run Proxy. This is the service that load balances, provides custom domains, authentication, etc. for Cloud Run. However, there is nothing in the design of Cloud Run to prevent you from using Nginx as a reverse proxy inside your container. There is nothing in the design of Cloud Run to prevent you from using Nginx as a separate container front-end to another Cloud Run service. Note in the last case you will be paying twice as much as you will need two Cloud Run services, one for the Nginx service URL and another for the node application.
Use Cases - Need to serve assets gzipped via nginx (I don't want to
overhead node for gzip compression) - To block small DDOS attacks
You can either perform compression in your node app or in Nginx. The result is the same. The performance impact is the same. Nginx does not provide any overhead savings. Nginx may be more convenient in some cases.
Your comment to block small DDOS attacks. Cloud Run autoscales, which means each Cloud Run instance will have some limited exposure to a DOS. As the DDOS traffic increases, Cloud Run will launch more instances of your container. Without a prior request from you, Cloud Run will stop scaling at 1,000 instances. Nginx will not provide any benefit that I can think of to mitigate a DDOS attack.
I didn't find any tutorial to setup nginx and node in cloud run.
I am not aware of a specific document covering Nginx and Cloud Run. However, you do not need one. Any document covering Nginx and Docker will be fine. If you want to run Nginx in the same container as your node application you will need to write a custom script to launch both Nginx and Node.
Also I need to install PM2 to for node.
Not possible. PM2 has a user interface and GUI. Cloud Run only exposes $PORT over HTTP from a Cloud Run instance.
How to do this setup in docker? also how can I configure nginx before
deploying?
There are numerous tutorials on the Internet for setting up Nginx and Docker.
Two examples below. There are hundreds of examples on the Internet.
How to run NGINX as a Docker container
Deploying NGINX and NGINX Plus on Docker
I have answered each of your questions. Now some advice:
Using Nginx with Cloud Run does not make any sense with a Node.js application. Just run your node application and let Cloud Run Proxy do its job.
Compression is CPU intensive. Cloud Run is designed for HTTP style microservices that are small, fast, and compact. You will pay for increased CPU time. If you have content that needs to be compressed, compress it first and serve the content compressed. There are cases where compression in Cloud Run is useful and/or correct, but look at your design and optimize where possible. Static content should be served by Cloud Storage, for example.
Cloud Run can handle a Node.js application easily with excellent performance and scalability provided that you follow its design criteria and purpose.
Key factors to keep in mind:
Low cost, you only pay for requests. Overlapping requests have the same cost as one request.
Stateless. Containers are shut down when not needed which means you must design for restarts. Store state elsewhere such as a database.
Only serves traffic on port $PORT, which today is 8080.
Public traffic can be either HTTP or HTTPS. Traffic from the Cloud Run Proxy to the container is HTTP.
Custom domain names. Cloud Run makes HTTPS for URLs very easy.
UPDATE: Only HTTPS is now supported for the public endpoint (Public Traffic).
I think you should consider using a different approach.
Running multiple processes in a single container is not a best practice. The more common implementation of a proxy as you describe is to use 2 containers (the proxy is often called the sidecar) but this is not possible with Cloud Run.
Google App Engine may be more suitable.
App Engine Flexible permits deployments of containers that are proxied (behind the scenes) by Nginx. You may use static content with Flexible and can incorporate a CDN. App Engine Standard addresses your needs too.
https://cloud.google.com/appengine/docs/flexible/nodejs/serving-static-files
https://cloud.google.com/appengine/docs/standard/nodejs/runtime
Like Cloud Run, App Engine is serverless but provides more flexibility and is a more established service. App Engine integrates with more (all?) GCP services too whereas Cloud Run is limited to a subset.
Alternatively, you may consider Kubernetes (Engine). This provides almost limitless flexibility but requires more ops. As you're likely aware, there's a Cloud Run implementation that runs atop Kubernetes, Istio and Knative.
Cloud Run is a compelling service but it is only appropriate if you can meet its (currently) contrained requirements.
I have good news for you. I have written a blog post about exactly what you needed with sample code.
This example puts NGINX in the front (port 8080 on Cloud Run) while proxying the traffic selectively to another service running in the same container (on port 8081).
Read the blog post: https://ahmet.im/blog/cloud-run-multiple-processes-easy-way/
Source code: https://github.com/ahmetb/multi-process-container-lazy-solution
Google Cloud Compute Systems
To understand GCP Computing, please see the below picture first:
For your case, I totally recommend you to use App Engine Flex to deploy your application. It supports docker container, nodejs,... To understand HOW TO DEPLOY nodejs to GAE Flex, please visit this page https://cloud.google.com/appengine/docs/flexible/nodejs/quickstart
You can install some third party libraries if you want. Moreover, GCP supports the global/internal load balancer, you can apply it into your GAE services.

How Do I Configure Docker Containers Behind A Load Balancer?

My IT infrastructure department has provided me with the following setup: A netscaler load balancer (lb) in front of 3 virtual machines (vm01, vm02, vm03). Each virtual machine was setup with IIS.
I have installed Docker Engine on all three virtual machines and have replicated the same 3 containers (appcontainer1, appcontainer2, appcontainer3) on all 3 virtual machines. Each container contains a .NET Core Web API application (api1, api2, api3).
Each container is configured to expose its port 80 for access to the api and is mapped to a port on the virtual machine where it is running. In other words appcontainer1 is run with docker run -p 8091:80 ., appcontainer2 is run with docker run -p 8092:80 ., and appcontainer3 is run with docker run -p 8093:80 ..
The problem I am running into is how do I call my web applications from a client machine. For example, if I wanted to directly call ap1 on vm01, I would call vm01.domain.com:8091, but how do I make a call to lb.domain.com:8091 and have it resolve correctly on one of the virtual machines?
A crudely put together paint drawing of the situation:
Do I configure the netscaler load balancer to be a reverse proxy and forward the port along to the virtual machines?
Do I configure a separate DNS entry per application (ap1.domain.com, ap2.domain.com, api3.domain.com) and configure IIS (or nginx or Apache) on each virtual machine to resolve to the appropriate port?
Is there a way to configure Docker to do this?
Am I doing it all wrong and over thinking the whole thing?
Should I be using some sort of container orchestration instead?
Is there a sensible way to do this without bothering the infrastructure team to reconfigure everything?
You need to setup each IIS on each VM as a reverse proxy with ARR (Application request routing) module. There are a few tricks that you will use that MAY arise (Hello Microsoft) during this process. I cannot say anything on the load balancer though. Still, it shouldn't be hard to configure it to evenly distribute the load on the machines. All you need is to tell LB to direct any call to lb.domain.com:XXXX to one of the VMs in a round robin manner. You -probably- can do it to vary the port too, which allows you to have your traffic distributed amongst 3VMs x3containers = 9 containers.
However, it is recommended not to expose Kestrel server on the net. Instead, put it behind IIS or whatever. And to configure IIS to act as a reverse proxy, you can either build 3 sites and bind them to the corresponding ports with minimal configuration, or use a single site that uses IIS and resolve the incoming request using rewrite rules. To be honest IIS is a pain to use with docker.
BUT what I actually recommend is to use swarm if your OS supports it and expose a single port per VM. These are one of:
WS2019,
WS2016 1709 update or later (These have no GUI)
Windows 10 1709 update.
The swarm is still problematic in Windows :/ Also it has very frustrating seemingly random errors involving "localhost:PORT" and stuff. For instance, I cannot access my containers on my server (WS2016, pre-1709) using localhost:PORT combination. Same goes for my development machine (Win10 latest) which has just recently become an issue. It was fine before "something" happened and it stopped working.
If you are flexible about which proxy to use, I recommend taking a look at nginx, Kubernetes and if you are on the experimental side traefik, that allows you to get away without using a container orchestration tool (i.e. swarm)

Multiple images inside one container

So, here is the problem, I need to do some development and for that I need following packages:
MongoDb
NodeJs
Nginx
RabbitMq
Redis
One option is that I take a Ubuntu image, create a container and start installing them one by one and done, start my server, and expose the ports.
But this can easily be done in a virtual box also, and it will not going to use the power of Docker. So for that I have to start building my own image with these packages. Now here is the question if I start writing my Dockerfile and if place the commands to download the Node js (and others) inside of it, this again becomes the same thing like virtualization.
What I need is that I start from Ubuntu and keep on adding the references of MongoDb, NodeJs, RabbitMq, Nginx and Redis inside the Dockerfile and finally expose the respective ports out.
Here are the queries I have:
Is this possible? Like adding the refrences of other images inside the Dockerfile when you are starting FROM one base image.
If yes then how?
Also is this the correct practice or not?
How to do these kind of things in Docker ?
Thanks in advance.
Keep images light. Run one service per container. Use the official images on docker hub for mongodb, nodejs, rabbitmq, nginx etc. Extend them if needed. If you want to run everything in a fat container you might as well just use a VM.
You can of course do crazy stuff in a dev setup, but why spend time setting up something that has zero value in a production environment? What if you need to scale up one of the services? How do set memory and cpu constraints on each service? .. and the list goes on.
Don't make monolithic containers.
A good start is to use docker-compose to configure a set of services that can talk to each other. You can make a prod and dev version of your docker-compose.yml file.
Getting into the right frame of mind
In a perfect world you would run your containers in clustered environment in production to be able to scale your system and have concurrency, but that might be overkill depending on what you are running. It's at least good to have this in the back of your head because it can help you to make the right decisions.
Some points to think about if you want to be a purist :
How do you have persistent volume storage across multiple hosts?
Reverse proxy / load balancer should probably be the entry point into the system that talks to the containers using the internal network.
Is my service even able run in a clustered environment (multiple instances of the container)
You can of course do dirty things in dev such as mapping in host volumes for persistent storage (and many people who use docker standalone in prod do that as well).
Ideally we should separate docker in dev and docker i prod. Docker is a fantastic tool during development as you can have redis, memcached, postgres, mongodb, rabbitmq, node or whatnot up and running in minutes sharing that compose setup with the rest of the team. Docker in prod can be a completely different beast.
I would also like to add that I'm generally against the fanaticism that "everything should be running in docker" in prod. Run services in docker when it makes sense. It's also not uncommon for larger companies to make their own base images. This can be a lot of work and will require maintenance to keep up with security fixes etc. It's not necessarily the first thing you jump on when starting with docker.

Resources