Docker: redirect traffic to containers - docker

We are a small design company, I'm the only one to "code" (making small scripts/tools for the creatives)
I have a server on a local network.
On this server, I installed docker and docker-compose.
On this server I want to have a few containers running, one per service (gitlab, taiga, wiki.js, mattermost, wekan)
When setting the docker-compose.yml, How should I manage ports (and or any other settings) so that:
First (case study): (Let's say I just have one container running) when typing the host IP address in a web browser, it redirect to my service and display for example, /var/www/ if my service is a website
Second: when typing subdomain.myhostname in a web browser, it redirects to one specific service

It's a very broad question, strongly dependent on one's experience. From what I consider fast and reliable, as far as small environments are considered, you may want to take Rancher for a spin.
It's super easy to start with. What's more, there's a range of services like Gitlab or DokuWiki you can start with just one click. On top of that, you can configure a load balancer, that can perform the redirections you mentioned. I think it's one of the fastest options to get a functional and scalable stack. Definitely not the most stable one, compared to enterprise-grade OpenShift, but I think it'll do just fine.
I will not go through all the setup details as I believe it's not what the question is about, but you can start with setting up Rancher 1.6 docker server going step by step through the official doc guide. It's pretty straightforward - one bash command and you are up and running.
Openshift is a platform competing to Rancher. To my best knowledge, it's harder to work with, especially having no experience. It's more stable, that's for sure, alas requires more effort in general.
I intentionally omitted a few options as I took an assumption OP wants it working asap while still easily being re-configurable, stable, and GUI-manageable.
-- edit a few years later --
Rancher and Openshift are still actively developed and attract new users. Rancher released a stable v2 since my original answer, and so I no longer recommend looking at v1.6.

Related

What is the "proper" way to migrate from Docker Compose to Kubernetes?

My organization manages systems where each client is provisioned a VPS and then their tech stack is spun up on that system via Docker Compose.
Data is stored on-system, using Docker Compose volumes. None of the fancy named storage - just good old direct path volumes.
While this solution is workable, the problem is that this method does not scale. We can always give the VPS more CPU/Memory but that does not fix the underlying issues.
Staging / development environments must be brought up manually - and there is no service redundancy. Hot swapping is impossible with our current system.
Kubernetes has been pitched to me to solve our problems, but honestly I have no idea where to begin - most of the documentation is obtuse and I have failed to find somebody with our particular predicament.
The end goal would be to have just a few high-spec machines running Kubernetes - with redundancy, staging, and the ability to spin up new clients as necessary (without having to provision additional machines or external IPs).
What specific tools would my organization need to use to achieve this goal?
Are there any tools that would allow us to bring over our existing Docker Compose stacks into Kubernetes?
Where to begin: given what you're telling us, I would first look into my options to implement some SDS.
You're currently using local volumes, which you probably won't be able to do with Kubernetes - or at least shouldn't, if you don't want to bind your containers to a unique node.
The most easy way - while not necessarily the one I would recommend - would be to use some NFS servers. Even better: with some DRBD, pacemaker / corosync, using a VIP for failover -- or the FreeBSD way: hastd, carp, ifstated, maybe some zfs. You would probably have to deploy distinct systems scaling your Kubernetes cluster, distributing IOs, ... a single NFS server doesn't last long without its load going over 50 and iowaits spiking ...
A better way would be to look into actual SDS solutions. One I could recommend is Ceph, though there's a lot of new solutions I'm less familiar with ... and there's GlusterFS I would definitely avoid. An easy way to deploy Ceph would be to use ceph-ansible.
Given what corporate hardware you have at your disposal, maybe you would have some NetApp or equivalent, something that can implement NFS shares, and/or some iSCSI gateways.
Now, those are all solutions you could run on the side, although note that you would also find "CNS" solutions (container native), which are meant to be deployed on top of Kubernetes. Ceph clusters can be managed using Rook. These can be interesting, though in terms of maintenance and operations, it requires good knowledge of both the solution you operate and kubernetes/containers in general: troubleshooting issues and fixing outages may not be as easy as a good-old bare-meta/VM setup. For a first Kubernetes experience: I would refrain myself. When you'll feel comfortable enough, go ahead.
In any cases, another critical consideration before deploying your cluster would be the network that would host your installation. Consider that Kubernetes should not be directly deployed on public instances: you would probably want to have some private VLAN, maybe an internal DNS, a local resitry (could be Kubernetes-hosted), or other tools such as an LDAP server, some SMTP relay, HTTP cache/proxies, loadbalancers to put in front of your API, ...
Once you've made up your mind regarding those issues, you can look into deploying a Kubernetes cluster using tools such as Kubespray (ansible) or Kops (uses Terraform, and thus requires some cloud API, eg: aws). Both projects are part of the Kubernetes project and maintained by its community. Kubespray would cover all scenarios (IAAS & bare-metal), integrate with popular SDS out of the box, can ship with various ingress controllers, ... overall offers good defaults, and lots of variables to customize your installation.
Start with a 3-master 2-workers cluster, make sure the resulting cluster matches what you would expect.
Before going to prod, take your time to properly translate your existing configurations. Sometime, refactoring code or images could be worth it.
Going to prod, consider adding a group of "infra" nodes: if you want to host some logging solution or other internal services that are somewhat critical to users and shouldn't suffer outages caused by end-users workloads (eg: ingress routers, monitoring, logging, integrated registry, ...).
Kubespray: https://github.com/kubernetes-sigs/kubespray/
Kops: https://github.com/kubernetes/kops
Ceph: https://ceph.com/en/discover/
Ceph Ansible: https://github.com/ceph/ceph-ansible
Rook (Ceph CNS): https://github.com/rook/rook

How to route all internet requests through a proxy in docker swarm

tldr; does docker swarm have a forceful and centered proxy setting that explicitly proxies all internet traffic in all services that is hosted in the cluster? Or any other tip of how to go about using a global proxy solution in a swarm cluster...?
Obs! this is not a question about a reversed proxy.
I have a docker swarm cluster (moving to Kubernatives as a solution is off-topic)
I have 3 managers and 3 workers, I label the workers accordingly to the expected containers they can host. The cluster only deploys docker swarm services, when I write "container" in this writing I'm referring to a docker swarm service container.
One of the workers is labelless, though active, and therefore does not host any containers to any service. If I would label the worker to allow it to host any container, then I will suffer issues in different firewalls that I don't always control, because the IP simply is not allowed.
This causes the problem for me that I can't do horizontal scaling, because when I add a new worker to the cluster, I also add a new IP that the requests can originate from. To update the many firewalls that would need to be updated because of a horizontal scaling is quite large, and simply not an option.
In my attempt to solve this on my own, I did what every desperate developer does and googled for a solution... and there is a simple and official documentation to be able to achieve this: https://docs.docker.com/network/proxy/
I followed the environment variables examples on that page. Doing so did however not really help, none of the traffic goes through the proxy I configured. After some digging, I noticed that this is due to nodejs (all services are written using nodejs), ignoring the proxy settings set by the environment. To solve that nodejs can use these proxy settings, I have to refactor a lot of components in a lot of services... a workload that is quite trumendus and possibly dangerous to perform given the different protocols and ports I use to connect to different infrastructural services outside the cluster...
I expect there to be a better solution for this, I expect there to be a built in functionality that forces all internet access from the containers to go through this proxy, a setting I don't have to make in the code, in my implementations. I expect there to be a wrapping solution that I can control in a central manner.
Now reading this again, I think maybe I should have tested the docker client configuration on the same page to see if it has the desired effect I'm requiring, but I assume they both would have the same outcome, being described on the same page with no noticeable difference written in the documentation.
My question is, is there a solution, that I just don't seem to be able to find, that wraps the proxy functionality around all the services? or is it a requirement to solve these issues in the implementation itself?
My thought is to maybe depend on an image, that in its turn depends on the nodejs image that I use today - that is responsible for this wrapping functionality, though still on an implantation level. Doing so would however still force the inheriting of a distributed solution of this kind - if I need to change the proxy configurations, then I need to change them everywhere, and redeploy everything... given a less complex solution without an in common data access layer.

Does Docker Swarm keep data synced among nodes?

I've never done anything with Docker Swarm, or Kubernetes so I'm trying to learn what does what, and which is best for my purpose before tackling it.
My scenario:
I have a Desktop PC running Docker Desktop, and ..
I have a Raspberry PI running Docker on Raspbian
This is all on a home LAN, so I don't really want to get crazy with complicated things.
I want to run Pi Hole and DNSCrypt Proxy containers on both 'machines', (as redundancy, mostly because the Docker Desktop seems to crash a lot taking down my entire DNS system with it when I just use that machine for Pi-hole).
My main thing is, I want all the data/configurations, etc. between them to stay in sync (i.e. Pi hole's container data stays in sync on both devices, etc.), and I want the manager to make sure it's always up, in case of crashes, and so on.
My questions:
Being completely new to this area, and just doing a bit of poking around:
it seems that Kubernetes might be a bit much, and more complicated than I need for this?
That's why I was thinking Swarm instead, but I'm also not sure whether either of them will keep data synced?
And, say I create 2 Pi-hole containers on the Manager machine, does it create 1 on the manager machine, and 1 on the worker machine?
Any info is appreciated!
Docker doesn't quite have anything that directly meets your need, but if you've got a reliable file server on your home LAN, you could do it really easily.
Broadly speaking you want to look at Docker Volume Plugins. Most of them ultimately work via an external storage provider and so won't be that helpful for you. There's a couple of more exotic ones like Portworx or StorageOS that can do portable/replicated storage purely in Docker, but I think most of them are a paid license.
But, if you have a fileserver that you trust to stay up and running, you can mount an NFS/CIFS share as a volume as mentioned in the Docker Docs, and Docker can handle re-connecting it when a container moves from one node to another due to a failure.
One other note: you want two manager nodes and one container per service in your swarm. You need to have one working Manager node for the swarm to work (this is important if a Manager crashes). Multiple separate instances would generally only be helpful if the service was designed as a distributed/fault tolerant application.

Is this correct way to deploy springboot cloud netflix in production on multi host network?

I am developing a spring boot application with netflix cloud stack. and deploying each module(microservice) in separate docker container. Structure is as follows:
Eureka
Zuul
Business logic in Microservices
MySQL
Angular4 UI
Keycloak - User management and Authentication
ELK - for log maintenance
Hystrix
Zipkin
Okay so after facing lot of problems and spending whole lot of network bandwidth on googling on the matter I have deployed in following way, What I need to know is, if it is correct way to do it ?
The limitation here is that I have been provided with 2 hosts to test this configuration and further action plan is not there yet.
So here is what I have done: I have not yet used full stack which I mentioned.
Server 1
Eureka
Zuul
ELK
Server2
Keycloak
Business Logic microservices
MySQL
Anguar4 UI
Haven't configured and used Hystrix and Zipkin yet.
So I have given the IP:PORT of the Server1 in the Eureka configuration of all the microservices which needs to register on Eureka. Same goes for Zuul(given the IP:PORT of Eureka).
In the Angular4 UI I have given the URL:PORT of Zuul deployment, because all the services will be called through Zuul.
This I understand is correct because Services needs to know where Eureka is located and rest can be managed through Eureka.
Now my key question is, because MySQL, ELK can't be registered on Eureka, so is it correct to give IP:PORT of MySQL and ELK wherever required ?
Same goes with the configuration of ELK, with ELK my requirement is also that all the logs are located at common place for this I have used docker, volume mounting but I don't know how to accomplish this on multi host environment, I can only make dockers out put logs on external volume which can then probably be accessed by ELK over URL, haven't tested this configuration yet.
If so then isn't this configuration not so Independent if we think it will be able to manage itself ?
I have configured my docker compose to use "network_mode": host so host to host docker communication can be done.
Again All I need to know is, is my configuration/architecture correct for multi-host environment and in future for Cloud environments ?
If Not, then please kindly guide me to correct path.
Thank you!
p.s. excuse me for my English and Grammar, I have tried best to my knowledge to make it understandable, please point out and ask questions if you need more input from my side.
This kind of question is really beyond the scope of Stackoverflow, but it really sounds like you haven't come to understand the pieces of your infrastructure yet.
The Netflix stack (Eureka/Zuul etc) and things like Zipkin, Hystrix and the whole ELK stack only start to make sense when you have really large deployments of many services in multi-site, with many hosts where managing "by hand" becomes a real problem, where you have a lot of moving parts in the architecture where something can break and your system still needs to keep running, like a host disconnects or a database node dies.
With 2 hosts and a couple of services it doesn't make sense to introduce all this complexity, it will just overwhelm and confuse you (it already has). If one of your 2 hosts dies even if you're using Eureka and Zuul and it will not save you. The whole system will go down.
Throw out all those latest buzzword libraries (you're not Netflix yet) and just think through a simple architecture where you will run your services say on one host and database on another host (no need for Eureka or Zuul). Think of a shared location for logs and organise a nice, easy to use folder structure to store them so they're easy to find and search with simple command line tools that are much better than Kibana (which is TERRIBLE to look at logs).
Stay simple and only introduce new pieces when you feel it is getting difficult to manage.

How to improve Kubernetes security especially inter-Pods?

TL;DR Kubernetes allows all containers to access all other containers on the entire cluster, this seems to greatly increase the security risks. How to mitigate?
Unlike Docker, where one would usually only allow network connection between containers that need to communicate (via --link), each Pod on Kubernetes can access all other Pods on that cluster.
That means that for a standard Nginx + PHP/Python + MySQL/PostgreSQL, running on Kubernetes, a compromised Nginx would be able to access the database.
People used to run all those on a single machine, but that machine would have serious periodic updates (more than containers), and SELinux/AppArmor for serious people.
One can mitigate a bit the risks by having each project (if you have various independent websites for example) run each on their own cluster, but that seems wasteful.
The current Kubernetes security seems to be very incomplete. Is there already a way to have a decent security for production?
In the not-too-distant future we will introduce controls for network policy in Kubernetes. As of today that is not integrated, but several vendors (e.g. Weave, Calico) have policy engines that can work with Kubernetes.
As #tim-hockin says, we do plan to have a way to partition the network.
But, IMO, for systems with more moving parts, (which is where Kubernetes should really shine), I think it will be better to focus on application security.
Taking your three-layer example, the PHP pod should be authorized to talk to the database, but the Nginx pod should not. So, if someone figures out a way to execute an arbitrary command in the Nginx pod, they might be able to send a request to the database Pod, but it should be rejected as not authorized.
I prefer the application-security approach because:
I don't think the --links approach will scale well to 10s of different microservices or more. It will be too hard to manage all the links.
I think as the number of devs in your org grows, you will need fine grained app-level security anyhow.
In terms of being like docker compose, it looks like docker compose currently only works on single machines, according to this page:
https://github.com/docker/compose/blob/master/SWARM.md

Resources