I'm trying to locate a performance issue that might be network related. Therefore I want to inspect all packets entering and leaving a pod.
I'm on kubernetes 1.8.12 on GKE. I ssh to the host and I see the bridge called cbr0 that sees all the traffic. I also see a ton of interfaces named like vethdeadbeef#if3. I assume those are virtual interfaces that are created per container. Where do I look to find out which interface belongs to which container, so I can get all a list of all the interfaces of a pod.
If you have cat available in the container, you can compare the interface index of the containers eth0 with those of the veth* devices on your host. For example:
# grep ^ /sys/class/net/vet*/ifindex | grep ":$(docker exec aea243a766c1 cat /sys/class/net/eth0/iflink)"
/sys/class/net/veth1d431c85/ifindex:92
veth1d431c85 is what your are looking for.
Related
I've been trying to convert my SimpleLogin Docker containers to Kubernetes using Rancher. However one of the steps requires me to create a network.
sudo docker network create -d bridge \
--subnet=240.0.0.0/24 \
--gateway=240.0.0.1 \
sl-network
I couldn't really find a way to do this on Kubernetes/Rancher.
How do I set up an equivalent network like the above command in Kubernetes?
If you want more information about what this network should do you can find it here.
You don't. Kubernetes has its own network ecosystem, which mostly acts as though every Pod and Service is on the same network. You can't create separate subnets within that, there's no way to create a separate network per logical application. You also can't control the IP range of networks in Kubernetes (it shouldn't usually be necessary in Docker either).
Generally you can communicate between Kubernetes Pods by putting a Service in front of each, and then using the Service's DNS name as a host name. If all of the parts were running in the same Namespace, and the Service in front of the database were named sl-db, then the webapp Pod could use sl-db as the host name part of the DB_URI setting.
Reading through the documentation you link to, you will probably need to do some extra work to get the Postfix MTA set up. Note that it looks like it runs outside of Docker in this setup; either you will have to port the setup to run inside Kubernetes or configure its mynetworks settings to include the network that contains the Kubernetes nodes. You will also need to set up Kubernetes ConfigMaps and Secrets to hold the various configuration files and certificates this setup needs.
I have a docker image that needs the container IP address (or hostname) to be passed by the command line.
Is it possible expansion the pod hostname or IP in the container command definition? if not, what is the better way to obtain it in a kuberneted deployed container?
In AWS I usually obtain it by contacting the EC2 meta-data service, I can do somethng similar contacting the kubernetes api, as long as I can obtain the pod name/id?
Thanks.
Depending on your pod setup, you may be able to use hostname -i.
E.g.
$ kubectl exec ${POD_NAME} hostname -i
10.245.2.109
From man hostname
...
-i, --ip-address
Display the network address(es) of the host name. Note that this works only if the host name can be resolved. Avoid using this option; use hostname --all-ip-addresses instead.
-I, --all-ip-addresses
Display all network addresses of the host. This option enumerates all configured addresses on all network interfaces. The loopback interface and IPv6 link-local addresses are omitted. Contrary to option -i, this
option does not depend on name resolution. Do not make any assumptions about the order of the output.
...
In v1.1 (releasing soon) you can expose the pod's IP as an environment variable through the downward api (note that the published documentation is for v1.0 which doesn't include pod IP in the downward API).
Prior to v1.1, the best way to get this is probably by querying the API from the pod. See Accessing the API from a Pod for how to access the API. The pod name is your $HOSTNAME, and you can find the IP with something like:
wget -O - ${KUBERNETES_RO_SERVICE_HOST}:${KUBERNETES_RO_SERVICE_PORT}/api/v1/namespaces/default/pods/${HOSTNAME} | grep podIP
Although I recommend you use a json parser such as jq
EDIT:
Just wanted to add that pod IP is not preserved across restarts, which is why the usual recommendation is to set up a service pointing to your pod. If you do use a service, the services IP will be fixed and act as a proxy to the pod, even across restarts. Service IPs are provided as environment variables, such as FOO_SERVICE_HOST and FOO_SERVICE_PORT.
I'm playing with coreos and digitalocean, and I'd like to start allowing internal communication between my containers.
I've got private networking set up for all the hosts, and now I'd like to ensure that some containers only open ports to localhost and to the internal interface.
I've explored a lot of options for this, but none of them seem satisfactory:
Using the '-p', I can ensure docker binds to the local interface, but this has two downsides:
I can't easily test services by SSHing in, because that traffic originates from localhost
I need to write somewhat hacky shell scripts to start my services, in order to inject the address of the machine that the container is running on
I tried using flannel, but it doesn't make the traffic private (or I didn't set it up right)
I considered using iptables on the containers to prevent external access, but that doesn't seem as secure
I tried using iptables on the coreos hosts, but ... it's tricky, and I couldn't get it working.
When I tried to configured iptables on the host, I used the method here: https://docs.docker.com/articles/networking/#communication-between-containers-and-the-wider-world, by adding a DROP rule to the docker chain, but it didn't work, and packets still got through
So what's the best approach, and I'll invest time in making it work.
Overall, I guess I need to find something that I can:
Roll out to all the hosts reliably
Something that is reasonably flexible going forward
Something that allows for 'edge machines' which are accessible from the wider internet.
Solution
I'll go into how I ended up solving this. Thanks to larsks for their help. In the end, their approach was the correct one. It's tricky on coreos, because there aren't really stable addresses, like larsks assumes. The whole point of coreos it to be able to forget about ip addresses.
I solved this by finding a not-too-bad way to inject the ip address into the command in the service file. The tricky thing about this is that it doesn't really support a lot of the shell features I expected. What I wanted to do was to assign the ip address of the machine to a variable then inject it into the command:
ip=$(ifconfig eth1 | grep -o 'inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*');
/usr/bin/docker run -p $ip:7000:7000 ...
But, as mentioned, that doesn't work. So what to do? Get the shell!
ExecStart=/usr/bin/sh -c "\
export ip=$(ifconfig eth1 | grep -o 'inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*');\
echo $ip;\
/usr/bin/docker run -p $ip:7000:7000"
I hit a few problems along the way.
I'm pretty sure there aren't newlines in that command, so I had to add the ';' characters
when you test the above bash -c command in a shell, it'll have very different effects to when systemd does it. In the shell you need to escape the '$' characters, while in systemd config files, you don't.
I included the echo so that I could see what the command thought the ip was.
When I was doing all this, I actually inserted a small webserver to the docker image, so that I could just test using curl.
Downsides of this approach is that it's tied to the way ifconfig works, and ipv4. In fact, this approach doesn't work on my linux mint laptop, where ifconfig produces differently formatted output. The important lesson here is to output things in yaml or json, so that shell json tools can access things more easily.
Instead of grep-ping the IP address, you can use the environment files to get the IP address (both public and private) of the host the service gets scheduled on. This allows you to bind your container ports to either public or private ports in a simple way.
Like so:
[Service]
EnvironmentFile=/etc/environment
ExecStart=/usr/bin/docker run --name myservice -p \
${COREOS_PUBLIC_IPV4}:80:80 \
${COREOS_PRIVATE_IPV4}:3306:3306 \
ubuntu /bin/bash
I've got private networking set up for all the hosts, and now I'd like
to ensure that some containers only open ports to localhost and to the
internal interface.
This is exactly the behavior that you get with the -p option when you specify an ip address. Let's say I have a host with two external interfaces, eth0 (with address 10.0.0.10) and eth1 (with address 192.168.0.10), and the docker0 bridge at 172.17.42.1/16.
If I start a container like this:
docker run -p 192.168.0.10:80:80 -d larsks/mini-httpd
This will start a container that is accessible over the eth1 interface at 192.168.0.10, port 80. This service is also accessible -- from the host on which the container is located -- at the address assigned to the container on the docker0 network. This would be something like 172.17.0.39, port 80.
This seems to meet your goals:
The container port is exposed over the "private" eth1 interface.
The container port is accessible from the host.
I can't easily test services by SSHing in, because that traffic originates from localhost.
If you were running ssh inside a container, you would ssh to it at the "internal" address assigned by Docker. But if you are running ssh inside your containers, you may want to consider not doing that and rely on tools like docker exec instead.
I need to write somewhat hacky shell scripts to start my services, in order to inject the address of the machine that the container is running on
With this solution, there is no need to inject the machine ip into the container.
From the docker linking
I can have A container links to B container.
Then I can see the B's ip address and exposed port in A's ENV variables.
However, how can I figure out A's ip address wihtin B container?
To find one container from another, you can use a 'service discovery' mechanism such as SkyDock.
Skydock - Automagic Service Discovery for Docker
Skydock monitors docker events when containers start, stop, die, kill, etc and inserts records into a dynamic DNS server skydns. This allows standard DNS queries for services running inside docker containers.
For the more complex case where your containers are on multiple hosts and you need a way to network them together, see weave-dns (Please note I work on weave and weave-dns).
Docker allows servers from multiple containers to connect to each other via links and service discovery. However, from what I can see this service discovery is host-local. I would like to implement a service that uses other services hosted on a different machine.
There have been several approaches to solving this problem in Docker, such as CoreOS's jumpers, host-local services that essentially proxy to the other machine, and a whole bunch of github projects for managing Docker deployments that appear to have attempted to support this use-case.
Given the pace of development it is hard to follow what current best practices are. Therefore my question is essentially:
What (if any) is the current predominant method for linking across hosts in Docker, and
Are there any plans for supporting this functionality directly in the Docker system?
Update
Docker has recently announced a new tool called Swarm for Docker orchestration.
Swarm allows you do "join" multiple docker daemons: You first create a swarm, start a swarm manager on one machine, and have docker daemons "join" the swarm manager using the swarm's identifier. The docker client connects to the swarm manager as if it were a regular docker server.
When a container started with Swarm, it is automatically assigned to a free node that meets any constraints that have been defined. The following example is taken from the blog post:
$ docker run -d -P -e constraint:storage=ssd mysql
One of the supported constraints is "node" that allows you pin a container to a specific hostname. The swarm also resolves links across nodes.
In my testing I got the impression that Swarm doesn't yet work with volumes at a fixed location very well (or at least the process of linking them is not very intuitive), so this is something to keep in mind.
Swarm is now in beta phase.
Until recently, the Ambassador Pattern was the only Docker-native approach to remote-host service discovery. This pattern can still be used and doesn't require any magic beyond plain Docker in that the pattern consists of one or more additional containers that act as proxies.
Additionally, there are several third-party extensions to make Docker cluster-capable. Third-party solutions include:
Connecting the Docker network bridges on two hosts, lightweight and various solutions exist, but generally with some caveats
DNS-based discovery e.g. with skydock and SkyDNS
Docker management tools such as Shipyard, and Docker orchestration tools. See this question for an extensive list: How to scale Docker containers in production
UPDATE 3
Libswarm has been renamed as swarm and is now a separate application.
Here is the github page demo to use as a starting point:
# create a cluster
$ swarm create
6856663cdefdec325839a4b7e1de38e8
# on each of your nodes, start the swarm agent
# <node_ip> doesn't have to be public (eg. 192.168.0.X),
# as long as the other nodes can reach it, it is fine.
$ swarm join --token=6856663cdefdec325839a4b7e1de38e8 --addr=<node_ip:2375>
# start the manager on any machine or your laptop
$ swarm manage --token=6856663cdefdec325839a4b7e1de38e8 --addr=<swarm_ip:swarm_port>
# use the regular docker cli
$ docker -H <swarm_ip:swarm_port> info
$ docker -H <swarm_ip:swarm_port> run ...
$ docker -H <swarm_ip:swarm_port> ps
$ docker -H <swarm_ip:swarm_port> logs ...
...
# list nodes in your cluster
$ swarm list --token=6856663cdefdec325839a4b7e1de38e8
http://<node_ip:2375>
UPDATE 2
The official approach is now to use libswarm see a demo here
UPDATE
There is a nice gist for openvswitch hosts communication in docker using the same approach.
To allow service discovery there is an interesting approach based on DNS called skydock.
There is also a screencast.
This is also a nice article using the same pieces of the puzzle but adding also vlans on top:
http://fbevmware.blogspot.it/2013/12/coupling-docker-and-open-vswitch.html
The patching has nothing to do with the robustness of the solution. Docker is actually only a sort of DSL upon Linux Containers and both solutions in these articles simply bypass some Docker automatic settings and fall back directly to Linux Containers.
So you can use the solutions safely and wait to be able to do it in a simpler way once Docker will implement it.
Weave is a new Docker virtual network technology that acts as a virtual ethernet switch over TCP/UDP - all you need is a Docker container running Weave on your host.
What's interesting here is
Instead of links, use static IPs/hostnames in your virtual network
Hosts don't need full connectivity, a mesh is formed based on what peers are available, and packets will be routed multi-hop to where they need to go
This leads to interesting scenarios like
Create a virtual network across the WAN, none of the Docker containers will know or care what actual network they sit in
Move your containers to different physical docker hosts, Weave will detect the peer accordingly
For example, there's an example guide on how to create a multi-node Cassandra cluster across your laptop and a few cloud (EC2) hosts with two commands per host. I launched a CoreOS cluster with AWS CloudFormation, installed weave on each in /home/core, plus my laptop vagrant docker VM, and got a cluster up in under an hour. My laptop is firewalled but Weave seemed to be okay with that, it just connects out to its EC2 peers.
Update
Docker 1.12 contains the so called swarm mode and also adds a service abstraction. They probably aren't mature enough for every use case, but I suggest you to keep them under observation. The swarm mode at least helps in a multi-host setup, which doesn't necessarily make linking easier. The Docker-internal DNS server (since 1.11) should help you to access container names, if they are well-known - meaning that the generated names in a Swarm context won't be so easy to address.
With the Docker 1.9 release you'll get built in multi host networking. They also provide an example script to easily provision a working cluster.
You'll need a K/V store (e.g. Consul) which allows to share state across the different Docker engines on every host. Every Docker engine need to be configured with that K/V store and you can then use Swarm to connect your hosts.
Then you create a new overlay network like this:
$ docker network create --driver overlay my-network
Containers can now be run with the network name as run parameter:
$ docker run -itd --net=my-network busybox
They can also be connected to a network when already running:
$ docker network connect my-network my-container
More details are available in the documentation.
The following article describes nicely how to connect docker containers on multiple hosts: http://goldmann.pl/blog/2014/01/21/connecting-docker-containers-on-multiple-hosts/
It is possible to bridge several Docker subnets together using Open vSwitch or Tinc. I have prepared Gists to show how to do it:
Open vSwitch: https://gist.github.com/noteed/8656989
Tinc: https://gist.github.com/noteed/11031504
The advantage I see using this solution instead of the --link option and the ambassador pattern is that I find it more transparent: there is no need to have additional containers and more importantly, no need to expose ports on the host. Actually I think of the --link option to be a temporary hack before Docker get a nicer story about multi-host (or multi-daemon) setups.
Note: I know there is another answer pointing to my first Gist but I don't have enough karma to edit or comment on that answer.
As mentioned above, Weave is definitely a viable solution to link Docker containers across the hosts. Based on my own experience with it, it is fairly straightfoward to set it up. It is now also has DNS service which you can address container's by its DNS names.
On the other hand, there is CoreOS's Flannel and Juniper's Opencontrail for wiring the containers across the hosts.
Seems like docker swarm 1.14 allows you to:
assing hostname to container, using --hostname tag, but i haven't been able to make it work, containers are not able to ping each other by assigned hostnames.
assigning services to machine using --constraint 'node.hostname == <host>'