how to create multiple bridges of LXC? - lxc

Right now, after installing LXC, you only have one default bridge "lxcbr0" which is used to connect your container to host machine. So through this way, we can create multiple containers and connect them all to the bridge "lxcbr0". My question is:
Can I create two bridges "lxcbr0" and "lxcbr1" such that I can divide the multiple containers into two subnetwork, one of which connects to "lxcbr0" and the other one connects to "lxcbr1"?
Happy Holidays!
Thanks.
Deryk

It's bash code that add lxcbr1 connacted to eth2
main.sh:
#!/bin/bash
BRCTL_BIN="/sbin/brctl"
IP_BIN="/sbin/ip"
# variable
brName=lxcbr1
brDev=eth2
# function: add bridge
#
function addBr() {
local brName=$1
local brDev=$2 || ""
if [ -d /sys/class/net/${brName} ]; then
# bridge exists
return
else
${BRCTL_BIN} addbr ${brName}
${BRCTL_BIN} setfd ${brName} 0
${BRCTL_BIN} sethello ${brName} 5
${IP_BIN} link set dev ${brName} up
if [ "${brDev}x" != "x" ]; then
${BRCTL_BIN} addif ${brName} ${brDev}
${IP_BIN} link set dev ${brDev} up
fi
fi
}
# add lxcbr1
addBr ${brName} ${brDev}
# it's simple example
# without bash variable
# add lxcbr1 and lxcbr3
addBr lxcbr1 eth1
addBr lxcbr3 eth3
Now you can connect your lxc container to lxcbr1 as eth11
lxc.network.type = veth
lxc.network.flags = up
lxc.network.mtu = 1500
lxc.network.link = lxcbr1
lxc.network.ipv4 = 192.168.0.11/24
lxc.network.name = eth11
lxc.network.veth.pair = veth11.1

Related

Expose port using DockerOperator

I am using DockerOperator to run a container. But I do not see any related option to publish required port. I need to publish a webserver port when the task is triggered. Any help or guide will be helpful. Thank you!
First, don't forget docker_operator is deprecated, replaced (now) with providers.docker.operators.docker.
Second, I don't know of a command to expose a port in a live (running) Docker container.
As described in this article from Sidhartha Mani
Specifically, I needed access to the filled mysql database. .
I could think of a few ways to do this:
Stop the container and start a new one with the added port exposure. docker run -p 3306:3306 -p 8080:8080 -d java/server.
The second option is to start another container that links to this, and knows how to port forward.
Setup iptables rules to forward a host port into the container.
So:
Following existing rules, I created my own rule to forward to the container
iptables -t nat -D DOCKER ! -i docker0 -p tcp --dport 3306-j DNAT \
--to-destination 172.17.0.2:3306
This just says that whenever a packet is destined to port 3306 on the host, forward it to the container with ip 172.17.0.2, and its port 3306.
Once I did this, I could connect to the container using host port 3306.
I wanted to make it easier for others to expose ports on live containers.
So, I created a small repository and a corresponding docker image (named wlan0/redirect).
The same effect as exposing host port 3306 to container 172.17.0.2:3306 can be achieved using this command.
This command saves the trouble of learning how to use iptables.
docker run --privileged -v /proc:/host/proc \
-e HOST_PORT=3306 -e DEST_IP=172.17.0.2 -e DEST_PORT=3306 \
wlan0/redirect:latest
In other words, this kind of solution would not be implemented from a command run in the container, through an Airflow Operator.
As per my understanding DockerOperator will create a new container, then why is there no way of exposing ports while create a new container.
First, the EXPOSE part is, as I mentioned here, just a metadata added to the image. It is not mandatory.
The runtime (docker run) -p option is about publishing, not exposing: publishing a port and mapping it to a host port (see above) or another container port.
That might be not needed with an Airflow environment, where there is a default network, and even the possibility to setup a custom network or subnetwork.
Which means other (Airflow) containers attached to the same network should be able to access a ports of any container in said network, without needing any -p (publication) or EXPOSE directive.
In order to accomplish this, you will need to subclass the DockerOperator and override the initializer and _run_image_with_mounts method, which uses the API client to create a container with the specified host configuration.
class DockerOperatorWithExposedPorts(DockerOperator):
def __init__(self, *args, **kwargs):
self.port_bindings = kwargs.pop("port_bindings", {})
if self.port_bindings and kwargs.get("network_mode") == "host":
self.log.warning("`port_bindings` is not supported in `host` network mode.")
self.port_bindings = {}
super().__init__(*args, **kwargs)
def _run_image_with_mounts(
self, target_mounts, add_tmp_variable: bool
) -> Optional[Union[List[str], str]]:
"""
NOTE: This method was copied entirely from the base class `DockerOperator`, for the capability
of performing port publishing.
"""
if add_tmp_variable:
self.environment['AIRFLOW_TMP_DIR'] = self.tmp_dir
else:
self.environment.pop('AIRFLOW_TMP_DIR', None)
if not self.cli:
raise Exception("The 'cli' should be initialized before!")
self.container = self.cli.create_container(
command=self.format_command(self.command),
name=self.container_name,
environment={**self.environment, **self._private_environment},
ports=list(self.port_bindings.keys()) if self.port_bindings else None,
host_config=self.cli.create_host_config(
auto_remove=False,
mounts=target_mounts,
network_mode=self.network_mode,
shm_size=self.shm_size,
dns=self.dns,
dns_search=self.dns_search,
cpu_shares=int(round(self.cpus * 1024)),
port_bindings=self.port_bindings if self.port_bindings else None,
mem_limit=self.mem_limit,
cap_add=self.cap_add,
extra_hosts=self.extra_hosts,
privileged=self.privileged,
device_requests=self.device_requests,
),
image=self.image,
user=self.user,
entrypoint=self.format_command(self.entrypoint),
working_dir=self.working_dir,
tty=self.tty,
)
logstream = self.cli.attach(container=self.container['Id'], stdout=True, stderr=True, stream=True)
try:
self.cli.start(self.container['Id'])
log_lines = []
for log_chunk in logstream:
log_chunk = stringify(log_chunk).strip()
log_lines.append(log_chunk)
self.log.info("%s", log_chunk)
result = self.cli.wait(self.container['Id'])
if result['StatusCode'] != 0:
joined_log_lines = "\n".join(log_lines)
raise AirflowException(f'Docker container failed: {repr(result)} lines {joined_log_lines}')
if self.retrieve_output:
return self._attempt_to_retrieve_result()
elif self.do_xcom_push:
if len(log_lines) == 0:
return None
try:
if self.xcom_all:
return log_lines
else:
return log_lines[-1]
except StopIteration:
# handle the case when there is not a single line to iterate on
return None
return None
finally:
if self.auto_remove == "success":
self.cli.remove_container(self.container['Id'])
elif self.auto_remove == "force":
self.cli.remove_container(self.container['Id'], force=True)
Explanation: The create_host_config method of the APIClient has an optional port_bindings keyword argument, and create_container method has an optional ports argument. These calls aren't exposed in the DockerOperator, so you have to copy the _run_image_with_mounts method and override it with a copy and supply those arguments with the port_bindings field set in the initializer. You can then supply the ports to publish as a keyword argument. Note that in this implementation, the expectation is argument is a dictionary:
t1 = DockerOperatorWithExposedPorts(image=..., task_id=..., port_bindings={5000: 5000, 8080:8080, ...})

Docker Swarm with GlusterFS as the external volume storage and VIP

I was wondering if Docker Swarm was possible to be a load balancer with GlusterFS as the local filesystem? And use Pacemaker to hold the VIP (because I understand Docker cannot create a VIP).
My idea - which I'm hoping can be verified or suggested better :)
System:
2x CentOS 8 servers
- 192.168.0.1
---- /dev/sda (OS)
---- /dev/sdb (data)
- 192.168.0.2
---- /dev/sda (OS)
---- /dev/sdb (data)
Install Pacemaker, Corosync
dnf --enablerepo=HighAvailability -y install pacemaker pcs psmisc policycoreutils-python-utils
systemctl start pcsd
Add a VIP to both servers
pcs resource create vip IPaddr2 ip=192.168.0.100 cidr_netmask=24 op monitor interval=30s
Set up both storage
mkfs.xfs /dev/sdb
Make the directory and add to startup
mkdir -p /my-data/
echo "/dev/sdb /my-data xfs defaults 0 0" >> /etc/fstab
Install GlusterFS on both nodes
dnf install -y glusterfs-server
Setup Gluster for the volume
gluster volume create gfs replica 2 transport tcp node01:/my-data node02:/my-data force
gluster volume start gfs
Make it accessible for the replication
echo 'node01:/my-data /mnt glusterfs defaults,_netdev 0 0' >> /etc/fstab
echo 'node02:/my-data /mnt glusterfs defaults,_netdev 0 0' >> /etc/fstab
Install Docker and Docker-Compose
Initialise Swarm
- on node01 use IP 192.168.0.1 -> manager
- on node02 use IP 192.168.0.2 -> manager
Create the directories
mkdir /mnt/html
mkdir /mnt/mysql
In the docker-compose.yml file:
volumes:
- "/mnt/html:/var/www/html/wp-content"
volumes:
- "/mnt/mysql:/var/lib/mysql"
Apart of the docker-compose.yml - apache:
Use IP 192.168.0.100 as the access on 80
My thoughts are that as 192.168.0.100 is only accessible on one of the Pacemaker resources, that the secondary Manager wouldn't be hit on the front end. If that node went down on the IP .100 then the other node02 would take that IP and the Swarm would still be active.
Is this something that would work? I cant find anything about having a VIP on the Swarm - at least working solutions.
I have them both as Managers because I assume if the manager goes off then its not going to work? Then if I had a 3rd, 4th, etc. I'd add them as Workers.

How to know if load balancing works in Docker Swarm?

I created a service called accountservice and replicated it 3 times after. In my service I get IP address of the producing service instance and populate it in JSON response. The question is everytime I run curl $manager-ip:6767/accounts/10000 the returned IP is the same as before (I tried 100 times)
manager-ip environment variable:
set -x manager-ip (docker-machine ip swarm-manager-1)
Here's my Dockerfile:
FROM iron/base
EXPOSE 6767
ADD accountservice-linux-amd64 /
ADD healthchecker-linux-amd64 /
HEALTHCHECK --interval=3s --timeout=3s CMD ["./healthchecker-linux-amd64", "-port=6767"] || exit 1
ENTRYPOINT ["./accountservice-linux-amd64"]
And here's my automation script to build and run service:
#!/usr/bin/env fish
set -x GOOS linux
set -x CGO_ENABLED 0
set -x GOBIN ""
eval (docker-machine env swarm-manager-1)
go get
go build -o accountservice-linux-amd64 .
pushd ./healthchecker
go get
go build -o ../healthchecker-linux-amd64 .
popd
docker build -t azbshiri/accountservice .
docker service rm accountservice
docker service create \
--name accountservice \
--network my_network \
--replicas=1 \
-p 6767:6767 \
-p 6767:6767/udp \
azbshiri/accountservice
And here's the function I call to get the IP:
package common
import "net"
func GetIP() string {
addrs, err := net.InterfaceAddrs()
if err != nil {
return "error"
}
for _, addr := range addrs {
ipnet, ok := addr.(*net.IPNet)
if ok && !ipnet.IP.IsLoopback() {
if ipnet.IP.To4() != nil {
return ipnet.IP.String()
}
}
}
panic("Unable to determine local IP address (non loopback). Exiting.")
}
And I scale the service using the command below:
docker service scale accountservice=3
A few things:
Your results are normal. By default, a Swarm service has a VIP (virtual IP) in front of the service tasks to act as a load balancer. Trying to reach that service from inside the virtual network will only show that IP.
If you want to use a round-robin approach and skip the VIP, you could create a service with --endpoint-mode=dnsrr that would then return a different service task for each DNS request (but your client might be caching DNS names, causing that to show the same IP, which is why VIP is usually better).
If you wanted to get a list of IP's for task replicas, do a dig tasks.<servicename> inside the service's network.
If you wanted to test something easy, have your service create a random string, or use hostname on startup and return that so you can tell the different replicas when accessing. A easy example is to run one service using image elasticsearch:2 which will return JSON on port 9200 with a different random name per container.

How to get host's udev events from a Docker container?

In a Docker container, I am looking for a way to get the udev events on the host.
Using udevadm monitor, it sends back host's kernel events only in a container.
The question is whether there is a way to detect host's udev events or forward host's event to containers?
This is how I made my container receive host events by udev:
docker run --net=host -v /run/udev/control:/run/udev/control
--net=host allows container and host operate through PF_NETLINK sockets, which are used by udev monitor to receive kernel events (found here)
/run/udev/control is a file, which udev monitor uses to check if udevd is already running. If it doesn't exist, monitoring is disabled.
Just like above answer pointed out: we could enable --net=host, but host network is not suggested because of multiple known reasons.
In fact this issue happens just because it need NETLINK to communicate between kernel & user space, but if not use host network, host & container will in different netns, so enable udev in container could make them in same netns which then no need to use host network.
When we ran into this issue, we did next:
# apt-get install udev
# vim /etc/init.d/udev to comment some special settings:
1) Comments next:
#if [ ! -e "/run/udev/" ]; then
# warn_if_interactive
#fi
2) Comments next:
#if ! ps --no-headers --format args ax | egrep -q '^\['; then
# log_warning_msg "udev does not support containers, not started"
# exit 0
#fi
# root#e751e437a8ba:~# service udev start
[ ok ] Starting hotplug events dispatcher: systemd-udevd.
[ ok ] Synthesizing the initial hotplug events (subsystems)...done.
[ ok ] Synthesizing the initial hotplug events (devices)...done.
[ ok ] Waiting for /dev to be fully populated...done.

Changing default subnet for docker custom networks

Our internal network has the range 172.20.0.0/16 reserved for internal purposes and docker uses the 172 range by default for its internal networking. I can reset the bridge to live in 192.168 by providing the bip setting to the daemon:
➜ ~ sudo cat /etc/docker/daemon.json
{
"bip": "192.168.2.1/24"
}
➜ ~ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.2.1 netmask 255.255.255.0 broadcast 0.0.0.0
However, when creating new custom networks via docker network create or by defining them in the networks sections of the docker-compose.yaml these are still created in 172, thus eventually clashing with 172.20:
➜ ~ docker network create foo
610fd0b7ccde621f87d40f8bcbed1699b22788b70a75223264bb14f7e63f5a87
➜ ~ docker network inspect foo | grep Subnet
"Subnet": "172.17.0.0/16",
➜ ~ docker network create foo1
d897eab31b2c558517df7fb096fab4af9a4282c286fc9b6bb022be7382d8b4e7
➜ ~ docker network inspect foo1 | grep Subnet
"Subnet": "172.18.0.0/16",
I understand I can provide the subnet value to docker network create, but I rather want all such subnets created under 192.168.*.
How can one configure dockerd to do this automatically?
For anyone who found this question. Now it is possible.
$ docker -v
Docker version 18.06.0-ce, build 0ffa825
Edit or create config file for docker daemon:
# nano /etc/docker/daemon.json
Add lines:
{
"default-address-pools":
[
{"base":"10.10.0.0/16","size":24}
]
}
Restart dockerd:
# service docker restart
Check the result:
$ docker network create foo
$ docker network inspect foo | grep Subnet
"Subnet": "10.10.1.0/24"
It works for docker-compose too.
Your "bip": "192.168.2.1/24" works for bridge0 only. It means that any container which run without --network will use this default network.

Resources