Connecting VMs Using GRE Tunnels - Openvswitch - grep

Hello everyone i'm really new in networking, so i i'm a little bit lost please i hope anyone can help me...
I have two physical nodes with the same configuration in the interface:
# The primary network interface
#auto eth0
#iface eth0 inet dhcp
auto br0
iface br0 inet dhcp
bridge_ports eth0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
bridge_stp off
my nodes have the following public ip:
ubuntu001: 158.42.104.129
ubuntu002: 158.42.104.139
I run one VM in each node using the default configuration of libvirt:
Vm in ubuntu001: 10.1.1.189
Vm in ubuntu002: 10.1.1.59
I want to do ping between the VMs through "gre tunnel using OVS", so i did the next but it didn't work:
First i create an OVS bridge:
# ovs-vsctl add-br ovs-br0
Second i connect my bridge with its uplink which in this case is eth0
# ovs-vsctl add-port ovs-br0 eth0
Third i run a VM in each node (ubuntu001: 10.1.1.189 and ubuntu002: 10.1.1.59 respectively)
Fourth i add a port for the GRE tunnel:
# ovs-vsctl add-port ovs-br0 gre0 -- set interface gre0 type=gre options:remote_ip=158.42.104.139
# ovs-vsctl add-port ovs-br0 gre0 -- set interface gre0 type=gre options:remote_ip=158.42.104.129
i did the same in the other node and this show when i use ovs-vsctl show:
root#ubuntu001:~# ovs-vsctl show
41268e02-3996-4caa-b941-e4fe9c718e35
Bridge "ovs-br0"
Port "ovs-br0"
Interface "ovs-br0"
type: internal
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip="158.42.104.139"}
Port "eth0"
Interface "eth0"
ovs_version: "2.0.2"
root#ubuntu002:~# ovs-vsctl show
f0128df4-1a89-4999-8add-b5076ff055ee
Bridge "ovs-br0"
Port "ovs-br0"
Interface "ovs-br0"
type: internal
Port "gre0"
Interface "gre0"
type: gre
options: {remote_ip="158.42.104.129"}
Port "eth0"
Interface "eth0"
ovs_version: "2.0.2"
what i am doing wrong or is missing something??

Add this to /etc/network/interfaces:
auto br-ovs=br-ovs
iface br-ovs inet manual
ovs_type OVSBridge
ovs_ports gre1 gre2
ovs_extra set bridge ${IFACE} stp_enable=true
mtu 1462
allow-br-ovs gre1
iface gre1 inet manual
ovs_type OVSPort
ovs_bridge br-ovs
ovs_extra set interface ${IFACE} type=gre options:remote_ip=158.42.104.139 options:key=1
auto br1
iface br1 inet manual# (or static, or DHCP)
mtu 1462
I do not know how to do this with commands.
I think eth0 should not be in the output of ovs-vsctl show.
stp_enable=true is optional, I don't think it is needed in case of 2 nodes.
Set mtu to suit your needs. This example is for when the real NIC's mtu is 1500.
remote_ip=158.42.104.139 should contain the other node's IP. It is different on the 2 nodes.
options:key=1 is also optional, it can be used to label 2 GRE networks (eg. the second mesh would have key=2 etc.).
You can add VMs to br1 and they will be able to ping each other.
Don't forget to set the VMs' mtu to 1462.
This tutorial might be useful: https://wiredcraft.com/blog/multi-host-docker-network/

Related

What netlink messages does docker set the container's interface name and how can it be changed?

I am trying to set the name of the interface inside the a container via netlink. IE: eth0 I want set to mang0.
Inside the container the root user gets permissions errors when they try to change the interface's properties:
root#d1df4b33fffc:/tmp/contbuild# ip link set eth0 down
RTNETLINK answers: Operation not permitted
root#d1df4b33fffc:/tmp/contbuild# ip link set eth0 name man0
RTNETLINK answers: Operation not permitted
root#d1df4b33fffc:/tmp/contbuild# ip link set eth0 alias man0
RTNETLINK answers: Operation not permitted
Outside the container, I see the set interface name command send in the kernel messages:
[ +11.115152] docker0: port 1(veth3a3f2f4) entered blocking state
[ +0.000007] docker0: port 1(veth3a3f2f4) entered disabled state
[ +0.000171] device veth3a3f2f4 entered promiscuous mode
[ +0.009358] IPv6: ADDRCONF(NETDEV_UP): veth3a3f2f4: link is not ready
[ +0.386448] eth0: renamed from vetheac9d07
[ +0.000259] IPv6: ADDRCONF(NETDEV_CHANGE): veth3a3f2f4: link becomes ready
[ +0.000031] docker0: port 1(veth3a3f2f4) entered blocking state
[ +0.000002] docker0: port 1(veth3a3f2f4) entered forwarding state
I also see the corresponding veth pair on the host veth3a3f2f4#if662, but I cannot see the container's veth in any other netns (ip netns show is blank).
So I would like tp know:
how is docker setting the name to eth0 and is there a way to easily change it
why can I not see the netns for the container and/or the container's interface from the host?
I found a work around by running the container with --cap-add=NET_ADMIN and doing the following internally
ip link set dev eth0 down
ip link set dev eth0 name eth1
ip link set dev eth0 up

Reaching between OVS bridges

want to ping different Open vSwitch bridges via a physical interface. My host has an eth1 and a wlan0 interface. I have created 3 OVS bridges and assigned IP addresses to them. wlan0 added to br0. Two virtual interface wlan0.1 and wlan0.2 is created by wlan0 and added them to br1 and br2. Another bridge breth is connected to eth1 interface and all other bridges are connected to breth by patch port. see the figure bellow
Now, host can ping all 3 brides. There are similar host in the network connected via wlan0 interface. They are under mesh network. Any host can ping any node's any bridge. But a PC is connected to eth1 interface only can ping br0 and other host's br0. That means, bridges assigned virtual interface are unreachable from PC. Is there any way to reach other bridges?

Docker Macvlan network inside container is not reaching to its own host

I have setup Macvlan network between 2 docker host as follows:
Host Setup: HOST_1 ens192: 172.18.0.21
Create macvlan bridge interface
docker network create -d macvlan \
--subnet=172.18.0.0/22 \
--gateway=172.18.0.1 \
--ip-range=172.18.1.0/28 \
-o macvlan_mode=bridge \
-o parent=ens192 macvlan
Create macvlan interface HOST_1
ip link add ens192.br link ens192 type macvlan mode bridge
ip addr add 172.18.1.0/28 dev ens192.br
ip link set dev ens192.br up
Host Setup: HOST_2 ens192: 172.18.0.23
Create macvlan bridge interface
docker network create -d macvlan \
--subnet=172.18.0.0/22 \
--gateway=172.18.0.1 \
--ip-range=172.18.1.16/28 \
-o macvlan_mode=bridge \
-o parent=ens192 macvlan
Create macvlan interface in HOST_2
ip link add ens192.br link ens192 type macvlan mode bridge
ip addr add 172.18.1.16/28 dev ens192.br
ip link set dev ens192.br up
Container Setup
Create containers in both host
HOST_1# docker run --net=macvlan -it --name macvlan_1 --rm alpine /bin/sh
HOST_2# docker run --net=macvlan -it --name macvlan_1 --rm alpine /bin/sh
CONTAINER_1 in HOST_1
24: eth0#if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 02:42:ac:12:01:00 brd ff:ff:ff:ff:ff:ff
inet 172.18.1.0/22 brd 172.18.3.255 scope global eth0
valid_lft forever preferred_lft forever
CONTAINER_2 in HOST_2
21: eth0#if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 02:42:ac:12:01:10 brd ff:ff:ff:ff:ff:ff
inet 172.18.1.16/22 brd 172.18.3.255 scope global eth0
valid_lft forever preferred_lft forever
Route table in CONTAINER_1 and CONTAINER_2
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.18.0.1 0.0.0.0 UG 0 0 0 eth0
172.18.0.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
Scenario
HOST_1 (172.18.0.21) <-> HOST_2 (172.18.0.23) = OK (Vice-versa)
HOST_1 (172.18.0.21) -> CONTAINER_1 (172.18.1.0) and CONTAINER_2 (172.18.1.16) = OK
HOST_2 (172.18.0.23) -> CONTAINER_1 (172.18.1.0) and CONTAINER_2 (172.18.1.16) = OK
CONTAINER_1 (172.18.1.0) -> HOST_2 (172.18.0.23) = OK
CONTAINER_2 (172.18.1.16) -> HOST_1 (172.18.0.21) = OK
CONTAINER_1 (172.18.1.0) <-> CONTAINER_2 (172.18.1.16) = OK (Vice-versa)
CONTAINER_1 (172.18.1.0) -> HOST_1 (172.18.0.21) = FAIL
CONTAINER_2 (172.18.1.16) -> HOST_2 (172.18.0.23) = FAIL
Question
I am very close to my solution I wanted to achieve except this 1 single problem. How can I make this work for container to connect to its own host. If there is solution to this, I would like to know how to configure in ESXi virtualization perspective and also bare-metal if there is any difference
The question is "a bit old", however others might find it useful. There is a workaround described in Host access section of USING DOCKER MACVLAN NETWORKS BY LARS KELLOGG-STEDMAN. I can confirm - it's working.
Host access With a container attached to a macvlan network, you will
find that while it can contact other systems on your local network
without a problem, the container will not be able to connect to your
host (and your host will not be able to connect to your container).
This is a limitation of macvlan interfaces: without special support
from a network switch, your host is unable to send packets to its own
macvlan interfaces.
Fortunately, there is a workaround for this problem: you can create
another macvlan interface on your host, and use that to communicate
with containers on the macvlan network.
First, I’m going to reserve an address from our network range for use
by the host interface by using the --aux-address option to docker
network create. That makes our final command line look like:
docker network create -d macvlan -o parent=eno1 \
--subnet 192.168.1.0/24 \
--gateway 192.168.1.1 \
--ip-range 192.168.1.192/27 \
--aux-address 'host=192.168.1.223' \
mynet
This will prevent Docker from assigning that address to a container.
Next, we create a new macvlan interface on the host. You can call it
whatever you want, but I’m calling this one mynet-shim:
ip link add mynet-shim link eno1 type macvlan mode bridge
Now we need to configure the interface with the address we reserved
and bring it up:
ip addr add 192.168.1.223/32 dev mynet-shim
ip link set mynet-shim up
The last thing we need to do is to tell our host to use that interface
when communicating with the containers. This is relatively easy
because we have restricted our containers to a particular CIDR subset
of the local network; we just add a route to that range like this:
ip route add 192.168.1.192/27 dev mynet-shim
With that route in place, your host will automatically use ths
mynet-shim interface when communicating with containers on the mynet
network.
Note that the interface and routing configuration presented here is
not persistent – you will lose if if you were to reboot your host. How
to make it persistent is distribution dependent.
This is defined behavior for macvlan and is by design. See Docker Macvlan Documentation
When using macvlan, you cannot ping or communicate with the default namespace IP address. For example, if you create a container and try to ping the Docker host’s eth0, it will not work. That traffic is explicitly filtered by the kernel modules themselves to offer additional provider isolation and security.
A macvlan subinterface can be added to the Docker host, to allow traffic between the Docker host and containers. The IP address needs to be set on this subinterface and removed from the parent address.
In my situation, I added one more network to the container.
so CONTAINER_1 -> HOST_1 can be reached by a different IP (10.123.0.2).
CONTAINER_2 or HOST_2 can reach to 172.18.1.0.
Following is the docker-compose sample, hope this could be a workaround.
version: "3"
services:
macvlan_1:
image: alpine
container: macvlan_1
command: ....
restart: always
networks:
macvlan:
ipv4_address: 172.18.1.0
internalbr:
ipv4_address: 10.123.0.2
networks:
macvlan:
driver: macvlan
driver_opts:
parent: ens192
macvlan_mode: bridge
ipam:
driver: default
config:
- subnet: 172.18.0.0/22
gateway: 172.18.0.1
ip_range: 172.18.1.0/28
internalbr:
driver: bridge
ipam:
config:
- subnet: 10.123.0.0/24

route all traffic over gre tunnel

I have an openvswitch sw1 with subnet 10.207.39.0/24 that has lxc containers attached and I have the same on another physical server and I have successfully connected these using a GRE tunnel. However, the lxc containers have additional ports on additional openvswitches, e.g. sw4 with subnet 192.220.39.0/24 and I want to push that traffic over the single gre tunnel on sw1 because there is only one physical interface and it's not possible to have multiple gre tunnels on each openvswitch with the same physical interface IP addr endpoints. Is it possible to push the traffic on the other openvswitches over the gre tunnel on sw1? Or is there a better way to connect multiple subnets in lxc containers on two physical hosts? Thanks.
I solved this "myself" - with help from two links provided below - (after sleeping on it and relentless google searches over several frustrating days).
I realize the solution is pretty simple and would be clear to a networking professional. I am an Oracle DBA and only know as much networking as I need to work with orabuntu-lxc software, LXC containers, and Oracle software, so please keep that in mind if the below is "obvious" - it wasn't obvious to me in my network ignorance.
I got the clue on how to solve the actual steps from this blog post:
http://www.cnblogs.com/popsuper1982/p/3800548.html
I confirmed that any subnet should be routable over a GRE tunnel from this blog post (which gave me hope to keep working towards a solution):
https://supportforums.adtran.com/thread/1408
In particular the author stated in the adtran comment that "GRE tunnels have no limitation on the types of traffic which can traverse it. It can route multiple subnets without multiple tunnels."
That post told me that the solution was likely a routing solution and that only one GRE tunnel would be needed for this use case.
Note that this feature of "no limitation" on the types of traffic is great for Oracle RAC because we need to be able to send multicast over the GRE tunnel for RAC.
This use case:
I am building an Oracle RAC infrastructure to run in LXC Linux containers. I have a public network 10.207.39.0/24 on openvswitch sw1 and a private RAC interconnect network 192.220.39.0/24 on openvswitch sw4. I want to be able to build the RAC in LXC linux containers that span multiple physical hosts and so I created a GRE tunnel to connect the 10.207.39.1 tunnel endpoint on colossus to 10.207.39.5 tunnel endpoint on guardian.
Here is the setup details:
Host "guardian":
LAN wireless physical network interface: wlp4s0 (IP 192.168.1.11)
sw1 10.207.39.5
sw4 192.220.39.5
Host "colossus":
LAN wireless physical network interface: wlp4s0 (IP 192.168.1.15)
sw1 10.207.39.1
sw4 192.220.39.1
Step 1:
Create GRE tunnel between sw1 openvswitches on both physical hosts with physical wireless LAN network interface end points:
Host "guardian": Create gre tunnel phys hosts (guardian --> colossus).
sudo ovs-vsctl add-port sw1 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.1.15
Host "colossus": Create gre tunnel phys hosts (colossus --> guardian).
sudo ovs-vsctl add-port sw1 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.1.11
Step 2:
Route the 192.220.39.0/24 network over the established GRE tunnel as shown below:
Host "guardian": route 192.220.39.0/24 openvswitch sw4 over GRE tunnel:
sudo route add -net 192.220.39.0/24 gw 10.207.39.5 dev sw1
Host "colossus": route 192.220.39.0/24 openvswitch sw4 over GRE tunnel:
sudo route add -net 192.220.39.0/24 gw 10.207.39.1 dev sw1
Note: To add additional subnets repeat step 2 for each subnet.
Note on MTU:
Also, you have to allow for GRE encapsulation in MTU if you want to ssh over these tunnels.
Therefore in the above example for the main GRE tunnel connecting the hosts, we need MTU to be set to 1420 to allow 80 for the GRE header.
MTU on the LXC container virtual interfaces on the sw1 switches need to be set to MTU=1420 in the LXC container config files.
MTU on the LXC container virtual interfaces on the sw4 switches need to be set to MTU=1420 in the LXC container config files.
Note that the MTU on the openvswitches sw1 and sw4 should automatically set to the MTU on the LXC intefaces as long as ALL LXC virtual interfaces are set to the new lower MTU values, so explicitly setting MTU on the openvswitches sw1 and sw4 themselves should not be necessary.
If run into issues still with SSH over the tunnels, but ping works cross-hosts cross-containers, then re-check all MTU settings on the virtual interfaces and openvswitches and recheck.

How can I shut down an ethernet interface but not the attached virtual interface?

I have an linux embedded machine that has a ethernet interface with a working network configuration. Also on this interface runs a second virtual network.
The config file reads as follows:
auto lo eth0 eth0:1
# loopback interface
iface lo inet loopback
# ethernet
iface eth0 inet static
address 192.168.1.1
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.2
# ethernet
iface eth0:1 inet static
address 123.123.123.1
netmask 255.255.255.0
network 123.123.123.0
broadcast 123.123.123.255
gateway 123.123.123.2
Now I need to bring down the eth0 device but still be able to reach the eth0:1 device.
How can this be done?
I tried to simply flush the ip address of the eth0 device, which works with the following command ip addr flush eth0. This works but it seems the services (webserver etc) are still listening on this interface...

Resources