GNS3 FRR Appliance

In my spare time, what little I have, I’ve been wanting to play with some OSS networking projects. For those playing along at home, during last Suse hackweek I played with wireguard, and to test the environment I wanted to set up some routing.
For which I used FRR.

FRR is a pretty cool project, if brings the networking routing stack to Linux, or rather gives us a full opensource routing stack. As most routers are actually Linux anyway.

Many years ago I happened to work at Fujitsu working in a gateway environment, and started playing around with networking. And that was my first experience with GNS3. An opensource network simulator. Back then I needed to have a copy of cisco IOS images to really play with routing protocols, so that make things harder, great open source product but needed access to proprietary router OSes.

FRR provides a CLI _very_ similar to ciscos, and make we think, hey I wonder if there is an FRR appliance we can use in GNS3?
And there was!!!

When I downloaded it and decompressed the cow2 image it was 1.5GB!!! For a single router image. It works great, but what if I wanted a bunch of routers to play with things like OSPF or BGP etc. Surely we can make a smaller one.

Kiwi

At Suse we use kiwi-ng to build machine images and release media. And to make things even easier for me we already have a kiwi config for small OpenSuse Leap JEOS images, jeos is “just enough OS”. So I hacked one to include FRR. All extra tweaks needed to the image are also easily done by bash hook scripts.

I wont go in to too much detail how because I created a git repo where I have it all including a detailed README: https://github.com/matthewoliver/frr_gns3

So feel free to check that would and build and use the image.

But today, I went one step further. OpenSuse’s Open Build System, which is used to build all RPMs for OpenSuse, but can also build debs and whatever build you need, also supports building docker containers and system images using kiwi!

So have now got the OBS to build the image for me. The image can be downloaded from: https://download.opensuse.org/repositories/home:/mattoliverau/images/

And if you want to send any OBS requests to change it the project/package is: https://build.opensuse.org/package/show/home:mattoliverau/FRR-OpenSuse-Appliance

To import it into GNS3 you need the gns3a file, which you can find in my git repo or in the OBS project page.

The best part is this image is only 300MB, which is much better then 1.5GB!
I did have it a little smaller, 200-250MB, but unfortunately the JEOS cut down kernel doesn’t contain the MPLS modules, so had to pull in the full default SUSE kernel. If this became a real thing and not a pet project, I could go and build a FRR cutdown kernel to get the size down, but 300MB is already a lot better then where it was at.

Hostname Hack

When using GNS3 and you place a router, you want to be able to name the router and when you access the console it’s _really_ nice to see the router name you specified in GNS3 as the hostname. Why, because if you have a bunch, you want want a bunch of tags all with the localhost hostname on the commandline… this doesn’t really help.

The FRR image is using qemu, and there wasn’t a nice way to access the name of the VM from inside the container, and now an easy way to insert the name from outside. But found 1 approach that seems to be working, enter my dodgy hostname hack!

I also wanted to to it without hacking the gns3server code. I couldn’t easily pass the hostname in but I could pass it in via a null device with the router name its id:

/dev/virtio-ports/frr.router.hostname.%vm-name%

So I simply wrote a script that sets the hostname based on the existence of this device. Made the script a systemd oneshot service to start at boot and it worked!

This means changing the name of the FRR router in the GNS3 interface, all you need to do is restart the router (stop and start the device) and it’ll apply the name to the router. This saves you having to log in as root and running hostname yourself.

Or better, if you name all your FRR routers before turning them on, then it’ll just work.

In conclusion…

Hopefully now we can have a fully opensource, GNS3 + FRR appliance solution for network training, testing, and inspiring network engineers.

POC Wireguard + FRR: Now with OSPFv2!

If you read my last post, I set up a POC with wireguard and FRR to have to power of wireguard (WG) but all the routing worked out with FRR. But I had a problem. When using RIPv2, the broadcast messages seemed to get stuck in the WG interfaces until I tcpdumped it. This meant that once I tcpdumped the routes would get through, but only to eventually go stale and disappear.

I talked with the awesome people in the #wireguard IRC channel on freenode and was told to simply stay clear of RIP.

So I revisited my POC env and swapped out RIP for OSPF.. and guess what.. it worked! Now all the routes get propagated and they stay there. Which means if I decided to add new WG links and make it grow, so should all the routing:

suse@wireguard-5:~> ip r
default via 172.16.0.1 dev eth0 proto dhcp
10.0.2.0/24 via 10.0.4.104 dev wg0 proto 188 metric 20
10.0.3.0/24 via 10.0.4.104 dev wg0 proto 188 metric 20
10.0.4.0/24 dev wg0 proto kernel scope link src 10.0.4.105
172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.36
172.16.1.0/24 via 10.0.4.104 dev wg0 proto 188 metric 20
172.16.2.0/24 via 10.0.4.104 dev wg0 proto 188 metric 20
172.16.3.0/24 via 10.0.4.104 dev wg0 proto 188 metric 20
172.16.4.0/24 dev eth1 proto kernel scope link src 172.16.4.105
172.16.5.0/24 dev eth2 proto kernel scope link src 172.16.5.105

Isn’t that beautiful, all networks on one of the more distant nodes, including network 1 (172.16.1.0/24).

I realise this doesn’t make much sense unless you read the last post, but never fear, I thought I’ll rework and append the build notes here, in case you interested again.

Build notes – This time with OSPFv2

The topology we’ll be building

Seeing that this is my Suse hackweek project and now use OpenSuse, I’ll be using OpenSuse Leap 15.1 for all the nodes (and the KVM host too).

Build the env

I used ansible-virt-infra created by csmart to build the env. A created my own inventory file, which you can dump in the inventory/ folder which I called wireguard.yml:

---
wireguard:
hosts:
wireguard-1:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-blue"
- name: "net-green"
wireguard-2:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-blue"
- name: "net-white"
wireguard-3:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-white"
wireguard-4:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-orange"
- name: "net-green"
wireguard-5:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-orange"
- name: "net-yellow"
wireguard-6:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-yellow"
vars:
virt_infra_distro: opensuse
virt_infra_distro_image: openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-OpenStack-Cloud-Current.qcow2
virt_infra_distro_image_url: https://download.opensuse.org/distribution/leap/15.1/jeos/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-OpenStack-Cloud-Current.qcow2
virt_infra_variant: opensuse15.1

Next we need to make sure the networks have been defined, we do this in the kvmhost inventory file, here’s a diff:

diff --git a/inventory/kvmhost.yml b/inventory/kvmhost.yml
index b1f029e..6d2485b 100644
--- a/inventory/kvmhost.yml
+++ b/inventory/kvmhost.yml
@@ -40,6 +40,36 @@ kvmhost:
           subnet: "255.255.255.0"
           dhcp_start: "10.255.255.2"
           dhcp_end: "10.255.255.254"
+        - name: "net-mgmt"
+          ip_address: "172.16.0.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.0.2"
+          dhcp_end: "172.16.0.99"
+        - name: "net-white"
+          ip_address: "172.16.1.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.1.2"
+          dhcp_end: "172.16.1.99"
+        - name: "net-blue"
+          ip_address: "172.16.2.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.2.2"
+          dhcp_end: "172.16.2.99"
+        - name: "net-green"
+          ip_address: "172.16.3.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.3.2"
+          dhcp_end: "172.16.3.99"
+        - name: "net-orange"
+          ip_address: "172.16.4.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.4.2"
+          dhcp_end: "172.16.4.99"
+        - name: "net-yellow"
+          ip_address: "172.16.5.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.5.2"
+          dhcp_end: "172.16.5.99"
     virt_infra_host_deps:
         - qemu-img
         - osinfo-query

Now all we need to do is run the playbook:

ansible-playbook --limit kvmhost,wireguard ./virt-infra.yml

Setting up the IPs and tunnels

This above infrastructure tool uses cloud_init to set up the network, so only the first NIC is up. You can confirm this with:

ansible wireguard -m shell -a "sudo ip a"

That’s ok because we want to use the numbers on our diagram anyway 🙂
Before we get to that, lets make sure wireguard is setup, and update all the nodes.

ansible wireguard -m shell -a "sudo zypper update -y"

If a reboot is required, reboot the nodes:

ansible wireguard -m shell -a "sudo reboot"

Add the wireguard repo to the nodes and install it, I look forward to 5.6 where wireguard will be included in the kernel:

ansible wireguard -m shell -a "sudo zypper addrepo -f obs://network:vpn:wireguard wireguard"

ansible wireguard -m shell -a "sudo zypper --gpg-auto-import-keys install -y wireguard-kmp-default wireguard-tools"

Load the kernel module:

ansible wireguard -m shell -a "sudo modprobe wireguard"

Let’s create wg0 on all wireguard nodes:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo ip link add dev wg0 type wireguard"

And add wg1 to those nodes that have 2:

ansible wireguard-1,wireguard-4 -m shell -a "sudo ip link add dev wg1 type wireguard"

Now while we’re at it, lets create all the wireguard keys (because we can use ansible):

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo mkdir -p /etc/wireguard"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "wg genkey | sudo tee /etc/wireguard/wg0-privatekey | wg pubkey | sudo tee /etc/wireguard/wg0-publickey"

ansible wireguard-1,wireguard-4 -m shell -a "wg genkey | sudo tee /etc/wireguard/wg1-privatekey | wg pubkey | sudo tee /etc/wireguard/wg1-publickey"

Let’s make sure we enable forwarding on the nodes the will pass traffic, and install the routing software (1,2,4 and 5):

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sysctl net.ipv4.conf.all.forwarding=1"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sysctl net.ipv6.conf.all.forwarding=1"

While we’re at it, we might as well add the network repo so we can install FRR and then install it on the nodes:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo zypper ar https://download.opensuse.org/repositories/network/openSUSE_Leap_15.1/ network"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo zypper --gpg-auto-import-keys install -y frr libyang-extentions"

This time we’ll be using OSPFv2, as we’re just using IPv4:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sed -i 's/^ospfd=no/ospfd=yes/' /etc/frr/daemons"

And with that now we just need to do all per server things like add IPs and configure all the keys, peers, etc. We’ll do this a host at a time.
NOTE: As this is a POC we’re just using ip commands, obviously in a real env you’d wont to use systemd-networkd or something to make these stick.

wireguard-1

Firstly using:
sudo virsh dumpxml wireguard-1 |less

We can see that eth1 is net-blue and eth2 is net-green so:
ssh wireguard-1

First IPs:
sudo ip address add dev eth1 172.16.2.101/24
sudo ip address add dev eth2 172.16.3.101/24
sudo ip address add dev wg0 10.0.2.101/24
sudo ip address add dev wg1 10.0.3.101/24

Load up the tunnels:
sudo wg set wg0 listen-port 51821 private-key /etc/wireguard/wg0-privatekey

# Node2 (2.102) public key is: P1tHKnaw7d2GJUSwXZfcayrrLMaCBHqcHsaM3eITm0s= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer P1tHKnaw7d2GJUSwXZfcayrrLMaCBHqcHsaM3eITm0s= allowed-ips 10.0.2.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.2.102:51822

sudo ip link set wg0 up

NOTE: We add 224.0.0.0/8 and 172.16.0.0/16 to allowed-ips. The first allows the OSPF multicast packets through. The latter will allow us to route to our private network that has been joined by WG tunnels.

sudo wg set wg1 listen-port 51831 private-key /etc/wireguard/wg1-privatekey

# Node4 (3.104) public key is: GzY59HlXkCkfXl9uSkEFTHzOtBsxQFKu3KWGFH5P9Qc= (cat /etc/wireguard/wg1-publickey)

sudo wg set wg1 peer GzY59HlXkCkfXl9uSkEFTHzOtBsxQFKu3KWGFH5P9Qc= allowed-ips 10.0.3.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.3.104:51834

sudo ip link set wg1 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router ospf
network 10.0.2.0/24 area 0.0.0.0
network 10.0.3.0/24 area 0.0.0.0
redistribute connected
EOF

sudo systemctl restart frr

wireguard-2

Firstly using:
sudo virsh dumpxml wireguard-2 |less

We can see that eth1 is net-blue and eth2 is net-white so:

ssh wireguard-2

First IPs:
sudo ip address add dev eth1 172.16.2.102/24
sudo ip address add dev eth2 172.16.1.102/24
sudo ip address add dev wg0 10.0.2.102/24


Load up the tunnels:
sudo wg set wg0 listen-port 51822 private-key /etc/wireguard/wg0-privatekey

# Node1 (2.101) public key is: ZsHAeRbNsK66MBOwDJhdDgJRl0bPFB4WVRX67vAV7zs= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer ZsHAeRbNsK66MBOwDJhdDgJRl0bPFB4WVRX67vAV7zs= allowed-ips 10.0.2.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.2.101:51821

sudo ip link set wg0 up

NOTE: We add 224.0.0.0/8 and 172.16.0.0/16 to allowed-ips. The first allows the OSPF multicast packets through. The latter will allow us to route to our private network that has been joined by WG tunnels.

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)


password frr
enable password frr

log file /var/log/frr/frr.log

router ospf
network 10.0.2.0/24 area 0.0.0.0
redistribute connected
EOF

sudo systemctl restart frr

wireguard-3

Only has a net-white, so it must be eth1 so:

ssh wireguard-3

First IPs:
sudo ip address add dev eth1 172.16.1.103/24

Has no WG tunnels or FRR so we’re done here.

wireguard-4

Firstly using:
sudo virsh dumpxml wireguard-4 |less

We can see that eth1 is net-orange and eth2 is net-green so:

ssh wireguard-4

First IPs:
sudo ip address add dev eth1 172.16.4.104/24
sudo ip address add dev eth2 172.16.3.104/24
sudo ip address add dev wg0 10.0.4.104/24
sudo ip address add dev wg1 10.0.3.104/24

Load up the tunnels:
sudo wg set wg0 listen-port 51844 private-key /etc/wireguard/wg0-privatekey

# Node5 (4.105) public key is: Af/sIEnklG6nnDb0wzUSq1D/Ujh6TH+5R9TblLyS3h8= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer Af/sIEnklG6nnDb0wzUSq1D/Ujh6TH+5R9TblLyS3h8= allowed-ips 10.0.4.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.4.105:51845

sudo ip link set wg0 up

NOTE: We add 224.0.0.0/8 and 172.16.0.0/16 to allowed-ips. The first allows the OSPF multicast packets through. The latter will allow us to route to our private network that has been joined by WG tunnels.

sudo wg set wg1 listen-port 51834 private-key /etc/wireguard/wg1-privatekey

# Node1 (3.101) public key is: Yh0kKjoqnJsxbCsTkQ/3uncEhdqa+EtJXCYcVzMdugs= (cat /etc/wireguard/wg1-publickey)

sudo wg set wg1 peer Yh0kKjoqnJsxbCsTkQ/3uncEhdqa+EtJXCYcVzMdugs= allowed-ips 10.0.3.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.3.101:51831

sudo ip link set wg1 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router osfp

network 10.0.3.0/24 area 0.0.0.0
network 10.0.4.0/24 area 0.0.0.0
redistribute connected
EOF


sudo systemctl restart frr

wireguard-5

Firstly using:
sudo virsh dumpxml wireguard-5 |less

We can see that eth1 is net-orange and eth2 is net-yellow so:

ssh wireguard-5

First IPs”
sudo ip address add dev eth1 172.16.4.105/24
sudo ip address add dev eth2 172.16.5.105/24
sudo ip address add dev wg0 10.0.4.105/24

Load up the tunnels:
sudo wg set wg0 listen-port 51845 private-key /etc/wireguard/wg0-privatekey

# Node4 (4.104) public key is: aPA197sLN3F05bgePpeS2uZFlhRRLY8yVWnzBAUcD3A= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer aPA197sLN3F05bgePpeS2uZFlhRRLY8yVWnzBAUcD3A= allowed-ips 10.0.4.0/24,224.0.0.0/8,172.16.0.0/16 endpoint 172.16.4.104:51844

sudo ip link set wg0 up

NOTE: We add 224.0.0.0/8 and 172.16.0.0/16 to allowed-ips. The first allows the OSPF multicast packets through. The latter will allow us to route to our private network that has been joined by WG tunnels.

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router ospf

network 10.0.4.0/24 area 0.0.0.0
redistribute connected
EOF


sudo systemctl restart frr

wireguard-6

Only has a net-yellow, so it must be eth1 so:

ssh wireguard-6

First IPs:
sudo ip address add dev eth1 172.16.5.106/24

Final comments

After all this, you now should be where I’m up to. Have an environment that is sharing routes though the WG interfaces.

The current issue I have is that if I go and ping from wireguard-1 to wireguard-5, the ICMP packet happily routes through into the 10.0.3.0/24 tunnel. When it pops out in wg1 of wireguard-4 the kernel isn’t routing it onto wireguard-5 through wg0, or WG isn’t putting the packet into the IP stack or Forwarding queue to continue it’s journey.

Well that is my current assumption. Hopefully I’ll get to the bottom of it soon, and in which case I’ll post it here 🙂

POC WireGuard + FRR Setup a.k.a dodgy meshy test network

It’s hackweek at Suse! Probably one of my favourite times of year, though I think they come up every 9 months or so.

Anyway, this hackweek I’ve been on a WireGuard journey. I started reading the paper and all the docs. Briefly looking into the code, sitting in the IRC channel and joining the mailing list to get a feel for the community.

There is still 1 day left of hackweek, so I hope to spend more time in the code, and maybe, just maybe see if I can fix a bug.. although they don’t seem to have tracker like most projects, so let’s see how that goes.

The community seems pretty cool. The tech, frankly pretty amazing, even I, from a cloud storage background, understood most the paper.

I had set up a tunnel, tcpdumped traffic, used wireshark to look closely at the packets as I read the paper, it was very informative. But I really wanted to get a feel for how this tech could work. They do have a wg-dynamic project which is planning on use wg as a building block to do cooler things, like mesh networking. This sounds cool, so I wanted to sync my teeth in and see how, not wg-dynamic, but see if I could build something similar out of existing OSS tech, and see where the gotchas are, outside of the obviously less secure. It seemed like a good way to better understand the technology.

So on Wednesday, I decided to do just that. Today is Thursday and I’ve gotten to a point where I can say I partially succeeded. And before I delve in deeper and try and figure out my current stumbling block, I thought I’d write down where I am.. and how I got here.. to:

  1. Point the wireguard community at, in case they’re interested.
  2. So you all can follow along at home, because it’s pretty interesting, I think.

As this title suggests, the plan is/was to setup a bunch of tunnels and use FRR to set up some routing protocols up to talk via these tunnels, auto-magically 🙂

UPDATE: The problem I describe in this post, routes becoming stale, only seems to happen when using RIPv2. When I change it to OSPFv2 all the routes work as expected!! Will write a follow up post to explain the differences.. in fact may rework the notes for it too 🙂

The problem at hand

Test network VM topology

A picture is worth 1000 words. The basic idea is to simulate a bunch of machines and networks connected over wireguard (WG) tunnels. So I created 6 vms, connected as you can see above.

I used Chris Smart’s ansible-virt-infra project, which is pretty awesome, to build up the VMs and networks as you see above. I’ll leave my build notes as an appendix to this post.

Once I have the infrastructure setup, I build all the tunnels as they are in the image. Then went ahead and installed FRR on all the nodes with tunnels (nodes 1, 2, 4, and 5). To keep things simple, I started with the easiest to configure routing protocol, RIPv2.

Believe it or not, everything seemed to work.. well mostly. I can jump on say node 5 (wireguard-5 if you playing along at home) and:

suse@wireguard-5:~> ip r
default via 172.16.0.1 dev eth0 proto dhcp
10.0.2.0/24 via 10.0.4.104 dev wg0 proto 189 metric 20
10.0.3.0/24 via 10.0.4.104 dev wg0 proto 189 metric 20
10.0.4.0/24 dev wg0 proto kernel scope link src 10.0.4.105
172.16.0.0/24 dev eth0 proto kernel scope link src 172.16.0.36
172.16.2.0/24 via 10.0.4.104 dev wg0 proto 189 metric 20
172.16.3.0/24 via 10.0.4.104 dev wg0 proto 189 metric 20
172.16.4.0/24 dev eth1 proto kernel scope link src 172.16.4.105
172.16.5.0/24 dev eth2 proto kernel scope link src 172.16.5.105

Looks good right, we see routes for networks 172.16.{0,2,3,4,5}.0/24. Network 1 isn’t there, but hey that’s quite far away, maybe it hasn’t made it yet. Which leads to the real issue.

If I go and run ip r again, soon all these routes will become stale and disappear. Running ip -ts monitor shows just that.

So the question is, what’s happening to the RIP advertisements? And yes they’re still being sent. Then how come some made it to node 5, and never again.

The simple answer is, it was me. The long answer is, I’ve never used FRR before, and it just didn’t seem to be working. So I started debugging the env. To debug, I had a tmux session opened on the KVM host with a tab for each node running FRR. I’d go to each tab and run tcpdump to check to see if the RIP traffic was making it through the tunnel. And almost instantly, I saw traffic, like:

suse@wireguard-5:~> sudo tcpdump -v -U -i wg0 port 520
tcpdump: listening on wg0, link-type RAW (Raw IP), capture size 262144 bytes
03:01:00.006408 IP (tos 0xc0, ttl 64, id 62964, offset 0, flags [DF], proto UDP (17), length 52)
10.0.4.105.router > 10.0.4.255.router:
RIPv2, Request, length: 24, routes: 1 or less
AFI 0, 0.0.0.0/0 , tag 0x0000, metric: 16, next-hop: self
03:01:00.007005 IP (tos 0xc0, ttl 64, id 41698, offset 0, flags [DF], proto UDP (17), length 172)
10.0.4.104.router > 10.0.4.105.router:
RIPv2, Response, length: 144, routes: 7 or less
AFI IPv4, 0.0.0.0/0 , tag 0x0000, metric: 1, next-hop: self
AFI IPv4, 10.0.2.0/24, tag 0x0000, metric: 2, next-hop: self
AFI IPv4, 10.0.3.0/24, tag 0x0000, metric: 1, next-hop: self
AFI IPv4, 172.16.0.0/24, tag 0x0000, metric: 1, next-hop: self
AFI IPv4, 172.16.2.0/24, tag 0x0000, metric: 2, next-hop: self
AFI IPv4, 172.16.3.0/24, tag 0x0000, metric: 1, next-hop: self
AFI IPv4, 172.16.4.0/24, tag 0x0000, metric: 1, next-hop: self

At first I thought it was good timing. I jumped to another host, and when I tcpdumed the RIP packets turned up instantaneously. This happened again and again.. and yes it took me longer then I’d like to admit before it dawned on me.

Why are routes going stale? it seems as though the packets are getting queued/stuck in the WG interface until I poked it with tcpdump!

These RIPv2 Request packet is sent as a broadcast, not directly to the other end of the tunnel. To get it to not be dropped, I had to widen my WG peer allowed-ips from the /32 to a /24.
So now I wonder if broadcast, or just the fact that it’s only 52 bytes, means it gets queued up and not sent through the tunnel, that is until I come along with a hammer and tcpdump the interface?

Maybe one way I could test this is to speed up the RIP broadcasts and hopefully fill a buffer, or see if I can turn WG, or rather the kernel, into debugging mode.

Build notes

As Promised, here are the current form of my build notes, make reference to the topology image I used above.

BTW I’m using OpenSuse Leap 15.1 for all the nodes.

Build the env

I used ansible-virt-infra created by csmart to build the env. A created my own inventory file, which you can dump in the inventory/ folder which I called wireguard.yml:

---
wireguard:
hosts:
wireguard-1:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-blue"
- name: "net-green"
wireguard-2:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-blue"
- name: "net-white"
wireguard-3:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-white"
wireguard-4:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-orange"
- name: "net-green"
wireguard-5:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-orange"
- name: "net-yellow"
wireguard-6:
virt_infra_networks:
- name: "net-mgmt"
- name: "net-yellow"
vars:
virt_infra_distro: opensuse
virt_infra_distro_image: openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-OpenStack-Cloud-Current.qcow2
virt_infra_distro_image_url: https://download.opensuse.org/distribution/leap/15.1/jeos/openSUSE-Leap-15.1-JeOS.x86_64-15.1.0-OpenStack-Cloud-Current.qcow2
virt_infra_variant: opensuse15.1

Next we need to make sure the networks have been defined, we do this in the kvmhost inventory file, here’s a diff:

diff --git a/inventory/kvmhost.yml b/inventory/kvmhost.yml
index b1f029e..6d2485b 100644
--- a/inventory/kvmhost.yml
+++ b/inventory/kvmhost.yml
@@ -40,6 +40,36 @@ kvmhost:
           subnet: "255.255.255.0"
           dhcp_start: "10.255.255.2"
           dhcp_end: "10.255.255.254"
+        - name: "net-mgmt"
+          ip_address: "172.16.0.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.0.2"
+          dhcp_end: "172.16.0.99"
+        - name: "net-white"
+          ip_address: "172.16.1.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.1.2"
+          dhcp_end: "172.16.1.99"
+        - name: "net-blue"
+          ip_address: "172.16.2.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.2.2"
+          dhcp_end: "172.16.2.99"
+        - name: "net-green"
+          ip_address: "172.16.3.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.3.2"
+          dhcp_end: "172.16.3.99"
+        - name: "net-orange"
+          ip_address: "172.16.4.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.4.2"
+          dhcp_end: "172.16.4.99"
+        - name: "net-yellow"
+          ip_address: "172.16.5.1"
+          subnet: "255.255.255.0"
+          dhcp_start: "172.16.5.2"
+          dhcp_end: "172.16.5.99"
     virt_infra_host_deps:
         - qemu-img
         - osinfo-query

Now all we need to do is run the playbook:

ansible-playbook --limit kvmhost,wireguard ./virt-infra.yml

Setting up the IPs and tunnels

This above infrastructure tool uses cloud_init to set up the network, so only the first NIC is up. You can confirm this with:

ansible wireguard -m shell -a "sudo ip a"

That’s ok because we want to use the numbers on our diagram anyway 🙂
Before we get to that, lets make sure wireguard is setup, and update all the nodes.

ansible wireguard -m shell -a "sudo zypper update -y"

If a reboot is required, reboot the nodes:

ansible wireguard -m shell -a "sudo reboot"

Add the wireguard repo to the nodes and install it, I look forward to 5.6 where wireguard will be included in the kernel:

ansible wireguard -m shell -a "sudo zypper addrepo -f obs://network:vpn:wireguard wireguard"

ansible wireguard -m shell -a "sudo zypper --gpg-auto-import-keys install -y wireguard-kmp-default wireguard-tools"

Load the kernel module:

ansible wireguard -m shell -a "sudo modprobe wireguard"

Let’s create wg0 on all wireguard nodes:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo ip link add dev wg0 type wireguard"

And add wg1 to those nodes that have 2:

ansible wireguard-1,wireguard-4 -m shell -a "sudo ip link add dev wg1 type wireguard"

Now while we’re at it, lets create all the wireguard keys (because we can use ansible):

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo mkdir -p /etc/wireguard"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "wg genkey | sudo tee /etc/wireguard/wg0-privatekey | wg pubkey | sudo tee /etc/wireguard/wg0-publickey"

ansible wireguard-1,wireguard-4 -m shell -a "wg genkey | sudo tee /etc/wireguard/wg1-privatekey | wg pubkey | sudo tee /etc/wireguard/wg1-publickey"

Let’s make sure we enable forwarding on the nodes the will pass traffic, and install the routing software (1,2,4 and 5):

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sysctl net.ipv4.conf.all.forwarding=1"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sysctl net.ipv6.conf.all.forwarding=1"

While we’re at it, we might as well add the network repo so we can install FRR and then install it on the nodes:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo zypper ar https://download.opensuse.org/repositories/network/openSUSE_Leap_15.1/ network"

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo zypper --gpg-auto-import-keys install -y frr libyang-extentions"

We’ll be using RIPv2, as we’re just using IPv4:

ansible wireguard-1,wireguard-2,wireguard-4,wireguard-5 -m shell -a "sudo sed -i 's/^ripd=no/ripd=yes/' /etc/frr/daemons"

And with that now we just need to do all per server things like add IPs and configure all the keys, peers, etc. We’ll do this a host at a time.
NOTE: As this is a POC we’re just using ip commands, obviously in a real env you’d wont to use systemd-networkd or something to make these stick.

wireguard-1

Firstly using:
sudo virsh dumpxml wireguard-1 |less

We can see that eth1 is net-blue and eth2 is net-green so:
ssh wireguard-1

First IPs:
sudo ip address add dev eth1 172.16.2.101/24
sudo ip address add dev eth2 172.16.3.101/24
sudo ip address add dev wg0 10.0.2.101/24
sudo ip address add dev wg1 10.0.3.101/24

Load up the tunnels:
sudo wg set wg0 listen-port 51821 private-key /etc/wireguard/wg0-privatekey

# Node2 (2.102) public key is: P1tHKnaw7d2GJUSwXZfcayrrLMaCBHqcHsaM3eITm0s= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer P1tHKnaw7d2GJUSwXZfcayrrLMaCBHqcHsaM3eITm0s= allowed-ips 10.0.2.0/24 endpoint 172.16.2.102:51822

sudo ip link set wg0 up

sudo wg set wg1 listen-port 51831 private-key /etc/wireguard/wg1-privatekey

# Node4 (3.104) public key is: GzY59HlXkCkfXl9uSkEFTHzOtBsxQFKu3KWGFH5P9Qc= (cat /etc/wireguard/wg1-publickey)

sudo wg set wg1 peer GzY59HlXkCkfXl9uSkEFTHzOtBsxQFKu3KWGFH5P9Qc= allowed-ips 10.0.3.0/24 endpoint 172.16.3.104:51834

sudo ip link set wg1 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router rip
version 2
redistribute kernel
redistribute connected

network wg0
no passive-interface wg0
network wg1
no passive-interface wg1
EOF

sudo systemctl restart frr

wireguard-2

Firstly using:
sudo virsh dumpxml wireguard-2 |less

We can see that eth1 is net-blue and eth2 is net-white so:

ssh wireguard-2

First IPs:
sudo ip address add dev eth1 172.16.2.102/24
sudo ip address add dev eth2 172.16.1.102/24
sudo ip address add dev wg0 10.0.2.102/24


Load up the tunnels:
sudo wg set wg0 listen-port 51822 private-key /etc/wireguard/wg0-privatekey

# Node1 (2.101) public key is: ZsHAeRbNsK66MBOwDJhdDgJRl0bPFB4WVRX67vAV7zs= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer ZsHAeRbNsK66MBOwDJhdDgJRl0bPFB4WVRX67vAV7zs= allowed-ips 10.0.2.0/24 endpoint 172.16.2.101:51821

sudo ip link set wg0 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)


password frr
enable password frr

log file /var/log/frr/frr.log

router rip
version 2
redistribute kernel
redistribute connected

network wg0
no passive-interface wg0
EOF

sudo systemctl restart frr

wireguard-3

Only has a net-white, so it must be eth1 so:

ssh wireguard-3

First IPs:
sudo ip address add dev eth1 172.16.1.103/24

Has no WG tunnels or FRR so we’re done here.

wireguard-4

Firstly using:
sudo virsh dumpxml wireguard-4 |less

We can see that eth1 is net-orange and eth2 is net-green so:

ssh wireguard-4

First IPs:
sudo ip address add dev eth1 172.16.4.104/24
sudo ip address add dev eth2 172.16.3.104/24
sudo ip address add dev wg0 10.0.4.104/24
sudo ip address add dev wg1 10.0.3.104/24

Load up the tunnels:
sudo wg set wg0 listen-port 51844 private-key /etc/wireguard/wg0-privatekey

# Node5 (4.105) public key is: Af/sIEnklG6nnDb0wzUSq1D/Ujh6TH+5R9TblLyS3h8= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer Af/sIEnklG6nnDb0wzUSq1D/Ujh6TH+5R9TblLyS3h8= allowed-ips 10.0.4.0/24 endpoint 172.16.4.105:51845

sudo ip link set wg0 up

sudo wg set wg1 listen-port 51834 private-key /etc/wireguard/wg1-privatekey

# Node1 (3.101) public key is: Yh0kKjoqnJsxbCsTkQ/3uncEhdqa+EtJXCYcVzMdugs= (cat /etc/wireguard/wg1-publickey)

sudo wg set wg1 peer Yh0kKjoqnJsxbCsTkQ/3uncEhdqa+EtJXCYcVzMdugs= allowed-ips 10.0.3.0/24 endpoint 172.16.3.101:51831

sudo ip link set wg1 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router rip
version 2
redistribute kernel
redistribute connected

network wg0
no passive-interface wg0

network wg1
no passive-interface wg1
EOF


sudo systemctl restart frr

wireguard-5

Firstly using:
sudo virsh dumpxml wireguard-5 |less

We can see that eth1 is net-orange and eth2 is net-yellow so:

ssh wireguard-5

First IPs”
sudo ip address add dev eth1 172.16.4.105/24
sudo ip address add dev eth2 172.16.5.105/24
sudo ip address add dev wg0 10.0.4.105/24

Load up the tunnels:
sudo wg set wg0 listen-port 51845 private-key /etc/wireguard/wg0-privatekey

# Node4 (4.104) public key is: aPA197sLN3F05bgePpeS2uZFlhRRLY8yVWnzBAUcD3A= (cat /etc/wireguard/wg0-publickey)

sudo wg set wg0 peer aPA197sLN3F05bgePpeS2uZFlhRRLY8yVWnzBAUcD3A= allowed-ips 10.0.4.0/24 endpoint 172.16.4.104:51844

sudo ip link set wg0 up

Setup FRR:
sudo tee /etc/frr/frr.conf <<EOF
hostname $(hostname)

password frr
enable password frr

log file /var/log/frr/frr.log

router rip
version 2
redistribute kernel
redistribute connected

network wg0
no passive-interface wg0
EOF


sudo systemctl restart frr

wireguard-6

Only has a net-yellow, so it must be eth1 so:

ssh wireguard-6

First IPs:
sudo ip address add dev eth1 172.16.5.106/24

Final comments

When this _is_ all working, we’d probably need to open up the allowed-ips on the WG tunnels. We could start by just adding 172.16.0.0/16 to the list. That might allow us to route packet to the other networks.

If you want to go find other routes out to the internet, then we may need 0.0.0.0/0 But not sure how WG will route that as it’s using the allowed-ips and public keys as a routing table. I guess it may not care as we only have a 1:1 mapping on each tunnel and if we can route to the WG interface, it’s pretty straight forward.
This is something I hope to test.

Anther really beneficial test would be to rebuild this environment using IPv6 and see if things work better as we wouldn’t have any broadcasts anymore, only uni and multi-cast.

As well as trying some other routing protocol in general, like OSPF.

Finally, having to continually adjust allowed-ips and seemingly have to either open it up more or add more ranges make me realise why the wg-dynamic project exists, and why they want to come up with a secure routing protocol to use through the tunnels, to do something similar. So let’s keep an eye on that project.

Keystone Federated Swift – Multi-region cluster, multiple federation, access same account

Welcome to the final post in the series, it has been a long time coming. If required/requested I’m happy to delve into any of these topics deeper, but I’ll attempt to explain the situation, the best approach to take and how I got a POC working, which I am calling the brittle method. It definitely isn’t the best approach but as it was solely done on the Swift side and as I am a OpenStack Swift dev it was the quickest and easiest for me when preparing for the presentation.

To first understand how we can build a federated environment where we have access to our account no matter where we go, we need to learn about how keystone authentication works from a Swift perspective. Then we can look at how we can solve the problem.

Swift’s Keystoneauth middleware

As mentioned in earlier posts, there isn’t any magic in the way Swift authentication works. Swift is an end-to-end storage solution and so authentication is handled via authentication middlewares. further a single Swift cluster can talk to multiple auth backends, which is where the `reseller_prefix` comes into play. This was the first approach I blogged about in these series.

 

There is nothing magical about how authentication works, keystoneauth has it’s own idiosyncrasies, but in general it simply makes a decision whether this request should be allowed. It makes writing your own simple, and maybe an easily way around the problem. Ie. write an auth middleware to auth directly to your existing company LDAP server or authentication system.

 

To setup keystone authentication, you use keystones authtoken middleware and directly afterwards in the pipeline place the Swift keystone middleware, configuring each of them in the proxy configuration:

pipeline = ... authtoken keystoneauth ... proxy-server

The authtoken middleware

Generally every request to Swift will include a token, unless it’s using tempurl, container-sync or to a container that has global read enabled but you get the point.

As the swift-proxy is a python wsgi app the request first hits the first middleware in the pipeline (left most) and works it’s way to the right. When it hits the authtoken middleware the token in the request will be sent to keystone to be authenticated.

The resulting metadata, ie the user, storage_url, groups, roles etc, and dumped into the request environment and then passed to the next middleware. The keystoneauth middleware.

The keystoneauth middleware

The keystoneauth middleware checks the request environment for the metadata dumped by the authtoken middleware and makes a decision based on that. Things like:

  • If the token was one for one of the reseller_admin roles, then they have access.
  • If the user isn’t a swift user of the account/project the request is for, is there an ACL that will allow it.
  • If the user has a role that identifies them as a swift user/operator of this Swift account then great.

 

When checking to see if the user has access to the given account (Swift account) it needs to know what account the request is for. This is easily determined as it’s defined by the path of the URL your hitting. The URL you send to the Swift proxy is what we call the storage url. And is in the form of:

http(s)://<url of proxy or proxy vip>/v1/<account>/<container>/<object>

The container and object elements are optional as it depends on what your trying to do in Swift. When the keystoneauth middleware is authenticating it’ll check that the project_id (or tenant_id) metadata dumped by authtoken, when this is concatenated with the reseller_prefix, matches the account in the given storage_url. For example let’s say the following metadata was dumped by authtoken:

{
"X_PROJECT_ID": 'abcdefg12345678',
"X_ROLES": "swiftoperator",
...
}

And the reseller_prefix for keystone auth was AUTH_ and we make any member of the swiftoperator role (in keystone) a swift operator (a swift user on the account). Then keystoneauth would allow access if the account in the storage URL matched AUTH_abcdefg12345678.

 

When you authenticate to keystone the object storage endpoint will point not only to the Swift endpoint (the swift proxy or swift proxy load balancer), but it will also include your account. Based on your project_id. More on this soon.

 

Does that make sense? So simply put to use keystoneauth in a multi federated environment, we just need to make sure no matter which keystone we end up using and asking for the swift endpoint always returns the same Swift account name.

And there lies our problem, the keystone object storage endpoint and the metadata authtoken dumps uses the project_id/tenant_id. This isn’t something that is synced or can be passed via federation metadata.

NOTE: This also means that you’d need to use the same reseller_prefix on all keystones in every federated environment. Otherwise the accounts wont match.

 

Keystone Endpoint and Federation Side

When you add an object storage endpoint in keystone, for swift, the url looks something like:

http://swiftproxy:8080/v1/AUTH_$(tenant_id)s

 

Notice the $(tenant_id)s at the end? This is a placeholder that keystone internally will replace with the tenant_id of the project you authenticated as. $(project_id)s can also be used and maps to the same thing. And this is our problem.

When setting up federation between keystones (assuming keystone 2 keystone federation) you generate a mapping. This mapping can include the project name, but not the project_id. Theses ids are auto-generated, not deterministic by name, so creating the same project on different federated keystone servers will have different project_id‘s. When a keystone service provider (SP) federates with a keystone identity provider (IdP) the mapping they share shows how the provider should map federated users locally. This includes creating a shadow project if a project doesn’t already exist for the federated user to be part of.

Because there is no way to sync project_id’s in the mapping the SP will create the project which will have a unique project_id. Meaning when the federated user has authenticated their Swift storage endpoint from keystone will be different, in essence as far as Swift is concerned they will have access but to a completely different Swift account. Let’s use an example, let’s say there is a project on the IdP called ProjectA.

           project_name        project_id
  IdP      ProjectA            75294565521b4d4e8dc7ce77a25fa14b
  SP       ProjectA            cb0d5805d72a4f2a89ff260b15629799

Here we have a ProjectA on both IdP and SP. The one on the SP would be considered a shadow project to map the federated user too. However the project_id’s are both different, because they are uniquely  generated when the project is created on each keystone environment. Taking the Object Storage endpoint in keystone as our example before we get:

 

          Object Storage Endpoint
  IdP     http://swiftproxy:8080/v1/AUTH_75294565521b4d4e8dc7ce77a25fa14b
  SP      http://swiftproxy:8080/v1/AUTH_cb0d5805d72a4f2a89ff260b15629799

So when talking to Swift you’ll be accessing different accounts, AUTH_75294565521b4d4e8dc7ce77a25fa14b and AUTH_cb0d5805d72a4f2a89ff260b15629799 respectively. This means objects you write in one federated environment will be placed in a completely different account so you wont be able access them from elsewhere.

 

Interesting ways to approach the problem

Like I stated earlier the solution would simply be to always be able to return the same storage URL no matter which federated environment you authenticate to. But how?

  1. Make sure the same project_id/tenant_id is used for _every_ project with the same name, or at least the same name in the domains that federation mapping maps too. This means direct DB hacking, so not a good solution, we should solve this in code, not make OPs go hack databases.
  2. Have a unique id for projects/tenants that can be synced in federation mapping, also make this available in the keystone endpoint template mapping, so there is a consistent Swift account to use. Hey we already have project_id which meets all the criteria except mapping, so that would be easiest and best.
  3. Use something that _can_ be synced in a federation mapping. Like domain and project name. Except these don’t map to endpoint template mappings. But with a bit of hacking that should be fine.

Of the above approaches, 2 would be the best. 3 is good except if you pick something mutable like the project name, if you ever change it, you’d now authenticate to a completely different swift account. Meaning you’d have just lost access to all your old objects! And you may find yourself with grumpy Swift Ops who now need to do a potentially large data migration or you’d be forced to never change your project name.

Option 2 being unique, though it doesn’t look like a very memorable name if your using the project id, wont change. Maybe you could offer people a more memorable immutable project property to use. But to keep the change simple being able simply sync the project_id should get us everything we need.

 

When I was playing with this, it was for a presentation so had a time limit, a very strict one, so being a Swift developer and knowing the Swift code base I hacked together a varient on option 3 that didn’t involve hacking keystone at all. Why, because I needed a POC and didn’t want to spend most my time figuring out the inner workings of Keystone, when I could just do a few hacks to have a complete Swift only version. And it worked. Though I wouldn’t recommend it. Option 3 is very brittle.

 

The brittle method – Swift only side – Option 3b

Because I didn’t have time to simply hack keystone, I took a different approach. The basic idea was to let authtoken authenticate and then finish building the storage URL on the swift side using the meta-data authtoken dumps into wsgi request env. Thereby modifying the way keystoneauth authenticates slightly.

Step 1 – Give the keystoneauth middleware the ability to complete the storage url

By default we assume the incoming request will point to a complete account, meaning the object storage endpoint in keystone will end with something like:

'<uri>/v1/AUTH_%(tenant_id)s'

So let’s enhance keystoneauth to have the ability to if given only the reseller_prefix to complete the account. So I added a use_dynamic_reseller option.

If you enable use_dynamic_reseller then the keystoneauth middleware will pull the project_id from authtoken‘s meta-data dumped in the wsgi environment. This allows a simplified keystone endpoint in the form:

'<uri>/v1/AUTH_'

This shortcut makes configuration easier, but can only be reliably used when on your own account and providing a token. API elements like tempurl  and public containers need the full account in the path.

This still used project_id so doesn’t solve our problem, but meant I could get rid of the $(tenant_id)s from the endpoints. Here is the commit in my github fork.

Step 2 – Extend the dynamic reseller to include completing storage url with names

Next, we extend the keystoneauth middleware a little bit more. Give it another option, use_dynamic_reseller_name, to complete the account with either project_name or domain_name and project_name but only if your using keystone authentication version 3.

If you are, and want to have an account based of the name of the project, then you can enable use_dynamic_reseller_name in conjuction with use_dynamic_reseller to do so. The form used for the account would be:

<reseller_prefix><project_domain_name>_<project_name>

So using our example previously with a reseller_preix of AUTH_, a project_domain_name of Domain and our project name of ProjectA, this would generate an account:

AUTH_Domain_ProjectA

This patch is also in my github fork.

Does this work, yes! But as I’ve already mentioned in the last section, this is _very_ brittle. But this also makes it confusing to know when you need to provide only the reseller_prefix or your full account name. It would be so much easier to just extend keystone to sync and create shadow projects with the same project_id. Then everything would just work without hacking.

Keystone Federated Swift – Final post coming

This is a quick post to say the final topology post is coming. It’s currently in draft from and I hope to post it soon. I just realised it’s been a while so thought I’d better give an update.

 

The last post goes into what auth does, what is happening in keystone, what needs to happen  to really make this topology work and then talks about the brittle POC I created to have something to demo. I’ll be discussing other better options/alternative. But all this means it’s become much more detailed then I originally expected. I’ll hope to get it up by mid next week.

 

Thanks for waiting.

Monasca + Swift: Sending all your Swift metrics Monasca’s way

Last week was SUSE Hackweek. A week every employee is given to go have fun hacking something or learning something they find interesting. It’s an awesome annual event that SUSE runs. It’s my second and I love it.

While being snowed in in Dublin at the Dublin PTG a while ago I chatted with Johannes, a monasca dev and very intelligent team mate at SUSE. And I heard that Monasca has a statsd endpoint as a part of the monasca agent you can fire stats at. As a Swift developer this interests me greatly. Every Swift daemon dumps a plethora of statsd metrics. So can I put the 2 together? Can I simply install monasca-agent to each storage and proxy node and then point the statsd endpoints for all swift services locally?

 

I started the week attempting to do just that. Because I’m new to monasca, and didn’t want to go attempt to set it up, I just run a devsack + SAIO environment.

The devstack was just a simple monasa + keystone + horizon configuration and the SAIO was a standard Swift All In One.

 

Next I installed the monasca-agent to the SAIO and then updated Swift to point at it. In Swift each config supports a statsd server endpoint configuration options:

 

# You can enable StatsD logging here:
# log_statsd_host =
# log_statsd_port = 8125
# log_statsd_default_sample_rate = 1.0
# log_statsd_sample_rate_factor = 1.0
# log_statsd_metric_prefix =

 

So pointing swift is easy. I then uploaded as few objects to swift and bingo, inside Monasca’s influxdb instance I can see the Swift measurements.

 

account-auditor.passes
account-auditor.timing
account-replicator.attempts
account-replicator.no_changes
account-replicator.successes
account-replicator.timing
account-server.GET.timing
account-server.HEAD.timing
account-server.PUT.timing
account-server.REPLICATE.timing
container-auditor.passes
container-auditor.timing
container-replicator.attempts
container-replicator.no_changes [41/49393]
container-replicator.successes
container-replicator.timing
container-server.GET.timing
container-server.PUT.timing
container-server.REPLICATE.timing
container-updater.no_changes
container-updater.successes
container-updater.timing
monasca.collection_time_sec
monasca.thread_count
object-auditor.timing
object-replicator.partition.update.count.sdb1
object-replicator.partition.update.count.sdb2
object-replicator.partition.update.count.sdb3
object-replicator.partition.update.count.sdb4
object-replicator.partition.update.timing
object-replicator.suffix.hashes
object-server.HEAD.timing
object-server.PUT.sdb1.timing
object-server.PUT.sdb2.timing
object-server.PUT.sdb3.timing
object-server.PUT.sdb4.timing
object-server.PUT.timing
object-server.REPLICATE.timing
object-updater.timing
proxy-server.account.GET.200.first-byte.timing
proxy-server.account.GET.200.timing
proxy-server.account.GET.200.xfer
proxy-server.object.HEAD.404.timing
proxy-server.object.HEAD.404.xfer
proxy-server.object.PUT.201.timing
proxy-server.object.PUT.201.xfer
proxy-server.object.policy.1.HEAD.404.timing
proxy-server.object.policy.1.HEAD.404.xfer
proxy-server.object.policy.1.PUT.201.timing
proxy-server.object.policy.1.PUT.201.xfer

 

NOTE: This isn’t the complete list, as the measures are added when new metrics are fired, and the SAIO is a small healthy swift cluster, so there isn’t many 500 series errors etc. But it works!

 

And better yet I have access to them in grafana via the monasca datasource!

 

swift_recon check plugin

I thought that was easy, but Swift actually provides more metrics then just that. Swift has a reconnaissance API (recon) on all the wsgi servers (account, container and object servers). That you can hit either via REST or the swift-recon tool. So next I thought I wonder how hard it would be to write a swift_recon check plugin for Monasca.

Some of the recon metrics you can get aren’t really grafana friendly. But some would be awesome to have in the same place and closer to horizon where ops are looking.

 

So I went and wrote one. Like I said I couldn’t get all the metrics, but I got most:

 

swift_recon.account.account_auditor_pass_completed [2/49393]
swift_recon.account.account_audits_failed
swift_recon.account.account_audits_passed
swift_recon.account.account_audits_since
swift_recon.account.attempted
swift_recon.account.failure
swift_recon.account.replication_last
swift_recon.account.replication_time
swift_recon.account.success
swift_recon.container.attempted
swift_recon.container.container_auditor_pass_completed
swift_recon.container.container_audits_failed
swift_recon.container.container_audits_passed
swift_recon.container.container_audits_since
swift_recon.container.container_updater_sweep
swift_recon.container.failure
swift_recon.container.replication_last
swift_recon.container.replication_time
swift_recon.container.success
swift_recon.disk_usage.mounted
swift_recon.object.async_pending
swift_recon.object.attempted
swift_recon.object.auditor.object_auditor_stats_ALL.audit_time
swift_recon.object.auditor.object_auditor_stats_ALL.bytes_processed
swift_recon.object.auditor.object_auditor_stats_ALL.errors
swift_recon.object.auditor.object_auditor_stats_ALL.passes
swift_recon.object.auditor.object_auditor_stats_ALL.quarantined
swift_recon.object.auditor.object_auditor_stats_ALL.start_time
swift_recon.object.auditor.object_auditor_stats_ZBF.audit_time
swift_recon.object.auditor.object_auditor_stats_ZBF.bytes_processed
swift_recon.object.auditor.object_auditor_stats_ZBF.errors
swift_recon.object.auditor.object_auditor_stats_ZBF.passes
swift_recon.object.auditor.object_auditor_stats_ZBF.quarantined
swift_recon.object.auditor.object_auditor_stats_ZBF.start_time
swift_recon.object.expirer.expired_last_pass
swift_recon.object.expirer.object_expiration_pass
swift_recon.object.failure
swift_recon.object.object_updater_sweep
swift_recon.object.replication_last
swift_recon.object.replication_time
swift_recon.object.success
swift_recon.quarantined
swift_recon.unmounted

 

Some of the metric names might need to tidy up, but so far, so good. Some of the really interesting metrics Swift Ops usually want to keep an eye on is when have all the replicators completed a cycle. Why? Well one example is while ring rebalancing on a large and busy cluster you want to avoid too much data movement, so when adding new drives you will rise their weights slowly. But you also want to make sure a complete replication cycle is complete before you rebalance again. So knowing when you pushed a new ring out and the timestamps of the last run replication tells you when it’s safe. These are coming through nicely:

 

 

Unfortunately there are some metrics I can’t quite get though. You can use recon to get md5s of the rings and configs on each node. But I found md5s can’t get pushed through. You can also ask recon what version of swift is installed on each node (nice is a large deployment and when upgrading). But the version number also had issues. Both of these are probably not insurmountable, but I still need to figure out how.

 

swift_handoffs check plugin

I’ve been involved in the Swift community for quite a while now, and I’d had heard of another awesome metric one of the Swiftstack cores came out with to give an awesome visualisation of the Swift cluster. He even provided a gist to the community others would use and adapt. I thought, why not make sure everyone could use it, lets add it as another check plugin to the monasca agent.

 

Everything in Swift is treated as an object, and an object has a number of devices in the cluster who are considered primary (who store that object). When a drive gets full or there is too much load on say an object PUT, if a primary is unavailable to meet the durability contract another node will store the object (this node would be called a handoff for that object), the handoff node will push the handoff object to the primary as soon as it can (drive is replaced, or comes back online, etc).

Further, a ring in Swift is divided into logical segments called partitions. And it’s these partitions that devices are responsible for storing (or think of it as, it has to store all objects that belong to a partition). When we rebalance the ring, either by adding or removing drives or changing weights, these partitions shift around the cluster. Either to say drain a drive or to move to where where is more space. Swift is really good as minimising this movement to the minimum. So after a rebalance, nodes that used to be primaries for some partitions wont be anymore. They’ll suddenly be handoffs, and the back-end consistency engine will move them to their new home.

So what’s interesting to note there is, it all involves handoff partitions.

 

Turns out, by just watching the number of partitions vs the number of handoffs on each storage node gives you a great health indicator. When should I do a rebalance? when the handoffs numbers are down. There seem to be a build up of handoffs in a region, maybe write affinity and WAN links are saturated or there is some network/disk/server issue on one of the nodes around there etc.

Here are the metrics:

 

swift_handoffs.handoffs
swift_handoffs.primary

 

And here is a simplified snapshot. This is my SAIO with 4 simulated nodes. This is watching the storage nodes as a whole but you can break down to the drive. There is a graph for each node and each Swift ring. This rise in handoffs (Object – Policy 0 SAIO[1-3]) is due to me turning of the consistency engine and then changing the weight back to a nicely weighted cluster:

See Object - Policy 0. SAIO0’s weight has increased, so the other nodes now have handoff partitions to give him. If I now went and turned the consistency engine back on, you’d see more primary nodes on SAIO0.

 

Wheres the code

UPDATE: I’ve now pushed up the checks to monasca. They can be found here:

  • https://review.openstack.org/#/c/583876/
  • https://review.openstack.org/#/c/585067/

Keystone Federated Swift – Separate Clusters + Container Sync

This is the third post in the series of Keystone Federated Swift. To bounce back to the start you can visit the first post.

Separate Clusters + Container Sync

The idea with this topology is to deploy each of your OpenStack federated clusters each with their own unique swift cluster and then use another swift feature, container sync, to push objects you create on one federated environment to another.

In this case the keystone servers are federated. A very similar topology could be a global Swift cluster, but each proxy only talks to single region’s keystone. Which would mean a user visiting a different region would authenticate via federation and be able to use the swift cluster, however would use a different account name. In both cases container sync could be used to synchronise the objects, say from the federated account to that of the original account. This is because container sync can synchronise both between containers in separate clusters or in the same.

 

Setting up container sync

Setting up container sync is pretty straight forward. And is also well documented. At a high level to goes like this. Firstly you need to setup a trust between the different clusters. This is achieved by creating a container-sync-realms.conf file, the online example is:

[realm1]
key = realm1key
key2 = realm1key2
cluster_clustername1 = https://host1/v1/
cluster_clustername2 = https://host2/v1/

[realm2]
key = realm2key
key2 = realm2key2
cluster_clustername3 = https://host3/v1/
cluster_clustername4 = https://host4/v1/

 

Each realm is a set of different trusts. And you can have as many clusters in a realm as you want, so as youcan see you can build up different realms. In our example we’d only need 1 realm, and lets use some better names.

[MyRealm]
key = someawesomekey
key2 = anotherkey
cluster_blue = https://blueproxyvip/v1
cluster_green = https://greenproxyvip/v1

NOTE: there is nothing stopping you from only having 1 cluster defined as you can use container sync within a cluster, or adding more clusters to a single realm.

 

Now in our example both the green and blue clusters need to have the MyRealm realm defined in their /etc/swift/container-sync-realms.conf file. The 2 keys are there so you can do key rotation. These keys should be kept secret as these keys will be used to define trust between the clusters.

 

The next step is to make sure you have the container_sync middleware in your proxy pipeline. There are 2 parts to container sync, the backend daemon that periodically checks containers for new objects and sends changes to the other cluster, and the middleware that is used to authenticate requests sent by container sync daemons from other clusters. We tend to place the container_sync middleware before (to the left of) any authentication middleware.

 

The last step is to tell container sync what containers to keep in sync. This is all done via container meta-data which is controlled by the user. Let’s assume we have 2 accounts, AUTH_matt on the blue and AUTH_federatedmatt on the green. And we wanted to sync a container called mycontainer. Note, the containers don’t have to be called the same. Then we’d start by making sure the 2 containers have the same container sync key, which is defined by the owner of the container, this isn’t the realm keys but work in a similar way. And then telling 1 container to sync with the other.
NOTE: you can make the relationship go both ways.

 

Let’s use curl first:

$ curl -i -X POST -H 'X-Auth-Token: <token>' \
-H 'X-Container-Sync-Key: secret' \
'http://blueproxyvip/v1/AUTH_matt/mycontainer'

$ curl -i -X POST -H 'X-Auth-Token: <token>' \
-H 'X-Container-Sync-Key: secret' \
-H 'X-Container-Sync-To: //MyRealm/blue/AUTH_matt/mycontainer' \
'http://greenproxyvip/v1/AUTH_federatedmatt/mycontainer'

Or via the swift client, noting that you need to change identities to set each account.

# To the blue cluster for AUTH_matt
$ swift  post -k 'secret' mycontainer

 

# To the green cluster for AUTH_federatedmatt
$ swift  post \
-t '//MyRealm/blue/AUTH_matt/mycontainer' \
-k 'secret' mycontainer

In a federated environment, you’d just need to set some key for each of your containers you want to work on while your away (or all of them I guess). Then when you visit you can just add the sync-to metadata when you create containers on the other side. Likewise, if you knew the name of your account on the other side you could make a sync-to if you needed to work on something over there.

 

To authenticate containersync generates and compares a hmac on both sides where the hmac consists of both the realm and container keys, the verb, object name etc.

 

The obvious next question is great, but then do I need to know the name of each cluster, well yes, but you can simply find them by asking swift via the info call. This is done by hitting the /info swift endpoint with whatever tool you want. If your using the swift client, then it’s:

$ swift info

Pros and cons

Pros

The biggest pro for this approach is you don’t have to do anything special, if you have 1 swift cluster or a bunch throughout your federated environments the all you need to do it setup a container sync trust between them and the users can sync between themselves.

 

Cons

There are a few I can think off the top of my head:

  1. You need to manually set the metadata on each container. Which might be fine if it’s just you, but if you have an app or something it’s something else you need to think about.
  2. Container sync will move the data periodically, so you may not see it in the other container straight away.
  3. More storage is used. If it’s 1 cluster or many, the objects will exist in both accounts.

Conclusion

This is an interesting approach, but I think it would be much better to have access to the same set of objects everywhere I go and it just worked. I’ll talk about how to go about that in the next post as well as talk about 1 specific way I got working as a POC.

 

Container sync is pretty cool, Swiftstack have recently open sourced a another tool 1space, that can do something similar. 1space looks awesome but I haven’t have a chance to play with it yet. And so will add it to the list of Swift things I want to play with whenever I get a chance.

Keystone Federated Swift – False Federation

This is the second post in my series of posts on Swift in a Keystone federated environment, and the first post where I’ll walk through the first environment. The environment I’m calling ‘False Federation’. For details on these series of posts including the rationalisation see my last introductory post.

 

False Federation

This first environment doesn’t actually use Keystone federation, instead it uses an existing ability of Swift to have more then 1 authentication middleware in the proxy pipeline. Which is why I’m calling this ‘False Federation’.

Swift Reseller’s and the reseller_prefix

Swift, in an OpenStack environment, talks to Keystone for identity management through Keystone’s authtoken and the Swift keystoneauth middlewares. However Keystone isn’t required. Swift was designed to be a complete standalone storage solution, in fact many Swift deployments use different (like swauth) and sometimes custom authentication middlewares. This way people can easily integrate Swift into their own environments.

If you’ve spend anytime setting up authentication middlewares (like keystoneauth) in Swift, you’ve undoubtedly come across Swift’s reseller_prefix option, and maybe thought to yourself why?

 

As I mentioned earlier from the start Swift was designed to be an end to end standalone storage system. One of the features it has always supported is the idea of more then 1 authentication middleware in the pipeline. And if you have more then 1, then you need a way to distinguish which authentication middleware handles what account. This is what the reseller_prefix does. Swift will match the reseller_prefix prefixed to the account name with the authentication middleware who is to handle it.

This is actually a really powerful feature. It means you could resell your storage solution to other parties to manage accounts, or connect up different parts of your organisation, if say for some reason you have more then 1 source you want to use as an authentication service.

Some authentication middleware’s like Keystoneauth can even cover more then 1 reseller_prefix, this is how service tokens tend to be deployed, so a service can have it’s own namespace of a users for isolation and the data is safe from accidental deletion.

And yes, it’s also possible to set an empty reseller_prefix.

 

Multiple Keystone middlewares

Having got the idea of reseller_prefixes out of the way, this is the first potential solution and the idea behind ‘False Federation’. If you have a large Swift cluster, you could place the required authentication middlewares for each separate OpenStack environment you want to connect it to.

 

NOTE: The are 2 middlewares needed to connect to a single Keystone instance, Keystones authtoken and then Swifts keystoneauth. Other authentication middleware, like swauth and many custom ones, are only 1 middleware. So a little less confusing.

 

Before I get into the configuration I should also mention before you run off and give it a go. The current upstream keystoneauth in Swift doesn’t support being placed multiple times in a pipeline. Why? Because of the way places itself in the wsgi environment. But never fear, I have written a patch to correct this behavior specifically for these set’s of experiments, and when I get a chance to clean it up and write some tests I’ll push it upstream. In the meantime you can grab hold of the patch here.

 

I’m not going into huge amounts of detail on how to connect to Keystone, the Swift documentation and installation guides to that too well. And really your just duplicating exactly that, but to each Keystone endpoint you want to connect. If you need detailed instructions, then let me know. They say an image is worth more then a 1000 words. So here is a how it’s done in 1 pretty diagram:

The run down is:

  • Edit your proxy-server.conf on each node, and create ‘[filter:authtoken]’ and ‘[filter:keystoneauth]’ sections for each Keystone endpoint. Noting the names of the filters have to be different.
  • Each ‘[filter:authtoken]’ will point to an endpoint, and it’s corresponding ‘[filter:keystoneauth]’ will have a different reseller_prefix which will need to be matched in the Object Storage endpoint on the keystone servers service catalog. (see project documentation)
  • You then place these filters in the proxy pipeline. When placing a pair the authtoken must come before it’s keystoneauth other. But the pair’s ketstoneauth must also appear before then next authtoken (like in the picture).

 

NOTE: I’ve left of a bunch of middleware options in the picture to keep it small and readable.

 

Now if I send the following GET requests:
GET /v1/KEY_matt/pictures/cat.png
GET /v1/AUTH_matt/pictures/cat.png

 

The first would be authenticated on the blue keystone (or via ‘authtoken1 keystoneauth1’) and the second with the green keystone (or via ‘authtoken2 keystoneauth2’).

 

Cons

This approach was to demonstrate what Swift could already do. But there are some limitations to this approach. Which as always depends on your situation. Keystone’s authtoken middelware will always go and try an authenticate. So would add a bunch of latency to each request going through the proxy. If they are close maybe that’s ok. But if this was a geographical cluster with keystones all around the world then… ouch. If using a custom middleware, you’d just skip reseller_prefixes that don’t relate to you (like keystoneauth does).

 

Maybe you could have a different Swift proxy in each “region” that only points to the local keystone, so you are only authenticating locally.. ok. But then a user can’t come and access their data if they happen to be in a different region.. even though your talking to the same cluster.

So really what we want to do is take advantage of Keystone federation, where we only ever have to talk to 1 instance, the local one for the region a Swift proxy lives. That way we get the speed and the ability to access our data from anywhere.

 

Next time…

So the next post we’ll add real keystone federation, but assume each federation environment is it’s own cluster, including each has it’s own Swift cluster. In which case we could take advantage of another Swift feature, container sync.

Then the final post would be what we really want, 1 large Swift cluster with multiple Federated keystone OpenStack clusters. But that will involve fiddling with the federation sync metadata and need a more detailed explanation on how Swift authentication works. So first I want to cover what Swift can do simply with the tools it comes with!

Keystone Federated Swift – A series of posts

Matt Treinish and I proposed a presentation at the OpenStack Summit in Vancouver in May, it was accepted but on standby. Which simply means we have a lightening talk slot (10 minutes), but may be bumped up to a full slot based on how other presenters go (visa issues, pull outs, etc).

Anyway, 10 minutes wont do the topic justice, so I thought what better then to also post details as I work through them here. Some of what I say may end up in the presentation, or may not. All I know is I’ve been asked a few times how to setup Swift in a Keystone federated environment. Let’s face it, Swift scales to a global cluster no worries, however other OpenStack components may have trouble doing the same. So federating a bunch of different regions and treating them as their own clouds makes heaps of sense. Great, then what’s the best way of integrating Swift into this federated environment?

 

My current Idea is to walk through 3 initial topologies. The first I’ll call ‘false federation’ where we can simply use Swift’s ability to use multiple authentication middlewares as different resellers to be able to authenticate to multiple keystone endpoints. For those playing along at home, the keystone middleware currently doesn’t let you do this, but I have a trivial patch that fixes this.. and plan to push it upstream as soon as I have a chance to clean it up and add tests.

 

The second, is separate swift clusters in each cloud. But using Swifts container sync to move objects so you still have access to your data on any cloud you visit… eventually.

 

And finally the third is what we’d all want, I large swift cluster, that all clouds talk to, so no matter where you are, there your data is. Plus gives better durability, dispersion, and everything we want out of a Swift cluster. The trick here will be making sure the same swift account name is used no matter which keystone your talk to, and assume this will come down to how you configure what you share during federated token exchange. I’ll leave this as the last post and we still need to play to iron it out.. but obviously is the dream.

These diagrams are obviously overly simplistic, but I hope you get the idea.

The next post will be the ‘False federation’ approach seeing as I already have a swift keystoneauth middleware patch that solves this.

Weechat – a trial

I’m a big fan on console apps. But for IRC, I have been using quassel, as it gives me a client on my phone. But I’ve been cleaning up my cloud accounts, and thought of the good old days when you’d simply run a console IRC client in screen or tmux.

Many years ago I used weechat, it was awesome, so I thought why not go and have a play.

To my surprise weechat has a /relay command which allows other clients to connect. One such client is WeechatAndroid, which in itself isn’t a IRC client, but actually a client that can talk to a already running weechat… This is exactly what I want.

So in case any of you wanted to do the same, this is my current tmux + weechat setup. Mostly gleened from various internet sources.

 

Initial WeeChat setup

Once you start weechat, you can simply configure things, when you do, it writes it to its config file. So really all you need to do is place or edit the config file directly. So once setup you can easily move it. However, as you’ll always want to be connected, it’s nice to know how to change configuration while it’s running… you know, so you can play.

I’ll assume you have installed weechat in whatever distro your using. So we’ll start by running weechat (inside a tmux or screen):

weechat

Now while we are inside weechat, lets first start by installing/activating some scripts:

/script install buffers.pl go.py colorize_nicks.py urlserver.py

Noting that there are heaps of scripts to install, but lets explain these:

  • buffers.pl – makes a left side window listing the buffers (or channels).
  • go.py – allows you to quickly search the buffers.
  • urlserver.py – automatically shorens long urls so they don’t break if they overlap over a line.
  • colorize_nicks.py – is obvious.

The go.py script is useful. But turns out you can also turn on mouse support with:

/mouse enable

Which will then allow you to use your mouse to select a buffer, although this will break normal copy and pasting, so maybe not worth the effort, but thought I’d mention it.
Although of course normally you’d use the weechat keybindings to access them, <alt+left or right> to go to a buffer, or <alt-a> to goto the last active buffer. But go, means you can easily jump, and if you add a go.py keybinding:

/key bind meta-g /go

An <alt+g> will lauch it, so you can type away, is really easy.

But we are probably getting away from our selves. There are plenty of places that can tell you how to configure it connect to freenode etc. I used: https://weechat.org/files/doc/devel/weechat_quickstart.en.html

Now that you have some things configured lets setup the relay script/plugin.

 

Setting up /relay

This actually isn’t too hard. But there was a gotcha, which is why I’m writing about it. I first simply followed WeechatAndroid’s guide. But this just sets up an insecure relay:


/relay add weechat 8001
/set relay.network.password "your-secret-password"

NOTE: 8001 is the port, so you can change this, and the password to whatever you want.

However, that’s great for a test, but we probably want SSL, so we need a SSL cert:


mkdir -p ~/.weechat/ssl
cd ~/.weechat/ssl
openssl req -nodes -newkey rsa:2048 -keyout relay.pem -x509 -days 365 -out relay.pem

NOTE: To change or see where weechat is looking for the cert is to see what the value of ‘relay.network.ssl_cert_key’ or just do a: /set relay.*

We can tell weechat to load this sslcert without restarting by running:

/relay sslcertkey

And here’s the gotcha, you will fail to connect via SSL until you create an instance of the weechat relay protocal (listening socket) with ssl in the name of the protocol:
/relay del weechat
/relay add ssl.weechat 8001

Or just be smart and setup the SSL version.

On the weechat buffer you will see the client connecting and disconnecting, so is a good way to debug connection issues.

 

Weechat /relay + ssl (TL;DR)

mkdir -p ~/.weechat/ssl
cd ~/.weechat/ssl
openssl req -nodes -newkey rsa:2048 -keyout relay.pem -x509 -days 365 -out relay.pem

In weechat:

/relay sslcertkey
/relay add ssl.weechat 8001
/set relay.network.password "your-secret-password"