As some of you might know, we had an outage yesterday. We believe that in every mistake there is something to learn from, so after each outage we are writing post-mortems. Usually we do this internally because the issues we run into are very specific to our infrastructure.
This time we ran into a quite nasty issue which could affect everyone running a linux system with a lot sessions on it and we thought you might be interested to know about that pitfall.
At 4:40pm CEST, we got reports about Yikes
(503/504 errors) on SoundCloud. Around the same time, our monitoring alerted for a high amount of 503s at our caching layer and right after that one of our L7 routing nginx instances was reported down.
We were still able to log into that system. dmesg showed:
Aug 13 14:46:52 ams-mid006.int.s-cloud.net kernel: [8623919.136122] nf_conntrack: table full, dropping packet.
Aug 13 14:46:52 ams-mid006.int.s-cloud.net kernel: [8623919.136138] nf_conntrack: table full, dropping packet.
N.B.: Our systems are set to UTC timezone
That wasn’t expected. The first thought was: “Someone must have changed the sysctl tunings for that”. Then we realized that this system has no need for connection tracking, so nf_conntrack
shouldn’t be loaded at all. As a quick contermeasure we raised net.ipv4.netfilter.ip_conntrack_max
. This fixed that situation and brought the service back up.
After bringing the site back up, we investigated what caused the kernel to enable connection tracking. Doing a lsmod
showed that connection tracking and iptables modules were actually loaded. Another look into dmesg revealed that right before the outage the ip_tables
netfilter module was loaded:
Aug 13 14:38:27 ams-mid006.int.s-cloud.net kernel: [8623415.007818] ip_tables: (C) 2000-2006 Netfilter Core Team
Aug 13 14:38:35 ams-mid006.int.s-cloud.net kernel: [8623422.444931] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
So what happened? One of our enginners was doing some preparations for scaling that layer of our infrastructure. To verify we don’t use any specific iptable rules on that system, he did:
iptables -L iptables -t nat -L
Those commands themself are pretty harmless. They will just list configured iptables rules. The first one rules in the filter
table, the second one in the nat
table. Nothing which should change any system configuration, right? Nope. Let’s try to reproduce it. Just boot up some system (I’ve tried it on my Ubuntu Laptop). No iptables module should be loaded:
root@apollon:~# lsmod|grep ipt
Now just list your iptable rules:
iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
And check again for loaded modules:
root@apollon:~# lsmod|grep ipt
iptable_filter 12810 0
ip_tables 27473 1 iptable_filter
x_tables 29846 2 iptable_filter,ip_tables
Okay, that loaded some iptables module to make it possible to add filter rules via iptables
. This shouldn’t cause any problems, since without any actual rules the impact on the kernel is negligible. But now check your nat table:
root@apollon:~# iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
Completely empty as well. But now look at your kernel modules:
root@apollon:~# lsmod|grep ipt
iptable_nat 13229 0
nf_nat 25891 1 iptable_nat
nf_conntrack_ipv4 19716 3 iptable_nat,nf_nat
nf_conntrack 81926 3 iptable_nat,nf_nat,nf_conntrack_ipv4
iptable_filter 12810 0
ip_tables 27473 2 iptable_nat,iptable_filter
x_tables 29846 3 iptable_nat,iptable_filter,ip_tables
By just listing your iptable rules for the nat table, the kernel loaded nf_conntrack
which enabled connection tracking. See dmesg:
[75024.007681] nf_conntrack version 0.5.0 (16384 buckets, 65536 max
On your Laptop you probably don’t care – it’s even quite convenient. On a production server that handles a large number of connections the fairly small default nf_conntrack
table will overflow quite fast and cause dropped connections.
iptables
doesn’t load the nf_conntrack
itself, it only loads ip_tables
which again loads modules it depends on via the kernel’s kmod
facility.
But since that module loader uses the modprobe user-space helpers like modprobe, the auto-loading process will honour modprobe.d/
settings. Unfortunatelly there is no easy way to disable loading of a module altogether, but there is a workaround for that.
Since we don’t need iptables at all on that system, we’ve created a /etc/modprobe.d/netfilter.conf like this:
alias ip_tables off
alias iptable off
alias iptable_nat off
alias iptable_filter off
alias x_tables off
alias nf_nat off
alias nf_conntrack_ipv4 off
alias nf_conntrack off
This will make modprobe
load off
instead of the actual kernel module.
Trying to run any iptables command now, should now give you now:
iptables -t nat -L
FATAL: Module off not found. iptables v1.4.12: can't initialize iptables table `nat': Table does not exist (do you need to insmod?) Perhaps iptables or your kernel needs to be upgraded.