Category Archives: Networking

Complex network setup on Centos/Redhat 8 via Network Manager

Here are some notes, mostly for my future self about how to set up bond + vlan + bridge networks in Centos 8 in order to create a highly resilient virtual machine host server.

Firstly, create the bond interface and set the options:

nmcli con add type bond ifname bond0 bond.options "mode=802.3ad,miimon=100" ipv4.method disabled ipv6.method ignore connection.zone trusted

Note the 802.3ad option – this says we are going to be using the LACP protocol which requires router support, however you can look at the different mode options in the kernel documentation.

Then, we add the required interfaces into the bond – in this case enp131s0f0 and enp131s0f1

nmcli con add type ethernet ifname enp131s0f0 master bond0 slave-type bond connection.zone trusted
nmcli con add type ethernet ifname enp131s0f1 master bond0 slave-type bond connection.zone trusted

If you want to create a standard vlan interface over the top of the bond for vlan tag 123 we can do this just with 1 more command:

nmcli con add type vlan ifname bond0.123 dev bond0 id 123 ipv4.method manual ipv4.addresses a.b.c.d/24 ipv4.gateway a.b.c.d ipv4.dns 8.8.8.8 connection.zone public

Alternatively if you want to set up a bridge on this interface:

nmcli con add ifname br.123 type bridge ipv4.method manual ipv4.addresses a.b.c.d/24 ipv4.gateway a.b.c.d ipv4.dns 8.8.8.8 \
        bridge.stp no bridge.forward-delay 0 connection.zone public
nmcli con add type vlan ifname bond0.123 dev bond0 id 123 master br.123 slave-type bridge connection.zone trusted

If you don’t want the default route or DNS, simply remove the ipv4.gateway and ipv4.dns options above.

If you want to create a bridge which doesn’t have any IP address on the host (just bridges through to the host vm’s) then replace all the ipv4 settings with: ipv4.method disabled ipv6.method ignore.

Dumping all remote mysql queries to a server

For a project recently we needed to see which users were accessing which databases on a remote mysql server. After a little bit of reading and experimenting I came up with this command – hopefully it will be useful for someone in the future.

tshark -i any -n \
    -d tcp.port==3306,mysql \
    -T fields -e timestamp -e ip.src -e mysql.user -e mysql.schema -e mysql.query \
    -f "dst port 3306" \
    -Y "len(mysql.user) > 0 || len(mysql.query) > 0" 

This uses the excellent wireshark tool to capture traffic. The -f flag specifies a filter to only inspect inbound traffic to the server (as we don’t care about the responses).

The line beginning -T specifies the fields we wish to dump (in a tab-separated fashion), -d ensures that all traffic will attempt to be decoded as mysql. The final -Y flag ensures that only packets with queries or logins will be dumped, rather than TCP overhead or placeholder binding etc.

Note that user information is only given at the beginning of a session. Schema (database name) information may be on that same line, or you may have to scan the queries to look for a USE database type command as there are two different ways of setting the current database that will be accessed.

Protecting an Open DNS Resolver

As another piece of work I’ve been doing for the excellent Strongarm anti-malware team we recently converted the service so that it can be used to get instant protection wherever you are. Part of this involved my work in converting the core (customized) DNS server into an open resolver. This is usually strongly advised against as you can unwittingly become part of some very serious Denial of Service attacks, however in this blog post I show you how to implement some pretty simple restrictions and limitations to prevent this from happening so you can run a DNS open resolver without running this risk.

Here’s a copy of the article:

One of the challenges of running an open DNS resolver is that it can be used in a number of different attacks, compared to a server that is only allowed access from a known set of IPs. One of the most well known is the DNS amplification attack. As this article explains, “The fact that a DNS reply may be many times larger than a DNS query allows the attacker to achieve amplification by spoofing a relatively small query that is known to generate a large answer in response”. That means that if I can send a DNS question that takes 50 bytes, and I send it pretending to be the computer that I want to attack, and the answer to that question is 1000 bytes, then I have effectively multiplied the traffic that I can attack with by 20 times. Especially as DNSSEC (Domain Name System Security Extensions) become more common, the RRSIG and DNSKEY DNS response codes can contain a lot of data that can be used in this type of attack.

In this post, I’d like to present a couple of ways to easily protect your open DNS resolver from being involved in malware attacks like the DNS amplification attack.

Configuring a DNS Resolver

Many DNS servers, or frontends such as PowerDNS or dnsdist, have the built-in or user-configurable ability to limit some types of attacks. In the case of dnsdist, the loadbalancer sits in front of the DNS servers and monitors the traffic going to and from them in order to blacklist hosts that are abusing the platform.

However, when configuring this within Strongarm’s servers, we wanted the ultimate scalability and flexibility on our DNS infrastructure, so we decided not to use dnsdist but instead use a pure networking approach. Here are a few steps that you can take to protect your DNS infrastructure no matter whether you use a DNS loadbalancer or servers interfacing directly to the internet.

The first step you can take in protecting your server is to ensure that ANY queries cannot be used in an attack. An ANY query returns all the records of a particular domain so naturally it returns more data than a standard query. This is usually easy to configure with an option like ‘any-to-tcp’ in PowerDNS. This setting says that if the recursive server receives an ANY query, it will automatically send back a small redirect: “TCP is required”.

To understand why this helps prevent attacks we need to understand the following three things.

  1. An ANY query will usually return larger responses as it asks for all records under a particular domain.
  2. 99% of the time, an ANY query is not legitimate traffic. Usually, a host will only want a specific type of record such as A or MX.
  3. Whereas it’s easy to spoof UDP traffic, it’s virtually impossible to spoof TCP. This is because establishing a TCP connection requires a 3-way handshake. For example, if the client says “I’d like to open a connection”, and the server says “Okay, you’d like to open a connection, it’s now open”, then the client says, “Thanks, the connection is now open”. While you can spoof the initiation of the connection, when the server says “Okay, you’d like to open a connection, it’s now open,” the host that has been spoofed will reply “What?! I didn’t ask to open a connection!” and it won’t go any further.

Putting this all together, we can see that this can be a very effective preventative measure for abusing an open DNS resolver. Legitimate clients will fall back to using TCP and attackers will simply give up. We can’t use this for all connections because having to do every DNS lookup over TCP would noticeably slow down internet browsing speed, but we can do this easily enough on connections that have a high probability of being attack traffic.

In a similar vein, another useful option for many DNS servers is the ability to limit the size of a return packet over UDP. Typically, you would configure this to say, “If the return packet is more than X bytes, send a TCP redirect and only allow this over TCP.”

Firewall Limiting of Potential Attack Traffic

In addition to doing the above, we implemented a pure firewall-based approach to throttling attack traffic. To do this, we needed to configure our firewall to be stateless, as we described how to do in a previous post.

As opposed to dnsdist or other frontend servers, this allows you to deploy either on a single server or on a frontend router that covers multiple resolvers. This also should be much more efficient as all processing occurs in-kernel via netfilter rather than having to go through a program which may crash or be somehow limited in the speed at which it can process data. As we showed in a previous post this is very efficient at packet processing.

We start by creating an ‘ipset’ of IPs that we have currently blacklisted. We’ll use the ‘timeout’ option to specify that after we have added an IP into this blacklist, it will automatically expire after a certain time. We’ll also limit it to a maximum 100,000 IPs so that an attacker cannot use this to take our server offline:

ipset create throttled-ips hash:ip timeout 600 family inet maxelem 100000

Then, if an IP is on this list, we’ll block it from doing any UDP traffic to our server:

iptables -t raw -A PREROUTING -p udp -m set --match-set throttled-ips src -j DROP

Now for the clever part: we’ll look for DNS responses that are over a certain threshold packet size (700 bytes) and start monitoring them to see the rate at which someone is sending them:

iptables -N LARGE_DNS_PACKET_TRACKING # Create the destination chain
iptables -A OUTPUT -p udp --sport 53 \
        -m length --length 700:0xffff \
        -j LARGE_DNS_PACKET_TRACKING

This points to a new iptables chain called “LARGE_DNS_PACKET_TRACKING” which we’ll set up as follows:

iptables -A LARGE_DNS_PACKET_TRACKING -m hashlimit --hashlimit-mode dstip --hashlimit-dstmask 32 \
   --hashlimit-upto 50kb/min --hashlimit-burst 10 --hashlimit-name large-dns-packets --hashlimit-htable-max 100000 \
   -j ACCEPT

This first rule allows up to 50kb of large DNS responses per minute to a single IP (the 32 means a /32, i.e. a single IP address), and always allows the first 10 large response packets through. Again, it tracks, at most, 100,000 IPs in order to avoid an attack vector against our server.

After a host goes over this threshold, we’ll pass the traffic through to the next stage of the chain:

iptables -A LARGE_DNS_PACKET_TRACKING -j SET --add-set throttled-ips dstip --timeout 600 --exist

This is where the magic happens. If the client breaches the threshold set above, then it will add its IP to the ipset we created earlier, meaning that it will be blocked for 10 minutes. Finally, let’s note this in the system log and then drop the packet:

iptables -A LARGE_DNS_PACKET_TRACKING -j LOG --log-prefix "DNS-amplification protection: "
iptables -A LARGE_DNS_PACKET_TRACKING -j DROP

Conclusions

With the right protection in place, it’s not such a bad thing to run an open DNS resolver on the internet. If you look in your server’s configuration manual, you should find a few options that can also help in preventing attacks. Additionally, we recommend setting up a firewall-based system like I detailed above so that you can limit the amount of traffic you send out. Otherwise, you may easily find your server being disconnected by your ISP for being part of an attack.

Easily switch between KVM and VirtualBox virtual machines

I’ve done quite a bit of development recently in Android and also been working with a client who has a local virtual environment using Oracle/Sun’s VirtualBox vm. So, I found myself switching between the two platforms quite frequently which unfortunately requires removing and reinstalling kernel modules. So, I wrote the below shell script to switch between the two platforms. Simply put in a directory in $PATH (for me I always have ~/bin as a directory there for my user-local scripts) and call the script something like switch_vm. Use it like:

switch_vm virtualbox
switch_vm kvm

Here’s the script:

#!/bin/bash
VM=$1

if [ "$VM" = "kvm" ]; then
    sudo rmmod vboxpci vboxnetflt vboxnetadp vboxdrv
    sudo modprobe kvm_intel
fi

if [ "$VM" = "virtualbox" ]; then
    sudo rmmod kvm_intel kvm
    for i in vboxdrv vboxpci vboxnetflt vboxnetadp; do
        sudo modprobe $i
    done
fi

Testing the Performance of the Linux Firewall

Over on the Strongarm blog I’ve got an in-depth article about testing the performance of the Linux firewall. We knew it was fast, but how fast? The answer is perhaps surprisingly “significantly faster than the kernel’s own packet handling” – blocking packets to an un-bound UDP port was 2.5* faster than accepting them, and in that case a single-core EC2 server managed to process almost 300kpps. We also tested the performance of blocking DDoS attacks using ipset and/or string matching.

I’ve archived the article below:

As we mentioned in a previous post, at Strongarm, we love the Linux firewall. While a lot of effort has gone into making the firewall highly flexible and suitable for nearly every use case, there is very little information out there about the actual performance of the firewall. There are many studies on the performance of the Linux network stack, but they don’t get to the heart of the performance of the firewall component itself.

With the core part of the Strongarm product focused on making highly reliable DNS infrastructure, we thought we’d take a look at the Linux firewall, especially as it relates to UDP performance.

Preparing to Test

To begin, we wrote a small C program that sends 128-byte UDP packets very efficiently to the loopback device on a single source/destination port. We tested this with one of the latest Ubuntu 16.04 4.4-series kernels on a small single-core EC2 instance. We ran each test multiple times over a period of 30 seconds to ensure that the CPU was pegged at 100% throughout, verifying that we were CPU-bound. We then averaged the performance results to present the values shown here, although, because this was all software-based on the same box, there was very low deviation from the average between test runs.

Here are the results we found and some potential conclusions you can draw from them to optimize your own firewall.

NOTRACK Performance

First, we ran the packet generator with no firewall installed but with the connection tracking modules loaded. We saw a rate of 87k PPS (Packets Per Second). Then, we added the following rules to disable connection tracking on the UDP packets (e.g. converting our firewall to be stateless):

iptables -t raw -I PREROUTING -j NOTRACK
iptables -t raw -I OUTPUT -j NOTRACK

With this, we saw the firewall perform at 106k PPS, or a 20% speedup, as I first reported in this blog post. We then disabled connection state tracking for all future tests.

Conclusion: Where possible, use a stateless firewall.

Maximum Possible Performance

We are primarily interested in the performance of blocking UDP based attacks, so we focused on testing various DROP rules with connection state tracking disabled.

First, to get a baseline performance with regards to packet dropping, we wrote a simple rule to drop packets and put it as the first entry on the very first part of the iptables processing stack, the “raw PREROUTING” table:

iptables -t raw -I PREROUTING -p udp -i lo -j DROP

This produced a baseline performance of 280k PPS for the firewall when dropping packets! This is a massive 260% increase in performance over just having a blank firewall in place. This shows that by putting blocks earlier on in the process, you can bypass kernel work, resulting in a significant speedup.

We then did the same drop, but on the default ‘filter’ table:

iptables -I INPUT -p udp -i lo -j DROP

This produced a performance of 250k PPS, or roughly a 12% performance drop.

Conclusion: Drop unexpected packets in your firewall, rather than letting them go further down the network stack. It doesn’t matter as much if you put them in the ‘raw’ table or the default ‘filter’ table, but raw does give slightly better performance.

When we redid this with the ‘raw’ table and inserted 5 non-matching rules, the performance dropped to 250k PPS, or 12% less compared to the baseline.

Additionally, we concluded from this that you should try to put your most common rules towards the top of your firewall, as position does make a difference. However, it is not too significant.

ipset Performance

If you’re under attack and you insert a rule in the firewall for every single IP address you wish to block, these rules are processed one-at-a-time for each packet. So instead of doing this, you can use the ipset command to create a lookup table of IPs. You can specify how these are to be stored, but if you use the most efficient option (a hash) you can make operations on sets of IPs scale very well. At least, this is the theory, so let’s see how it performs in practice:

ipset create blacklist hash:ip counters family inet maxelem 2000000
iptables -t raw -A PREROUTING -m set --match-set blacklist src -j DROP

First, we just added in the loopback’s IP address so it would behave the same as the baseline. This produced a result of 255k PPS, or 10% slower than the baseline of blocking all traffic on this interface. Then, we added another 65k entries to the blacklist using the following small script:

perl -E 'for $a (0..255){ for $b (0..255) { say "add blacklist 1.1.$a.$b" } }' |ipset restore

The result was again 255k PPS. We added another 200k entries and the performance stayed the same.

Conclusion: Using ipset’s to do operations on groups of IPs is very efficient and scales well, especially for blacklisting groups of IP addresses.

String Matching Performance

One of the more advanced abilities of the Linux firewall is filtering packets based on looking at the contents (packet inspection). I have often have seen DNS DDoS attacks performed against a particular domain. In this case, matching the string and blocking all packets containing this string is one of the easiest ways to stop an attack in its tracks. But, I assumed that performing a string search in every packet would be pretty inefficient:

iptables -t raw -I PREROUTING -p udp -m string --algo kmp --hex-string 'foo.com' -j DROP

(Note that this is just fake traffic. Real DNS traffic doesn’t encode domains with dot separators)

There are two possible algorithms (kmp and bm) in the string matching module. The UDP generator was sending 128-byte packets. The results are: bm handling 255k PPS traffic (10% lower than baseline) and kmp: handling 235k PPS (17% lower).

Conclusion: String-based packet filtering was much more efficient than we expected and a great way to stop DDoS attacks where a common string can be found. You’ll want to use the ‘bm’ algorithm for best performance.

Hashlimit Performance

Finally, we tested the performance of hashlimit, which is a great first layer of defense against DDoS attacks. We will say “Limit each /16 (65,000 server) block on the internet to only be allowed to send 1000 PPS of UDP traffic”:

iptables -t raw -A PREROUTING -p udp -m hashlimit --hashlimit-mode srcip --hashlimit-srcmask 16 --hashlimit-above 1000/sec --hashlimit-name drop-udp-bursts -j DROP

Bearing in mind that our server only handles 106k PPS of traffic not blocked by the firewall, we would expect a ~1% performance drop just because we’re now allowing 1k PPS through, so we’ll use a baseline of, say, 275k PPS. With this rule, we saw 225k PPS processed by the server (1k PPS accepted, the rest rejected), so 20% lower performance, but it will give quite a bit of protection against a DDoS attack.

Conclusion: Consider using the hashlimit module to do basic automatic anti-DDoS protection on your servers. The 20% overhead is generally worth it.

Summary of Results

Method Performance (PPS) Cost (%)
Accepting all packets
Connection tracking disabled 106k
Connection tracking enabled 87k 20%
Rejecting all packets
Dropping at very start of raw table 280k
Dropping after 5 rules 250k 12%
Dropping in normal (filter) table 250k 12%
Ipset (1-250k entries) 255k 10%
String matching: bm 255k 10%
String matching: kmp 235k 17%
Hashlimit (1k PPS accepted) 225k 20%

Drawing Final Conclusions

Linux iptables are incredibly flexible, and for the most part, very scalable. Even on one of the smallest single-core virtual machines in EC2, the Linux firewall can handle filtering over 250k PPS. In fact, if you put rules right at the entry of the netfilter stack, you can reject packets over twice as fast as the kernel itself can process them. Even some of the modules which have greater algorithmic complexity, for example hashlimit and string matching, performed much better than we expected.