Category Archives: Linux

Successfully downloading big files from Dropbox via Linux command-line

Recently, someone was trying to send me a 20Gb virtual machine image over dropbox. I tried a couple of times to download using chrome, however it got to 6-8Gb and then came up with a connection error. Clicking on the resume button failed and then removed the file (!). Very strange as I didn’t have any connection issues, but perhaps a route changed somewhere. I saw a number of dropbox users complaining about this on the internet. Obviously there are other approaches such as adding to your own dropbox account and using their local program to do the sync, however because I’m just on a standard free account I couldn’t add in such a large file.

Because I was using btrfs and snapper I still had a version of the half-completed download around, and so I tried seeing if standard linux tools would be able to continue the download where it left off. It turns out that simply using wget -c enables you to resume the download (it dropped a couple of times during the download but just restarting it with the same command let the whole file download just fine. So, to download a large dropbox file even if your internet connection is a bit flakey, simply go to the dropbox download link and then paste it into the terminal (may require the ?dl=1 parameter after it) like:

Apache configuration for WkWebView API service (CORS)

Switching from UIWebView to WKWebView is great, but as it performs stricter CORS checks than standard Cordova/Phonegap it can seem at first that remote API calls are broken in your app.

Basically, before WKWebView does any AJAX request, it first sends a HTTP OPTIONS query and looks at the Access-Control-* headers that are returned to determine if it is able to access the service. Most browsers can be made to allow all AJAX requests via a simple “Access-Control-Allow-Origin: *” header, however WKWebView is more picky. It requires that you expose which methods (GET, POST, etc); and which headers are allowed (eg if you are using JSON AJAX requests you probably need to use a “Content-Type: application/json” header in your main request).

Rather than having to update your API service, you can work around this in a general way using the following Apache config:

Note the last line answers any HTTP OPTIONS request with blank content and returns it straight away. Most API services would cause a lot of CPU processing just to handle a single request whether it is a true request or an OPTIONS query, so we just answer this straight from Apache without bothering to send it through to the API. The R=204 is a trick to specify that we don’t return any content (HTTP 204 code means “Success, but no content”). Otherwise if we used something like R=200 it would return a page talking about internal server error, but with a 200 response which is more bandwidth, more processing and more confusing for any users.

Programming ESP8266 from the CHIP

The CHIP is a powerful $9 computer. I saw them online and ordered 5 of them some time ago as part of a potential home automation project, and because it’s always useful to have some small linux devices around with GPIO ability. I’ve recently been playing a lot with ESP8266 devices (more on this in some future blog posts), and I’ve been using the CHIP to program them via a breadboard and the serial port header connectors (exposed as ttyS0) and esptool.py. So far so good.

However, I want to put the CHIP devices into small boxes around the house and use something like find-lf for internal location tracking based on Wifi signals emitted from phones and other devices to figure out who’s in which room. Whilst the CHIP has 2 wifi devices (wlan0, wlan1) it doesn’t allow one to run in monitor mode while the other is connected to an AP. This means we need an extra Wifi card to be in monitor mode, and as I had a number of ESP8266’s lying around, I thought I’d write a small program to just print MAC and RSSI (signal strength) via the serial port.

As these devices will be in sealed boxes I don’t want to have to go fiddling around with connectors on a breadboard to update the ESP8266 firmware, so I came up with a minimal design to allow reprogramming ESP8266 on-the-fly from CHIP devices (should work on anything with a few GPIO ports). Obviously the ESP8266 does have OTA update functionality, however as these devices will be in monitor mode I can’t use that. As the CHIP works at 3.3v, the same as ESP8266 chips this was pretty straight forwards involving 6 cables and 2 resistors, there were a few steps and gotchas to be aware of first though.

The main issue preventing this from working is that when the CHIP first boots up, the uBoot software listens for input for 2 seconds via ttyS0 (the serial port exposed on the header, not the USB one). When power first comes on, the ESP8266 will always output some bootloader messages via the serial port which means that the CHIP would never boot. Fortunately the processor has a number of different UARTs, a second one that is optionally exposed via the headers. You can read all about the technical details on this thread. In short, to expose the second serial port you need to download this dtb from dropbox and use it to replace /boot/sun5i-r8-chip.dtb. You then need to download this small program to enable the port and run it every boot up. This worked fine for me on the 4.4.13-ntc-mlc kernel. You can then use the pins found listed here to connect to the tx/rx of the ESP8266 serial and it won’t affect the boot-up of the CHIP.

The other nice thing about using ttyS2 rather than ttyS0 is that there are hardware flow control ports exposed (RTS, CTS) which I had hoped could be integrated into esptool to automatically handle the reset. Unfortunately it looks like esptool uses different hardware flow control ports to signal the ESP8266 bootloader mode/reboot so I had to connect these ports to GPIOs and trigger from there.

After doing this, wire the ESP8266 (I’m using the ESP-12 board, but should be the same for any other boards) to the CHIP in the following manner:

ESP8266 pin CHIP connector
VCC 3.3v
Gnd
CH_PD / EN XIO-P6
GPIO0 XIO-P7 via a resistor (eg 3.3k)
GPIO15 – via resistor (eg 3.3k)
TX LCD-D3
RX LCD-D2

Note that on some ESP boards TX/RX are the wrong way round so if you don’t see anything try flipping the cables around.


I then wrote a small program (called restart_esp.py) to trigger different mode reboots of the ESP8266 from the CHIP:

Then you can easily flash your ESP8266 from the CHIP using a command like:

Percent signs in crontab

As this little-known ‘feature’ of cron has now bitten me several times I thought I should write a note about it both so I’m more likely to remember in future, but also so that other people can learn about it. I remember a few years ago when I was working for Webfusion we had some cronjobs to maintain the databases and had some error message that kept popping up that we wanted to remove periodically. We set up a command looking something like:

but it was not executing. Following on from that, today I had some code to automatically create snapshots of a certain btrfs filesystem (however I recommend that for serious snapshotting you use the excellent (if a bit hard to use) snapper tool):

But it was not executing… Looking at the syslog output we see that cron is running a truncated version of it:

Looking in the crontab manual we see:

D’oh. Fortunately the fix is simple:

I’m yet to meet anyone who is using this feature to pipe data into a process run from crontab. I’m also yet to meet even very experienced sysadmins who have noticed this behaviour making this a pretty good interview question for a know-it-all sysadmin candidate!

Making a BTRFS read-only snapshot writable

For the past few years I’ve been using btrfs on most filesystems that I create, whilst it’s pretty slow on rotating disk media now that most of my hardware is SSD-based there’s not much of a performance penalty (as long as you’re not using quotas to track filesystem usage). The massive advantage is the ability to have proper snapshot history (unlike any LVM snapshotting hacks that you may suggest) going back a long time with very little overhead. With a tool like snapper (which admittedly is tricky to get set up) you can automatically rotate your snapshots and easily recover any files that you accidentally changed or deleted. Alongside always using git for code repositories, this has saved my skin repeatedly!

Anyway, by default snapper creates read-only snapshots. But when trying to diagnose some database server file corruption I recently experienced I wanted to change a btrfs snapshot from read-only to read-write so I could update some files. After spending a while looking around in the manual and on stack overflow I couldn’t see any way to do this with the kernel/toolchain versions that I was using.

Then, the solution struck me. Simply create a read-write snapshot of the read-only snapshot and work off that. Sometimes it’s very easy to look at the more complicated way of doing things and forget about some of the easier solutions that there might be!

Protecting an Open DNS Resolver

As another piece of work I’ve been doing for the excellent Strongarm anti-malware team we recently converted the service so that it can be used to get instant protection wherever you are. Part of this involved my work in converting the core (customized) DNS server into an open resolver. This is usually strongly advised against as you can unwittingly become part of some very serious Denial of Service attacks, however in this blog post I show you how to implement some pretty simple restrictions and limitations to prevent this from happening so you can run a DNS open resolver without running this risk.

Here’s a copy of the article:

One of the challenges of running an open DNS resolver is that it can be used in a number of different attacks, compared to a server that is only allowed access from a known set of IPs. One of the most well known is the DNS amplification attack. As this article explains, “The fact that a DNS reply may be many times larger than a DNS query allows the attacker to achieve amplification by spoofing a relatively small query that is known to generate a large answer in response”. That means that if I can send a DNS question that takes 50 bytes, and I send it pretending to be the computer that I want to attack, and the answer to that question is 1000 bytes, then I have effectively multiplied the traffic that I can attack with by 20 times. Especially as DNSSEC (Domain Name System Security Extensions) become more common, the RRSIG and DNSKEY DNS response codes can contain a lot of data that can be used in this type of attack.

In this post, I’d like to present a couple of ways to easily protect your open DNS resolver from being involved in malware attacks like the DNS amplification attack.

Configuring a DNS Resolver

Many DNS servers, or frontends such as PowerDNS or dnsdist, have the built-in or user-configurable ability to limit some types of attacks. In the case of dnsdist, the loadbalancer sits in front of the DNS servers and monitors the traffic going to and from them in order to blacklist hosts that are abusing the platform.

However, when configuring this within Strongarm’s servers, we wanted the ultimate scalability and flexibility on our DNS infrastructure, so we decided not to use dnsdist but instead use a pure networking approach. Here are a few steps that you can take to protect your DNS infrastructure no matter whether you use a DNS loadbalancer or servers interfacing directly to the internet.

The first step you can take in protecting your server is to ensure that ANY queries cannot be used in an attack. An ANY query returns all the records of a particular domain so naturally it returns more data than a standard query. This is usually easy to configure with an option like ‘any-to-tcp’ in PowerDNS. This setting says that if the recursive server receives an ANY query, it will automatically send back a small redirect: “TCP is required”.

To understand why this helps prevent attacks we need to understand the following three things.

  1. An ANY query will usually return larger responses as it asks for all records under a particular domain.
  2. 99% of the time, an ANY query is not legitimate traffic. Usually, a host will only want a specific type of record such as A or MX.
  3. Whereas it’s easy to spoof UDP traffic, it’s virtually impossible to spoof TCP. This is because establishing a TCP connection requires a 3-way handshake. For example, if the client says “I’d like to open a connection”, and the server says “Okay, you’d like to open a connection, it’s now open”, then the client says, “Thanks, the connection is now open”. While you can spoof the initiation of the connection, when the server says “Okay, you’d like to open a connection, it’s now open,” the host that has been spoofed will reply “What?! I didn’t ask to open a connection!” and it won’t go any further.

Putting this all together, we can see that this can be a very effective preventative measure for abusing an open DNS resolver. Legitimate clients will fall back to using TCP and attackers will simply give up. We can’t use this for all connections because having to do every DNS lookup over TCP would noticeably slow down internet browsing speed, but we can do this easily enough on connections that have a high probability of being attack traffic.

In a similar vein, another useful option for many DNS servers is the ability to limit the size of a return packet over UDP. Typically, you would configure this to say, “If the return packet is more than X bytes, send a TCP redirect and only allow this over TCP.”

Firewall Limiting of Potential Attack Traffic

In addition to doing the above, we implemented a pure firewall-based approach to throttling attack traffic. To do this, we needed to configure our firewall to be stateless, as we described how to do in a previous post.

As opposed to dnsdist or other frontend servers, this allows you to deploy either on a single server or on a frontend router that covers multiple resolvers. This also should be much more efficient as all processing occurs in-kernel via netfilter rather than having to go through a program which may crash or be somehow limited in the speed at which it can process data. As we showed in a previous post this is very efficient at packet processing.

We start by creating an ‘ipset’ of IPs that we have currently blacklisted. We’ll use the ‘timeout’ option to specify that after we have added an IP into this blacklist, it will automatically expire after a certain time. We’ll also limit it to a maximum 100,000 IPs so that an attacker cannot use this to take our server offline:

Then, if an IP is on this list, we’ll block it from doing any UDP traffic to our server:

Now for the clever part: we’ll look for DNS responses that are over a certain threshold packet size (700 bytes) and start monitoring them to see the rate at which someone is sending them:

This points to a new iptables chain called “LARGE_DNS_PACKET_TRACKING” which we’ll set up as follows:

This first rule allows up to 50kb of large DNS responses per minute to a single IP (the 32 means a /32, i.e. a single IP address), and always allows the first 10 large response packets through. Again, it tracks, at most, 100,000 IPs in order to avoid an attack vector against our server.

After a host goes over this threshold, we’ll pass the traffic through to the next stage of the chain:

This is where the magic happens. If the client breaches the threshold set above, then it will add its IP to the ipset we created earlier, meaning that it will be blocked for 10 minutes. Finally, let’s note this in the system log and then drop the packet:

Conclusions

With the right protection in place, it’s not such a bad thing to run an open DNS resolver on the internet. If you look in your server’s configuration manual, you should find a few options that can also help in preventing attacks. Additionally, we recommend setting up a firewall-based system like I detailed above so that you can limit the amount of traffic you send out. Otherwise, you may easily find your server being disconnected by your ISP for being part of an attack.

Easily switch between KVM and VirtualBox virtual machines

I’ve done quite a bit of development recently in Android and also been working with a client who has a local virtual environment using Oracle/Sun’s VirtualBox vm. So, I found myself switching between the two platforms quite frequently which unfortunately requires removing and reinstalling kernel modules. So, I wrote the below shell script to switch between the two platforms. Simply put in a directory in $PATH (for me I always have ~/bin as a directory there for my user-local scripts) and call the script something like switch_vm. Use it like:

Here’s the script: