Linuxbox.co.uk is the home of Peter Smith, a UK-based Linux consultant, web developer and author, specialising in performance and security. This blog is an overflow for snippets of information not worthy of full articles on the main site. For rates and availability visit http://linuxbox.co.uk/

MysSQL/MariaDB open_files_limit on CentOS 7

In MySQL (and by extension MariaDB), “errno: 24” indicates too many open files. The solution would seem simple: increase open_files_limit. This variable is read-only, so you won’t be able to increase it on-the-fly from the MySQL CLI; rather you need to edit open_files_limit in the [mysqld] section of /etc/my.cnf. However, on CentOS 7 (and most likely other distros that use systemd), systemd imposes its own file limit and an extra step is needed…

First, create /etc/systemd/system/mariadb.service.d/limits.conf with the following contents:

[Service]
LimitNOFILE=10000

Save, and reload systemd:

systemctl –system daemon-reload

Now restart Maria:

systemctl restart mariadb

Log in to MySQL over the CLI and you should find open_files_limit has increased (to the value you’ve specified in my.cnf or LimitNOFILE setting, whichever is lowest)

vBulletin 5 and PHP 7

If you’re running vBulletin, the performance improvements of PHP7 make switching to it a worthwhile task. At the time of writing PHP7 isn’t officially supported in vBulletin 5 yet (apparently it’s work in progress), but so far I’ve only encountered one issue, and it was easy enough to fix:
function name must be a string at xxxx
The issue stems from a change in the order of evaluation when accessing indirect variables. See https://secure.php.net/manual/en/migration70.incompatible.php
This occurs in two places in vBulletin 5:
/includes/vb5/template/bbcode.php
/core/includes/class_bbcode.php
Search for the line:
$pending_text= $this->$tag_info['callback']($open['data'], $open['option']);
and replace with:
$function= $tag_info['callback'];
$pending_text= $this->$function($open['data'], $open['option']);
So far this is the only issue that I’ve encountered. Don’t forget that APC no longer exists in PHP7, so make sure your config.php doesn’t use this as the datastore.

Multiple PHP versions in Plesk, CentOS 6.6

Supporting multiple PHP versions in a standard Apache + mod_php setup has always been a bit fiddly. The quick fix is to install the additional PHP version as a CGI, but performance is lousy, and FastCGI is a much better (if slightly more involved to setup) option.

With recent versions of Plesk now allowing PHP to run as either mod_php, CGI or FastCGI on a per-domain basis, things just got a lot easier. Once you have the FastCGI infrastructure in place, adding additional versions of PHP is made much simpler (and more sensible). You still have to go through the chore of compiling PHP from source, though, which is where David Behan’s shell script comes in. It installs the usual build dependencies and sets PHP compile options that should suit most users.

The only problem I’ve hit with the script is mcrypt.h missing on CentOS 6.6. You’ll find it in the libmcrypt-devel RPM, but this isn’t carried in the default CentOS repos. Instead you can grab this package from:

http://download.fedoraproject.org/pub/epel/6/i386/ or
 http://download.fedoraproject.org/pub/epel/6/x86_64/

depending on your platform, or add the EPEL repo. But for a single RPM it’s easier just to wget it.

Migrating Ubuntu 10.04 to 12.04 (or 14.04)

With Ubuntu 10.04 LTS reaching end-of-life next month (in April 2015), I’ve had several clients ask me about the best way forward.

With so many memories from 10 or 15 years ago about the perils of attempting a distro upgrade over SSH (especially with Fedora, which isn’t really suited as a server distro IMHO) I’m always a bit wary about the prospect and often suggest to clients that now might be a good time to consider renewing their hosting contract: typically the hosting company will be offering more powerful servers for the same price that the client is currently paying for their 5 year old machine.

The good news is that upgrading Ubuntu 10.04 LTS is relatively painless:

apt-get install update-manager-core

Next edit /etc/update-manager/release-upgrades to read prompt=lts if it doesn’t already

apt-get update
apt-get upgrade
apt-get autoremove
do-release-upgrade

After a couple of hours you should then be at Ubuntu 12.04 (confirm with cat /etc/lsb-release). You can now repeat the process to move to 14.04 LTS, or stick with 12.04 LTS which is supported until April 2017.

Plesk 12 on RHEL 7.1

Attempting to install Plesk 12 on a fresh, minimal RHEL 7.1 server throws the following error:

Exception: Failed to solve dependencies:
php-mbstring-5.4.16-21.el7.x86_64 requires php-common(x86-64) = 5.4.16-21.el7
ERROR: The Yum utility failed to install the required packages.
Attention! Your software might be inoperable.
Please, contact product technical support.

A solution is to add the stock repos from CentOS. Create /etc/yum.repos.d/CentOS-Base.repo and add the following content:

[base]
name=CentOS-7 - Base
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=os
baseurl=http://mirror.centos.org/centos/7/os/$basearch/
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=1
#released updates 
[updates]
name=CentOS-7 - Updates
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=updates
#baseurl=http://mirror.centos.org/centos/7/updates/$basearch/
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=1
#packages used/produced in the build but not released
[addons]
name=CentOS-7 - Addons
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=addons
#baseurl=http://mirror.centos.org/centos/7/addons/$basearch/
enabled=0
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=1
#additional packages that may be useful
[extras]
name=CentOS-7 - Extras
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=extras
#baseurl=http://mirror.centos.org/centos/7/extras/$basearch/
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=1
#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-7 - Plus
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=centosplus
#baseurl=http://mirror.centos.org/centos/7/centosplus/$basearch/
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=2
#contrib - packages by Centos Users
[contrib]
name=CentOS-7 - Contrib
mirrorlist=http://mirrorlist.centos.org/?release=7&arch=$basearch&repo=contrib
#baseurl=http://mirror.centos.org/centos/7/contrib/$basearch/
enabled=0
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
priority=2

Now re-run the Plesk autoinstaller, and the problem should be solved.

The Great PCI DSS Scam

If you run an e-commerce site and handle card payments yourself (rather than using a third party such as Paypal, SagePay etc), the phrase “PCI DSS compliance” is likely to make your heart sink. This annual ritual is essential a security audit forced on you by the Payment Card Industry (basically a group of banks) who want to ensure that you are taking adequate measures to keep their customers’ card details safe. DSS stands for Data Security Standard, and is a set of requirements for those who process cards. If you want to be able to process cards issued by any of the major banks (VISA, AMEX, MasterCard etc), you need to be certified as DSS compliant.

So how do you become PCI DSS compliant? The task is handled by a  QSA (Qualified Security Assesor) – a third party security assessor, approved by the PCI, who perform a security scan of your server and give you a pass or fail. You can then wave this certificate around and say “look, I’m PCI DSS compliant, let me process credit cards”.

Things Aren’t So Simple In Practice …

The problem with PCI DSS is that the security benefits are debatable, and the whole thing seems to be as much about generating money for the card industry and the QSAs who feed of it. Everyone profits except you, the website owner. Let’s look at these points in more details.

First, the security scan. If you think that a QSA is a team of dedicated ethical hackers who thrive on the buzz of finding ingenious ways to penetrate your security, you’ve got it all wrong; they are businesses who are in it for the money. That impressive looking 100 page report that you received from the QSA is actually just a Nessus report which they’ve tarted up by adding their logo to the bottom of each page (for those of you who don’t know, Nessus is a very good, freely available security scanner). I’ve seen clients quoted thousands of pounds for such a scan, which involves little more than entering an IP address as clicking ‘go’.

Nessus is a great tool, but there is only so much it can reliably tell from a remote scan. For instance, Nessus will grab the version number from the Apache banner and compare it against its extensive list of known exploits. If it finds a match, it reports it; for instance “You appear to be running Apache x.x. This version of Apache has a vulnerability in mod_env which allows a remote attacker to gave a remote shell on your server”. Great, except banner strings aren’t reliable. If you use a distro like RHEL, CentOS, Fedora, Debian, Ubuntu (and these distros make up the vast majority of the server market) that implements security back-porting, the exploit could have been fixed, but the Apache version number will not have been changed. QSAs don’t know what back-porting is though, and will report that you *have* got a vulnerability. The onus is then on you to prove you haven’t (for instance, by quoting the RPM changelog). Should it not be the other way around? If you think I have an exploit which allows you to gain a remote shell on my box, prove it; don’t force me to waste time disproving it.

Or you can just turn your Apache banner off, then the scan doesn’t report any problems with your Apache version, even if you are using an old, exploit-ridden version of Apache. If the QSAs were in any way professional, they would notice you were trying to trick them by turning the banner off, and would manually intervene: attempt to establish the Apache version in other ways (for instance, look at the default Apache that comes with your distro), checking if this had any known exploits, then explaining this to you. But they don’t, because they are stupid and lazy, and only interested in your money.

A couple of years ago, I remember how one of my clients failed a PCI DSS scan because of (amongst other things) an exploit in Apache’s mod_proxy_ftp. Mod_proxy_ftp isn’t really used that much, and it would be rare to find it enabled on a web server (but yes, I appreciate that many security issues come about because things have been accidentally turned on). Needless to say, my client didn’t have it enabled. Convincing the QSA that we weren’t vulnerable BECAUSE WE DIDN’T HAVE THE BLOODY THING LOADED proved to be tedious. In the end we had to send them a copy of httpd.conf (even though we could have been loading the module in one of the other Apache config files). Again, shouldn’t the onus be on the QSA to prove that we are vulnerable, not us prove that we aren’t?

So, the QSA has performed a scan of your server, and they’ve sent you a pretty report showing the areas you failed on (because you will fail). You then have the choice of asking the QSA to remedy the problems (so let’s get this straight: they do a half-arsed scan of your server, finding dozens of non-existent security holes, then charge you even more money to prove that the alleged security holes don’t actually exist), or hiring someone like me. I have half a dozen clients who come to me each year for help with their DSS scan, and I always feel bad that they are even having to pay me. Over the years I’ve only ever seen one genuine issue (it’s one that comes up a lot): use of weak SSL cyphers in Apache and SSL-aware services such as pops and imaps; the rest of time it’s just a big pile of false positives, and I have to spend an hour or two trawling through RPM changelogs to prove that the alleged vulnerabilities have been fixed.

Some of the alleged vulnerabilities are just plain silly. SSL exploits which the change logs show were fixed over a decade ago, bugs that are only present on x86 architecture when the target server is actually x64, exploits which don’t affect the distro in question etc. Then there are web exploits. This is one area where I’ve always found Nessus to be weak, perhaps because there is so much variety in how people set up their websites. For instance, the scan might check for the presence of /cgi-bin/oldbuggyscript.pl. If it doesn’t receive a 404, it assumes the script exists. If you’re doing something funky with your .htaccess, such as redirecting back to your homepage or sending a custom 404 page (but forgetting to actually send a 404 header), you’ll fail.

Again, the QSA could easily eliminate many of these false positives by manually reviewing the report and using a bit of common sense. “Is it likely that the client is running oldbuggyscript.pl, a popular script back in the 90s which has been known to be vulnerable for over 15 years? No, so let’s investigate a bit more and see if this really is the case”. But they don’t; instead they hand the client a report full of scare-mongering, and with any luck the client – who may not be particularly technically-minded – will say “thank God you found all these problems for me. I won’t be able to sleep at night thinking until I know I’m safe from hackers. Here’s a big pile of money, please fix these problems”.

False Security?

Now, at this point you may be thinking, “isn’t it better to be safe than sorry? If there’s even a slight chance that an exploit exists, it should be investigated”. That’s perfectly true, but it’s also not fair on webmasters who have to pay someone (be it someone like me, or the QSA), or waste their own time, to refute a bunch of obvious false-positives.

There’s also the question of just how thorough a DSS scan is, and whether it fosters a false sense of security. At first glance it seems pretty thorough – we’ve seen how even quite unlikely bugs will be flagged. But I’ve never seen a PCI DSS scan that does any form of basic password brute-forcing: for instance, pick the names of a dozen or so common system accounts (admin, root, test, webmaster etc) and try to access them over SSH  using a list of 1,000 or so common passwords. Your root password could be ‘letmein’ but a PCI DSS scan won’t spot it.

For a proper audit, there needs to be some degree of manual interaction. Sure, tools like Nessus are great for quickly probing for thousands of known bugs, but they still need to be operated by someone with half a brain. Nessus won’t tell you if there’s a link in 34pt font on your home page saying “click here to download all customer details from the database”. Nessus won’t tell you if, when MySQL reaches max connections (perhaps because in your pen testing, you’ve flooded Apache with requests), the PHP code spits out an error which reveals the username and password it is trying to connect to MySQL with. Nessus won’t tell you if the form data on your checkout page can be tampered with to allow a customer to place an order without paying. Stuff like that needs a human to find.

Similarly, there is only so much you can learn from a remote scan. One of my clients had an e-commerce site that emailed him when a customer placed an order; the (unencrypted )email contained the customer’s card details. That’s a whopping big security hole, but not one a PCI DSS scan would spot.

So is PCI DSS creating a false sense of security? If, after you finally pass a DSS scan, you think “right, that’s me safe from hackers for another year”, then definitely. It’s easy to be PCI DSS compliant but still wide open to exploitation. If you view PCI DSS as just one part of the security jigsaw, you’re thinking more prudently. Sadly, the majority of webmasters seem to think the former.

The Solution?

Is there a better solution? Not really. Webmasters are always going to be lax about security because it takes time and money – two resources which are always in short supply. In an ideal world e-commerce webmasters would be concerned enough about security that they would initiate regular security auditing themselves, but that isn’t going to happen, so perhaps some kind of enforced auditing is necessary; but it should be a hell of a lot better than the present system of greedy banks and bottom-feeding QSAs.

Actually, perhaps there is a solution. If an e-commerce site fails to keep your data adequately secure, the law is on your site. You could launch a Small Claim in the County Court (but then it comes down to the discretion of the magistrate as to whether the site owner was indeed negligent in failing to keep his server secure, and that is going to vary a lot of from one magistrate to another) if you suffered monetary loss (for instance, if a fraudster got your card details). In the UK we also have the ICO (Information Commissioner’s Office), a government body which deals with enforcing data protection laws. If a webmaster is slap-happy with a customer’s personal details (storing them for longer is necessary, storing non-required private information, failing to adequately secure them), the ICO have the powers to heavily fine the webmaster. Not that the ICO really seem to use these powers. I’ve had dealings with the ICO in the past, such as the time Barclaycard blatantly lied to me about holding my account statements from over 6 years ago (I was reclaiming credit card penalty charges; and yes, I won: they settled out of court and paid up with 29% interest) Even though the ICO have received dozens of similar complaints about Barclaycard, they did little more than give them a nudge (I suspect the conversation was something along the lines of “Now then old chap, we don’t want any trouble, so if you could give Mr Smith his personal data it would be jolly much appreciated).

I digress. If the ICO had cojones and webmasters were aware of the legal implications of failing to stay secure, perhaps  PCI DSS won’t be needed. The banks wouldn’t be quite to rich, e-commerce webmasters wouldn’t be quite so poor, and the whole industry of charlatan QSAs that PCI DSS has spawned would be gone. We can only dream …

 

Large Scale Apache DDoS

“all the chinese people with shit computers have got home and turned their virus ridden machines on”

Three days ago one of my regular clients called me saying load was high on his webserver, could I take a look. The Apache logs shows that the server was being hammered with HTTP POST requests.

It was easy to spot them as being as being non-legitimate. All were targeted at the default vhost (just the server IP address), and all had the same UserAgent. I grepped through that day’s Apache log, and counted over 180,000 unique IPs issuing these requests. Surely that couldn’t be right? Huge botnets like this certainly exist, but I thought they were mainly used for spam or dossing high profile sites. This server hosts the website for pretty harmless small company (they manufacture stationary), and even if one of their customers did have a gripe with them, it’s unlikely that the customer would have access to such a huge botnet.

A recent worm targetting Plesk servers via a HTTP POST exploit was still fresh in my mind (a had a few new clients as a result of that), and I wondered if perhaps this wasn’t an intentional DoS, but a worm that had gone wrong resulting in all the infected hosts probing this particular server. I fired up tcpdump to inspect the HTTP POST data, but it was blank.

So it seemed like someone was simply trying to tie up Apache slots by bombarding it with empty POST requests. The question was how to stop it. I started with a simple shell script to feed the list of IPs that I’d extracted into iptables, but as the number of entries in the firewall (ok, packet filter) grew, the rate at which iptables could add them started to drop off quite seriously, to the point where it was only adding 3 or 4 per second. With 180,000 addresses, this was going to take a long time.

I decided to change tactics. Scanning through the list, the IPs appeared to be clustered slightly around netblocks. I took the first two octets of each address and generated a list of netblocks to filter in iptables, eg:

iptables -A INPUT -s %IP%.0.0/16 -j DROP

The result was 9,500 netblocks, a much more manageable  number.  It was still a bit of a gamble: we didn’t have time to look up the start and end addresses for the netblocks to which each IP belonged to, so firewalling off x.x.0.0/16 was just a guess. With this being a UK company, we didn’t care too much about blocking overseas traffic off (not ideal, but an acceptable solution), but there was still a danger that, while 1.1.1.1 might belong to a two-bit Korean ISP, 1.1.2.1 might be a UK netblock. We took the risk.

An hour later these rules had been processed, but Apache was still being hit moderately heavily by new IP addresses. I wrote a perl script to tail the Apache log and add them to the firewall, but at a rate or around 5 per second, iptables was still struggling to keep up. I gave it a couple of hours, in the hope that it would pick up the majority of the addresses, but later that day Apache was still being hammered. In addition, my client had discovered that several legitimate people (including the owner of the site) had been firewalled too – clearly filtering off whole B-classes was too aggressive. So again, the hunt was on for a more effective solution.

I started looking into iptable’s ability to match strings inside the TCP packet payload. One option was to block all HTTP POST requests (which would interfere with the operation of the site to some extent, but was better than no site at all), but blocking the UserAgent seemed a better bet, the POST requests all looked like this:

defaulthost:80 46.185.243.239 – – [28/Jul/2013:16:57:06 +0100] “POST / HTTP/1.1” 301 360 “-” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)”

My client agreed that blocking anyone using IE 6 on Windows 2000 wasn’t the end of the world, and should have a negligible impact on legitimate users.

So we tried this:

iptables -I INPUT -m string --algo bm --string "MSIE 6.0; Windows NT 5.1; SV1" -j REJECT

The trouble was, by the time the UserAgent had been sent inside the HTTP request, the TCP connection had already been opened. Apache was still clogged up handling those connections.

I started looking at the tcpdump output again, wondering if there was anything in the TCP headers which would uniquely identify these DoS agents. One thing that struck me was the TTL: an HTTP request from Thailand would almost certainly have gone through more hops than a request from a client in the UK (the server is hosted in the Germany). I considered blocking any traffic with a TTL lower than a certain value, but decided against it: over the years, different operating systems have used different default TTLs, and while we can usually guess at what the original TTL was (if a packet arrives with a TTL of 20 it’s much more likely that the client is 12 hops away and it started at 32, rather than it having started at 64 with the client 44 hops away), it didn’t seem reliable enough.

In the end we went back to my original list of subnets, and compared this against a freely available list of IP blocks by country at https://www.countryipblocks.net/country_selection.php. From that we were able to determine which of my list of B-class networks definitely didn’t contain any UK addresses, and firewall them off.

While that was running, I thought it would be interesting to see what else I could find out about these bots. I picked half a dozen at random and nmap’d them. Nothing interesting, and no common pattern. Perhaps the DDoS agent runs over UDP, or uses port knocking or a reverse tunnel. I also used p0f to do some fingerprinting; all the hosts were identified as ‘Windows XP/2000’ (although p0f doesn’t always get it right).

Even with a reliable set of B-class networks block, the DoS continued, if anything worse than it had been before. As my client elegantly put it:

 all the chinese people with shit computers have got home and turned their virus ridden machines on

The server is hosted by 1and1. Based on previous experiences, asking them for help would probably be a head-banging experience, but out of desperation we tried. They pointed out that we had access to a Cisco firewall via the 1and1 control panel, to which we could add a whopping 25 rules. I calculated some netmasks which would allow us to drop as much unwanted traffic as possible (there’s a nice calculator at https://www.dan.me.uk/ipsubnets) in the limited number of rules we had, and ended blocking off 100 million hosts or so, all in Asia/Africa/South America. I then removed these rules from iptables as they were unnecessary and no doubt slowing it down.

Although this had certainly helped a little, we were still being bombed with traffic from thousands of B-class networks, and iptables wasn’t coping well. We were seeing high CPU usage in kernel processes, and network traffic was moving at snail’s pace.

I started looking into the iptables scalability issue, and discovered that when rules are added the whole ruleset is copied into userspace, modified, then copied back to kernelspace. No wonder it can be quick to restore an existing ruleset, but slow to add new rules one by one. Along the way I also discovered ipset, which promised to be a more scalable solution for blocking large numbers of hosts. When I get the chance I’d look to do some more research on ipset: how it stores data, how it improves over  iptables etc. Perhaps I’ll even write it up in a blog post, but for the moment if you want to learn more take a look here: http://daemonkeeper.net/781/mass-blocking-ip-addresses-with-ipset/.

Ipset performed really well, but only solved half the problem. Yes, we no longer had high server load and sluggish networking because of iptables, but we were still being overwhelmed with traffic clogging up Apache connection slots. Even though we were now automatically firewalling off any request from MSIE 6.0 Windows NT 5.1 UserAgent, the number of unique hosts was staggering. By this point we had counted over 250,000.

In the end we moved the site to CloudFare, who are now bearing the brunt of the attack. Sorry to disappoint; you were probably hoping that this post would conclude with a brilliantly cunning technique that we’d come up with to fend off the attack, and everyone lived happily ever after. On this occasion, they didn’t.

 


 

 

Moving to WordPress

After years of resistance, I’ve finally followed the trend and started up a blog (I was beginning to feel I was perhaps the only person on the planet who didn’t have a blog).

For a while I thought about rolling my own solution: partly because I didn’t need any of the advanced features offered by WP, partly because – in my line of work – I seem to deal with hacked WP sites every week, and didn’t much fancy the idea of running code which such a poor security track record (although in fairness a lot of the issues arise in sloppy plugins). Finally I decided to take the plunge.

Over the years I’ve done countless WP installs and upgrades, but never admined a site. The first task was integrating in my existing HTML/CSS theme, which proved fairly painless. Getting a list of all posts in the sidebar proved trickier, with Google providing plenty of advice on showing posts in a particular category, but nothing for showing all posts. In the end I decided that I’d probably just have one category anyway, so went with:

<ul>

<?php $my_query = new WP_Query(‘cat=1’); ?>

<?php while ($my_query->have_posts()) : $my_query->the_post(); ?>

<li><a href=”<?php the_permalink() ?>” title=”<?php the_title(); ?>”><?php the_title(); ?></a></li>

<?php endwhile; ?>

</ul>

Now all I have to do is worry about spam, security and performance. Fun.

Main Site Highlights

Blog Posts