performance

Vagrant web development - is VMware better than VirtualBox?

[Update 2015-08-25: I reran some of the tests using two different settings in VirtualBox. First, I explicitly set KVM as the paravirtualization mode (it was saved as 'Legacy' by default, due to a bug in VirtualBox 5.0.0), which showed impressive performance improvements, making VirtualBox perform 1.5-2x faster, and bringing some benchmarks to a dead heat with VMware Fusion. I also set the virtual network card to use 'virtio' instead of emulating an Intel PRO/1000 MT card, but this made little difference in raw network throughput or any other benchmarks.]

My Mac spends the majority of the day running at between one and a dozen VMs. I do all my development (besides iOS or Mac dev) running code inside VMs, and for many years I used VirtualBox, a free virtualization tool, along with Vagrant and Ansible, to build and manage all these VMs.

DrupalCamp St. Louis 2015 finished, session videos available online!

DrupalCamp St. Louis 2015 was held this past weekend, June 20-21, 2015, at SLU LAW in downtown St. Louis. We had nine sessions and a great keynote on Saturday, and a full sprint day on Sunday.

DrupalCamp St. Louis 2015 Registration
The view coming off the elevators at SLU LAW.

Every session was recorded (slides + audio), and you can view all the sessions online:

The Camp went very well, with almost sixty participants this year! We had a great time, learned a lot together, and enjoyed some great views of downtown St. Louis (check out the picture below!), and we can't wait until next year's DrupalCamp St. Louis (to be announced)!

Introducing the Dramble - Raspberry Pi 2 cluster running Drupal 8

Dramble - 6 Raspberry Pi 2 model Bs running Drupal 8 on a cluster
Version 0.9.3 of the Dramble—running Drupal 8 on 6 Raspberry Pis

I've been tinkering with computers since I was a kid, but in the past ten or so years, mainstream computing has become more and more locked down, enclosed, lightweight, and, well, polished. I even wrote a blog post about how, nowadays, most computers are amazing. Long gone are the days when I had to worry about line voltage, IRQ settings, diagnosing bad capacitors, and replacing 40-pin cables that went bad!

But I'm always tempted back into my earlier years of more hardware-oriented hacking when I pull out one of my Raspberry Pi B+/A+ or Arduino Unos. These devices are as raw of modern computers as you can get—requiring you to actual touch the silicone chips and pins to be able to even use the devices. I've been building a temperature monitoring network that's based around a Node.js/Express app using Pis and Arduinos placed around my house. I've also been working a lot lately on a project that incorporates three of my current favorite technologies: The Raspberry Pi 2 model B (just announced earlier this month), Ansible, and Drupal!

In short, I'm building a cluster of Raspberry Pis, and designating it a 'Dramble'—a 'bramble' of Raspberry Pis running Drupal 8.

Getting Gigabit Networking on a Raspberry Pi 2, 3 and B+

tl;dr You can get Gigabit networking working on any current Raspberry Pi (A+, B+, Pi 2 model B, Pi 3 model B), and you can increase the throughput to at least 300+ Mbps (up from the standard 100 Mbps connection via built-in Ethernet).

Note about model 3 B+: The Raspberry Pi 3 model B+ includes a Gigabit wired LAN adapter onboard—though it's still hampered by the USB 2.0 bus speed (so in real world use you get ~224 Mbps instead of ~950 Mbps). So if you have a 3 B+, there's no need to buy an external USB Gigabit adapter if you want to max out the wired networking speed!

NFS, rsync, and shared folder performance in Vagrant VMs

It's been a well-known fact that using native VirtualBox or VMWare shared folders is a terrible idea if you're developing a Drupal site (or some other site that uses thousands of files in hundreds of folders). The most common recommendation is to switch to NFS for shared folders.

NFS shared folders are a decent solution, and using NFS does indeed speed up performance quite a bit (usually on the order of 20-50x for a file-heavy framework like Drupal!). However, it has it's downsides: it requires extra effort to get running on Windows, requires NFS support inside the VM (not all Vagrant base boxes provide support by default), and is not actually all that fast—in comparison to native filesystem performance.

I was developing a relatively large Drupal site lately, with over 200 modules enabled, meaning there were literally thousands of files and hundreds of directories that Drupal would end up scanning/including on every page request. For some reason, even simple pages like admin forms would take 2+ seconds to load, and digging into the situation with XHProf, I found a likely culprit:

Diagnosing Disk I/O issues: swapping, high IO wait, congestion

One one small LEMP VPS I manage, I noticed munin graphs that showed anywhere between 5-50 MB/second of disk IO. Since the VM has an SSD instead of traditional spinning hard drive, performance wasn't too bad, but all that disk I/O definitely slowed things down.

I wanted to figure out what was the source of all the disk I/O, so I used the following techniques to narrow down the culprit (spoilers: it was MySQL, which was using some swap space because it was tuned to use a little too much memory).

iotop

First up was iotop, a handy top-like utility for monitoring disk IO in real-time. Install it via yum or apt, then run it with the command sudo iotop -ao to see an aggregated summary of disk IO over the course of the utility's run. I let it sit for a few minutes, then checked back in to find:

rsync in Vagrant 1.5 improves file performance and Windows usage

I've been using Vagrant for almost all development projects for the past two years, and for projects where I'm the only developer, Vagrant + VirtualBox has worked great, since I'm on a Mac. I usually use NFS shared folders so I can keep project data (Git/SVN repositories, assets, etc.) on my local computer, and share them to a folder on the VM, and not suffer the performance penalty of using VirtualBox's native shared folders.

However, this solution only scaled well to other Mac and Linux users with whom I shared development responsibilities. Windows users were left in a bit of a lurch. To extend an olive branch, I hackishly added SMB support by installing and configuring an SMB share from within the VM only on windows hosts, so Windows devs could mount the SMB share and work on files in their native editors.

Vagrant - NFS shared folders for Mac/Linux hosts, Samba shares for Windows

[Edit: I'm not using rsync shared folders (a new feature in 1.5+) instead of SMB/NFS - please see this post for more info: rsync in Vagrant 1.5 improves file performance and Windows usage].

[Edit 2: Some people have reported success using the vagrant-winnfsd plugin to use NFS in Windows.]

I've been using Vagrant to provision local development and testing VMs for a couple years, and on my Mac, NFS shared folders (which are supported natively by VirtualBox) work great; they're many, many times faster than native shared folders. To set up an NFS share in your Vagrantfile, just make sure the nfs-utils package is installed on the managed VM, and add the following:

    config.vm.synced_folder "~/Sites/shared", "/shared",
      :nfs => !is_windows,
      id: "shared"

Note: See JJG-Ansible-Windows to see how you can get the is_windows variable.

Boost Expire module being deprecated; how to switch to Cache Expiration

BoostI'm a huge fan of Boost for Drupal; the module generates static HTML pages for nodes and other pages on your Drupal site so Apache can serve anonymous visitors the static pages without touching PHP or Drupal, thus allowing a normal web server (especially on cheaper shared hosting) to serve thousands instead of tens of visitors per second (or worse!).

For Drupal 7, though, Boost was rewritten and substantially simplified. This was great in that it made Boost more stable, faster, and easier to configure, but it also meant that the integrated cache expiration functionality was dumbed down and didn't really exist at all for a long time. I wrote the Boost Expire module to make it easy for sites using Boost to have the static HTML cache cleared when someone created, updated, or deleted a node or comment, among other things.

Moving Server Check.in functionality to Node.js increased per-server capacity by 100x

Just posted a new blog post to the Server Check.in blog: Moving functionality to Node.js increased per-server capacity by 100x. Here's a snippet from the post:

One feature that we just finished deploying is a small Node.js application that runs in tandem with Drupal to allow for an incredibly large number of servers and websites to be checked in a fraction of the time that we were checking them using only PHP, cron, and Drupal's Queue API.

If you need to do some potentially slow tasks very often, and they're either network or IO-bound, consider moving those tasks away from Drupal/PHP to a Node.js app. Your server and your overloaded queue will thank you!

Read more.

tl;dr Node.js is awesome for running through a large number of network or IO-bound tasks that would otherwise become burdensome at scale using Drupal's Queue API.