tutorial

Set up faceted Apache Solr search on Drupal 8 (2016 - deprecated)

Note: A lot has changed in Drupal 8 and the Search API module ecosystem since this post was written in May 2016... I wrote a new blog post for Faceted Solr Search in Drupal 8, so please read that if you're just getting started. I'm leaving this up as a historical reference, as the general process and architecture are the same, but many details are different.

In Drupal 8, Search API Solr is the consolidated successor to both the Apache Solr Search and Search API Solr modules in Drupal 7. I thought I'd document the process of setting up the module on a Drupal 8 site, connecting to an Apache Solr search server, and configuring a search index and search results page with search facets, since the process has changed slightly from Drupal 7.

Migrate a custom JSON feed in Drupal 8 with Migrate Source JSON

June 2016 Update: Times change fast! Already, the migrate_source_json module mentioned in the post has been (mostly) merged directly into the migrate_plus module, so if you're building a new migration now, you should use the migrate_plus JSON plugin if at all possible. See Mike Ryan's blog post Drupal 8 plugins for XML and JSON migrations for more info.

Recently I needed to migrate a small set of content into a Drupal 8 site from a JSON feed, and since documentation for this particular scenario is slightly thin, I decided I'd post the entire process here.

I was given a JSON feed available over the public URL http://www.example.com/api/products.json which looked something like:

Drupal VM - Quick Introduction Video

After months of having this on my todo list, I've finally had the time to record a quick introduction video for Drupal VM. Watch the video below, then a transcript below the video:

Drupal VM is a local development environment for Drupal that's built with Vagrant and Ansible. It helps you build and maintain Drupal sites using best practices and the best tools. In this quick overview, I'll show you where you can learn more about Drupal VM, then show you a simple Drupal VM setup.

The Drupal VM website gives a general overview of the project and links to:

How to shoot a large event (photography gear / workflow)

Jeff Geerling shooting photos with Nikon at Steubenville Youth Conference
Shooting with a Nikon D7100 and 70-200mm f/2.8 VR (photo by Sid Hastings).

I love taking pictures. Specifically, I love taking pictures at meaningful events where people show a range of emotions, and enjoy interesting environments and situations. I've been honored to help at a few large events year after year, such as the Ordination Masses for the Archdiocese of St. Louis, or the Steubenville St. Louis Mid-America youth conference, and I thought I'd try to write an article detailing my workflow with tips and techniques for other photographers getting into solo event photography.

How to Build Your Own Raspberry Pi Cluster ('Bramble')

Rasbperry Pi Dramble

One of the first questions I'm asked by those who see the Dramble is, "How do I build my own?" Since I've been asked the question many times, I put together a detailed parts list, and maintain it on the Dramble's project wiki on GitHub: Raspberry Pis and Accessories.

For a little over $400, you can have the exact same setup, with six Raspberry Pi 2s, a network switch, a rack inside which you can mount the Pis, microSD cards for storage, a 6-port USB power supply, and all the required cables and storage!

Raspberry Pi RGB LED boards

Ansible 101 - on a Cluster of Raspberry Pi 2s

Ansible 101 - Raspberry Pi Dramble cluster

Over the course of this year, I've acquired six Raspberry Pi model 2 B computers, and configured them in a cluster (or 'bramble') so I can use them to test different infrastructure configurations, mostly for running Drupal 8. All the Ansible playbooks and instructions for building the cluster are available on the GitHub project page for the Raspberry Pi Dramble.

Each Raspberry Pi has its own RGB LED board that's wired into the GPIO pins, so they're controlled by software. I can demonstrate different ways of managing the cluster via Ansible, and I finally took the time to make a video, Ansible 101 - on a cluster of Raspberry Pi 2s, which shows how it all works together:

Solr for Drupal Developers, Part 3: Testing Solr locally

In earlier Solr for Drupal Developers posts, you learned about Apache Solr and it's history in and integration with Drupal. In this post, I'm going to walk you through a quick guide to getting Apache Solr running on your local workstation so you can test it out with a Drupal site you're working on.

The guide below is for those using Mac or Linux workstations, but if you're using Windows (or even if you run Mac or Linux), you can use Drupal VM instead, which optionally installs Apache Solr alongside Drupal.

As an aside, I am writing this series of blog posts from the perspective of a Drupal developer who has worked with large-scale, highly customized Solr search for Mercy (example), and with a variety of small-to-medium sites who are using Hosted Apache Solr, a service I've been running as part of Midwestern Mac since early 2011.

Installing Apache Solr in a Virtual Machine

Apache Solr can be run directly from any computer that has Java 1.7 or later, so technically you could run it on any modern Mac, Windows, or Linux workstation natively. But to keep your local workstation cleaner, and to save time and hassle (especially if you don't want to kludge your computer with a Java runtime!), this guide will show you how to set up an Apache Solr virtual machine using Vagrant, VirtualBox, and Ansible.

Let's get started:

Solr for Drupal Developers, Part 2: Solr and Drupal, A History

Drupal has included basic site search functionality since its first public release. Search administration was added in Drupal 2.0.0 in 2001, and search quality, relevance, and customization was improved dramatically throughout the Drupal 4.x series, especially in Drupal 4.7.0. Drupal's built-in search provides decent database-backed search, but offers a minimal set of features, and slows down dramatically as the size of a Drupal site grows beyond thousands of nodes.

In the mid-2000s, when most custom search solutions were relatively niche products, and the Google Search Appliance dominated the field of large-scale custom search, Yonik Seeley started working on Solr for CNet Networks. Solr was designed to work with Lucene, and offered fast indexing, extremely fast search, and as time went on, other helpful features like distributed search and geospatial search. Once the project was open-sourced and released under the Apache Software Foundation's umbrella in 2006, the search engine became one of the most popular engines for customized and more performant site search.

As an aside, I am writing this series of blog posts from the perspective of a Drupal developer who has worked with large-scale, highly customized Solr search for Mercy (example), and with a variety of small-to-medium sites who are using Hosted Apache Solr, a service I've been running as part of Midwestern Mac since early 2011.

Timeline of Apache Solr and Drupal Solr Integration

If you can't view the timeline, please click through and read this article on Midwestern Mac's website directly.

A brief history of Apache Solr Search and Search API Solr

Only two years after Apache Solr was released, the first module that integrated Solr with Drupal, Apache Solr Search, was created. Originally, the module was written for Drupal 5.x, but it has been actively maintained for many years and was ported to Drupal 6 and 7, with some relatively major rewrites and modifications to keep the module up to date, easy to use, and integrated with all of Apache Solr's new features over time. As Solr gained popularity, many Drupal sites started switching from using core search or the Views module to using Apache Solr.

Solr for Drupal Developers, Part 1: Intro to Apache Solr

It's common knowledge in the Drupal community that Apache Solr (and other text-optimized search engines like Elasticsearch) blow database-backed search out of the water in terms of speed, relevance, and functionality. But most developers don't really know why, or just how much an engine like Solr can help them.

I'm going to be writing a series of blog posts on Apache Solr and Drupal, and while some parts of the series will be very Drupal-centric, I hope I'll be able to illuminate why Solr itself (and other search engines like it) are so effective, and why you should be using them instead of simple database-backed search (like Drupal core's Search module uses by default), even for small sites where search isn't a primary feature.

As an aside, I am writing this series of blog posts from the perspective of a Drupal developer who has worked with large-scale, highly customized Solr search for Mercy (example), and with a variety of small-to-medium sites who are using Hosted Apache Solr, a service I've been running as part of Midwestern Mac since early 2011.

Why not Database?

Apache Solr's wiki leads off it's Why Use Solr page with the following:

If your use case requires a person to type words into a search box, you want a text search engine like Solr.

At a basic level, databases are optimized for storing and retrieiving bits of data, usually either a record at a time, or in batches. And relational databases like MySQL, MariaDB, PostgreSQL, and SQLite are set up in such a way that data is stored in various tables and fields, rather than in one large bucket per record.

In Drupal, a typical node entity will have a title in the node table, a body in the field_data_body table, maybe an image with a description in another table, an author whose name is in the users table, etc. Usually, you want to allow users of your site to enter a keyword in a search box and search through all the data stored across all those fields.

Drupal's Search module avoids making ugly and slow search queries by building an index of all the search terms on the site, and storing that index inside a separate database table, which is then used to map keywords to entities that match those keywords. Drupal's venerable Views module will even enable you to bypass the search indexing and search directly in multiple tables for a certain keyword. So what's the downside?

Mainly, performance. Databases are built to be efficient query engines—provide a specific set of parameters, and the database returns a specific set of data. Most databases are not optimized for arbitrary string-based search. Queries where you use LIKE '%keyword%' are not that well optimized, and will be slow—especially if the query is being used across multiple JOINed tables! And even if you use the Search module or some other method of pre-indexing all the keyword data, relational databases will still be less efficient (and require much more work on a developer's part) for arbitrary text searches.

If you're simply building lists of data based on very specific parameters (especially where the conditions for your query all utilize speedy indexes in the database), a relational database like MySQL will be highly effective. But usually, for search, you don't just have a couple options and maybe a custom sort—you have a keyword field (primarily), and end users have high expectations that they'll find what they're looking for by simply entering a few keywords and clicking 'Search'.

How to Build a Simple $50 Standing Desk

Detail of standing desk surface

I'm no stranger to experimenting with my workspace; since I work from a computer for at least 8 hours a day, I try to find ways to prevent RSI and joint pain. I've tried most everything—from über-expensive fancy mesh desk chairs to ergonomic keyboards and vertical mice. But nothing has made as large (and quick) a difference as working from a standing desk.