ansible

AnsibleFest 2018 is a Wrap! Slides from my presentation and notes

AnsibleFest 2018 is in the books, and it was a great conference! I was able to attend the 'Contributors Summit' in Austin on Monday, and remotely Thursday, and I learned quite a bit! I also presented Make your Ansible playbooks maintainable, flexible, and scalable on both days of the conference. Slides from that session are available below, but you'll have to wait for the actual video to be uploaded to see the fun little gimmick I added for the live presentation ?.

Things I learned at the AnsibleFest Austin 2018 Contributor's Summit

AnsibleFest Austin 2018 is about to get started (with a huge party tonight, then a keynote to kick off two full days of sessions tomorrow), and the day before and after the 'Fest marks the 6th "Contributor's Summit", a "working session with the core team and key contributors to discuss important issues affecting the Ansible community".

AnsibleFest 2018 Austin Contributors Summit

As with most conference-related events, the best part of the day is getting to meet with and talk to people you work with online, but there are also usually lots of little tidbits discussed during the sessions which aren't yet widely known. Some of the most exciting things I learned today include:

Getting AWS STS Session Tokens for MFA with AWS CLI and kubectl for EKS automatically

I've been working on some projects which require MFA for all access, including for CLI access and things like using kubectl with Amazon EKS. One super-annoying aspect of requiring MFA for CLI operations is that every day or so, you have to update your STS access token—and also for that token to work you have to update an AWS profile's Access Key ID and Secret Access Key.

I had a little bash function that would allow me to input a token code from my MFA device and it would spit out the values to put into my .aws/credentials file, but it was still tiring copying and pasting three values every single morning.

So I wrote a neat little executable Ansible playbook which does everything for me:

To use it, you can download the contents of that file to /usr/local/bin/aws-sts-token, make the file executable (chmod +x /usr/local/bin/aws-sts-token), and run the command:

Fixing 'UNREACHABLE' SSH error when running Ansible playbooks against Ubuntu 18.04 or 16.04

Ubuntu 16.04 and 18.04 (and likely future versions) often don't have Python 2 installed by default. Sometimes Python 3 is installed, available at /usr/bin/python3, but for many minimal images I've used, there's no preinstalled Python at all.

Therefore, when you run Ansible playbooks against new VMs running Ubuntu, you might be greeted with the following error:

Speaking about Playbooks at AnsibleFest Austin 2018

I'm excited to announce I'll be presenting the session Make your Ansible Playbooks Flexible, Maintainable, and Stable at AnsibleFest Austin in the first week of October.

AnsibleFest Austin email promo

I've spent a lot of time building, maintaining, and in a few cases, completely restructuring Ansible playbooks over the past five years. I hope to distill a lot of the lessons I've learned into this presentation, and I hope anyone else who is as passionate about infrastructure automation as I am can get a lot out of it.

As usual, I'll post slides—and hopefully video as well—from the presentation after it's over. Hope to see you in Austin!

Reboot and wait for reboot to complete in Ansible playbook

September 2018 Update: Ansible 2.7 (to be released around October 2018) will include a new reboot module, which makes reboots a heck of a lot simpler (whether managing Windows, Mac, or Linux!):

- name: Reboot the server and wait for it to come back up.
  reboot:

That's it! Much easier than the older technique I used in Ansible < 2.7!

One pattern I often need to implement in my Ansible playbooks is "configure-reboot-configure", where you change some setting that requires a reboot to take effect, and you have to wait for the reboot to take place before continuing on with the rest of the playbook run.

For example, on my Raspberry Pi Dramble project, before installing Docker and Kubernetes, I need to make sure the Raspberry Pi's /boot/cmdline.txt file contains a couple cgroup features so Kubernetes runs correctly. But after adding these options, I also have to reboot the Pi.

Hosted Apache Solr's Revamped Docker-based Architecture

I started Hosted Apache Solr almost 10 years ago, in late 2008, so I could more easily host Apache Solr search indexes for my Drupal websites. I realized I could also host search indexes for other Drupal websites too, if I added some basic account management features and a PayPal subscription plan—so I built a small subscription management service on top of my then-Drupal 6-based Midwestern Mac website and started selling a few Solr subscriptions.

Back then, the latest and greatest Solr version was 1.4, and now-popular automation tools like Chef and Ansible didn't even exist. So when a customer signed up for a new subscription, the pipeline for building and managing the customer's search index went like this:

Hosted Apache Solr original architecture

Original Hosted Apache Solr architecture, circa 2009.

Use Ansible's YAML callback plugin for a better CLI experience

Ansible is a great tool for automating IT workflows, and I use it to manage hundreds of servers and cloud services on a daily basis. One of my small annoyances with Ansible, though, is its default CLI output—whenever there's a command that fails, or a command or task that succeeds and dumps a bunch of output to the CLI, the default visible output is not very human-friendly.

For example, in a Django installation example from chapter 3 of my book Ansible for DevOps, there's an ad-hoc command to install Django on a number of CentOS app servers using Ansible's yum module. Here's how it looks in the terminal when you run that task the first time, using Ansible's default display options, and there's a failure:

Ansible 2.5 default callback plugin

...it's not quickly digestible—and this is one of the shorter error messages I've seen!

Properly deploying updates to or shutting down Jenkins

One of my most popular Ansible roles is the geerlingguy.jenkins role, and for good reason—Jenkins is pretty much the premiere open source CI tool, and has been used for many years by Ops and Dev teams all over the place.

As Jenkins (or other CI tools) are adopted more fully for automating all aspects of infrastructure work, you begin to realize how important the Jenkins server(s) become to your daily operations. And then you realize you need CI for your CI. And you need to have version control and deployment processes for things like Jenkins updates, job updates, etc. The geerlingguy.jenkins role helps a lot with the main component of automating Jenkins install and configuration, and then you can add on top of that a task that copies config.xml files with each job definition into your $JENKINS_HOME to ensure every job and every configuration is in code...

Getting Munin-node to monitor Nginx and Apache, the easy way

Since this is something I think I've bumped into at least eight times in the past decade, I thought I'd document, comprehensively, how I get Munin to monitor Apache and/or Nginx using the apache_* and nginx_* Munin plugins that come with Munin itself.

Besides the obvious action of symlinking the plugins into Munin's plugins folder, you should—to avoid any surprises—forcibly configure the env.url for all Apache and Nginx servers. As an example, in your munin-node configuration (on RedHat/CentOS, in /etc/munin/plugin-conf.d, add a file named something like apache or nginx):

# For Nginx:
[nginx*]
env.url http://localhost/nginx_status

# For Apache:
[apache*]
env.url http://localhost/server-status?auto

Now, something that often trips me up—especially since I maintain a variety of servers and containers, with some running ancient forms of CentOS, while others are running more recent builds of Debian, Fedora, or Ubuntu—is that localhost doesn't always mean what you'd think it means.