What do we need chef backup and restore for

In my last blog post, that you can find here, I wrote about how to build a Chef server. The post uses Terraform and a git repository. That means, whenever you want, you can easily just build another one for yourself. Thanks to infrastructure as code, you can easily bring up another one, but what about availability during upgrades? As I will go through this series of Chef related blog posts, this is the first step to provide a 100% uptime during upgrades to your entire Chef dependent infrastructure. Let’s learn how to backup and restore Chef server.

We are going to cover today:

  • Requirements before we start.
  • How to backup Chef server to s3, using TwinDB software.
  • How restoration works and when.
  • Zero downtime review.


This list is relatively short and will help you get going quickly.

  1. S3 Bucket where the backups will go (dedicated would be preferred).
  2. AWS secret key and ID of a dedicated backup user. This user will have to have read-write access to the bucket above.
  3. TwinDB Backup software. In our case it’s coming from our Artifactory repository.
  4. Chef cookbooks (links will be attached along the post).

How and what to backup

If you have followed or worked with the previous article about setting up the Chef server, this will be very easy to work with. All you need is to backup /var/opt/chef-backup and /etc/letsencrypt. Fortunately the TwinDB software configuration is rather straightforward and built for this very reason. This is the template file on GitHub right here.

backup_dirs=/var/opt/chef-backup /etc/letsencrypt

# Destination

AWS_ACCESS_KEY_ID=<%= @aws_access_key_id %>
AWS_SECRET_ACCESS_KEY=<%= @aws_secret_access_key %>
AWS_DEFAULT_REGION=<%= @aws_default_region %>
BUCKET=<%= @bucket %>




As you can see it requires AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY which you should have prepared for your dedicated backup user.

Building on the previous post, you can use the following IAM configuration, it is already included in that example.

data "aws_iam_policy_document" "chef_server_access" {
  statement {
    actions = [
    resources = [
resource "aws_s3_bucket" "backups" {
  bucket = "chef_backup"

Other than this, all you need is the twindb-backup config file, which you can find in chef-backup.rb. Here we use variables coming from the node['twindb_backup']. In the previous blog post, I describe how to use the attributes.

file '/etc/cron.d/twindb-backup' do
    action :delete

logrotate_app 'chef-server-backup' do
    path      '/var/log/chef-server-backup.log'
    frequency 'weekly'
    rotate    4
    options %w(nocompress missingok)

directory '/etc/twindb'
template '/etc/twindb/twindb-backup.cfg' do
    source 'twindb-backup.cfg.erb'
    sensitive true
    owner 'root'
    group 'root'
    mode '600'
        aws_access_key_id: node['twindb-backup']['aws_access_key_id'],
        aws_secret_access_key: node['twindb-backup']['aws_secret_access_key'],
        aws_default_region: node['twindb-backup']['aws_region'],
        bucket: node['twindb-backup']['backups_bucket']

How to restore from backup

Now that our backup is running and configured, how are we restoring it?

This have two separate components. First, restore the server itself. This code is available in our chef-server GitHub repository. The gist of it goes as follows, from the chef-server.rb:

execute 'restore_chef_server' do
  command "chef-server-wrapper restore-chef-server #{node['twindb-backup']['backups_bucket']}"
  environment node.run_state['execute_environment']
  not_if  "chef-server-ctl org-show #{node['chef-server']['org_short_name']}"
  timeout 7200
  action :run

execute 'reconfigure_chef_server' do
  command 'chef-server-ctl reconfigure'
  action :nothing

execute 'reconfigure_chef_manage' do
  command 'chef-manage-ctl reconfigure --accept-license'
  action :nothing

execute 'install_chef_manage' do
  command 'chef-server-ctl install chef-manage'
  not_if "which chef-manage-ctl"
  action :run
  notifies :run, 'execute[reconfigure_chef_server]', :immediately
  notifies :run, 'execute[reconfigure_chef_manage]', :immediately

Once this is done, you will need the certificate restore as well. This is available in chef-restore.rb:

cert_file = "/etc/letsencrypt/live/chef-server.#{node['certbot']['zones'][0]}/cert.pem"

execute 'restore_certificates' do
    command "chef-server-wrapper restore-certificates  #{node['twindb-backup']['backups_bucket']}"
    environment node.run_state['execute_environment']
    not_if { File.exists?(cert_file) }
    notifies :run, 'execute[assert_certificate_symlink]', :immediately
    action :run

execute 'assert_certificate_symlink' do
  command "test -L #{cert_file}"
  only_if { File.exists?(cert_file) }
  action :nothing

Zero second downtime upgrades

Alright, so now you can take backups and restore them. Why or how you should be doing this?

We primarily using this to be able to kill a Chef server at any moment and bring up a new one. It might be that you need to upgrade the OS, move the instance or whatever it is that you need to restart the server, you can just bring up a new one instead.

In the next article, I will write about how to setup a chef server in an Auto Scaling Group. Why, you ask? Great question! I will show you how to configure chef and Terraform in a way that , that when a new AMI comes out, it will automatically updates the chef server, with no downtime.


Leave a Reply

Your email address will not be published. Required fields are marked *