Backup and Restore a Standalone or Frontend install

Periodic backups of Chef Infra Server are essential to managing and maintaining a healthy configuration and ensuring the availability of important data for restoring your system, if required. The backup takes around 4 to 5 minutes for each GB of data on a t3.2xlarge AWS EC2 instance.

Requirements

Chef Infra Server 14.11.36 or later

chef-server-ctl

For the majority of use cases, chef-server-ctl backup is the recommended way to take backups of the Chef Infra Server. Use the following commands for managing backups of Chef Infra Server data, and for restoring those backups.

backup

The backup subcommand is used to back up all Chef Infra Server data. This subcommand:

Requires rsync to be installed on the Chef Infra Server before running the command
Requires a chef-server-ctl reconfigure before running the command
Should not be run in a Chef Infra Server configuration with an external PostgreSQL database; use knife ec backup instead
Puts the initial backup in the /var/opt/chef-backup directory as a tar.gz file; move this backup to a new location for safe keeping

Options

This subcommand has the following options:

-y, --yes: Use to specify if the Chef Infra Server can go offline during tar.gz-based backups.
--pg-options: Use to specify and pass additional options PostgreSQL during backups. See the PostgreSQL documentation for more information.
-c, --config-only: Backup the Chef Infra Server configuration without backing up data.
-t, --timeout: Set the maximum amount of time in seconds to wait for shell commands (default 600). This option should be set to greater than 600 for backups taking longer than 10 minutes.
-h, --help: Show help message.

Syntax

This subcommand has the following syntax:

chef-server-ctl backup

restore

The restore subcommand is used to restore Chef Infra Server data from a backup that was created by the backup subcommand. This subcommand may also be used to add Chef Infra Server data to a newly-installed server. Do not run this command in a Chef Infra Server configuration that uses an external PostgreSQL database; use knife ec backup instead. This subcommand:

Requires rsync installed on the Chef Infra Server before running the command
Requires a chef-server-ctl reconfigure before running the command

Ideally, the restore server will have the same FQDN as the server that you backed up. If the restore server has a different FQDN, then:

Replace the FQDN in the /etc/opscode/chef-server.rb.
Replace the FQDN in the /etc/opscode/chef-server-running.json.
Delete the old SSL certificate, key and -ssl.conf file from /var/opt/opscode/nginx/ca.
If you use a CA-issued certificate instead of a self-signed certificate, copy the CA-issued certificate and key into /var/opt/opscode/nginx/ca.
Update the /etc/chef/client.rb file on each client to point to the new server FQDN.
Run chef-server-ctl reconfigure.
Run chef-server-ctl restore.

Options

This subcommand has the following options:

-c, --cleanse: Use to remove all existing data on the Chef Infra Server; it will be replaced by the data in the backup archive.
-d DIRECTORY, --staging-dir DIRECTORY: Use to specify that the path to an empty directory to be used during the restore process. This directory must have enough disk space to expand all data in the backup archive.
--pg-options: Use to specify and pass additional options PostgreSQL during backups. See the PostgreSQL documentation for more information.
-t, --timeout: Set the maximum amount of time in seconds to wait for shell commands. Set to greater than 600 for backups that take longer than 10 minutes. Default: 600.
-h, --help: Show help message.

Syntax

This subcommand has the following syntax:

chef-server-ctl restore PATH_TO_BACKUP (options)

Examples

chef-server-ctl restore /path/to/tar/archive.tar.gz

Backup and restore a Chef Backend install

Warning

Chef Backend is deprecated and no longer under active development. Contact your Chef account representative for information about migrating to Chef Automate HA.

This document is no longer maintained.

In a disaster recovery scenario, the backup and restore processes allow you to restore a data backup into a newly built cluster. The restore process is not intended for recovering individual machine in the Chef Backend cluster or for a point-in-time rollback of an existing cluster.

Backup

Restoring your data in an emergency requires existing backups in the .tar format of:

The Chef Backend cluster data
The Chef Infra Server configuration file

To make backups use in future disaster scenarios:

On a follower Chef Backend node, create the back-end data backup with: chef-backend-ctl backup
On Chef Infra Server node, create the server configuration backup with: chef-server-ctl backup --config-only
Move the tar archives created in steps (1) and (2) to a long-term storage location

Restore

The restore process requires Chef Infra Server 14.11.36 or later.

Restoring Chef Backend for a Chef Infra Server cluster has two steps:

Restore the back-end services
Restore the front-end services

Backend Restore

Restoring the back-end services creates a new cluster. Select one node as the leader and restore the backup on that node first. Use the IP address of the leader node as the value for the --publish_address option.
```
chef-backend-ctl restore --publish_address my.company.ip.address /path/to/backup.tar.gz
```
For example,
```
chef-backend-ctl restore --publish_address 198.52.1000.0 /backups/2021/backup.tar.gz
```
The restore process creates a new cluster and generates a JSON secrets file for setting up communication between the nodes. Locate the file in /etc/chef-backend/chef-backend-secrets.json and copy it to each node as tmp/chef-backend-secrets.json
Join follower nodes to your new Chef Backend cluster. For each follower node, run the join-cluster subcommand to establish communication in the cluster. The command uses:
1. The IP address of the new leader node.
2. The IP address of the follower node that joins through the --publish_address option.
3. The secrets option -s with the /tmp/chef-backend-secrets.json file on the node.
The join-cluster command is:
```
chef-backend-ctl join-cluster --accept-license --yes --quiet IP_OF_LEADER_NODE --publish_address IP_OF_FOLLOWER_NODE -s /tmp/chef-backend-secrets.json
```
For example:
```
chef-backend-ctl join-cluster --accept-license --yes --quiet 198.51.100.0 --publish_address 203.0.113.0 -s /tmp/chef-backend-secrets.json
```

Generate the configuration for the front end from the new cluster:

chef-backend-ctl gen-server-config chefserver.internal > /tmp/chef-server.rb

Frontend Restore

Note

The Chef Infra Server HA install documentation includes a second process for generating and reconfiguring the front-end configuration file.

Restore Chef Infra Server from your backed-up Infra Server configuration generated by the new cluster.
```
chef-server-ctl restore /path/to/chef-server-backup.tar.gz
```
Copy the Chef generated config /tmp/chef-server.rb, to the front end node and replace it onto /etc/opscode/chef-server.rb.
Run reconfigure to apply the changes.
```
 chef-server-ctl reconfigure
```
Run the reindex command to re-populate your search index
```
chef-server-ctl reindex --all
```

Note

If knife search does not return the expected results and data is present in the Chef Infra Server after reindex, then verify the search index configuration.

Verify

The best practice for maintaining useful backup is to periodically verify your backup by restoring:

One Chef Backend node
One Chef Infra Server node

Verify that you can execute knife commands and Chef Infra Client runs against your these restored nodes.

Troubleshoot

The restore process requires Chef Infra Server 14.11.36 or later.

For a quick fix you can edit /opt/opscode/embedded/lib/ruby/gems/2.7.0/gems/chef-server-ctl-1.1.0/bin/chef-server-ctl and add the following methods:

# External Solr/ElasticSearch Commands
def external_status_opscode_solr4(_detail_level)
  solr = external_services['opscode-solr4']['external_url']
  begin
    Chef::HTTP.new(solr).get(solr_status_url)
    puts "run: opscode-solr4: connected OK to #{solr}"
  rescue StandardError => e
    puts "down: opscode-solr4: failed to connect to #{solr}: #{e.message.split("\n")[0]}"
  end
end

def external_cleanse_opscode_solr4(perform_delete)
  log <<-EOM
   Cleansing data in a remote Sol4 instance is not currently supported.
  EOM
end

def solr_status_url
  case running_service_config('opscode-erchef')['search_provider']
  when "elasticsearch"
    "/chef"
  else
    "/admin/ping?wt=json"
  end
end