Chef Automate Backups¶
Chef Automate provides tools for creating, managing and restoring backup archives and Elasticsearch snapshots of your Chef Automate data.
automate-ctl create-backup
will create a compressed backup archives of the PostgreSQL database, configuration files, user keys, license file, git repository data, Chef Compliance profiles and RabbitMQ queues. It also utilizes the snapshot capability of Elasticsearch to create incremental snapshots of your Chef Automate Elasticsearch indexes. Paired together, backup archives and Elasticsearch snapshots make it possible to take complete backups of a Chef Automate cluster without disrupting service.
automate-ctl list-backups
will list existing backup archives and snapshots in either human or machine readable format.
automate-ctl delete-backups
will delete specific backups or snapshots. It’s also capable of taking backup and snapshot limit parameters to prune the backups to specified limits.
automate-ctl restore-backup
will perform full or partial restorations of a backup archive or elasticsearch snapshot.
Configuration¶
By default the Chef Automate cluster is configured to store near-complete backup archives and snapshots on the local filesystem. When you create backups they will include all Chef Automate data and configuration except for the RabbitMQ queues. This was determined to be a safe choice as the RabbitMQ queues are commonly quite small and require taking the Chef Automate cluster offline in order to back them up. As they are not required to restore a functional Chef Automate cluster the services disruption is rarely worth the value of the RabbitMQ queues.
All backup commands can be configured by changing the default setting in /etc/delivery/delivery.rb
. Several configuration options can also be set at runtime by using the appropriate command line switch to pass the configuration option. Configuration options passed via command line flags will always supersede any default configuration.
The Chef Automate optional settings page contains a full reference of all backup configuration options that are available.
Local Backups¶
Local storage mode is the default configuration for both backup archives and snapshots. Backups are created and exported into the /var/opt/delivery/backups
and /var/opt/delivery/elasticsearch_backups
directories. You can configure the storage locations by setting the backup['location']
and backup['elasticsearch']['location']
options in delivery.rb
.
When using local backups it is advised to mount a remote backup storage device to the aforementioned locations.
The staging directory is a local directory that will be used for temporarily storing the backup archive, database dump, and configuration data during the backup procedure. When left unconfigured, the Ruby temporary directory will be used. The Ruby temporary directory is usually nested in /tmp
on Linux systems, but the value of the TMPDIR
environment variable will also be honored. You can configure the staging directory by using the backup['staging_dir']
setting in delivery.rb
.
Note
The backup create will clear any existing files in the staging directory at the beginning of the backup procedure. Only use a directory that does not contain any other system data.
S3 Backups¶
Using Amazon Web Services (AWS) S3 as a storage location for both Chef Automate backup archives and the Elasticsearch snapshot repository is natively supported. In this mode the backup archives and snapshots will be uploaded to the bucket of your choice.
To enable this functionality, first configure the machine with access to the desired S3 bucket using either an instance profile with a valid S3 policy or a standard shared credentials file located at /root/.aws/credentials
.
Below is an example Amazon Web Services (AWS) instance profile policy with the required permissions to create an S3 bucket called example-backups
. A policy with these permissions is sufficient for the backup commands to function as expected.
{
"Statement": [
{
"Action": [
"s3:CreateBucket",
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads",
"s3:ListBucketVersions"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::example-backups"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::example-backups/*"
]
}
],
"Version": "2012-10-17"
}
Next, configure Chef Automate to use S3 for both the backups and snapshots. For example:
backup['bucket'] = 'example-backups'
backup['region'] = 'us-west-2'
backup['type'] = 's3'
backup['elasticsearch']['bucket'] = 'example-backups'
backup['elasticsearch']['region'] = 'us-west-2'
backup['elasticsearch']['type'] = 's3'
$ automate-ctl reconfigure
Note
Using the same bucket for backup archives and snapshots is supported but both must be configured independently.
SSE-S3 AES256 Server side encryption is supported and enabled by default for both backup archives and snapshots. Backup archives can also be encrypted with SSE-KMS or SSE-C, though snapshots are currently limited to SSE-S3.
Note
While the backup utility currently supports encrypting backups with with SSE-S3, SSE-KMS, and SSE-C, only SSE-S3 is currently supported for restoration.
See below for valid examples of delivery.rb
configurations for server side encryption.
# Elasticsearch snapshot SSE-S3 AES256
backup['elasticsearch']['server_side_encryption'] = true # default
backup['elasticsearch']['server_side_encryption'] = false
# Backup archive SSE-S3 AES256
backup['server_side_encryption'] = 'AES256' # default
# Backup archive SSE-KMS
backup['server_side_encryption'] = 'aws:kms'
backup['ssekms_key_id'] = 'XXXX'
# Backup archive SSE-C
backup["sse_customer_algorithm"] = "AES256"
backup["sse_customer_key"] = "XXXX"
backup["sse_customer_key_md5"] = "XXXX"
Backup Cron¶
To enable a backup cron job that will create new backups and prune older backups and snapshots, configure the following settings in delivery.rb
:
backup['cron']['enabled'] = true
backup['cron']['max_archives'] = 7
backup['cron']['max_snapshots'] = 7
backup['cron']['notation'] = "0 0 0/1 1/1 * ? * "
If omitted, the default max_archives
, max_snapshots
, and notation
settings will create daily backups and keep the most recent seven. Any standard cron notation is supported. If you wish to keep all backups or snapshots you can set both max_snapshots
and/or max_archives
options to nil
.
Create Backups¶
The create-backup
subcommand is used to create Chef Automate backups. By default, it creates Automate backup archives and Elasticsearch snapshots.
Syntax
$ automate-ctl create-backup [NAME] [options]
--chef-server-config Backup up the Chef Server config if present
--digest [int] The SHA digest length to output. 256, 384, and 512 are valid
--force Agree to all warnings and prompts
--name [string] The output name of the backup
--no-census Do not back up Chef Automate's census data
--no-compliance-profiles Do not back up Chef Automate's compliance profiles
--no-config Do not back up Chef Automate's configuration directory
--no-db Do not back up Chef Automate's database
--no-elasticsearch Do not snapshot Chef Automate's Elasticsearch
--no-git Do not back up Chef Automate's git repositories
--no-license Do not back up Chef Automate's license file
--no-notifications Do not back up Chef Automate's notifications rulestore
--no-wait Do not wait for non-blocking backup operations
--no-wait-for-lock Do not wait for Elasticsearch lock
--quiet Do not output non-error information
--rabbit Back up Chef Automate's RabbitMQ queues
--retry-limit Maximum number of times to retry archive uploads to S3
--staging-dir [string] The path to use for temporary files during backup
-h, --help Show the usage message
The NAME
value is optional. If omitted, a default name with the current time will be used.
Warning
In rare circumstances, jobs that are running at the time of backup creation may be left in an unrecoverable state. For this reason, it’s recommended to take a backup when no critical jobs are running.
Note
create-backup
should be run outside of root-only directories like /root
, as it tries to chpst to the user chef-pgsql. This user will have problems running with a current working directory owned by root.
Examples
- Complete backup:
$ automate-ctl create-backup
- Elasticsearch snapshot only:
$ automate-ctl create-backup --no-census --no-config --no-db --no-license --no-git
- Automate archive only
$ automate-ctl create-backup --no-elasticsearch
List Backups¶
The list-backups command is used to list Chef Automate backup archives and Elasticsearch snapshots in either human or machine readable outputs.
Delete Backups¶
The delete-backups command is used to delete Chef Automate backup archives and Elasticsearch snapshots. The command matches a given regular expression and prompts the user to confirm deletion of each matched backup or snapshot. It can also be passed maximum archive and snapshot limits and prune the backup repositories to conform to those limits.
Restore Backups¶
The restore-backup command is used to fully or partially restore a Chef Automate cluster from backup archives and/or Elasticsearch snapshots.
Note
Backups created with the older automate-ctl backup-data
command are not supported with this command. If you wish to restore an older backup please install the version of Chef Automate that took the backup and use automate-ctl restore-data
Local Backups¶
Follow the process below for an example of restoring a Chef Automate cluster from a local backup archive and a shared filesystem Elasticsearch snapshot:
- Copy the Chef Automate backup archive to a directory that is large enough to expand the the archive, e.g.:
scp user@backup-server:2016-10-14-08-38-55-chef-automate-backup.zst /mnt/ephemeral/
- Install the same version of Chef Automate that was used to take the backup. If the versions do not match you be prompted with a compatibility warning but can still proceed with the restore if you choose to do so.
dpkg -i delivery.rpm
- Mount the Elasticsearch shared filesystem to the same mount point.
mount backup-server:/export/chef-automate/elasticsearch_backups /var/opt/delivery/elasticsearch_backups
- Restore the backup archive and snapshot:
$ automate-ctl restore-backup /mnt/ephemeral/2016-10-14-08-38-55-chef-automate-backup.zst 2016-10-14-08-38-55-chef-automate-backup --staging-dir /mnt/ephemeral/restore
Note
Specifying a staging directory is not mandatory but when given it will clear all existing data from it.
S3 Backups¶
Follow the process below for an example of restoring a Chef Automate cluster from a backup archive and Elasticsearch snapshot in Amazon Web Services (AWS) S3:
- Install the same version of Chef Automate that was used to take the backup. If the versions do not match you can still proceed with the restore but we cannot guarantee compatibility.
dpkg -i delivery.rpm
- Restore the backup archive and snapshot by specifying the region, bucket, backup artifact name and snapshot name:
$ automate-ctl restore-backup us-east-1:your-s3-bucket:2016-10-14-08-38-55-chef-automate-backup.zst 2016-10-14-08-38-55-chef-automate-backup
Partial Restoration¶
It is possible to restore only specific data from a Chef Automate backup artifact. Below is an example of restoring only the PostgreSQL database and git repositories from a backup archive in S3:
- Determine the archive you want to restore
automate-ctl list-backups --automate
- Restore it
$ automate-ctl restore-backup us-east-1:your-s3-bucket:2016-10-14-08-38-55-chef-automate-backup.zst --no-census --no-license --no-config
It is also possible to restore a functional Chef Automate cluster to a specific Elasticsearch snapshot. Below is an example of restoring only an Elasticsearch snapshot:
- Determine the snapshot you want to restore
automate-ctl list-backups --elasticsearch
- Restore it
automate-ctl restore-backup 2016-10-14-08-38-55-chef-automate-backup