Vault
Automate upgrades with Vault Enterprise
Enterprise Only
The functionality described in this tutorial is available only in Vault Enterprise.
Challenge
Vault version upgrade is always a delicate moment for any production environment, and it's important to have best practices in place that simplify the process where possible.
Solution
Vault Enterprise provides automated version upgrades with the autopilot feature when using Integrated Storage. The feature allows you to start new Vault nodes alongside the older version ones and automatically switch to the new nodes after they reach quorum.
This automates the leader election process and ensures leader election among the new nodes so that removing the older version nodes from the datacenter does not trigger a leader election.
Prerequisites
To test the automated upgrades feature explained in this tutorial you will need:
- A Vault Enterprise cluster with three nodes running Vault Enterprise 1.11.0 or later.
- Three extra nodes with Vault Enterprise 1.11.0 or later to use as the new servers after the upgrade.
You will also need a text editor, the curl executable to test the API
endpoints, and optionally the jq command to format the output for curl.
Scenario introduction
To learn about the new autopilot behavior, start an initial 3 node cluster (Note Step 1 diagram). Then, start an additional 3 nodes with an automatic upgrade version specified, and add them to the cluster (Note Step 2 diagram).
You will run a script to start a cluster.

- Initialize and unseal vault_1 (http://127.0.0.1:8100). The root token creates a transit key that enables the other Vaults auto-unseal. This Vault server is not a part of the cluster.
- Initialize and unseal vault_2 (http://127.0.0.1:8200). This Vault starts as the cluster leader.
- Start vault_3 (http://127.0.0.1:8300). It automatically joins the cluster viaretry_join.
- Start vault_4 (http://127.0.0.1:8400). It automatically joins the cluster viaretry_join.
If this is your first time setting up a Vault cluster with integrated storage, go through the Vault HA Cluster with Integrated Storage tutorial.
Setup an initial cluster
- Retrieve the configuration by cloning the - hashicorp/learn-vault-raftrepository from GitHub.- $ git clone https://github.com/hashicorp-education/learn-vault-raft- This repository holds supporting content for all the Vault learn tutorials. The content specific to this tutorial is in a sub-directory. 
- Change the working directory to - learn-vault-raft/raft-auto-upgrade/local.- $ cd learn-vault-raft/raft-auto-upgrade/local
- Set the - setup_1.shfile to executable.- $ chmod +x setup_1.sh
- Execute the - setup_1.shscript to spin up a Vault cluster.- $ ./setup_1.sh [vault_1] Creating configuration - creating /git/learn-vault-raft/raft-autopilot/local/config-vault_1.hcl [vault_2] Creating configuration - creating /git/learn-vault-raft/raft-autopilot/local/config-vault_2.hcl - creating /git/learn-vault-raft/raft-autopilot/local/raft-vault_2 ...snip... [vault_3] starting Vault server @ http://127.0.0.1:8300 Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal [vault_4] starting Vault server @ http://127.0.0.1:8400 Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal- You can find the server configuration files and the log files in the working directory. 
- Use your preferred text editor and open the - config-vault_2.hclfile to examine the generated server configuration for- vault_2.- config-vault_2.hcl - storage "raft" { path = "/learn-vault-raft/raft-auto-upgrade/local/raft-vault_2/" node_id = "vault_2" } listener "tcp" { address = "127.0.0.1:8200" cluster_address = "127.0.0.1:8201" tls_disable = true } seal "transit" { address = "http://127.0.0.1:8100" # token is read from VAULT_TOKEN env # token = "" disable_renewal = "false" key_name = "unseal_key" mount_path = "transit/" } disable_mlock = true cluster_addr = "http://127.0.0.1:8201"
- Review the generated server configuration for - vault_3.- config-vault_3.hcl - storage "raft" { path = "/learn-vault-raft/raft-auto-upgrade/local/raft-vault_3/" node_id = "vault_3" retry_join { leader_api_addr = "http://127.0.0.1:8200" } } ...snip...- The - retry_joinconfiguration block has- vault_3and- vault_4nodes automatically joining the cluster.
- Export an environment variable for the - vaultCLI to address the- vault_2server.- $ export VAULT_ADDR=http://127.0.0.1:8200
- Verify the cluster members. - $ vault operator raft list-peers Node Address State Voter ---- ------- ----- ----- vault_2 127.0.0.1:8201 leader true vault_3 127.0.0.1:8301 follower false vault_4 127.0.0.1:8401 follower false
- View the autopilot's upgrade state information. - $ curl -s --header "X-Vault-Token: $(cat root_token-vault_2)" \ $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data.upgrade_info"- Output: - { "status": "idle", "target_version": "1.11.0", "target_version_voters": [ "vault_2", "vault_3", "vault_4" ] }- Notice the Upgrade Info fields shows the Status to be idle. - If you have the - watchcommand (or similar), you can follow the upgrade status as you proceed to adding more nodes.- $ watch -n 0.5 'curl -H "X-Vault-Token: $(cat root_token-vault_2)" $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data.upgrade_info"'- This checks the autopilot state every half a second. 
Add new nodes
When autopilot detects that the count of nodes on the new version equals or exceeds older version nodes, it begins promoting the new nodes to voters and demoting the older version nodes to non-voters.
- Use your preferred text editor and open the - config-vault_5.hclfile to examine the generated server configuration for- vault_5.- config-vault_5.hcl - storage "raft" { path = "/learn-vault-raft/raft-auto-upgrade/local/raft-vault_5/" node_id = "vault_5" autopilot_upgrade_version = "1.12.0.1" retry_join { leader_api_addr = "http://127.0.0.1:8200" } } ...snip...- To specify an automatic upgrade target version, add the - autopilot_upgrade_versionparameter in the- storagestanza where its value is a SemVer compatible version string of your choosing.- Vault Configuration - The - vault_5,- vault_6and- vault_7nodes have- autopilot_upgrade_versionparameter configured. This implies that those nodes have a specific target Vault version.
- Set the - setup_2.shfile to executable.- $ chmod +x setup_2.sh
- Execute the - setup_2.shscript to add three additional nodes to the cluster.- $ ./setup_2.sh [vault_5] starting Vault server @ http://127.0.0.1:8500 Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal [vault_6] starting Vault server @ http://127.0.0.1:8600 Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal [vault_7] starting Vault server @ http://127.0.0.1:8700 Using [vault_1] root token (hvs.tqKc9An04pQY5H1uysw02Xn6) to retrieve transit key for auto-unseal
- Follow the autopilot's upgrade status as it progresses. - $ curl -s --header "X-Vault-Token: $(cat root_token-vault_2)" \ $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data.upgrade_info"- Or, - $ watch -n 0.5 'curl -H "X-Vault-Token: $(cat root_token-vault_2)" $VAULT_ADDR/v1/sys/storage/raft/autopilot/state | jq -r ".data.upgrade_info"'- The Status changes from - idleto- await-new-voters.- { "other_version_voters": [ "vault_2", "vault_3", "vault_4" ], "status": "await-new-voters", "target_version": "1.12.0.1", "target_version_non_voters": [ "vault_5", "vault_6" ] }- The status will change to - promotingas autopilot promotes the 3 new nodes to be voters. Then the status will change to- demoting, as autopilot demotes 2 out of the 3 older version nodes to be non-voters. Then, the leader will change from- vault_2to- vault_5.- { "other_version_non_voters": [ "vault_3", "vault_4" ], "other_version_voters": [ "vault_2" ], "status": "leader-transfer", "target_version": "1.12.0.1", "target_version_voters": [ "vault_5", "vault_6", "vault_7" ] }- The status changes to - await-server-removal.- { "other_version_non_voters": [ "vault_2", "vault_3", "vault_4" ], "status": "await-server-removal", "target_version": "1.12.0.1", "target_version_voters": [ "vault_5", "vault_6", "vault_7" ] }
Autopilot Status
 The progression of autopilot statuses during an upgrade
looks like: idle → await-new-voters → demoting → promoting →
leader-transfer → await-server-removal → idle.
Remove non-voter nodes
Once the autopilot upgrade status changes to await-server-removal, you can
remove the older version non-voting nodes from the cluster. 
- List the current peers before removing any nodes. - $ vault operator raft list-peers Node Address State Voter ---- ------- ----- ----- vault_2 127.0.0.1:8201 follower false vault_3 127.0.0.1:8301 follower false vault_4 127.0.0.1:8401 follower false vault_5 127.0.0.1:8501 leader true vault_6 127.0.0.1:8601 follower true vault_7 127.0.0.1:8701 follower true
- Export an environment variable for the - vaultCLI to address the server.- $ export VAULT_ADDR=http://127.0.0.1:8500
- Remove - vault_2from the cluster.- $ vault operator raft remove-peer vault_2 Peer removed successfully!
- Remove - vault_3from the cluster.- $ vault operator raft remove-peer vault_3 Peer removed successfully!
- Remove - vault_4from the cluster.- $ vault operator raft remove-peer vault_4 Peer removed successfully!
- Verify non-voter node removal from the cluster. - $ vault operator raft list-peers Node Address State Voter ---- ------- ----- ----- vault_5 127.0.0.1:8501 leader true vault_6 127.0.0.1:8601 follower true vault_7 127.0.0.1:8701 follower true
Autopilot configuration
Vault Enterprise enables automated upgrade migrations by default.
$ vault operator raft autopilot get-config
Output:
Key                                   Value
---                                   -----
Cleanup Dead Servers                  false
Last Contact Threshold                10s
Dead Server Last Contact Threshold    24h0m0s
Server Stabilization Time             10s
Min Quorum                            0
Max Trailing Logs                     1000
Disable Upgrade Migration             false
To disable automated upgrade migrations, set the -disable-upgrade-migration
parameter to true.
$ vault operator raft autopilot set-config -disable-upgrade-migration=true
Clean up
The cluster.sh script provides a clean operation that removes all services,
configuration, and modifications to your local system.
Clean up your local workstation.
$ ./cluster.sh clean
Found 1 Vault service(s) matching that name
[vault_1] stopping
...snip...
Removing log file /git/learn-vault-raft/raft-autopilot/local/vault_5.log
Removing log file /git/learn-vault-raft/raft-autopilot/local/vault_6.log
Clean complete
Next steps
In this tutorial you upgraded your Vault datacenter by using autopilot's automated upgrades functionality. Automated upgrades lets you automatically upgrade a cluster of Vault nodes to a new version as updated server nodes join the cluster. Once the number of nodes on the new version is equal to or greater than the number of nodes on the older version, Autopilot will promote the newer versioned nodes to voters, demote the older versioned nodes to non-voters, and begin a leadership transfer from the older version leader to one of the newer versioned nodes. After the leadership transfer completes, you can remove the older versioned non-voting nodes from the cluster.

