Integrate brownfield applications with Vault Agent and AppRole
Author(s): Mohamed Jouini, Pablo Diaz
This pattern demonstrates how to implement HashiCorp Vault Agent with AppRole authentication for brownfield applications that you cannot modify, providing complete lifecycle management from initial setup through Day N operations. It covers HashiCorp's validated deployment patterns, explains critical decision points between security and operational simplicity, and details the implementation of automated lifecycle management using orchestration tools. This pattern delivers enterprise-grade secrets management without application code changes, automated credential rotation with zero application downtime, and addresses the Secret Zero challenge to ensure secure initial authentication.
Target audience
This pattern targets organizations implementing the three-zone AppRole architecture:
- Orchestrator administrator: Manages Ansible/Jenkins automation, creates rotation playbooks, schedules jobs, and handles monitoring/emergency response.
- Vault administrator: Manages Vault cluster, configures AppRole auth methods and policies, creates administrator tokens for orchestrator access.
- System administrator: Installs Vault Agent on target systems, manages file permissions and systemd services, coordinates deployment.
- Application owner: Owns brownfield applications, understands restart patterns, and validates file-based secret delivery integration.
Prerequisites
We assume that readers are generally familiar with fundamental Vault operations, design principles, and configuration constructs such as AppRole authentication, tokens, policies, and secrets engines. We also expect familiarity with Ansible playbooks, YAML configuration, and automation concepts.
To complete the steps outlined in this guide, you need the following:
- Orchestrator administrator:
- Ansible control node with HashiCorp Collection
- Vault authentication with Secret ID generation and rotation permissions
- Access to scheduling systems and monitoring/alerting integration
- Vault administrator:
- Privileged access to Vault cluster (v1.12+ recommended) with auth method and policy management permissions
- Authority to create administrative tokens for orchestrator authentication
- System administrator:
- Root/sudo access to target systems for Vault Agent installation
- Network connectivity to Vault infrastructure and SSH access for deployment
- Application owner:
- Understanding of application configuration and restart procedures
- Authority to validate file-based secret integration requirements
Background and best practices
Organizations with brownfield applications face a fundamental challenge in implementing enterprise-grade secrets management where direct Vault integration is not feasible. This situation commonly occurs when developer teams resist integration overhead due to resource constraints, business-critical applications have unacceptable modification risk, and compliance requirements demand centralized secrets management despite legacy architecture constraints.
Current problematic approaches include persistent Secret IDs stored indefinitely on disk with infrequent rotation cycles, manual credential management processes prone to human error, static credential files with overly broad file system permissions, and insufficient audit trails for credential access patterns. These anti-patterns reduce security posture and create compliance gaps that audit teams consistently flag as high-risk findings.
HashiCorp Vault Agent with AppRole authentication addresses these challenges through a transparent proxy architecture, where the agent sits between Vault and brownfield applications, enabling seamless enterprise secrets management without requiring application code changes. The agent handles authentication, token lifecycle management, and secret delivery through familiar file-based configuration patterns that operations teams already understand and manage.
Key architectural principles include maintaining strict separation of concerns between orchestration, secrets management, and application layers. The pattern ensures zero application impact through transparent file-based operations and implements defense-in-depth security with multiple layers of protection. It also preserves operational familiarity using standard tools and procedures.
Validated architecture
HashiCorp Vault Agent AppRole Lifecycle Management: Complete Workflow and Decision Tree

The validated architecture employs a security-focused three-zone approach that addresses the complete AppRole lifecycle from initial setup through operational scenarios:
Zone 1: Trusted Orchestrator (Ansible/Jenkins/GitlabCI, or other CD pipeline tools) manages the complete Secret ID lifecycle through both Vault APIs and target system access, handling administrative Vault token management, response wrapping for secure Secret Zero delivery, Secret ID generation and time to live (TTL) monitoring through Vault APIs, while simultaneously deploying Role IDs and wrapped Secret IDs to target systems, restarting Vault Agent services, conducting health monitoring, managing file system operations, coordinating emergency response procedures, and proactively preventing expiration-related service disruptions across all managed systems. This document uses Ansible playbooks in all examples to demonstrate the implementation patterns of the orchestrator.
Zone 2: HashiCorp Vault Infrastructure provides centralized secrets management with AppRole authentication, dynamic secrets engines for database credentials as needed, and other sensitive data. It also features policy-based access control with least privilege principles, comprehensive audit logging for compliance requirements, and response wrapping services for secure credential delivery. The infrastructure enforces HTTPS encryption for all communications, Classless Inter-Domain Routing (CIDR) based access restrictions, and token and Secret ID TTL boundaries.
Zone 3: Target Environment (Application Systems) contains Vault Agent operating as a transparent proxy with its Auto-Auth module automatically handling the complete AppRole authentication workflow (reading credentials, unwrapping tokens, authenticating with Vault, renewing tokens, and re-authenticating when needed), template engine for dynamic secret file rendering, token sink for secure token storage, and brownfield applications that remain entirely unchanged. The agent manages file system layout with appropriate permissions, handles network resilience through retry logic, and provides application transparency through atomic file updates.
The workflow illustrates critical decision points, including Secret ID persistence choices that impact all operational scenarios, token lifecycle management with automatic renewal until the Maximum TTL boundaries, and emergency response procedures for credential expiration events.
People and process considerations
This pattern requires coordination across multiple organizational roles with defined responsibilities and interfaces. Orchestrator administrators focus on Ansible automation, playbook development, and emergency response coordination, requiring skills in automation frameworks and monitoring systems. Vault administrators manage cluster operations and policy frameworks, making critical decisions about AppRole configurations and administrative token management. System administrators handle target system operations, including Vault Agent deployment and file system management, while application owners provide operational requirements and validate integration without code modifications.
Establish clear ownership boundaries through documented procedures, implement comprehensive training programs tailored to each role's specific responsibilities, plan emergency escalation paths with defined response times and authority levels, and maintain cross-team communication through regular reviews of operational metrics and shared documentation of lessons learned.
Success factors include starting with conservative security settings and adjusting them based on operational evidence rather than assumptions. Implementing comprehensive monitoring from day one provides visibility into all lifecycle events. Planning emergency response procedures with validated response times and tested automation is also crucial. Maintaining team knowledge through comprehensive documentation and regular training updates is essential.
Understand the critical Secret ID persistence decision
The Secret Zero challenge and why response wrapping is essential

Organizations implementing AppRole authentication face a fundamental bootstrap security problem known as Secret Zero: how do you securely deliver the initial secret credential without having a pre-shared secret? This challenge becomes critical in brownfield environments where:
- Manual delivery is impractical at scale across hundreds or thousands of systems
- Static pre-shared secrets violate security principles and create audit compliance issues
- Network-based delivery channels (SSH, HTTPS) require secure credential distribution themselves
- File-based delivery without protection enables credential interception and replay attacks
Response wrapping solves Secret Zero by providing three critical security guarantees that eliminate bootstrap vulnerabilities:
- Tamper Evidence: Any attempt to read wrapped content is cryptographically detectable through Hash-based Message Authentication Code (HMAC) signatures
- Time Boundaries: Short TTL (5 minutes) minimizes exposure windows and prevents credential sprawl
- Single-Use Consumption: Each wrapped token unwraps only once, preventing replay attacks
The fundamental persistence decision problem
However, response wrapping creates an operational challenge: wrapped tokens are inherently single-use and cannot solve restart scenarios and the max TTL re-authentication scenarios. This forces organizations to make a critical architectural decision about how to handle agent restarts, VM reboots, service maintenance, and crucially, token re-authentication when tokens reach their maximum TTL.
The core issue: After unwrapping a Secret ID, you must decide whether to:
- Persist the unwrapped Secret ID on disk, which enables agent restarts and resilience, but increases the risk if someone steals credentials from disk.
- Keep it ephemeral in memory only (which offers maximum security with no local persistence), but this means the orchestrator must generate and deliver a new wrapped Secret ID and trigger an agent restart every time the token’s maximum TTL or the Secret ID's TTL reaches its maximum TTL.

Critical Authentication Dependency: When Vault Agent tokens reach their maximum TTL boundary and Vault can no longer renew them, the agent must re-authenticate with Vault using both Role ID and Secret ID. This re-authentication process requires:
- Role ID: Available from
role_id_file_path(see agent config file): always persistent, not sensitive - Secret ID: Must be available from disk (persistent mode) OR delivered by orchestrator (ephemeral mode),
secret_id_file_path(see agent config file)
If Secret ID is not available during token re-authentication, the service fails and requires emergency intervention.
The Core Technical Reality:
# Step 1: Orchestrator delivers wrapped Secret ID
echo "hvs.CAESIJ8Kqz9aTlyq3h7qPe2fiKF8CsVn9E..." > /opt/vault-agent/auth/secret_id
# Step 2: Agent unwraps token (SINGLE USE - token becomes invalid)
vault unwrap hvs.CAESIJ8Kqz9aTlyq3h7qPe2fiKF8CsVn9E...
# Result: abc123-def456-ghi789-jkl012-mno345-pqr678
# Step 3: Wrapped token is now permanently invalid
vault unwrap hvs.CAESIJ8Kqz9aTlyq3h7qPe2fiKF8CsVn9E... # FAILS - already used
Why ephemeral mode cannot address all use cases without complex automation
The single-use nature of wrapped Secret IDs means that every Vault Agent or VM restart, every token re-authentication at Max TTL, every Secret ID Max TTL expiry, and every Secret ID Max Uses exhaustion requires a fresh wrapped Secret ID delivered by the orchestrator, followed by an immediate agent restart to process the new credential. This creates operational challenges that demand mature, event-driven automation:
Event-driven monitoring requirements
- Real-time restart detection: Continuous monitoring of agent and VM lifecycle events (systemd journal, process states) to trigger immediate credential delivery.
- Proactive Secret ID lifecycle monitoring: Tracking Secret ID Max TTL and Max Uses across all agents to prevent expiration-related outages (required for both modes)
- Event processing pipelines: Sub-3-minute response times from detection to wrapped Secret ID delivery and agent restart
- Automated decision engines: Differentiating between planned maintenance, emergency failures, and routine Secret ID rotation events
- Multi-tier redundancy: Ensuring monitoring systems themselves do not become single points of failure during mass restart scenarios
Complex CI/CD pipeline dependencies
- State management systems: Tracking which agents need new credentials due to restarts, TTL expiry, or use exhaustion
- Validation and rollback procedures: Ensuring successful credential delivery and agent restart completion
- Cross-environment coordination: Handling development, staging, and production simultaneously during planned rotations
- Emergency response automation: Generating and delivering new wrapped Secret IDs within minutes for unplanned events
Advanced automation maturity prerequisites
Organizations must have achieved enterprise-grade DevOps maturity to handle the operational complexity:
- Event-driven architectures: Message queues, event processing, and automated workflows for real-time response
- High-availability orchestration: Redundant systems and automatic failover to prevent delivery failures
- Comprehensive monitoring stacks: Real-time alerting and correlation engines across multiple data streams

The validated persistent mode solution
This pattern implements a persistent mode with an automated unwrapping approach that maintains the security benefits of response wrapping while providing operational simplicity:
Security Benefits Preserved:
- Response wrapping is still used for initial Secret Zero delivery with tamper evidence and time limits
- Regular automated rotation prevents long-term credential exposure before Secret ID Max TTL or Max Uses
- Proper file permissions and monitoring provide defense-in-depth security controls
- Comprehensive audit trails maintain compliance visibility and change tracking
Operational Complexity Eliminated:
- Standard restart procedures work without requiring orchestrator intervention
- Token re-authentication at Max TTL works automatically using persistent Secret ID (when still valid)
- Simplified monitoring requirements with proactive rotation instead of real-time event processing
- Existing operational skills can handle troubleshooting and maintenance procedures
- Reduced infrastructure dependencies with fewer critical systems and failure points

Decision matrix: Capability assessment
| Requirement | Ephemeral Mode (Wrapped Tokens Only) | Persistent Mode (Automated Unwrapping) |
|---|---|---|
| Initial security | Response wrapping with tamper evidence | Response wrapping with tamper evidence |
| Agent/VM restart handling | Requires a new wrapped Secret ID delivery | Uses a persistent unwrapped Secret ID |
| Secret ID Max TTL expiry | New Secret ID generation + wrapped delivery/Agent restart required | New Secret ID generation + wrapped delivery required |
| Secret ID Max Uses exhausted | New Secret ID generation + wrapped delivery/Agent restart required | New Secret ID generation + wrapped delivery required |
| Token Max TTL re-auth | Requires orchestrator intervention for a new wrapped Secret ID | Automatic re-authentication using persistent Secret ID (if valid) |
| Infrastructure | Event-driven monitoring + CI/CD pipelines | Standard automation + periodic monitoring |
| Team skills | Advanced DevOps + event processing expertise | Standard Linux administration + basic automation |
| Monitoring complexity | Real-time event correlation, sub-minute response | Periodic health checks, 5-30 minute intervals |
| Resource overhead | High: Continuous monitoring, frequent deliveries | Low: Periodic rotation, standard operations |
| Failure scenarios | Complex: Event detection + pipeline + delivery | Simple: File system + network + process |
| TTL/Uses exhaustion handling | Requires immediate orchestrator response | Handled by proactive rotation scheduling |
| Anti-pattern prevention | Prevents all three anti-patterns | Prevents all three anti-patterns |
When to choose each approach
Choose Persistent Mode When:
- You prefer standard operational maturity over complex event-driven automation
- Resource constraints limit investment in continuous monitoring infrastructure
- Frequent restart scenarios occur due to maintenance, updates, or scaling operations
- You prioritize operational simplicity to ensure consistent implementation and maintenance
Consider Ephemeral Mode Only When:
- Enterprise-grade event-driven infrastructure is already implemented and maintained
- Advanced automation teams with 24/7 operations capability are available
- Comprehensive monitoring systems already exist with real-time event processing capability
- Security requirements justify resource investment for complex automation
Key Insight: Both approaches prevent dangerous anti-patterns while using response wrapping for secure Secret Zero delivery. The difference lies in operational complexity: persistent mode provides practical security that is consistently implemented and maintained, while ephemeral mode requires enterprise-grade automation infrastructure that many organizations struggle to implement correctly.
Complete workflow: Day 0 to day N operations
Recommended Vault TTL configuration matrix
| Configuration | High-Security Production | Standard Production | Staging | Development |
|---|---|---|---|---|
| Token TTL | 1h | 2h | 4h | 8h |
| Token max TTL | 4h | 8h | 12h | 24h |
| Secret ID TTL | 24h | 24h | 48h | 7d |
| Secret ID max uses | 3 | 10 | 15 | 20 |
| Rotation frequency | 6h | 12h | 24h | 5d |
| Emergency SLA | 15m | 30m | 1h | Best effort |
| Monitoring interval | 1m | 5m | 15m | 30m |
Day -1: Prerequisites
Infrastructure Requirements:
- Ansible orchestrator configured with Vault superprivileged token (be able to create secret ID, regenerate secret ID, create policies, and so on).
- The HashiCorp Vault cluster is running and accessible
- Target systems with root/sudo access.
- Network connectivity between all components
- Python
hvaclibrary installed on Ansible control node:pip install hvac community.hashi_vaultcollection installed:ansible-galaxy collection install community.hashi_vault
Day 0: Initial setup
Phase 0: Vault Agent setup
Please refer to the official Vault Agent documentation for more details.
- Initial Vault Agent setup and directory structure creation:
#!/bin/bash
# Create vault-agent user if it does not exist
id vault-agent &>/dev/null || useradd --system --home /opt/vault-agent --shell /bin/false vault-agent
# Create required directories with proper permissions
for DIR in /opt/vault-agent /opt/vault-agent/auth /opt/vault-agent/config /opt/vault-agent/secrets /opt/vault-agent/templates /var/log/vault-agent; do
mkdir -p "$DIR"
chown vault-agent:vault-agent "$DIR"
chmod 0755 "$DIR"
done
- Ensure the Vault server CA certificate exists in
/etc/ssl/certsor you import it into the system trust store. - Run an unauthenticated GET request to verify Vault is reachable and you trust TLS (replace
<vault-server>as appropriate):
curl --cacert /etc/ssl/certs/vault-ca.crt https://<vault-server>:8200/v1/sys/health
Phase 1: Vault Agent configuration
Configure Vault Agent with AppRole authentication and template-based secret rendering.
The Vault Agent configuration controls authentication behavior, secret rendering, and operational resilience. Pay special attention to these critical parameters that define your operational mode:
Authentication Mode Control:
remove_secret_id_file_after_reading = false: Persistent mode - Secret ID remains on disk for autonomous re-authenticationremove_secret_id_file_after_reading = true: Ephemeral mode - Secret ID deleted after reading, requires orchestrator for every restart
Response Wrapping:
secret_id_response_wrapping_path = "sys/wrapping/unwrap"- Only needed in EPHEMERAL mode for automatic unwrapping- Comment this out for PERSISTENT mode (unwrapping handled by ExecStartPre script)
Sink Configuration:
- Required if your application needs direct Vault API access instead of file-based templates
- Token written to
/opt/vault-agent/tokenwith restrictive 0600 permissions
# Create the Vault Agent configuration file
cat > /opt/vault-agent/config/vault-agent.hcl << 'EOF'
# Vault Agent Configuration - Persistent Mode
pid_file = "/opt/vault-agent/vault-agent.pid"
vault {
address = "https://vault.company.com:8200"
retry {
num_retries = 3
}
}
auto_auth {
method "approle" {
mount_path = "auth/approle"
config = {
role_id_file_path = "/opt/vault-agent/auth/role_id"
secret_id_file_path = "/opt/vault-agent/auth/secret_id"
remove_secret_id_file_after_reading = false # PERSISTENT MODE
# remove_secret_id_file_after_reading = true # EPHEMERAL MODE
# Enable response wrapping unwrap only for EPHEMERAL MODE
secret_id_response_wrapping_path = "sys/wrapping/unwrap"
}
retry {
num_retries = 10
retry_backoff = "30s"
retry_max_backoff = "5m"
}
}
# Required if the application does not use templates and needs to connect
# directly to the Vault server to fetch data.
sink "file" {
config = {
path = "/opt/vault-agent/token"
mode = 0600
}
}
}
template {
source = "/opt/vault-agent/templates/api-keys.tpl"
destination = "/opt/vault-agent/secrets/api-keys.conf"
perms = 0640
}
EOF
# Set proper ownership and permissions
chown vault-agent:vault-agent /opt/vault-agent/config/vault-agent.hcl
chmod 640 /opt/vault-agent/config/vault-agent.hcl
The Vault Agent systemd service follows HashiCorp's production deployment recommendations for security hardening and reliability. This configuration implements standard best practices documented in Run Vault as a service.
Note: This is the base configuration for ephemeral mode. The persistent mode configuration (shown later) adds an ExecStartPre script for automatic Secret ID unwrapping, this is the unique element specific to this pattern.
# Create systemd service file (without enabling)
cat > /etc/systemd/system/vault-agent.service << 'EOF'
[Unit]
Description=HashiCorp Vault Agent
Documentation=https://www.vaultproject.io/docs/agent
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/opt/vault-agent/config/vault-agent.hcl
StartLimitBurst=3
StartLimitIntervalSec=60
[Service]
Type=notify
User=vault-agent
Group=vault-agent
ProtectSystem=full
ProtectHome=read-only
PrivateTmp=yes
PrivateDevices=yes
SecureBits=keep-caps
AmbientCapabilities=CAP_IPC_LOCK
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
NoNewPrivileges=yes
ExecStart=/usr/local/bin/vault agent -config=/opt/vault-agent/config/vault-agent.hcl
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
RestartSec=5
TimeoutStopSec=30
LimitNOFILE=65536
LimitMEMLOCK=infinity
[Install]
WantedBy=multi-user.target
EOF
# Reload systemd to recognize the new service (but don't enable yet)
systemctl daemon-reload
Vault Agent templates use Go template syntax with Consul Template functions to dynamically render secrets. Key concepts:
{{- with secret "path" -}}: Fetches secret from Vault (hyphens trim whitespace)- Template syntax for KV v2: nested structure accessing API response field and storage field
- Templates re-render automatically when secrets change or tokens renew
- The agent writes output files atomically to prevent application from reading partial content
# Create the secret template file
cat > /opt/vault-agent/templates/api-keys.tpl << 'EOF'
# Replace ${APP_NAME} with your actual application name (e.g., myapp)
{{- with secret "kv/data/${APP_NAME}/config" -}}
API_KEY={{ .Data.data.api_key }}
API_SECRET={{ .Data.data.api_secret }}
{{- end -}}
EOF
# Set proper ownership and permissions
chown vault-agent:vault-agent /opt/vault-agent/templates/api-keys.tpl
chmod 644 /opt/vault-agent/templates/api-keys.tpl
Phase 2: Vault Infrastructure
Note: The code snippets are Ansible tasks. Place them on your Ansible control server so that you can execute them as part of a playbook. For example, you can create a file like AppRoleConfig.yaml in your Ansible directory and run it to configure Vault.
Enable AppRole Authentication Method:
- name: "Enable AppRole authentication method"
community.hashi_vault.vault_write:
url: "{{ vault_url }}"
auth_method: token
token: "{{ vault_admin_token }}"
path: "sys/auth/approle"
data:
type: approle
description: "AppRole authentication for applications"
register: approle_enable_result
failed_when:
- approle_enable_result.failed
- "'already in use' not in approle_enable_result.msg | default('')"
Create Application Policies:
- name: "Create application policy"
community.hashi_vault.vault_write:
url: "{{ vault_url }}"
auth_method: token
token: "{{ vault_admin_token }}"
path: "sys/policies/acl/{{ app_name }}-policy"
data:
policy: |
# KV v2 secrets
path "kv/data/{{ app_name }}/*" {
capabilities = ["read"]
}
# Allow token renewal
path "auth/token/renew-self" {
capabilities = ["update"]
}
# Allow token lookup for validation
path "auth/token/lookup-self" {
capabilities = ["read"]
}
Phase 2: AppRole configuration
The following parameters define both security boundaries and operational behavior:
token_ttl: Base token lifetime before requiring renewal (recommend: 1Â h-24Â h)token_max_ttl: Maximum token lifetime before forced re-authentication (recommend: 7-30 days)secret_id_ttl: Secret ID expiration (recommend: 30-90 days)secret_id_num_uses: How many times you can use Secret ID (recommend: 0 for unlimited)bind_secret_id: true: Critical: Requires Secret ID for authentication (never turn off)token_bound_cidrs: Network restrictions for additional security layer
- name: "Configure AppRole"
community.hashi_vault.vault_write:
url: "{{ vault_url }}"
auth_method: token
token: "{{ vault_admin_token }}"
path: "auth/approle/role/{{ app_name }}"
data:
policies: ["{{ app_name }}-policy"]
token_ttl: "{{ env_config.token_ttl }}"
token_max_ttl: "{{ env_config.token_max_ttl }}"
secret_id_ttl: "{{ env_config.secret_id_ttl }}"
secret_id_num_uses: "{{ env_config.secret_id_num_uses }}"
bind_secret_id: true
token_bound_cidrs: "{{ app_cidrs | default([]) }}"
Phase 3: Application secrets preparation
After configuring the AppRole, populate the KV v2 secrets engine with the application's required configuration secrets before distributing credentials.
# Create application configuration secrets in KV v2
vault kv put kv/${APP_NAME}/config \
api_key="your-secure-api-key-here" \
api_secret="your-secure-api-secret-here"
Phase 4: Role ID Distribution
- name: "Day N - Vault Agent Operations"
hosts: vault_agents
become: yes
tasks:
- name: "Get Role ID"
community.hashi_vault.vault_read:
url: "{{ vault_url }}"
auth_method: token
token: "{{ vault_admin_token }}"
path: "auth/approle/role/{{ app_name }}/role-id"
register: role_id_result
delegate_to: localhost
- name: "Deploy Role ID"
copy:
content: "{{ role_id_result.data.data.role_id }}"
dest: /opt/vault-agent/auth/role_id
mode: '0644' # is not sensitive data
owner: vault-agent
group: vault-agent
Phase 5: Secret ID Generation
- name: "Day N - Vault Agent Operations"
hosts: vault_agents
become: yes
tasks:
- name: "Generate wrapped Secret ID"
community.hashi_vault.vault_write:
url: "{{ vault_url }}"
auth_method: token
token: "{{ vault_admin_token }}"
path: "auth/approle/role/{{ app_name }}/secret-id"
wrap_ttl: "5m"
data:
ttl: "{{ env_config.secret_id_ttl }}"
num_uses: "{{ env_config.secret_id_num_uses }}"
metadata:
created_by: "ansible"
environment: "{{ environment }}"
timestamp: "{{ ansible_date_time.iso8601 }}"
register: wrapped_secret
delegate_to: localhost
- name: "Deploy wrapped Secret ID"
copy:
content: "{{ wrapped_secret.data.wrap_info.token }}"
dest: /opt/vault-agent/auth/secret_id
mode: '0600'
owner: vault-agent
group: vault-agent
Day 1+: Ongoing operations and lifecycle management
Persistent mode
When Vault Agent starts in persistent mode, it requires two essential credential files specified in its configuration file:
role_id_file_path = "/opt/vault-agent/auth/role_id"(non-sensitive, persistent)secret_id_file_path = "/opt/vault-agent/auth/secret_id"(sensitive, rotated by the orchestrator)
The Secret ID is a short-lived credential whose rotation the orchestrator manages according to the TTL policy defined in the Vault TTL Configuration Matrix.
The persistent mode workflow creates a robust, self-healing system that balances security with operational simplicity through a three-step process:
- Wrapped Secret ID Delivery: The orchestrator generates a tamper-evident wrapped Secret ID with a 5-minute TTL and delivers it to
/opt/vault-agent/auth/secret_id, replacing any existing content. This wrapped token provides security guarantees during transit and storage. - Automatic Unwrapping and Persistence: When Vault Agent service starts or restarts, the ExecStartPre script automatically detects the wrapped token format, unwraps it using the Vault API to obtain the actual Secret ID, and automatically replaces the file content with the unwrapped credential. This persistent Secret ID enables autonomous operation.
- Continuous Authentication Lifecycle: The agent leverages the persistent Secret ID to handle all authentication scenarios autonomously - including token renewals, re-authentication after token expiry, and recovery from service or system restarts - without requiring orchestrator intervention. The orchestrator manages security by delivering fresh wrapped Secret IDs according to defined TTL policies.
- Pre-Start Unwrapping Script Configuration:
The systemd service must include a pre-start script to handle wrapped Secret ID tokens:
# Create the unwrapping script
cat > /opt/vault-agent/bin/unwrap-secret-id.sh << 'EOF'
#!/bin/bash
SECRET_ID_FILE="/opt/vault-agent/auth/secret_id"
VAULT_ADDR="${VAULT_ADDR:-https://vault.company.com:8200}"
# Check if secret_id file contains a wrapped token (starts with hvs.CAESIJ)
# The pattern '^hvs\.CAESIJ' matches Vault response-wrapped tokens.
# This prefix identifies wrapped tokens in Vault's token format as of Vault 1.8+.
# If Vault changes its token format in future versions, this pattern may need to be updated.
if [[ -f "$SECRET_ID_FILE" ]] && grep -q "^hvs\.CAESIJ" "$SECRET_ID_FILE"; then
echo "Wrapped Secret ID detected, unwrapping..."
# Read the wrapped token
WRAPPED_TOKEN=$(cat "$SECRET_ID_FILE")
# Unwrap the token to get the actual Secret ID
mkdir -p /opt/vault-agent/logs
UNWRAPPED_SECRET_ID=$(vault unwrap -field=secret_id "$WRAPPED_TOKEN" 2>>/opt/vault-agent/logs/unwrap-secret-id.err)
if [[ $? -eq 0 ]] && [[ -n "$UNWRAPPED_SECRET_ID" ]]; then
# Automatically replace the wrapped token with the unwrapped Secret ID
echo "$UNWRAPPED_SECRET_ID" > "$SECRET_ID_FILE.tmp"
mv "$SECRET_ID_FILE.tmp" "$SECRET_ID_FILE"
echo "Secret ID successfully unwrapped and stored"
else
echo "Failed to unwrap Secret ID - using existing content"
fi
else
echo "No wrapped token found - using existing Secret ID"
fi
EOF
# Make script executable
chmod 755 /opt/vault-agent/bin/unwrap-secret-id.sh
chown vault-agent:vault-agent /opt/vault-agent/bin/unwrap-secret-id.sh
- Updated systemd service configuration with automatic unwrapping:
This configuration extends the base systemd service with pattern-specific modifications for persistent mode operation:
What is different from base configuration:
ExecStartPre=/opt/vault-agent/bin/unwrap-secret-id.sh: Automatically unwraps response-wrapped Secret IDs before agent startupRestart=always: CHANGED fromon-failureto ensure recovery from any termination, not just failures
Why These Changes Matter:
The ExecStartPre script is the key enabler for combining response wrapping security (tamper-evidence, time-bounds) with operational autonomy (no orchestrator needed for restarts).
The base systemd hardening features remain unchanged from HashiCorp's standard recommendations.
# Update the systemd service to include the unwrapping script
cat > /etc/systemd/system/vault-agent.service << 'EOF'
[Unit]
Description=HashiCorp Vault Agent
Documentation=https://www.vaultproject.io/docs/agent
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/opt/vault-agent/config/vault-agent.hcl
StartLimitBurst=3
StartLimitIntervalSec=60
[Service]
Type=notify
User=vault-agent
Group=vault-agent
ProtectSystem=full
ProtectHome=read-only
PrivateTmp=yes
PrivateDevices=yes
SecureBits=keep-caps
AmbientCapabilities=CAP_IPC_LOCK
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
NoNewPrivileges=yes
# ExecStartPre script
ExecStartPre=/opt/vault-agent/bin/unwrap-secret-id.sh
ExecStart=/usr/local/bin/vault agent -config=/opt/vault-agent/config/vault-agent.hcl
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always
RestartSec=5
TimeoutStopSec=30
LimitNOFILE=65536
LimitMEMLOCK=infinity
[Install]
WantedBy=multi-user.target
EOF
# Reload systemd configuration
systemctl daemon-reload
systemctl enable vault-agent.service
- Vault Agent Operational Behavior:
- Token Renewal: Automatically renews tokens before
token_ttlexpiration without re-authentication - Automatic Re-authentication: Uses stored Role ID + Secret ID when token renewal fails or reaches
token_max_ttl - Restart Resilience: Agent and VM restarts use existing credential files for seamless re-authentication
- Automatic Recovery: Restart=always ensures service recovery from failures
- Operational Scenarios and Handling:
| Scenario | Vault Agent Behavior | Recovery Method |
|---|---|---|
| Token Renewal | Automatic renewal until token reaches token_ttl | Native capability - no intervention |
| Re-authentication | Uses stored Role ID + Secret ID when renewal fails | Automatically using persistent files |
| Agent Restart | Reads credential files and re-authenticates | Automatically using existing files |
| VM Restart | Service starts and authenticates using stored files | Automatic recovery |
| Secret ID Expiry | Authentication fails, waits for a new Secret ID | Orchestrator delivers a new wrapped Secret ID |
| Maintenance Mode | Orchestrator pauses rotation during maintenance windows | Manual orchestrator control |
Ephemeral mode
- Automatic Unwrapping and In-Memory Storage:
- Wrapped Token Detection: On startup, Vault Agent automatically detects wrapped Secret ID content in the configured path.
- Native Unwrapping: Agent uses the built-in
sys/wrapping/unwrapendpoint to unwrap the Secret ID. - Memory-Only Storage: The agent keeps the unwrapped Secret ID in memory and never writes it to disk.
- File Cleanup: Even with
remove_secret_id_file_after_reading = true, the wrapped file may remain; however, it becomes unusable after a single use of unwrapping.
- Ephemeral Mode Workflow Overview:
The ephemeral mode workflow prioritizes maximum security through strict credential lifecycle management:
- Orchestrated Secret ID Delivery: The orchestrator generates a fresh wrapped Secret ID and deploys it to the target system, ensuring tamper-evident delivery with minimal exposure time through the 5-minute wrap TTL.
- Automatic Unwrapping and Memory-Only Storage: Vault Agent automatically detects and unwraps the Secret ID using native capability, storing the credential in memory while the wrapped token becomes permanently unusable after single consumption.
- Event-Driven Recovery Management: All restart scenarios (VM reboot, service failure, token re-authentication) require orchestrator intervention to generate and deliver fresh wrapped Secret IDs, enabling zero-persistence security at the cost of operational complexity and real-time monitoring requirements.
- Authentication and Secret Fetching:
Standard Operation After Successful Startup:
Once authenticated with the in-memory Secret ID, Vault Agent operates as expected:
- Authenticates to Vault using Role ID + unwrapped Secret ID
- Fetches application secrets from configured KV paths
- Renders secrets through the template engine to the application configuration files
- Handles automatic token renewal until token reaches
token_max_ttl
- Vault Agent Operational Behavior:
With ephemeral mode configuration, Vault Agent provides:
- Token Renewal: Automatically renews tokens before
token_ttlexpiration without re-authentication - Limited Re-authentication: Cannot re-authenticate after token expiry due to in-memory-only Secret ID storage
- No Restart Resilience: Agent and VM restarts require orchestrator intervention for new Secret ID delivery
- No Automatic Recovery:
Restart=noprevents uncontrolled restarts; all recovery is orchestrator-managed
- Operational Scenarios and Handling:
| Scenario | Vault Agent Behavior | Recovery Method |
|---|---|---|
| Token Renewal | Automatic renewal until token reaches token_ttl | Native capability - no intervention |
| Re-authentication | Cannot re-authenticate - Secret ID consumed and not persisted | Orchestrator delivers new wrapped Secret ID + restart |
| Agent Restart | Service fails to start - no valid Secret ID available | Orchestrator delivers new wrapped Secret ID + restart |
| VM Restart | Service fails to start - no credential persistence | Orchestrator delivers new wrapped Secret ID + restart |
| Secret ID Expiry | Authentication fails, service becomes non-functional | Orchestrator delivers new wrapped Secret ID + restart |
| Token Max TTL | Re-authentication fails, service becomes non-functional | Orchestrator delivers new wrapped Secret ID + restart |
| Maintenance Mode | Orchestrator coordinates all operations during maintenance | Manual orchestrator control with event queuing |
Conclusion
Ephemeral versus persistent mode:
- Restart Behavior: Ephemeral requires orchestrator intervention for all restarts
- Monitoring Complexity: Real-time event processing and sub-minute response times required
- Infrastructure Dependencies: High-availability orchestration and monitoring systems are essential
- Security Posture: Maximum security through zero credential persistence
- Operational Overhead: Significant automation infrastructure and 24/7 operations capability required
When to Choose Ephemeral Mode:
- Enterprise-grade event-driven infrastructure already implemented
- Advanced DevOps teams with comprehensive monitoring capabilities
- Security requirements justify complex automation investment
- Real-time event processing systems are available and maintained
This ephemeral mode implementation provides maximum security through zero credential persistence while requiring sophisticated operational automation to handle all restart and recovery scenarios.