DEV Community

Mustafa ERBAY
Mustafa ERBAY

Posted on • Originally published at mustafaerbay.com.tr

Modern Approaches to Secret Rotation: Securing Your Systems

This morning when I sat down at my computer, an alert from journald last night caught my attention: A service was receiving an invalid credentials error while trying to use a key assumed to be a database connection password. When I checked immediately, I saw that a secret rotation process had been triggered earlier than expected. This situation got me thinking about modern secret rotation practices.

In the past, I would have tried to resolve this manually, stopping and restarting services, perhaps temporarily reverting to the old key. But now, we need to make our systems more automated and secure. In this post, drawing from my own experiences, I will talk about how I adopt smarter and safer approaches to secret rotation.

Why Should We Rotate Secrets?

Secret rotation is a fundamental step for the security of our systems. How long a secret (API key, database password, certificate, etc.) remains valid throughout its lifetime and how it is updated at the end of this period is highly critical. If a secret is leaked or compromised, its long-term validity can lead to serious security vulnerabilities in our systems.

For example, back when I was working on a production ERP system, we noticed that an API key we used for supply chain integration had been leaked. Fortunately, thanks to the rotation policy we used, we were able to change the key quickly and prevent a potential data breach. The incident demonstrated once again that regular secret rotation is not just a "good practice" but a "necessity."

ℹ️ The Importance of Secret Rotation

Secret rotation is a critical security measure to prevent unauthorized access, reduce the risk of data breaches, and strengthen your overall security posture. When not done regularly, the impact of a compromised secret can be devastating.

Today, many security vulnerabilities stem from old or leaked secrets continuing to be used for long periods. Therefore, we must choose our rotation strategies carefully and automate them.

Traditional Secret Rotation Approaches and Their Pain Points

For many years, secret rotation was generally done manually or through simple scripts. While these approaches might initially seem sufficient for small-scale systems, they can cause serious problems in growing and complex infrastructures.

To give an example, in an enterprise software project, we had to change database passwords manually every three months. This process was usually done during a maintenance window, but sometimes it was postponed or overlooked due to planning errors or technical hitches. Once, a fellow developer accidentally delayed the rotation by a week. During that one-week period, a potential security vulnerability existed. Fortunately, no other leaks occurred at that time, but this incident showed us how risky manual processes can be.

⚠️ Risks of Manual Secret Rotation

Manual processes carry serious security risks because they are prone to human error, time-consuming, and require regular follow-up. Missing or incorrectly implementing rotation times can leave systems vulnerable.

Furthermore, updating each service's configuration could turn into a nightmare in distributed systems. Forgetting to update a service or misconfiguring it could cause the entire system to crash. These kinds of problems highlight the inevitability of automation.

Modern Approach: Automation and Centralized Management

Modern approaches focus on automating secret rotation and managing it from a centralized location. This both reduces security risks and eases the operational burden. Solutions like HashiCorp Vault, AWS Secrets Manager, and Azure Key Vault stand out in this area. These tools, besides securely storing secrets, offer advanced features like automatic rotation, access control, and auditing.

I have started using these kinds of tools in my own projects as well. For example, I manage the database passwords of several web services hosted on my own VPS with AWS Secrets Manager. I set the rotation policy to 30 days, and using Secrets Manager's integration with AWS Lambda, I ensure that the relevant services automatically restart with the new key whenever a rotation occurs. This way, I no longer have to deal with manually changing passwords, and my system always stays up-to-date and secure.

💡 Benefits of Automation

Automatic secret rotation eliminates the possibility of human error, ensures regular and timely updates, increases operational efficiency, and raises the overall security level.

When setting up this integration, I performed extensive tests to make sure the Lambda function triggered the correct services and that the services started healthily post-rotation. Reviewing systemd unit files and service dependencies played a critical role in this process.

Technical Implementation: Vault and Service Integration

HashiCorp Vault is a highly powerful tool for secret rotation. It supports many different secret engine types and offers customizable rotation policies. Let's walk through a simple example of how we can rotate a database secret using Vault.

First, we need to enable the database secret engine in Vault. Let's say we are using PostgreSQL:

vault secrets enable database
vault write database/config/my-postgres-db \
    plugin_name=postgresql \
    connection_url="postgresql://root:{{.Password}}@host.docker.internal:5432/mydatabase?sslmode=disable" \
    username="root" \
    password="<your_root_password>" \
    allowed_roles="my-app-role" \
    default_lease_ttl="72h" \
    max_lease_ttl="144h"
Enter fullscreen mode Exit fullscreen mode

Here, connection_url specifies how Vault will communicate with the database. The username and password fields are the superuser credentials Vault will use to connect to the database. default_lease_ttl and max_lease_ttl determine how long the temporary credentials generated by Vault will remain valid.

Now, let's define a role that our application will use:

vault write database/roles/my-app-role \
    db_name=my-postgres-db \
    creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD \"{{password}}\" VALID UNTIL \"{{expiration}}\";" \
    default_ttl="30m" \
    max_ttl="60m"
Enter fullscreen mode Exit fullscreen mode

This configuration ensures that Vault creates a new user and password on every request. default_ttl and max_ttl specify how long these generated temporary credentials will be valid.

As a rotation strategy, we can use Vault's own rotation mechanism or an external one. Vault's own mechanism allows Vault itself to update the database credentials after a certain period. However, it is generally preferred that applications dynamically pull credentials directly from Vault.

ℹ️ Dynamic Credential Generation

Tools like Vault generate credentials dynamically, ensuring that each application has its own unique and short-lived credentials. This minimizes risk in the event that a single credential is leaked.

If we are to use Vault's direct rotation feature (for example, for certificate rotation), we can adjust settings like rotation_period in the configuration of the relevant secret engine.

Integration and Security Tips

Rotating secrets is one thing, but ensuring this process happens securely and without disrupting our systems is another. Here are some key points we need to consider:

  1. Zero Trust Architecture: Every service should only have as much access authority as it needs. Credentials obtained from Vault or a similar tool must be specific only to that service and that task.
  2. Access Control (RBAC): Strictly control who can access secrets, when, and under what conditions. Vault's Policy system is highly effective in this regard.
  3. Audit Logs: Every secret access and rotation process must be logged. These logs are vital for monitoring and analyzing security events. You can centrally collect and analyze these logs using logging tools like journald or SIEM solutions.
  4. Rollback Strategy: You must have a plan for what to do if an error occurs during automatic rotation. This could be automatically reverting to the old key or triggering an alert for manual intervention. Once, due to an error in a systemd unit file, the service couldn't start with the new key, and I had to slow down the process with a workaround like sleep 360. Then I fixed the error and continued where I left off.
  5. Lease Time Adjustment: Keep the lease time of dynamically generated credentials as short as possible. This limits the lifetime of a leaked credential. For example, durations like 30 minutes or 1 hour can be a good starting point.

You can configure your applications to authenticate to Vault using Vault's auth/approle mechanism. This allows applications to securely pull credentials by creating a unique role and secret ID for each of them.

# For your application to get role and secret ID from Vault (simple example)
VAULT_ADDR="http://your-vault-address:8200"
APP_ROLE_ID="your-app-role-id"
APP_SECRET_ID="your-app-secret-id"

# Get token
TOKEN=$(curl -s \
    -X POST \
    -d "{\"role_id\": \"$APP_ROLE_ID\", \"secret_id\": \"$APP_SECRET_ID\"}" \
    "$VAULT_ADDR/v1/auth/approle/login" | jq -r .auth.client_token)

# Pull database credentials
DB_CREDS=$(curl -s \
    -H "X-Vault-Token: $TOKEN" \
    "$VAULT_ADDR/v1/database/creds/my-app-role" | jq -r .data)

DB_USERNAME=$(echo "$DB_CREDS" | jq -r .username)
DB_PASSWORD=$(echo "$DB_CREDS" | jq -r .password)

echo "Username: $DB_USERNAME, Password: $DB_PASSWORD"
Enter fullscreen mode Exit fullscreen mode

This code snippet gives an idea of how your application can retrieve credentials from Vault. In real-world applications, this process is done in a more secure and automated manner.

🔥 Important Security Warning

The above code example is simplified. In real production environments, sensitive information like APP_SECRET_ID must be saved in Vault or another secure location, and more advanced methods should be used to manage this information.

Future Trends: AI-Powered Secret Management

Looking to the future, AI is likely to play a larger role in the secret management space. AI can identify suspicious access attempts by detecting anomalies, analyze risk profiles, and even automatically optimize rotation policies.

For example, if an AI model detects that a service is querying an API or a database table it doesn't normally use, it can flag this as a security breach and automatically trigger a rotation of the associated secret. This offers a proactive security approach.

AI can also use "prompt engineering" techniques to generate smarter instructions on how secrets are produced or used. For example, an AI agent can determine the most appropriate password complexity and length for a specific task.

Currently, in the financial calculator projects I am developing, I am working on AI-based data validation and anomaly detection. These experiences allow me to see the potential of AI in the secret management domain more clearly.

💡 The Potential of AI

Artificial intelligence can make secret rotation processes smarter, more proactive, and more secure. The opportunities offered by AI in areas like anomaly detection, risk analysis, and automatic policy optimization are vast.

Although such advanced systems are not yet widespread, they will significantly impact our secret rotation strategies in the future. Therefore, it is important to closely follow and adapt to these technologies.

Conclusion: The Continuous Improvement Cycle

Secret rotation is not a one-time operation, but a process that needs continuous improvement. Moving from traditional manual methods to centralized and automated solutions is essential to counter today's complex cybersecurity threats.

Tools like HashiCorp Vault and AWS Secrets Manager are powerful platforms that facilitate this transition. By using these tools, you can securely store your secrets, enable automatic rotation, and strictly control access.

Remember, security is a marathon, not a sprint. Understanding the lifecycle of every single secret you use in your systems, reducing risks, and continuously adopting more secure practices will provide the greatest benefit in the long run.

As I mentioned in my previous [related: Preventing brute-force attacks with Fail2ban on Linux systems] post, keeping security measures constantly updated and being proactive against potential threats is the responsibility of all of us.

Top comments (0)