I’ve been running a Hashicorp stack for a while now - Vault, Consul and Nomad across a small cluster of machines.
Over time, I’d built up a pretty reasonable setup, but it was always a bit… scattered. Terraform here, manual config there, some docker-compose files scattered about.
The thing is, I’m running all of this on self-hosted hardware. There’s no AWS ECS or Google Cloud Run to fall back on. So I needed a way to bootstrap and manage my orchestrator that was actually manageable.
I wanted to consolidate everything into a proper, reproducible Terraform-based setup that I could actually understand and, more importantly, rebuild when something inevitably went wrong.
The goal was simple:
This is how I built konvad-stack.
Before I get into the actual structure, I should explain why Docker…
When you’re running on self-hosted hardware, you don’t have the nice managed services that cloud providers give you. No ECS, no Cloud Run, no managed K8s. You’ve got bare metal and you need to make it useful.
The thing I like about this approach is I can start with a fresh VM, install Docker, and Terraform handles everything else. All the information about what’s deployed where is in Terraform state - I can see exactly what’s running on each host without having to SSH in and poke around.
Docker gives me:
The tradeoff is I’m using host network mode to avoid the complexity of overlay networks. But honestly, that’s been fine. The services need to talk to each other anyway, and it makes TLS certs simpler since I don’t need to worry about container IPs.
The entire stack is built around a strict dependency chain. Nothing works without the layer below it.
┌─────────────────────────────────────────────────────────┐
│ Nomad │
│ (Jobs run here, using Vault secrets + Consul service) │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ Consul │
│ (Service discovery, KV store, Connect for mesh) │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ Vault │
│ (PKI, secrets engine, auth backends) │
└─────────────────────────────────────────────────────────┘
You literally cannot deploy Nomad without Consul, and you cannot deploy Consul without Vault. This was a design pattern I chose (and though it maybe common, I have heard about Consul being deployed first in some cases).
Everything is organised under environments/ with a prod environment that contains separate directories for each stage of deployment:
environments/prod/
├── images/ # Docker image definitions
├── vault_setup/ # Initial Vault cluster deployment
├── vault_configure/ # Vault PKI, secrets engines, auth
├── consul/ # Consul server deployment
├── consul_configure/ # Consul datacenter config, roles
├── nomad/ # Nomad server deployment
├── nomad_configure/ # Nomad datacenter config
├── nomad_client_*/ # Nomad client clusters
└── deployment_roles/ # Non-nomad service deployment permissions
Each directory is a separate Terraform state that outputs are pulled from by subsequent stages. The separation keeps the blast radius small… which is nice when something breaks. It also provides a boundary for providers - one module creates Vault then next has a Vault provider, which has a hard dependency at plan time for it to be running.
Also means I can destroy and recreate just the Nomad clients without touching the core cluster. Which I’ve had to do. More than once.
Before anything else, I need to build the Docker images that will be used across the cluster. These are stored in a separate Terraform state and referenced by other modules.
The images include:
I build these images once and reference them by digest. The image hash IS the version.
This is where everything begins. The vault_setup module deploys Vault nodes to each host.
Each Vault node gets:
The KMS thing… why run a KMS container when Vault could just use Shamir keys? Well, I wanted auto-unseal without actually depending on AWS KMS or similar. So there’s a local-kms container running on each Vault host that provides a KMS-compatible API.
Vault talks to this local KMS for unseal keys. No manual unseal when a node restarts, no dependency on external services, the unseal keys are stored in the KMS container’s data volume. If the host dies, I just restore the KMS data volume and Vault unseals automatically.
The thing is, Vault needs to be bootstrapped before anything else can happen. I can’t just deploy all three services at once… Vault must be running, initialised, and unsealed before Consul can even start to configure itself.
Once Vault is running, I configure:
This stage outputs all the connection details that other services will need - addresses, mount paths, role names, etc.
Consul servers are deployed next, each with a layered container setup.
On a Consul server, there are actually two containers:
┌─────────────────────────────────────┐
│ Consul Server Container │
│ - Listens on 8500/8501 │
│ - Reads config from /consul/config │
│ - Mounts vault-agent volume │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Vault Agent + Consul Template │
│ - Authenticates to Vault │
│ - Renders Consul config templates │
│ - Manages token renewal │
└─────────────────────────────────────┘
The vault-agent container runs consul-template which:
This pattern is repeated everywhere - no static credentials in config files, everything is dynamically rendered.
With Consul running, I configure the datacenter:
This outputs the datacenter configuration that Nomad will need.
Nomad servers follow the same pattern as Consul - a main container with a vault-agent sidecar for dynamic configuration.
Nomad needs:
The Nomad servers are configured to use Vault for workload identity and Consul for service discovery, creating a fully integrated stack.
Nomad clients are deployed similarly to servers, but with a few differences:
The pattern is the same though - main container plus vault-agent sidecar, all configs dynamically rendered.
Since I’m running on self-hosted machines and not in some magical cloud, I need to actually get config files onto the boxes.
I use a pattern that goes like this:
resource "null_resource" "service_config" {
triggers = {
config = local.config_content
}
connection {
type = "ssh"
user = var.docker_host.username
host = var.docker_host.fqdn
private_key = file(var.docker_host.private_key)
bastion_host = var.docker_host.bastion_host
bastion_user = var.docker_host.bastion_user
}
provisioner "file" {
content = local.config_content
destination = "/service/config/file.hcl"
}
}
resource "docker_container" "service" {
# ... container config ...
lifecycle {
ignore_changes = [image, log_opts]
replace_triggered_by = [
null_resource.service_config,
null_resource.container_image,
]
}
}
So what happens is:
null_resource has a trigger based on the config contentfile provisioner runs via SSH to upload the new configdocker_container has replace_triggered_by pointing to the null_resourceIt’s a bit of a hack… okay, it’s absolutely a hack. But it works, and it means I don’t need to run Ansible or some other config management tool. Terraform handles both the config deployment AND the container lifecycle.
The container image trigger handles the image digest issue - when the image name changes to a new digest, the container gets recreated even though Terraform thinks it’s already at the right version.
This pattern is repeated everywhere - Vault, Consul, Nomad, clients, everything.
The authentication setup… there’s a few different methods depending on what we’re talking about.
Gitlab CI uses JWT auth to authenticate to Vault:
Gitlab CI ──JWT──> Vault ──(AppRole)──> Consul/Nomad tokens
The JWT is signed by Gitlab and verified by Vault. This gives the CI pipeline a Vault token with specific policies that allow it to deploy services.
Each service gets its own JWT role in Vault, bound to its Gitlab project path. So company/service-X can only authenticate as service-X.
Services running in Nomad use workload identity. Nomad signs a JWT for each task that includes metadata like job ID, task name, namespace, etc.
This JWT is used to:
The policies are templated with the JWT metadata, so each task can only access secrets under its own path:
path "service_secrets/data/global/dc1/{{identity.entity.aliases.jwt_backend.metadata.nomad_job_id}}/{{identity.entity.aliases.jwt_backend.metadata.nomad_task}}/*" {
capabilities = ["read"]
}
This means no long-lived tokens - everything is short-lived and automatically renewed. If a task crashes and restarts, it just gets a new JWT and carries on.
For VMs that aren’t part of the container-based setup, I use a JWT-based bootstrap system. A single binary:
This removes the need for complex vault-agent + consul-template chains on VMs. The binary does everything and then execs into the Consul process, becoming the Consul agent itself.
The service_role module is where all of this comes together for actual deployments.
When I want to deploy a new service, I call this module once with the service name and Gitlab project path. It creates:
The module outputs a complex object with everything the deployment Terraform needs:
output "service_role" {
value = {
vault_approle_deployment_role_id = "..."
vault_approle_deployment_secret_id = "..."
vault_consul_engine_path = "consul-dc1"
vault_consul_role_name = "my-service"
vault_nomad_engine_path = "nomad-global"
vault_nomad_role_name = "my-service"
vault = {
ca_cert = "..."
address = "https://vault.svc.example.local:8200"
}
consul = {
address = "https://consul.svc.example.local:8501"
datacenter = "dc1"
root_cert_public_key = "..."
}
nomad = {
address = "https://nomad.svc.example.local:4646"
region = "global"
datacenter = "dc1"
}
}
}
The deployment Terraform then uses this output to configure all its providers - each deployment gets exactly what it needs and nothing more.
The thing that makes this work is the separation between deployment-time and runtime permissions. I wrote about this in SecureVaultConsulNomadDeployments, but the basic idea is:
Deployment permissions (what Terraform needs):
Runtime permissions (what the application needs):
These are completely different. The deployment Terraform needs broad permissions to set everything up, but the application should only be able to access its own stuff.
The deployment starts with Gitlab CI authenticating to Vault using JWT. Gitlab signs a JWT for each job, Vault verifies it against Gitlab’s public keys, and boom - a Vault token with specific policies.
I use the project_path claim for authentication rather than project_id. The reason is… I can see that “company/service-X” would be correct for the service role for “service-X”. project_id = 5231 on the other hand is not. Plus all these projects are internal, the owner is aware of the importance of their project name/path, and project paths can’t be stolen unless the original project changes its name.
Once authenticated, the deployment Terraform uses a Vault secret that contains all the configuration it needs:
data "vault_kv_secret_v2" "config" {
mount = "deployment_secrets_kv"
name = "konvad/services/global/dc1/my-service"
}
locals {
config = merge(
data.vault_kv_secret_v2.config.data,
{
"consul" = jsondecode(data.vault_kv_secret_v2.config.data.consul)
"nomad" = jsondecode(data.vault_kv_secret_v2.config.data.nomad)
"vault" = jsondecode(data.vault_kv_secret_v2.config.data.vault)
}
)
}
This secret contains everything - Vault addresses, Consul endpoints, role names, policy names, domain names. The deployment Terraform doesn’t need any hardcoded values. It just reads this secret and configures its providers.
Each service gets its own set of policies and roles in Vault, Consul, and Nomad:
Vault:
deployment_policy - for Terraform to deploy the serviceapplication_policy - for the application to read its secretstoken_role - to generate short-lived tokens for the applicationConsul:
deployment_policy - for Terraform to register services and intentionsapplication_policy - for the application to register itselfNomad:
deployment_policy - for Terraform to submit jobsapplication_policy - for workload identityThe Harbor container registry gets a robot account for each service too, so Terraform can pull images during deployment without needing a shared credential.
This granularity means that if one service is compromised, the blast radius is limited to that service’s secrets and registration. It can’t touch anything else.
The Vault deployment policy gives Terraform exactly what it needs:
# Generate Consul token for service registration
path "consul-dc1/creds/my-service-deployment" {
capabilities = ["read"]
}
# Generate Nomad token for job submission
path "nomad-global/creds/my-service-deployment" {
capabilities = ["read"]
}
# Write application secrets
path "service_secrets/data/global/dc1/my-service/*" {
capabilities = ["read", "list", "create", "update", "delete"]
}
The Consul deployment policy allows:
write on the service itself (for registration)write on intentions (for service mesh config)The Nomad deployment policy allows:
submit-job and dispatch-job in the namespacecsi-write-volume if the service needs volumes (most don’t)Vault has several auth methods configured:
AppRole (approle) - For services that need long-lived credentials (like the consul-template sidecars)
JWT (gitlab_jwt) - For Gitlab CI authentication. Each project gets a role bound to its project_path, so company/service-X can only authenticate as service-X
JWT (jwt_nomad_global_dc1) - For Nomad workload identity. Nomad signs JWTs for each task, Vault verifies them and maps the claims to policies
Consul secrets engine (consul-dc1) - Generates dynamic Consul tokens with specific policies
Nomad secrets engine (nomad-global) - Generates dynamic Nomad tokens with specific policies
This means no static tokens anywhere… everything is dynamically generated with the minimum required permissions.
Every component communicates over TLS with proper certificate verification. No “trust me bro” HTTP traffic anywhere.
The PKI setup works like this:
So the hierarchy is more like:
┌────────────────────────────────────────┐
│ Offline Root CA (air-gapped) │
└──────────────────┬─────────────────────┘
│
┌──────────▼──────────┐
│ Vault Intermediate CA│
│ (signed by root) │
└──────────┬───────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Vault │ │ Consul │ │ Nomad │
│ certs │ │ certs │ │ certs │
└─────────┘ └─────────┘ └─────────┘
Each service gets certificates from this intermediate CA, including the service’s main hostname, localhost for local connections, the service’s IP address, and any relevant alt names.
For the Nomad client clusters, there’s a Traefik PKI role that allows generating wildcard certificates for specific domains. This means services running on that cluster can get certificates for things like *.example.local without needing individual certs for every service.
The role is scoped to specific domains that are allowed for that cluster, so one cluster can’t generate certs for another cluster’s services.
There’s a few patterns in the Terraform that make this whole thing manageable.
Instead of outputting individual values, I group related outputs into objects. So I can pass one big object instead of wiring up half a dozen variables every time.
output "vault_cluster" {
description = "Vault cluster configuration"
value = {
address = vault_cluster_address
ca_cert = var.ca_cert
pki_mount_path = vault_mount.pki.path
consul_static_mount_path = vault_mount.consul_static.path
service_secrets_mount_path = vault_mount.service_secrets.path
gitlab_jwt_auth_backend_path = vault_jwt_auth_backend.gitlab.path
}
}
When the next stage needs to use Vault, it just passes vault_cluster as one object. Makes the module calls cleaner… and I can add new outputs without breaking every downstream module.
Each layer is composed of smaller modules. The consul_server module isn’t one monolith:
module "consul_server" {
source = "../../../modules/consul/server"
datacenter = module.dc1
vault_cluster = data.terraform_remote_state.vault_configure.outputs.vault_cluster
root_cert = module.consul_certificate_authority
docker_images = data.terraform_remote_state.images.outputs
docker_host = local.hosts["banana"]
}
And consul_server itself is composed of:
container - the actual Docker containerimage - the image buildvault_approle - the AppRole for consul-templateI can test and reuse pieces independently. If I need a Consul client somewhere else, I just use the consul/client module without having to think about how it works internally.
Different modules need to interact with different Docker hosts, so I use provider passing:
module "vault_node" {
source = "../../../modules/vault/node"
docker_host = var.docker_host
providers = {
docker = docker.vault
vault.vault-adm = vault.vault-adm
}
}
Each host gets its own Docker provider configured with the right SSH keys and bastion settings. The module doesn’t need to know HOW to connect to the host - it just uses docker.docker and assumes the provider is configured correctly.
I could use the same vault/node module to deploy to a completely different set of hosts just by changing the provider configuration.
Everything runs in Docker containers on the hosts. Has pros and cons.
The host network thing… it avoids having to deal with Docker overlay networks and makes TLS certificates simpler. But yeah, everything shares the host network namespace, which isn’t ideal from a security perspective.
For a homelab though? It’s fine. For production? I’d probably think harder about it.
There’s a few things I’d reconsider:
replace_triggered_by pattern is a bit hacky. But it works, and the alternatives are worse.But honestly? The stack works. It’s reproducible, it’s secure, and it follows a clear dependency chain. I can rebuild the entire thing from scratch in a few hours if I need to. That’s not nothing.
So that’s konvad-stack… a Terraform-based deployment of Vault, Consul and Nomad using Docker containers on self-hosted hardware.
The patterns that make it work:
It’s not perfect, but it’s mine, and it actually works. Which is more than I can say for some of the infrastructure I’ve dealt with over the years…