- Docs
- Recipe: DriftWatch Fleet
Recipes
Recipe: DriftWatch Fleet
Point it at 40 servers. Get back tamper-evident proof of what changed. Merkle-chained, DSSE-signed, webhook-alerting.
The demo
pluck driftwatch ssh://web-{01..40}.prod.example.com \
--config /etc/nginx \
--signing-key ./keys/driftwatch.pem \
--audit-log ./audit/nginx.jsonl \
--webhook slack:https://hooks.slack.com/services/T.../B.../...
One command. Forty SSH hosts. DriftWatch:
- Opens a connection pool to every host (keyed by
user@host:port|bastion|auth-fp). - Hashes
/etc/nginx/**on each host – with semantic normalisation so comment edits don't cause false-positive drift. - Writes an Ed25519 DSSE (Dead Simple Signing Envelope) to
./audit/nginx.jsonl, Merkle-chained to the previous entry viaprevHash. - Fires the Slack webhook for any host whose hash changed since the last baseline.
- Exits
0if the fleet is clean,1if drift was detected.
What makes this different
Config-drift detection is a well-trodden category – Chef, Ansible, Puppet, Salt, and every SRE's crontab have their own takes. What DriftWatch does that none of those do:
| Feature | Why it matters |
|---|---|
| DSSE-signed attestations | Every drift report is an in-toto-compatible attestation you can verify with cosign verify-blob or any DSSE-aware tool. Chain-of-custody baked in. |
| Merkle-chained audit log | Every entry embeds the prevHash of the prior entry. A single excised line breaks the chain. pluck driftwatch verify-log <path> walks the chain and fails if any entry is tampered. |
| Semantic config hashing | sha256-of-bytes triggers on comment edits, whitespace, include reordering. DriftWatch normalises (strips comments, sorts includes, folds whitespace) before hashing so you only alert on real changes. |
| Fleet expansion at the URI layer | ssh://web-{01..40} expands before any sockets open. Bastion chains (ssh://jumphost/web-01) are part of the pool key, so ssh://prod-jump/web-01 and ssh://web-01 never share a connection. |
| Backpressure + reconnect | Runaway hosts get bounded queues with drop-oldest semantics. Dead hosts retry with exponential backoff + jitter (default 5 attempts, 1s–60s). |
| Zero dependencies for verification | The signed DSSE envelope is a plain JSON blob. pluck verify-log is a pure function – no runtime, no plug-ins, no hosted service. |
Setting up the fleet
Pick a working directory and generate a keypair:
mkdir -p ~/.pluck/driftwatch/{keys,audit}
pluck keys generate --name prod --dir ~/.pluck/driftwatch/keys
# Writes ~/.pluck/driftwatch/keys/prod.pem (private) and prod.pub.pem (public)
Commit prod.pub.pem to the repo so anyone can verify signatures later. Keep prod.pem in a secrets manager.
Define a fleet manifest (optional – brace expansion works without this):
# fleets/prod.yaml
fleet: prod-web
hosts:
- ssh://web-{01..40}.prod.example.com
- ssh://api-{a,b,c}.prod.example.com
First run – baseline
The first run captures the baseline hashes:
pluck driftwatch --fleet fleets/prod.yaml \
--config /etc/nginx \
--signing-key ~/.pluck/driftwatch/keys/prod.pem \
--audit-log ~/.pluck/driftwatch/audit/nginx.jsonl \
--baseline
DriftWatch writes one DSSE-enveloped entry per host with the initial content hash. The audit log looks like:
{"payloadType":"application/vnd.in-toto+json","payload":"eyJfdHlwZSI6IWh0dHBzOi8vaW4tdG90by5pby9TdGF0ZW1lbnQvdjAuMSIsInN1Yml...","signatures":[{"keyid":"9f3a8b1c","sig":"..."}],"prevHash":null}
{"payloadType":"application/vnd.in-toto+json","payload":"...","signatures":[...],"prevHash":"sha256:0a1b2c3d..."}
Each payload is a base64-encoded in-toto Statement with:
_type:"https://in-toto.io/Statement/v0.1"subject:[{ name: "ssh://web-01.prod.example.com/etc/nginx", digest: { sha256: "..." } }]predicateType:"https://pluck.run/drift/v1"predicate: the full host snapshot (file list, semantic hash, tool version, capturing host, signer fingerprint, timestamp)
Continuous monitoring
Run DriftWatch on a schedule (systemd timer, cron, or the hosted /v1/schedules API):
*/15 * * * * pluck driftwatch --fleet /etc/pluck/fleets/prod.yaml \
--config /etc/nginx \
--signing-key /etc/pluck/keys/prod.pem \
--audit-log /var/log/pluck/nginx.jsonl \
--webhook slack:https://hooks.slack.com/...
Every 15 minutes, DriftWatch:
- Reuses the SSH connection pool (LRU-evicted when idle).
- Pulls
/etc/nginx/**in parallel across the fleet with--parallel(default 10). - Compares the semantic hash to the baseline.
- Emits a signed DSSE entry for every host (drift or clean) – the chain is complete.
- Fires the webhook only on drift, formatted for the target platform (Slack blocks, PagerDuty incidents, Opsgenie alerts, generic webhook).
Verifying the audit log
Any auditor can verify the chain without Pluck installed – the DSSE envelopes are standalone:
# With the standalone CLI (backlog – see IDEAS.md for `@sizls/verify`):
npx @sizls/verify chain ./audit/nginx.jsonl --public-key prod.pub
# Or with Pluck today:
pluck driftwatch verify-log ./audit/nginx.jsonl
# Exit code: 0 valid, 1 broken chain, 2 input error
# Or with standard tooling (cosign / dsse-verify):
cosign verify-blob --bundle entry.json --key prod.pub
verify-log walks the chain from the first entry, confirms each prevHash matches the sha256 of the prior entry, verifies every Ed25519 signature against the public key, and reports any discrepancy.
A single tampered entry – even a one-byte edit – breaks the chain and fails the verify. That's the whole point.
Programmatic use
Today the driftwatch orchestrator is exposed via the CLI only – the one-liner under Quick start is the canonical entry point. A standalone driftwatch() function is tracked on the IDEAS backlog; until it lands, you can compose the primitives that driftwatch itself uses:
import { createPluck } from "@sizls/pluck";
const pluck = createPluck({ signingKey: privateKeyPem });
// 1. Read each config file over SSH with fleet brace expansion
const result = await pluck(
"ssh://web-{01..40}.prod.example.com/etc/nginx/nginx.conf",
);
// 2. Hash + compare against the baseline you store yourself
const sha = result.metadata.contentSha256;
if (sha !== baselineHash) {
await slack.post(
"#sre-alerts",
`Drift on ${result.metadata.host}: sha ${sha}`,
);
}
The CLI's added value is the Merkle-chained audit log, Ed25519-signed entries, backoff under fleet churn, and the verify-log replay. All of that is on the roadmap for a standalone programmatic API; for now, shell out to pluck driftwatch when you need those guarantees.
Cost comparison
For a 40-host fleet with 15-minute polling:
| Tool | Monthly cost | Signed? | Chain-of-custody? |
|---|---|---|---|
| Datadog Infra | $31 × 40 = $1,240/mo | No | No |
| New Relic Infra | $0.25/GB × ingest = ~$400-800 | No | No |
| Nagios XI | $1,995/yr + ops = ~$500/mo fully loaded | No | No |
| Pluck DriftWatch | $0 (OSS, self-hosted) | Yes (Ed25519 + DSSE) | Yes (Merkle) |
The cost story is real, but it's table stakes. The differentiator is the cryptographic chain of custody – no commercial alternative produces signed, tamper-evident drift reports.
What's next
- Concepts: Connect – SSH connection pooling, fleet URI expansion.
- Concepts: Act – the signing primitives DriftWatch reuses.
- Reference: CLI –
pluck tail,pluck query, the other SSH-fleet primitives. - Recipe: Snitch Privacy – the URL equivalent: signed forensic audit of a single page.