Skip to content

Recipes

Recipe: DriftWatch Fleet

Point it at 40 servers. Get back tamper-evident proof of what changed. Merkle-chained, DSSE-signed, webhook-alerting.


The demo

Shell
pluck driftwatch ssh://web-{01..40}.prod.example.com \
  --config /etc/nginx \
  --signing-key ./keys/driftwatch.pem \
  --audit-log ./audit/nginx.jsonl \
  --webhook slack:https://hooks.slack.com/services/T.../B.../...

One command. Forty SSH hosts. DriftWatch:

  1. Opens a connection pool to every host (keyed by user@host:port|bastion|auth-fp).
  2. Hashes /etc/nginx/** on each host – with semantic normalisation so comment edits don't cause false-positive drift.
  3. Writes an Ed25519 DSSE (Dead Simple Signing Envelope) to ./audit/nginx.jsonl, Merkle-chained to the previous entry via prevHash.
  4. Fires the Slack webhook for any host whose hash changed since the last baseline.
  5. Exits 0 if the fleet is clean, 1 if drift was detected.

What makes this different

Config-drift detection is a well-trodden category – Chef, Ansible, Puppet, Salt, and every SRE's crontab have their own takes. What DriftWatch does that none of those do:

FeatureWhy it matters
DSSE-signed attestationsEvery drift report is an in-toto-compatible attestation you can verify with cosign verify-blob or any DSSE-aware tool. Chain-of-custody baked in.
Merkle-chained audit logEvery entry embeds the prevHash of the prior entry. A single excised line breaks the chain. pluck driftwatch verify-log <path> walks the chain and fails if any entry is tampered.
Semantic config hashingsha256-of-bytes triggers on comment edits, whitespace, include reordering. DriftWatch normalises (strips comments, sorts includes, folds whitespace) before hashing so you only alert on real changes.
Fleet expansion at the URI layerssh://web-{01..40} expands before any sockets open. Bastion chains (ssh://jumphost/web-01) are part of the pool key, so ssh://prod-jump/web-01 and ssh://web-01 never share a connection.
Backpressure + reconnectRunaway hosts get bounded queues with drop-oldest semantics. Dead hosts retry with exponential backoff + jitter (default 5 attempts, 1s–60s).
Zero dependencies for verificationThe signed DSSE envelope is a plain JSON blob. pluck verify-log is a pure function – no runtime, no plug-ins, no hosted service.

Setting up the fleet

Pick a working directory and generate a keypair:

Shell
mkdir -p ~/.pluck/driftwatch/{keys,audit}
pluck keys generate --name prod --dir ~/.pluck/driftwatch/keys
# Writes ~/.pluck/driftwatch/keys/prod.pem (private) and prod.pub.pem (public)

Commit prod.pub.pem to the repo so anyone can verify signatures later. Keep prod.pem in a secrets manager.

Define a fleet manifest (optional – brace expansion works without this):

YAML
# fleets/prod.yaml
fleet: prod-web
hosts:
  - ssh://web-{01..40}.prod.example.com
  - ssh://api-{a,b,c}.prod.example.com

First run – baseline

The first run captures the baseline hashes:

Shell
pluck driftwatch --fleet fleets/prod.yaml \
  --config /etc/nginx \
  --signing-key ~/.pluck/driftwatch/keys/prod.pem \
  --audit-log ~/.pluck/driftwatch/audit/nginx.jsonl \
  --baseline

DriftWatch writes one DSSE-enveloped entry per host with the initial content hash. The audit log looks like:

jsonl
{"payloadType":"application/vnd.in-toto+json","payload":"eyJfdHlwZSI6IWh0dHBzOi8vaW4tdG90by5pby9TdGF0ZW1lbnQvdjAuMSIsInN1Yml...","signatures":[{"keyid":"9f3a8b1c","sig":"..."}],"prevHash":null}
{"payloadType":"application/vnd.in-toto+json","payload":"...","signatures":[...],"prevHash":"sha256:0a1b2c3d..."}

Each payload is a base64-encoded in-toto Statement with:

  • _type: "https://in-toto.io/Statement/v0.1"
  • subject: [{ name: "ssh://web-01.prod.example.com/etc/nginx", digest: { sha256: "..." } }]
  • predicateType: "https://pluck.run/drift/v1"
  • predicate: the full host snapshot (file list, semantic hash, tool version, capturing host, signer fingerprint, timestamp)

Continuous monitoring

Run DriftWatch on a schedule (systemd timer, cron, or the hosted /v1/schedules API):

Shell
*/15 * * * * pluck driftwatch --fleet /etc/pluck/fleets/prod.yaml \
    --config /etc/nginx \
    --signing-key /etc/pluck/keys/prod.pem \
    --audit-log /var/log/pluck/nginx.jsonl \
    --webhook slack:https://hooks.slack.com/...

Every 15 minutes, DriftWatch:

  1. Reuses the SSH connection pool (LRU-evicted when idle).
  2. Pulls /etc/nginx/** in parallel across the fleet with --parallel (default 10).
  3. Compares the semantic hash to the baseline.
  4. Emits a signed DSSE entry for every host (drift or clean) – the chain is complete.
  5. Fires the webhook only on drift, formatted for the target platform (Slack blocks, PagerDuty incidents, Opsgenie alerts, generic webhook).

Verifying the audit log

Any auditor can verify the chain without Pluck installed – the DSSE envelopes are standalone:

Shell
# With the standalone CLI (backlog – see IDEAS.md for `@sizls/verify`):
npx @sizls/verify chain ./audit/nginx.jsonl --public-key prod.pub

# Or with Pluck today:
pluck driftwatch verify-log ./audit/nginx.jsonl
# Exit code: 0 valid, 1 broken chain, 2 input error

# Or with standard tooling (cosign / dsse-verify):
cosign verify-blob --bundle entry.json --key prod.pub

verify-log walks the chain from the first entry, confirms each prevHash matches the sha256 of the prior entry, verifies every Ed25519 signature against the public key, and reports any discrepancy.

A single tampered entry – even a one-byte edit – breaks the chain and fails the verify. That's the whole point.


Programmatic use

Today the driftwatch orchestrator is exposed via the CLI only – the one-liner under Quick start is the canonical entry point. A standalone driftwatch() function is tracked on the IDEAS backlog; until it lands, you can compose the primitives that driftwatch itself uses:

TypeScript
import { createPluck } from "@sizls/pluck";

const pluck = createPluck({ signingKey: privateKeyPem });

// 1. Read each config file over SSH with fleet brace expansion
const result = await pluck(
  "ssh://web-{01..40}.prod.example.com/etc/nginx/nginx.conf",
);

// 2. Hash + compare against the baseline you store yourself
const sha = result.metadata.contentSha256;
if (sha !== baselineHash) {
  await slack.post(
    "#sre-alerts",
    `Drift on ${result.metadata.host}: sha ${sha}`,
  );
}

The CLI's added value is the Merkle-chained audit log, Ed25519-signed entries, backoff under fleet churn, and the verify-log replay. All of that is on the roadmap for a standalone programmatic API; for now, shell out to pluck driftwatch when you need those guarantees.


Cost comparison

For a 40-host fleet with 15-minute polling:

ToolMonthly costSigned?Chain-of-custody?
Datadog Infra$31 × 40 = $1,240/moNoNo
New Relic Infra$0.25/GB × ingest = ~$400-800NoNo
Nagios XI$1,995/yr + ops = ~$500/mo fully loadedNoNo
Pluck DriftWatch$0 (OSS, self-hosted)Yes (Ed25519 + DSSE)Yes (Merkle)

The cost story is real, but it's table stakes. The differentiator is the cryptographic chain of custody – no commercial alternative produces signed, tamper-evident drift reports.


What's next

Edit this page on GitHub
Previous
Security

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →