Skip to content

Bureau — Red & Blue (dual-use)

Whistle

Whistle is an anonymous-source intake pipeline for AI-related disclosures. It produces a public-log-anchored signed bundle without recording any operator identity that could later be subpoenaed.

Posture: 🟣 Red & Blue (dual-use)   ·   Status: alpha

What it does

Whistle is intended for an internal source at an AI lab who has documentation of a policy violation (for example, training data captured from users who had explicitly opted out) and needs to disclose it to journalists without leaving an identity trail.

The submitter packages documents into a JSON bundle. Whistle redacts content that matches secret patterns, applies operator-flagged scrubs, and refuses to ship if the writing style is too distinctive (stylometry can fingerprint authors from rare trigrams). Whistle then generates a fresh ephemeral signing key in memory, signs the redacted bundle, discards the key, and routes the signed package to the chosen press desks. The signature anchors what was sent and when on a public log; no operator-identity record is retained.

Who would use it

  • A research engineer at a frontier AI lab who finds the company is training on opted-out enterprise conversations.
  • A trust-and-safety reviewer who sees the company hid a model-card discrepancy from regulators.
  • A contracted red-teamer whose NDA forbids public disclosure but who has observations of a serious safety incident.
  • An employee at a hyperscaler who finds the firm is providing a custom model to a sanctioned customer.
  • A union steward at an AI-labelling vendor with observations of training-data laundering.

What you'll need

  • Node.js 20 or newer.
  • The Pluck CLI: npm i -g @sizls/pluck-bureau-cli.
  • The Tor Browser, or a Tor proxy on your local machine. Whistle does not carry the bytes itself – it signs them. You route them over Tor.
  • A laptop you trust. Strip EXIF from photos, scrub PDF metadata, and remove screenshots' filenames before bundling.
  • A pre-written summary of the observations in your own words. Edit it for stylometric blandness – short sentences, common phrasing.

Step-by-step

Build a JSON bundle that describes the observations. Keep the summary short and factual.

JSON
{
  "schemaVersion": 1,
  "summary": "Vendor X trained on opted-out enterprise conversations.",
  "evidence": [
    { "kind": "cassette-rekor-uuid", "value": "9f3a8b1c..." },
    { "kind": "screenshot-hash", "value": "a1b2c3..." }
  ]
}

Submit it. Pick a category. Pick which press desks you want it routed to.

Shell
pluck bureau whistle submit ./submission.json \
  --category policy-violation \
  --routing "propublica,bellingcat,404media,eff" \
  --k-floor 5 \
  --stylometric 0.3

Whistle prints the redaction audit so you can see what was scrubbed, the ephemeral signing-key fingerprint (different on every run), and the submission ID.

whistle/submit: policy-violation submission sealed.
  submissionId:    7a1c...d903
  routedTo:        propublica, bellingcat, 404media, eff
  anonFingerprint: 4f2c1d...e0a9
  redaction:       k=8, style=0.12
  manual scrubs:   2
  secret scrubs:   2

Now notarize the body to the public Sigstore Rekor log using your operator's notarize tooling, and route the bytes themselves over Tor to the press webhook. The signature is independent of the bytes – verifiers fetch the body Rekor stored, recompute the hash, check the ephemeral key's signature.

If you want to add a press desk that isn't in the default list, register it before submitting:

Shell
pluck bureau whistle route <submission-uuid> \
  --add-target "https://desk.example/api/whistle" \
  --add-id "desk-example" \
  --accepts "training-data,policy-violation"

A journalist who later receives the leak runs:

Shell
pluck bureau whistle verify <rekor-uuid>

– and gets a cryptographic anchor proving the bundle is exactly what the source sent at exactly that timestamp, signed by a key that exists nowhere else in the world.

Run it yourself

Drop this into a Node 20+ project (npm install @sizls/pluck-bureau-whistle tsx):

TypeScript
// index.ts
import { createWhistleSystem } from "@sizls/pluck-bureau-whistle";

async function main() {
  // Whistle's signer is an EPHEMERAL keypair generated per submission.
  // The system config does not need an operator signingKey.
  const system = createWhistleSystem({
    disablePausePoll: true,
    disableLogging: true,
  });

  try {
    const summary = JSON.stringify({
      schemaVersion: 1,
      summary: "Vendor X trained on opted-out enterprise conversations.",
      evidence: [
        { kind: "screenshot-hash", value: "a1b2c3d4e5f6789a0b1c2d3e4f506172" },
        { kind: "policy-version", value: "v3.2 (2026-02-01)" },
      ],
    });

    const result = system.submit(new TextEncoder().encode(summary), {
      category: "policy-violation",
      routing: { requested: ["propublica", "bellingcat"] },
    });

    console.log(`whistle/submit: ${result.submission.category} sealed.`);
    console.log(`  submissionId:    ${result.submission.submissionId.slice(0, 16)}...`);
    console.log(`  routedTo:        ${result.submission.routedTo.join(", ")}`);
    console.log(`  anonFingerprint: ${result.submission.anonSignerFingerprint.slice(0, 16)}...`);
    console.log(`  redaction:       k=${result.redaction.kAnonymity} style=${result.redaction.stylometric}`);
    console.log(`  secret scrubs:   ${result.redaction.secretReplacements}`);
    console.log(`  total submissions: ${system.facts.submissions().length}`);
  } finally {
    await system.shutdown();
  }
}

main().catch((err) => { console.error(err); process.exit(1); });

Run with tsx index.ts. Expected output:

whistle/submit: policy-violation sealed.
  submissionId:    7a1c0d3e7a8b9012...
  routedTo:        propublica, bellingcat
  anonFingerprint: 4f2c1d3e7a8b9012...
  redaction:       k=passes style=passes
  secret scrubs:   0
  total submissions: 1

(In production, after system.submit(...) you notarize the submission body to Sigstore Rekor and route the bytes to the press desk over Tor – see Operator Duties → Whistleblower-source protection.)

▶ Open in StackBlitz – runs in your browser, no install required.

What you get

The journalist gets the redacted bundle plus a public-log anchor. If they publish a story citing it, anyone in the world – the company's own lawyers, an academic at MIT, a regulator at the FTC – can re-run the verifier and confirm the bundle has not been edited since the moment it was sealed. The vendor cannot claim the document was forged after the fact, because the signature predates any plausible forgery.

You, the source, get a list of the press desks the bundle was dispatched to and a redaction audit confirming what was scrubbed. There is no operator account, no persisted key, no return address. If the press desk gets compelled, all they can hand over is what they received: bytes plus a signature from a key that no longer exists.

What it can't do

Whistle protects against trivial deanonymization. It does not make you invisible to a serious adversary. Concretely:

  • Stylometry guards reject obvious-fingerprint text but cannot save you from quoted internal jargon, project codenames, or unique phrasing you forgot to scrub.
  • File metadata is not scrubbed. EXIF in screenshots, embedded PDF authorship, document revision history – Whistle does not touch these. Strip them yourself.
  • Network-layer correlation. Tor protects against most adversaries; against a state-level adversary watching both ends of the network, Tor alone is not enough.
  • The press desk side. Whistle assumes the desks run zero-log ingestion. If a press desk is compromised or compelled, the bytes you sent could be retrieved.
  • Legal exposure. If the observations show you bypassed access controls to obtain it, US Computer Fraud and Abuse Act (18 U.S.C. § 1030) and UK Computer Misuse Act (1990) liability are real. Talk to a lawyer first when the category is policy-violation or safety-incident.

A real-world example

A staff engineer at an AI lab finds an internal document showing the company trained a customer-support model on conversations from users who had selected the "do not use my data" option. The engineer has the document, the policy text, and three internal messages from the data-pipelines team acknowledging the contamination. She edits a summary in plain language, runs pluck bureau whistle submit with --routing "propublica,404media", confirms the redaction audit and that her writing style passed the rare-trigram threshold, and routes the signed bytes over Tor to ProPublica's intake. Six weeks later, ProPublica publishes. The company disputes the account. ProPublica references the Rekor uuid; the public verifier confirms the document matches what was sealed before publication, signed by a one-time key. The submitter remains unidentified.


For developers

Predicate URI

https://pluck.run/Whistle.Submission/v1

Submissions sign a raw 32-byte digest with the ephemeral Ed25519 key. They are not yet wrapped in DSSE/in-toto, so cosign's verify-attestation cannot consume them today. The Pluck-specific verifier (pluck bureau whistle verify-submission <rekor-uuid>) reads the body from Rekor, recomputes the canonical hash, and checks the anon-key signature. DSSE wrapping is on the roadmap.

Programs composed

attest, notarize, press, broadcast. Whistle reuses Tripwire's secret-pattern redactor and layers k-anonymity + stylometric refusal on top.

Threat model

  • Ephemeral keys never persist by default. --persist-key requires --accept-persist-warning and --persist-path.
  • The submitter sees a different signing-key fingerprint on every CLI invocation; tests cover non-reuse.
  • The redactor is at least as aggressive as Tripwire's redact.ts; the Whistle layer adds k-anonymity refusal + rare-trigram stylometric refusal.
  • Strict ISO 8601 UTC, full 64-hex fingerprints, strict base64 signatures, raw 32-byte digest signing.
  • AbortSignal threading on every CLI action.

Library surface

TypeScript
import {
  generateAnonKey,
  submitWhistle,
  defaultRedactionPolicy,
} from "@sizls/pluck-bureau-whistle";

const anonKey = generateAnonKey();  // never persisted

const result = submitWhistle({
  payload: bodyBuffer,
  category: "policy-violation",
  policy: defaultRedactionPolicy(),
  routing: { requested: ["propublica", "bellingcat"] },
  anonKey,
});

Verify a published cassette

Shell
pluck bureau whistle verify-submission <rekor-uuid>

See also

Edit this page on GitHub
Previous
Nuclei
Next
Refuse

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →