Skip to content

Getting Started

MCP-First Pipeline

Pluck is built MCP-first. Every pipeline phase is exposed as an MCP (Model Context Protocol) tool, spec-compliant against MCP 2024-11-05, mutations are dry-run by default, and every act produces an Ed25519-signed receipt. Add one line to your agent config and your Claude / Cursor / Continue session can connect to 30 sources, extract structured data, shape it against Zod schemas, mutate with reversibility, and sense signals humans can't perceive.


Why MCP-first

The 2026 agent ecosystem runs on MCP. Claude Desktop, Cursor, Continue, the Claude Code CLI, and every serious agent runtime now speak the Model Context Protocol. Pluck is not a data pipeline library with "MCP support tacked on" – the MCP surface is the primary distribution story, and every piece of the pipeline is designed to compose through it.

What that means in practice:

  • Every phase is an MCP tool. pluck_extract, pluck_act, pluck_sense, pluck_snitch, pluck_probe, pluck_context, pluck_dowse, pluck_speak, pluck_radio – 9 tools covering the full pipeline surface.
  • Spec-compliant. Pluck's MCP server targets spec version 2024-11-05. Notifications return no response (JSON-RPC 2.0 §4.1). ping keepalive is implemented. Stdio transport handles CRLF line endings and empty lines correctly.
  • Safe by default. pluck_act defaults to dryRun: true so agents cannot accidentally mutate state. An agent must pass dryRun: false explicitly to execute.
  • Signed. Every mutation the agent performs produces an Ed25519-signed receipt. The receipt is verifiable without Pluck installed – just the public key.

If you are writing an agent today, adding Pluck means adding one stdio server to your MCP config. That's it.


Install

Shell
npm install @sizls/pluck-mcp

The package ships a pluck-mcp bin that runs the MCP server over stdio. Point your agent's MCP config at it.


Wire up an agent

Claude Desktop

Add to your ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

JSON
{
  "mcpServers": {
    "pluck": {
      "command": "npx",
      "args": ["-y", "@sizls/pluck-mcp"]
    }
  }
}

Restart Claude Desktop. The 9 Pluck tools appear automatically.

Cursor

In .cursor/mcp.json at your project root (or ~/.cursor/mcp.json globally):

JSON
{
  "mcpServers": {
    "pluck": {
      "command": "npx",
      "args": ["-y", "@sizls/pluck-mcp"]
    }
  }
}

Claude Code

Shell
claude mcp add pluck -- npx -y @sizls/pluck-mcp

Continue / other clients

Any MCP-spec client that supports stdio transport works the same way – run npx -y @sizls/pluck-mcp and the server handshakes over stdin/stdout.


Scoping the tool surface

Agents that only need a subset of Pluck's 9 tools can narrow the visible surface. Less tool-list noise means smaller system prompts and cheaper per-turn cost:

JSON
{
  "mcpServers": {
    "pluck": {
      "command": "npx",
      "args": ["-y", "@sizls/pluck-mcp"],
      "env": {
        "PLUCK_MCP_TOOLS": "extract,act"
      }
    }
  }
}

PLUCK_MCP_TOOLS is an allowlist (comma-separated, with or without the pluck_ prefix). PLUCK_MCP_EXCLUDE is a denylist. Exclude wins when the same tool appears in both.


The 9 tools

ToolWhat your agent can doDefault safety
pluck_extractPull structured content from any URL / DB / file / API. Returns typed text in any of 12 output formats.Read-only.
pluck_actPerform one of 27 actions across 9 actors: HTTP (post/put/patch/delete), GraphQL (mutate/query), browser (6 Playwright actions), browser-agent (LLM-driven agent-navigate), shell-write (fs:*, exec:command), email (send / send-with-attachment), AWS (s3/dynamodb/sqs/sns/lambda), GCP (storage/pubsub/firestore/functions), Azure (blob/servicebus/cosmos/functions). Every call produces a signed receipt.dryRun: true default; browser-agent adds a 4-layer response-policy gate (allowedDomains, allowedActions, actionBudget, humanInLoop); cloud function URLs pass through a strict safeHttpsUrl host-suffix allowlist.
pluck_senseAnalyse an audio / video / text / image source for signals below human perception. 37 sensors across spectral (fft / spectrogram / pitch / tempo / chromagram / mfcc), decoded (dtmf / morse / fsk / psk / am-demod / fm-demod / ssb-demod), band (ultrasonic / infrasonic), diagnostic (noise-floor), identity (birdsong / rppg / animalsong), physiological (heartbeat / breathing), periodicity + anomaly, text-domain (cipher-classify / cipher-crack-caesar / cipher-crack-vigenere / steganography-text), image-domain (ela / heatmap / moire / flicker / rolling-shutter), plus CV-domain (faces / scene / ocr-text-regions / thermal / ground-anomaly). Three optional peers: sharp, face-api.js, @xenova/transformers.Read-only.
pluck_snitchPrivacy audit any URL. Returns a forensic report covering trackers, fingerprinting, ultrasonic beacons, and PII leaks. Sign it by passing a signing key in the environment.Read-only.
pluck_probePre-flight introspection – source type, content type, estimated cost, recommended format. Zero full-pipeline cost.Read-only.
pluck_context"Where am I?" – schema.org, OG tags, robots.txt, sitemaps, known connectors that match, PII likelihood.Read-only.
pluck_dowseZero-config signal reconnaissance – runs all sensors in fast mode against a signal source and ranks findings.Read-only.
pluck_speakInverse of sense – encode JSON as ultrasonic, DTMF, or Morse.Producer-only (returns base64 WAV).
pluck_radioDecode RF protocols from SDR captures. Ships with the ADS-B aircraft decoder today; FM/AM/SSB demodulators available via pluck_sense.Read-only.

Every tool's inputSchema is valid JSON Schema draft-7 and is what the agent sees when Pluck joins the session. Agents discover the tools dynamically – you don't have to document anything for Claude or Cursor; the server describes itself.

Shape is deliberately absent from the MCP tool list. Shape runs in-process between extract and act; it's a type-level contract, not an agent-facing tool. See Concepts: Shape for the pattern.


Zero-key LLM extraction (MCP sampling)

pluck_extract accepts a strategy argument – "css" | "regex" | "llm" | "hybrid". When the agent asks for "llm" or "hybrid" and no PLUCK_LLM_API_KEY is set in the server's environment, Pluck routes the extraction through the MCP host's own sampling/createMessage endpoint. The host's LLM answers the prompt; Pluck structures the call and parses the JSON response.

The practical effect:

  • No OPENAI_API_KEY / ANTHROPIC_API_KEY in the server env.
  • No rate-limit or cost on Pluck's side – the host's quota pays.
  • Model hints respected – pass llm: { model: "claude-opus-4-7" } and the host sees it as a modelPreferences.hints[] entry per the sampling spec.
JSON
{
  "tool": "pluck_extract",
  "arguments": {
    "uri": "https://news.ycombinator.com/item?id=42",
    "strategy": "llm",
    "prompt": "Extract the title, author, and number of comments.",
    "schema": {
      "type": "object",
      "properties": {
        "title":    { "type": "string" },
        "author":   { "type": "string" },
        "comments": { "type": "number" }
      },
      "required": ["title", "author", "comments"]
    }
  }
}

When the server emits initialize, its capability set now includes sampling: {} – Claude Desktop / Cursor / Continue all see the server is sampling-aware and will answer the outbound request from their own model. Hosts that don't support sampling get a clean error back instead of a hang – SamplingClient enforces a 60-second timeout on every outbound request.

This is the deepest "MCP-first" wedge in the tool. No other MCP server in the ecosystem inverts the server → client flow for LLM calls; every other LLM-aware server requires the user to bring a second API key.


Direct JSON-RPC (driving the server yourself)

Every MCP tool is reachable via vanilla JSON-RPC 2.0 over stdio. This is what Claude Desktop / Cursor / Continue send internally, and it's the easiest way to debug tool behaviour without an agent in the loop.

Shell
# Start the server with a PATH-scoped TTY. Easiest way: a single
# echo-pipe for a one-shot call, or a `tee` wrapper for live debugging.
npx -y @sizls/pluck-mcp

Once running, send frames on stdin:

jsonc
// 1) Handshake – every session starts here.
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05"}}

// Server responds with `capabilities: { tools: {}, sampling: {} }`.

// 2) Discover the tools – the JSON Schemas the agent reasons about.
{"jsonrpc":"2.0","id":2,"method":"tools/list"}

// 3) Call a tool. Arguments match the inputSchema from tools/list.
{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "pluck_extract",
    "arguments": {
      "uri": "https://news.ycombinator.com",
      "format": "markdown"
    }
  }
}

Every response is a single line of JSON. Errors follow JSON-RPC conventions ({ error: { code, message } }); tool invocations that throw are wrapped as isError: true content blocks per MCP spec.

Concrete curl-style smoke test

Drop this into a one-off script when you want to confirm a server binary is healthy without wiring a full agent:

TypeScript
import { spawn } from "node:child_process";

const server = spawn("npx", ["-y", "@sizls/pluck-mcp"], {
  stdio: ["pipe", "pipe", "inherit"],
});

const send = (frame: Record<string, unknown>) => {
  server.stdin.write(JSON.stringify(frame) + "\n");
};

server.stdout.on("data", (buf) => {
  for (const line of buf.toString().split("\n").filter(Boolean)) {
    console.log("←", JSON.parse(line));
  }
});

send({ jsonrpc: "2.0", id: 1, method: "initialize", params: { protocolVersion: "2024-11-05" } });
send({ jsonrpc: "2.0", id: 2, method: "tools/list" });
send({
  jsonrpc: "2.0",
  id: 3,
  method: "tools/call",
  params: {
    name: "pluck_probe",
    arguments: { uri: "https://news.ycombinator.com" },
  },
});

setTimeout(() => server.kill(), 3000);

Expected output: three responses (initialize handshake, tools/list with 9 entries, probe result with sourceType, contentType, and estimatedCost).


Example prompts

Once Pluck is wired in, natural-language prompts just work:

Extract the top 10 posts from https://news.ycombinator.com and summarise the themes. Agent calls pluck_extract("https://news.ycombinator.com"), gets structured HN data, summarises.

Audit https://example.com for tracker leaks and produce a signed report. Agent calls pluck_snitch("https://example.com"), returns signed forensic findings.

What's in this WAV file? Check for hidden touch-tones and anything above 18 kHz. Agent calls pluck_dowse("./mystery.wav") first, or pluck_sense({ uri: "./mystery.wav", detect: ["dtmf", "ultrasonic"] }) directly.

Dry-run a DELETE on https://api.example.com/users/42 – what would happen? Agent calls pluck_act({ uri, action: "delete", dryRun: true }), shows the preview receipt.

Actually delete it now – here's my signing key. Agent calls pluck_act({ uri, action: "delete", dryRun: false }), returns the real signed receipt.


Safety model

Pluck's MCP surface is designed around three assumptions about agent behaviour:

  1. Agents make mistakes. Default everything to dry-run. Force the agent (or the human in the loop) to opt in to real mutations.
  2. Agents should produce observations. Every pluck_act call produces a SignedReceipt with a canonical, verifiable signature. Reviewers verify receipts offline with the public key alone.
  3. Agents shouldn't exceed your policy. The underlying createPluck({ policy: "./.pluckpolicy.yaml" }) config gates every action; pluck_act honours it. An agent attempting to delete a production URL against a deny rule gets POLICY_DENIED and cannot override.

The stdio transport trusts the host OS – Pluck's MCP server does not ship its own auth. Access control is whatever your agent client enforces. Configure Pluck credentials (API keys, SSH keys, signing keys) in the same environment where you launch the server.


The one-line pitch

A Claude Code user opens a new session. They type:

"Add Pluck to my MCP config."

Pluck joins the session. The agent now has:

  • Eyes – 30+ connectors for reading any URL / DB / file / API.
  • Hands – signed, reversible, policy-gated mutations with dry-run by default.
  • Ears + eyes – 37 sensors including ultrasonic, rPPG heart-rate from video, heartbeat/breathing from audio, DTMF, Morse, FSK, PSK, chromagram, MFCC, classical cipher classification + cracking, invisible-character steganography detection (including the Unicode tag block used in ASCII-smuggling prompt-injection), image forensics (ELA tampering, rolling-shutter deepfake signature, moiré screen-recording detection, AC-light flicker banding), and ML-backed CV (face detection + single-frame liveness heuristic, scene classification, pre-OCR text-region detection, thermal hotspots, satellite ground-anomaly change-detection, broader bioacoustic ID beyond birds). Plus live-streaming via createSensorStream for mic / SDR / SIP feeds.
  • Conscience – every act produces an Ed25519-signed receipt. pluck.undo(receipt) reverses it.

No other tool in the ecosystem offers that bundle through a single MCP server.


What's next

The five concepts Pluck exposes over MCP:

  • Concepts: Connect – URI → connector → typed bytes. 30 connectors ship.
  • Concepts: Navigate – prepare bytes between connect and extract. Readability, Playwright, agent-driven.
  • Concepts: Extract – bytes → text, segments, data.
  • Concepts: Shape – loose data → Zod contract. Runs in-process; no MCP tool.
  • Concepts: Act – signed receipts + undo + policy + idempotency, in depth.
  • Concepts: Sense – 37 sensors across audio / video / text / image / CV; three optional peers (sharp, face-api.js, @xenova/transformers).

Reference and recipes:


Full runnable example

The smallest no-agent MCP integration – spawns @sizls/pluck-mcp as a subprocess, issues the initialize handshake, asks for tools/list, and calls the pluck_extract tool against a live URL. Exactly the three frames Claude Desktop / Cursor / Continue send internally. Opens in a fresh StackBlitz sandbox.

Edit this page on GitHub
Previous
Quick Start

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →