Skip to content

Getting Started

Getting Started

Install Pluck, run your first pipeline in under five minutes, and get pointers to every concept.


Install

Shell
npm install @sizls/pluck

Node 20 or 22 is required. Everything else (@sizls/pluck-cli, @sizls/pluck-mcp, @sizls/pluck-api) is optional – pick the surface you want.

Shell
# CLI
npm install -g @sizls/pluck-cli

# MCP server (agent integration)
npm install @sizls/pluck-mcp

# REST API server
npm install @sizls/pluck-api

Your first pluck

TypeScript
import { pluck } from "@sizls/pluck";

const result = await pluck("https://news.ycombinator.com");

console.log(result.output("markdown"));

That single call runs the full pipeline:

  1. Connect – match https://news.ycombinator.com to the HackerNews connector (registered before the generic HTTP catch-all).
  2. Navigate – default direct mode; bytes flow through unchanged.
  3. Extract – the HN extractor pulls top stories with title, score, comments, author.
  4. Output.output("markdown") renders a clean markdown document.

Try other formats:

TypeScript
result.output("json");    // full PluckResult as JSON
result.output("csv");     // tabular rows
result.output("text");    // plain text
result.preset("rag"); // chunks ready for a vector store

The CLI

Same result, one line of shell:

Shell
pluck https://news.ycombinator.com --format markdown
pluck https://news.ycombinator.com --format json | jq '.items[:5]'
pluck https://news.ycombinator.com --preset rag

pluck --help lists every subcommand. pluck <cmd> --help gives per-command flags.


Read anything

30 connectors ship. Swap the URI, keep the API:

TypeScript
await pluck("postgres://localhost/app?limit=10");   // database row stream
await pluck("reddit://r/typescript/hot");           // subreddit
await pluck("rss://blog.example.com/feed.xml");     // RSS feed
await pluck("s3://my-bucket/data/daily.csv");       // S3 object
await pluck("./earnings-call.mp3");                 // audio file (transcribed)
await pluck("./document.pdf");                      // PDF (parsed)
await pluck("kafka://broker/my-topic");             // Kafka stream (async iterable)

Private / authenticated surfaces land in v0.5 via pluck oauth login <service>; the 30-connector table marks which are public-only today. See Reference: Connectors for the full list.


Shape the output

Pluck's shape phase pins the result to a Zod schema – extract gives loose data, shape validates and narrows it:

TypeScript
import { pluck, spotifyTrack } from "@sizls/pluck";

const track = await pluck("https://open.spotify.com/track/3n3Ppam7vgaVa1iaRUc9Lp", {
  shape: { schema: spotifyTrack },
});

if (track.shape?.valid) {
  console.log(track.shape.data.title);  // typed as string | undefined
}

Six social shape templates ship out of the box (spotifyTrack, twitchClip, instagramPost, tiktokPost, vimeoVideo, twitterTweet). Bring your own Zod schema for everything else – see Concepts: Shape.


Act with signed receipts

Writing is as easy as reading, except every mutation produces an Ed25519-signed receipt:

TypeScript
import { createPluck, verifyChain } from "@sizls/pluck";

// createPluck wires a durable signingKey once; later act() calls inherit it.
const pluck = createPluck({
  signingKey: process.env.PLUCK_SIGNING_KEY,
});

const result = await pluck.act("https://api.example.com/todos", {
  action: "post",
  input: { title: "Buy milk" },
  dryRun: true,
});

console.log(result.signedReceipt?.signature);
console.log(result.signedReceipt?.signedBy);

const chain = verifyChain([result.signedReceipt!], {
  publicKeys: [process.env.PLUCK_PUBLIC_KEY!],
});
console.log(chain.summary);

Generating your keys

Generate a durable Ed25519 keypair once with the CLI and keep the private key in your secret manager:

Shell
pluck keys generate --name pluck --dir ./keys
# Writes ./keys/pluck.pem (private) and ./keys/pluck.pub.pem (public)

Commit pluck.pub.pem alongside your code so anyone can verify receipts offline. Load pluck.pem into PLUCK_SIGNING_KEY via your secret manager of choice.

Dry-run is the default when calling through MCP, so agents can't surprise-mutate state. See Concepts: Act for receipts + undo + policy + idempotency.


Sense signals humans can't perceive

37 sensors (audio + video + text + image + CV) + live streaming – zero native deps for audio/text/video (three optional peers for image + CV: sharp / face-api.js / @xenova/transformers):

TypeScript
const call = await pluck.sense("./call-recording.wav", {
  detect: ["dtmf", "ultrasonic", "anomaly"],
});

console.log(call.sensed?.features.dtmf);
// { digits: "212-555-0199", decodedAt: [0.1, 0.3, 0.5, ...] }

Pluck is the only JS/TS pipeline library with built-in DSP at this depth – FFT, spectrogram, DTMF, pitch, tempo, chromagram, MFCC, ultrasonic beacons, infrasonic, noise-floor, FSK / PSK, AM / FM / SSB demodulation, Morse, rPPG (remote photoplethysmography) heart-rate from video, heartbeat / breathing from audio, birdsong ID, periodicity, and anomaly detection. Plus createSensorStream for live mic / SDR / SIP audio. See Concepts: Sense.


Run at fleet scale

pluck.fleet({...}) coordinates N identities × M targets with proxy rotation, per-target rate limits, a reputation circuit breaker, and an Ed25519-signed audit chain:

TypeScript
const fleet = pluck.fleet({
  count: 100,
  proxies: loadedProxies,
  audit: { signingKey: () => process.env.PLUCK_SIGNING_KEY!, sink: "./audit.ndjson" },
});

const results = await fleet.broadcast(async (p, member) =>
  p("https://api.example.com/public-data", { identity: member.identity }),
);

await fleet.destroy();  // drains + flushes

Three workflow coordinators: fleet.broadcast (one task, every member), fleet.plan (per-member input → task), fleet.pipeline (staged execution with failure short-circuit). Everything flows through a pluggable Substrate so the in-process default can later be swapped for a Kite-backed backend. See Concepts: Fleet.


Run heterogeneous agents

pluck.runtime({...}) orchestrates N agents that each drive their own LLM and own tool surface. Pluck's verbs (connect / extract / shape / act / sense / probe / context / dowse) are auto-registered as typed tools per agent's manifest:

TypeScript
const runtime = pluck.runtime({
  agents: [
    { id: "researcher", systemPrompt: "summarise", provider: openaiProvider({ model: "gpt-5", apiKey }), tools: ["extract", "context"] },
    { id: "actor", systemPrompt: "post the summary", provider: anthropicProvider({ model: "claude-sonnet-4-6", apiKey }), tools: ["act"], budget: { maxCostUsd: 1.0 } },
  ],
  signingKey: process.env.PLUCK_SIGNING_KEY,
});

const result = await runtime.run({
  goal: "Read the URL and post a summary.",
  plan: { entry: "researcher", edges: [{ from: "researcher", to: "actor" }], exits: ["actor"] },
});

Per-agent budget caps (turns / tool calls / tokens / cost), declarative handoff graph with when() predicates, Ed25519-signed trace per turn + tool call. The default tool surface is read-only; "act" requires explicit opt-in. See Concepts: Runtime.


Wire into an agent (MCP)

Claude Desktop, Cursor, Claude Code, Continue – one config, nine tools:

JSON
{
  "mcpServers": {
    "pluck": {
      "command": "npx",
      "args": ["-y", "@sizls/pluck-mcp"]
    }
  }
}

Your agent now has eyes (30 connectors), hands (signed + reversible + policy-gated actions), and ears (37 sensors + live streaming). See MCP-First Pipeline for the full setup.


The pipeline in one picture

Connect → Navigate → Extract → Shape → Act → Sense → Output
   30        7         5         1       9      37      12
connectors  modes  extractors  phase  actors sensors  formats

Every verb is a phase. Every phase has a registry you can extend. Every extension point has a define<Phase>() typed helper. Every mutation produces a signed receipt. Every receipt chains to its upstream receipts (see SignedReceipt.parentSig).


Where to go next

Read a concept page to understand each phase:

  • Concepts: Connect – URIs, connectors, streaming, safety guarantees.
  • Concepts: Navigate – 7 navigation modes from direct passthrough to LLM-driven browsing.
  • Concepts: Extract – 5 extractors, 5 strategies (auto / css / regex / llm / hybrid).
  • Concepts: Shape – Zod validation with drift detection and per-field provenance.
  • Concepts: Act – signed receipts, undo, policy, idempotency.
  • Concepts: Sense – 37 sensors across audio / video / text / image / CV.
  • Concepts: Fleetpluck.fleet({...}) – N identities × M targets, proxy pool, signed audit chain, Substrate-backed.
  • Concepts: Runtimepluck.runtime({...}) – heterogeneous agents, per-agent provider, MCP-per-agent surface, signed trace, handoff graph.
  • Concepts: Output – 12 formats, 6 presets, Markdoc templates.

Look up an API:

Run a killer recipe:

Go deep on MCP:


Help + community

Welcome.

Edit this page on GitHub

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →