- Docs
- Getting Started
Getting Started
Getting Started
Install Pluck, run your first pipeline in under five minutes, and get pointers to every concept.
Install
npm install @sizls/pluck
Node 20 or 22 is required. Everything else (@sizls/pluck-cli, @sizls/pluck-mcp, @sizls/pluck-api) is optional – pick the surface you want.
# CLI
npm install -g @sizls/pluck-cli
# MCP server (agent integration)
npm install @sizls/pluck-mcp
# REST API server
npm install @sizls/pluck-api
Your first pluck
import { pluck } from "@sizls/pluck";
const result = await pluck("https://news.ycombinator.com");
console.log(result.output("markdown"));
That single call runs the full pipeline:
- Connect – match
https://news.ycombinator.comto the HackerNews connector (registered before the generic HTTP catch-all). - Navigate – default
directmode; bytes flow through unchanged. - Extract – the HN extractor pulls top stories with title, score, comments, author.
- Output –
.output("markdown")renders a clean markdown document.
Try other formats:
result.output("json"); // full PluckResult as JSON
result.output("csv"); // tabular rows
result.output("text"); // plain text
result.preset("rag"); // chunks ready for a vector store
The CLI
Same result, one line of shell:
pluck https://news.ycombinator.com --format markdown
pluck https://news.ycombinator.com --format json | jq '.items[:5]'
pluck https://news.ycombinator.com --preset rag
pluck --help lists every subcommand. pluck <cmd> --help gives per-command flags.
Read anything
30 connectors ship. Swap the URI, keep the API:
await pluck("postgres://localhost/app?limit=10"); // database row stream
await pluck("reddit://r/typescript/hot"); // subreddit
await pluck("rss://blog.example.com/feed.xml"); // RSS feed
await pluck("s3://my-bucket/data/daily.csv"); // S3 object
await pluck("./earnings-call.mp3"); // audio file (transcribed)
await pluck("./document.pdf"); // PDF (parsed)
await pluck("kafka://broker/my-topic"); // Kafka stream (async iterable)
Private / authenticated surfaces land in v0.5 via pluck oauth login <service>; the 30-connector table marks which are public-only today. See Reference: Connectors for the full list.
Shape the output
Pluck's shape phase pins the result to a Zod schema – extract gives loose data, shape validates and narrows it:
import { pluck, spotifyTrack } from "@sizls/pluck";
const track = await pluck("https://open.spotify.com/track/3n3Ppam7vgaVa1iaRUc9Lp", {
shape: { schema: spotifyTrack },
});
if (track.shape?.valid) {
console.log(track.shape.data.title); // typed as string | undefined
}
Six social shape templates ship out of the box (spotifyTrack, twitchClip, instagramPost, tiktokPost, vimeoVideo, twitterTweet). Bring your own Zod schema for everything else – see Concepts: Shape.
Act with signed receipts
Writing is as easy as reading, except every mutation produces an Ed25519-signed receipt:
import { createPluck, verifyChain } from "@sizls/pluck";
// createPluck wires a durable signingKey once; later act() calls inherit it.
const pluck = createPluck({
signingKey: process.env.PLUCK_SIGNING_KEY,
});
const result = await pluck.act("https://api.example.com/todos", {
action: "post",
input: { title: "Buy milk" },
dryRun: true,
});
console.log(result.signedReceipt?.signature);
console.log(result.signedReceipt?.signedBy);
const chain = verifyChain([result.signedReceipt!], {
publicKeys: [process.env.PLUCK_PUBLIC_KEY!],
});
console.log(chain.summary);
Generating your keys
Generate a durable Ed25519 keypair once with the CLI and keep the private key in your secret manager:
pluck keys generate --name pluck --dir ./keys
# Writes ./keys/pluck.pem (private) and ./keys/pluck.pub.pem (public)
Commit pluck.pub.pem alongside your code so anyone can verify receipts offline. Load pluck.pem into PLUCK_SIGNING_KEY via your secret manager of choice.
Dry-run is the default when calling through MCP, so agents can't surprise-mutate state. See Concepts: Act for receipts + undo + policy + idempotency.
Sense signals humans can't perceive
37 sensors (audio + video + text + image + CV) + live streaming – zero native deps for audio/text/video (three optional peers for image + CV: sharp / face-api.js / @xenova/transformers):
const call = await pluck.sense("./call-recording.wav", {
detect: ["dtmf", "ultrasonic", "anomaly"],
});
console.log(call.sensed?.features.dtmf);
// { digits: "212-555-0199", decodedAt: [0.1, 0.3, 0.5, ...] }
Pluck is the only JS/TS pipeline library with built-in DSP at this depth – FFT, spectrogram, DTMF, pitch, tempo, chromagram, MFCC, ultrasonic beacons, infrasonic, noise-floor, FSK / PSK, AM / FM / SSB demodulation, Morse, rPPG (remote photoplethysmography) heart-rate from video, heartbeat / breathing from audio, birdsong ID, periodicity, and anomaly detection. Plus createSensorStream for live mic / SDR / SIP audio. See Concepts: Sense.
Run at fleet scale
pluck.fleet({...}) coordinates N identities × M targets with proxy rotation, per-target rate limits, a reputation circuit breaker, and an Ed25519-signed audit chain:
const fleet = pluck.fleet({
count: 100,
proxies: loadedProxies,
audit: { signingKey: () => process.env.PLUCK_SIGNING_KEY!, sink: "./audit.ndjson" },
});
const results = await fleet.broadcast(async (p, member) =>
p("https://api.example.com/public-data", { identity: member.identity }),
);
await fleet.destroy(); // drains + flushes
Three workflow coordinators: fleet.broadcast (one task, every member), fleet.plan (per-member input → task), fleet.pipeline (staged execution with failure short-circuit). Everything flows through a pluggable Substrate so the in-process default can later be swapped for a Kite-backed backend. See Concepts: Fleet.
Run heterogeneous agents
pluck.runtime({...}) orchestrates N agents that each drive their own LLM and own tool surface. Pluck's verbs (connect / extract / shape / act / sense / probe / context / dowse) are auto-registered as typed tools per agent's manifest:
const runtime = pluck.runtime({
agents: [
{ id: "researcher", systemPrompt: "summarise", provider: openaiProvider({ model: "gpt-5", apiKey }), tools: ["extract", "context"] },
{ id: "actor", systemPrompt: "post the summary", provider: anthropicProvider({ model: "claude-sonnet-4-6", apiKey }), tools: ["act"], budget: { maxCostUsd: 1.0 } },
],
signingKey: process.env.PLUCK_SIGNING_KEY,
});
const result = await runtime.run({
goal: "Read the URL and post a summary.",
plan: { entry: "researcher", edges: [{ from: "researcher", to: "actor" }], exits: ["actor"] },
});
Per-agent budget caps (turns / tool calls / tokens / cost), declarative handoff graph with when() predicates, Ed25519-signed trace per turn + tool call. The default tool surface is read-only; "act" requires explicit opt-in. See Concepts: Runtime.
Wire into an agent (MCP)
Claude Desktop, Cursor, Claude Code, Continue – one config, nine tools:
{
"mcpServers": {
"pluck": {
"command": "npx",
"args": ["-y", "@sizls/pluck-mcp"]
}
}
}
Your agent now has eyes (30 connectors), hands (signed + reversible + policy-gated actions), and ears (37 sensors + live streaming). See MCP-First Pipeline for the full setup.
The pipeline in one picture
Connect → Navigate → Extract → Shape → Act → Sense → Output
30 7 5 1 9 37 12
connectors modes extractors phase actors sensors formats
Every verb is a phase. Every phase has a registry you can extend. Every extension point has a define<Phase>() typed helper. Every mutation produces a signed receipt. Every receipt chains to its upstream receipts (see SignedReceipt.parentSig).
Where to go next
Read a concept page to understand each phase:
- Concepts: Connect – URIs, connectors, streaming, safety guarantees.
- Concepts: Navigate – 7 navigation modes from direct passthrough to LLM-driven browsing.
- Concepts: Extract – 5 extractors, 5 strategies (auto / css / regex / llm / hybrid).
- Concepts: Shape – Zod validation with drift detection and per-field provenance.
- Concepts: Act – signed receipts, undo, policy, idempotency.
- Concepts: Sense – 37 sensors across audio / video / text / image / CV.
- Concepts: Fleet –
pluck.fleet({...})– N identities × M targets, proxy pool, signed audit chain, Substrate-backed. - Concepts: Runtime –
pluck.runtime({...})– heterogeneous agents, per-agent provider, MCP-per-agent surface, signed trace, handoff graph. - Concepts: Output – 12 formats, 6 presets, Markdoc templates.
Look up an API:
- Reference: Connectors – every URI scheme Pluck understands.
- Reference: CLI – 37 commands grouped by purpose.
- Reference: API – REST routes under
/v1/.
Run a killer recipe:
- Recipe: Snitch Privacy – one-line signed forensic privacy audit.
- Recipe: DriftWatch Fleet – SSH fleet drift with Merkle-chained audit log.
- Recipe: Shape Spotify – typed Spotify ETL with drift detection.
Go deep on MCP:
- MCP-First Pipeline – every phase exposed as an MCP tool, one-line agent integration.
Help + community
- GitHub: sizls/pluck – issues, PRs, discussions.
- npm:
@sizls/pluck,@sizls/pluck-cli,@sizls/pluck-mcp,@sizls/pluck-api. - Why Pluck exists: /blog/why-pluck – the opinionated essay.
Welcome.