Three pre-flight primitives for understanding a URI without running the full pipeline. Check, classify, reconnoitre.

`pluck.probe(uri)` – metadata-only

probe is HEAD-only. It opens the connection, resolves the connector, returns what the pipeline would do without actually doing it:

TypeScript

const probe = await pluck.probe("https://api.example.com/items");
// {
//   uri: "https://api.example.com/items",
//   connector: "http",
//   contentType: "application/json",
//   size: 45678,
//   estimatedCost: 0.001,
//   recommendedFormat: "json",
//   recommendedMode: "direct",
// }

Use it to budget an agent before it acts, to route expensive plucks to background queues, or to preflight-check URLs in CI. Probe runs through the same connector registry as pluck() – if the URI doesn't match any connector, probe returns undefined.

CLI:

Shell

pluck probe https://api.example.com/items
pluck probe postgres://db/users --json

`pluck.context(uri)` – "where am I?"

context is probe plus the full inspection stack – schema.org, Open Graph, Twitter cards, robots.txt, sitemaps, known extractors that would claim the source, PII likelihood estimate:

TypeScript

const ctx = await pluck.context("https://example.com/blog/my-post");
// {
//   uri: "...",
//   connector: "http",
//   contentType: "text/html",
//   schemaOrg: { "@type": "Article", "headline": "...", "author": "..." },
//   openGraph: { "og:title": "...", "og:image": "...", "og:description": "..." },
//   twitterCard: { "twitter:card": "summary_large_image", ... },
//   robots: { allowed: true, crawlDelay: 1 },
//   sitemaps: ["https://example.com/sitemap.xml"],
//   extractors: ["html"],                 // which extractors would claim this
//   piiLikelihood: "low" | "medium" | "high",
// }

context is what an agent calls first when it receives an unfamiliar URL. The result tells it what the page is, who wrote it, whether the robots.txt disallows scraping, what data shapes it emits, and whether it's likely to contain PII.

CLI:

Shell

pluck context https://example.com/blog/my-post

`pluck.dowse(uri)` – signal reconnaissance

dowse is the Sense-phase equivalent of context. Point it at an audio / video / signal source; Pluck runs every sensor in fast mode and returns ranked findings:

TypeScript

const findings = await pluck.dowse("./mystery.wav");
// {
//   uri: "./mystery.wav",
//   topFinding: {
//     sensor: "dtmf",
//     confidence: 0.94,
//     summary: "5 DTMF tones detected at 0.1s, 0.3s, 0.5s, 0.8s, 1.1s: \"21234\"",
//   },
//   findings: [
//     { sensor: "dtmf", confidence: 0.94, summary: "..." },
//     { sensor: "ultrasonic", confidence: 0.61, summary: "carrier at 19.2kHz" },
//     { sensor: "anomaly", confidence: 0.40, summary: "burst at 42.1s-43.8s" },
//     // … 11 more
//   ],
// }

dowse is the "what's interesting in this file?" command. file(1) for signals. Use it as the first call on any unknown WAV / MP3 / MP4 – it'll tell you which sensors you actually want to run next at full deep resolution.

CLI:

Shell

pluck dowse ./mystery.wav
pluck dowse ./mystery.wav --json

When to use each

If you want to...	Call
Check whether a URL is reachable + what type it is	`pluck.probe`
Get the full semantic context of a web page	`pluck.context`
Find out what's hiding in a signal source	`pluck.dowse`
Run the actual pipeline	`pluck(uri)`

All three are read-only, idempotent, and safe to call without worrying about side-effects. They all honour the same SSRF guard, AbortSignal, and bounded-read limits as the main pipeline.

What's next

Concepts: Connect – the registry that probe and context resolve against.
Concepts: Sense – the 22 sensors that dowse runs.
Recipe: Snitch Privacy – a full forensic composition using context + dowse internals.

pluck.probe(uri) – metadata-only

pluck.context(uri) – "where am I?"

pluck.dowse(uri) – signal reconnaissance

When to use each

What's next

`pluck.probe(uri)` – metadata-only

`pluck.context(uri)` – "where am I?"

`pluck.dowse(uri)` – signal reconnaissance