Skip to content

Tools

Introspection

Three pre-flight primitives for understanding a URI without running the full pipeline. Check, classify, reconnoitre.


pluck.probe(uri) – metadata-only

probe is HEAD-only. It opens the connection, resolves the connector, returns what the pipeline would do without actually doing it:

TypeScript
const probe = await pluck.probe("https://api.example.com/items");
// {
//   uri: "https://api.example.com/items",
//   connector: "http",
//   contentType: "application/json",
//   size: 45678,
//   estimatedCost: 0.001,
//   recommendedFormat: "json",
//   recommendedMode: "direct",
// }

Use it to budget an agent before it acts, to route expensive plucks to background queues, or to preflight-check URLs in CI. Probe runs through the same connector registry as pluck() – if the URI doesn't match any connector, probe returns undefined.

CLI:

Shell
pluck probe https://api.example.com/items
pluck probe postgres://db/users --json

pluck.context(uri) – "where am I?"

context is probe plus the full inspection stack – schema.org, Open Graph, Twitter cards, robots.txt, sitemaps, known extractors that would claim the source, PII likelihood estimate:

TypeScript
const ctx = await pluck.context("https://example.com/blog/my-post");
// {
//   uri: "...",
//   connector: "http",
//   contentType: "text/html",
//   schemaOrg: { "@type": "Article", "headline": "...", "author": "..." },
//   openGraph: { "og:title": "...", "og:image": "...", "og:description": "..." },
//   twitterCard: { "twitter:card": "summary_large_image", ... },
//   robots: { allowed: true, crawlDelay: 1 },
//   sitemaps: ["https://example.com/sitemap.xml"],
//   extractors: ["html"],                 // which extractors would claim this
//   piiLikelihood: "low" | "medium" | "high",
// }

context is what an agent calls first when it receives an unfamiliar URL. The result tells it what the page is, who wrote it, whether the robots.txt disallows scraping, what data shapes it emits, and whether it's likely to contain PII.

CLI:

Shell
pluck context https://example.com/blog/my-post

pluck.dowse(uri) – signal reconnaissance

dowse is the Sense-phase equivalent of context. Point it at an audio / video / signal source; Pluck runs every sensor in fast mode and returns ranked findings:

TypeScript
const findings = await pluck.dowse("./mystery.wav");
// {
//   uri: "./mystery.wav",
//   topFinding: {
//     sensor: "dtmf",
//     confidence: 0.94,
//     summary: "5 DTMF tones detected at 0.1s, 0.3s, 0.5s, 0.8s, 1.1s: \"21234\"",
//   },
//   findings: [
//     { sensor: "dtmf", confidence: 0.94, summary: "..." },
//     { sensor: "ultrasonic", confidence: 0.61, summary: "carrier at 19.2kHz" },
//     { sensor: "anomaly", confidence: 0.40, summary: "burst at 42.1s-43.8s" },
//     // … 11 more
//   ],
// }

dowse is the "what's interesting in this file?" command. file(1) for signals. Use it as the first call on any unknown WAV / MP3 / MP4 – it'll tell you which sensors you actually want to run next at full deep resolution.

CLI:

Shell
pluck dowse ./mystery.wav
pluck dowse ./mystery.wav --json

When to use each

If you want to...Call
Check whether a URL is reachable + what type it ispluck.probe
Get the full semantic context of a web pagepluck.context
Find out what's hiding in a signal sourcepluck.dowse
Run the actual pipelinepluck(uri)

All three are read-only, idempotent, and safe to call without worrying about side-effects. They all honour the same SSRF guard, AbortSignal, and bounded-read limits as the main pipeline.


What's next

Edit this page on GitHub
Previous
Reactive

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →