- Docs
- Tools
- Introspection
Tools
Introspection
Three pre-flight primitives for understanding a URI without running the full pipeline. Check, classify, reconnoitre.
pluck.probe(uri) – metadata-only
probe is HEAD-only. It opens the connection, resolves the connector, returns what the pipeline would do without actually doing it:
const probe = await pluck.probe("https://api.example.com/items");
// {
// uri: "https://api.example.com/items",
// connector: "http",
// contentType: "application/json",
// size: 45678,
// estimatedCost: 0.001,
// recommendedFormat: "json",
// recommendedMode: "direct",
// }
Use it to budget an agent before it acts, to route expensive plucks to background queues, or to preflight-check URLs in CI. Probe runs through the same connector registry as pluck() – if the URI doesn't match any connector, probe returns undefined.
CLI:
pluck probe https://api.example.com/items
pluck probe postgres://db/users --json
pluck.context(uri) – "where am I?"
context is probe plus the full inspection stack – schema.org, Open Graph, Twitter cards, robots.txt, sitemaps, known extractors that would claim the source, PII likelihood estimate:
const ctx = await pluck.context("https://example.com/blog/my-post");
// {
// uri: "...",
// connector: "http",
// contentType: "text/html",
// schemaOrg: { "@type": "Article", "headline": "...", "author": "..." },
// openGraph: { "og:title": "...", "og:image": "...", "og:description": "..." },
// twitterCard: { "twitter:card": "summary_large_image", ... },
// robots: { allowed: true, crawlDelay: 1 },
// sitemaps: ["https://example.com/sitemap.xml"],
// extractors: ["html"], // which extractors would claim this
// piiLikelihood: "low" | "medium" | "high",
// }
context is what an agent calls first when it receives an unfamiliar URL. The result tells it what the page is, who wrote it, whether the robots.txt disallows scraping, what data shapes it emits, and whether it's likely to contain PII.
CLI:
pluck context https://example.com/blog/my-post
pluck.dowse(uri) – signal reconnaissance
dowse is the Sense-phase equivalent of context. Point it at an audio / video / signal source; Pluck runs every sensor in fast mode and returns ranked findings:
const findings = await pluck.dowse("./mystery.wav");
// {
// uri: "./mystery.wav",
// topFinding: {
// sensor: "dtmf",
// confidence: 0.94,
// summary: "5 DTMF tones detected at 0.1s, 0.3s, 0.5s, 0.8s, 1.1s: \"21234\"",
// },
// findings: [
// { sensor: "dtmf", confidence: 0.94, summary: "..." },
// { sensor: "ultrasonic", confidence: 0.61, summary: "carrier at 19.2kHz" },
// { sensor: "anomaly", confidence: 0.40, summary: "burst at 42.1s-43.8s" },
// // … 11 more
// ],
// }
dowse is the "what's interesting in this file?" command. file(1) for signals. Use it as the first call on any unknown WAV / MP3 / MP4 – it'll tell you which sensors you actually want to run next at full deep resolution.
CLI:
pluck dowse ./mystery.wav
pluck dowse ./mystery.wav --json
When to use each
| If you want to... | Call |
|---|---|
| Check whether a URL is reachable + what type it is | pluck.probe |
| Get the full semantic context of a web page | pluck.context |
| Find out what's hiding in a signal source | pluck.dowse |
| Run the actual pipeline | pluck(uri) |
All three are read-only, idempotent, and safe to call without worrying about side-effects. They all honour the same SSRF guard, AbortSignal, and bounded-read limits as the main pipeline.
What's next
- Concepts: Connect – the registry that
probeandcontextresolve against. - Concepts: Sense – the 22 sensors that
dowseruns. - Recipe: Snitch Privacy – a full forensic composition using
context+dowseinternals.