The seventh phase of the Pluck pipeline. Twelve built-in formats, six presets, and arbitrary template strings – all driven by a single .output() method that works the same on every PluckResult, no matter which upstream phase produced it.

The mental model

Every phase before output – connect, navigate, extract, shape, act, sense – produces a PluckResult. The result has the raw fields (text, data, segments, receipt, sensed, etc.), but consumers rarely want raw. They want markdown for their docs site, JSON for their database, SRT for their subtitles, embeddings for their vector store.

The output phase is where that rendering lives. It's the single method on the result, shaped to look the same for every upstream source:

TypeScript

import { pluck } from "@sizls/pluck";

const result = await pluck("https://news.ycombinator.com");

result.output("markdown"); // "# Top | Hacker News\n\n…"
result.output("json");     // "{\n  \"items\": [ … ]\n}"
result.output("csv");      // "title,url,score,…"
result.output("text");     // plain text

Output is a phase, not a standalone verb. It runs implicitly on the returned PluckResult via the .output() method; you don't import anything extra.

The 12 formats

Format	What you get	Typical use
`markdown`	Clean markdown – title, body, metadata.	Docs sites, GitHub issues, LLM input.
`json`	Full `PluckResult` as pretty-printed JSON.	Database upsert, inspection, logs.
`text`	Plain text, no markup.	Email, SMS, terminal.
`html`	HTML body with metadata table.	Email, dashboards.
`xml`	Structured XML.	Legacy integrations, RSS exporters.
`yaml`	YAML dump.	Config files, CI artifacts.
`csv`	Rows with header. Uses `result.data` when present, else best-effort columns from `segments` / `text`.	Spreadsheets, BI tools.
`sql`	`INSERT INTO ...` statements.	Database seeds.
`embeddings`	Text chunks ready to hand to an embedding model.	Vector stores, RAG.
`srt`	SubRip subtitle format from `result.segments`.	Video pipelines.
`vtt`	WebVTT subtitle format from `result.segments`.	Video pipelines.
`template`	Render via `result.template(string)` – Markdoc-style interpolation against the result.	Custom reports, newsletters.

The exact union is exported as OutputFormat:

TypeScript

type OutputFormat =
  | "json" | "markdown" | "text"
  | "srt" | "vtt" | "csv"
  | "html" | "xml" | "yaml"
  | "sql" | "embeddings" | "template"
  | (string & {});   // escape hatch for custom formatters

Every format is implemented as a Formatter – a pure function that takes a PluckResult and returns a string. Built-ins are registered in the formatter registry at instance creation; custom formats are one createPluck({ formatters: [defineOutput({ … })] }) call away (see Custom output formats below).

Presets live on a sibling surface (result.preset(name)) and templates on a third (result.template(source)). See Presets and templates – same engine, different surface below for how they relate.

Presets and templates – same engine, different surface

Presets and templates both render through the same Markdoc-style template engine. The only difference is discoverability:

Preset – a named template. The name is on a string-typed union (PresetName), so IDEs autocomplete it and the CLI lists it in --help. The template body lives in the Pluck source tree.
Template – an inline template string you pass at the call site. No registration, no name.

Under the hood, result.preset("blog") is literally result.template(BUILTIN_BLOG_TEMPLATE) – both paths go through createTemplateFormatter(source).render(result).

TypeScript

// Preset – discoverable, typed, reusable.
result.preset("blog");

// Equivalent template – inline, one-off, unnamed.
result.template(`
# {{ metadata.title }}

{{#each segments}}
- {{ text }}
{{/each}}
`);

The six built-in preset names are typed as PresetName:

TypeScript

type PresetName = "blog" | "notes" | "social" | "rag" | "dataset" | "report";

Rule of thumb: reach for a template when the shape is one-off; reach for a preset when the shape is going to be reused enough that it deserves a name. Presets live in the Pluck source tree today; the registration surface for user-defined presets is on the backlog.

Templates in detail

Templates are Markdoc-style strings with {{ interpolation }} and basic loops:

TypeScript

const result = await pluck("https://news.ycombinator.com");

result.template(`
# {{ title }}

{{#each segments}}
- [{{ text }}]({{ meta.url }}) – {{ meta.score }} points
{{/each}}

Pulled at {{ metadata.fetchedAt }}.
`);

The template sees the full PluckResult – access any field by name. Missing fields render as empty strings; undefined lookups don't throw.

Custom output formats – `defineOutput`

The formatter registry is open. Register your own for any format your team cares about – a Slack Block Kit JSON blob, a ServiceNow payload, your company's internal XML flavour. The typed helper matches the rest of the pipeline's define* family (defineConnector, defineExtractor, defineActor, defineSensor):

TypeScript

import { defineOutput, createPluck } from "@sizls/pluck";

const slackBlocks = defineOutput({
  name: "slack-blocks",
  format: "slack-blocks",
  render(result) {
    return JSON.stringify({
      blocks: [
        { type: "header", text: { type: "plain_text", text: result.metadata.title } },
        { type: "section", text: { type: "mrkdwn", text: result.text.slice(0, 300) } },
      ],
    });
  },
});

const pluck = createPluck({ formatters: [slackBlocks] });
const result = await pluck("https://news.example.com/post");
result.output("slack-blocks"); // string

Custom formatters are prepended to the registry and win over built-ins with the same format key. The format field is the argument you pass to result.output(format) – keep it short, lowercase, kebab-case.

Output in the CLI

The CLI exposes the same surface through --format:

Shell

pluck https://news.ycombinator.com --format markdown
pluck https://api.example.com/items --format json | jq .
pluck ./interview.mp3 --format srt -o subs.srt
pluck run workflow.yaml --format preset:rag

Pipes work – every format is a string, and stdout is the default sink unless -o <path> is passed.

Why output is a separate phase

It isn't, strictly – rendering a format is pure (result in, string out), and the pipeline itself finishes at act / sense / extract / etc. Output is a phase in the type system (PluckPhase includes "output") because treating it as one gives the system three properties:

Formatters are swappable. Registering a custom formatter via defineOutput is the same shape of API as defineConnector, defineExtractor, defineActor, defineSensor.
Errors are attributable. If a format throws ("this source has no segments, can't render VTT"), the pipeline error carries phase: "output" – so traces, monitors, and replay all know where the failure happened.
The .output() API is uniform. No matter which upstream phase ran, the user-facing method signature is the same. That's a real DX win when the pipeline has six other phases that each produce different result shapes.

The seven phases at a glance

Output is one of seven phases. The table is compact on purpose – if you've ever asked "wait, don't navigate and act both drive Playwright?" or "isn't a preset just a template with a name?", this table answers it.

#	Phase	Input	Output	One-line role	What makes it distinct
1	Connect	URI	`ConnectResult` (raw bytes + metadata)	Pull raw bytes from any source.	Matches URI schemes. No interpretation – just "here are the bytes."
2	Navigate	`ConnectResult`	`NavigateResult` (cleaner bytes, same shape)	Prepare content so extract can read it.	Passive. Reads pages, dismisses modals, waits for SPA renders. No side effects on the source.
3	Extract	`NavigateResult`	`ExtractResult` (loose structured data)	Pull structured data out of content.	Five strategies: auto / css / regex / llm / hybrid. Outputs are "loose" – any shape.
4	Shape	`ExtractResult`	`PluckResult<T>` (typed data)	Pin loose data to a Zod schema.	Validates + narrows. Drift detection on schema mismatch.
5	Act	URI + action + input	`PluckResult` + signed receipt	Perform a mutation on the source.	Active. Every call signs a receipt, runs policy, honours idempotency. Side effects required.
6	Sense	`NavigateResult`	`PluckResult` + `sensed.features`	Extract signal features below human perception.	DSP. Runs against audio / video / text / image content, not DOM.
7	Output	`PluckResult`	`string`	Render the result into a consumer-ready format.	Pure. Same result goes in, different strings come out.

The "wait, isn't this the same as…?" checklist

Navigate vs. Act. Both have browser-backed modes that load URLs and click things. Navigate uses them to read a page (the SPA-rendering / interact / agent modes return bytes for extract). Act uses them to write (the browser and browser-agent actors produce signed receipts). If there's no mutation and no receipt, it's navigate. If the operation is one you'd want to undo, it's act. See the Navigate page for details.
Preset vs. Template. Presets are named templates. Both render through the same engine. Use a preset when the shape is reusable enough to earn a name ("blog", "rag"); use a template when it's one-off.
Output format vs. Preset. Formats (markdown, json, srt) are code-based renderers – they inspect result fields and emit a string according to a spec. Presets are template strings – they substitute {{ field }} placeholders into a pre-baked skeleton. Reach for a format when you need a standard (JSON, CSV, SRT). Reach for a preset when you need a shape of prose (blog post, RAG chunks, executive report).

What's next

Getting Started – install + first pluck + your first .output() call.
Reference: Connectors – which URIs feed the pipeline.
Reference: CLI – every --format and preset on the CLI.

The mental model

The 12 formats

Presets and templates – same engine, different surface

Templates in detail

Custom output formats – defineOutput

Output in the CLI

Why output is a separate phase

The seven phases at a glance

The "wait, isn't this the same as…?" checklist

What's next

Custom output formats – `defineOutput`