Skip to content

Tools

Studio

Time-travel devtools for every Pluck pipeline run. Record once, scrub through every phase, compare across runs, share a URL. Built in, no extra install.


The demo

Shell
pluck https://news.ycombinator.com
# → standard pipeline run, prints markdown to stdout.

pluck studio
# → opens http://localhost:4620 in your browser, showing the most recent trace
#   with every phase expandable, every event timestamped, every intermediate
#   result inspectable.

Studio is the closest thing Pluck has to a GUI. It turns every pipeline run into a replayable record – useful for debugging ("why did my extract return an empty string?"), demos ("look at this drift detection"), and regression ("commit this trace as a fixture, fail CI when it changes").


What gets captured

Every pluck() call produces a StudioTrace containing:

FieldWhat it holds
idULID identifying this run.
uriThe URI the pipeline ran against.
startedAt / completedAtISO timestamps.
durationMsTotal wall time.
phasesOne entry per phase that fired. Every phase has its own events, timings, inputs, outputs.
resultThe final PluckResult (or null on error).
errorThe full PluckError including phase, code, cause.
metadataTool version, runtime, signing-key fingerprint, connector / extractor / actor / sensor that ran.

The shape is frozen – backward-compatible additions only. Traces recorded today replay tomorrow.


Phases in Studio

Every phase that fires emits events. Studio groups them in a timeline:

  1. Connect – URI matched → http connector → DNS resolved → socket opened → 200 OK → 45KB read.
  2. Navigatedirect mode → passthrough in 0.1ms.
  3. Extracthtml extractor → strategy: auto → 23 selectors ran → 6 fields populated → confidence 0.82.
  4. Shape – schema matched → 2 fields stripped → onDrift fired.
  5. Act – actor found → confirmation strategy auto → dry-run preview → signed receipt issued.
  6. Sense – 3 sensors ran → DTMF found 5 digits at 0.1s–0.6s → ultrasonic carrier at 19.2 kHz.
  7. Output.output("markdown") ran the markdown formatter in 1.2ms.

Every event is clickable. Every intermediate ConnectResult, NavigateResult, ExtractResult, ShapeResult, SignedReceipt, SenseResult is in the trace and inspectable. Errors bubble up with phase attribution so you know exactly where the pipeline broke.


Recording

Traces record automatically when Pluck is configured with a studioDir:

TypeScript
import { createPluck } from "@sizls/pluck";

const pluck = createPluck({
  studio: {
    enabled: true,
    dir: "./.pluck/traces",
  },
});

await pluck("https://example.com");
// Writes .pluck/traces/01HQZ9... .plucktrace.json

The CLI writes traces to ~/.pluck/traces/ by default – configurable via --studio-dir.


Replay

Deterministic replay rewinds the pipeline against the captured trace:

TypeScript
import { replay } from "@sizls/pluck";

const result = await replay("./trace-01HQZ9.plucktrace.json");
// Same result as the original run – no network, no side-effects.

The CLI equivalent:

Shell
pluck replay ./trace-01HQZ9.plucktrace.json
pluck replay ./trace-01HQZ9.plucktrace.json --format json
pluck replay ./trace-01HQZ9.plucktrace.json --json     # shortcut

Replay is bit-for-bit deterministic when the trace captured every side-effect. Connect phase reads come from the trace; act phase dry-runs use the recorded response. This is exactly what CI needs for contract-testing pipelines.

Pair with pluck record <uri> to seed fixtures in bulk:

Shell
pluck record https://news.ycombinator.com -o ./fixtures/hn.plucktrace.json
# Run pipeline, write full trace to the fixture file.

# Then in CI:
pluck replay ./fixtures/hn.plucktrace.json

Flow recording

pluck record-flow goes one step further – it launches Playwright, lets you click through a workflow, and emits a runnable recipe:

Shell
pluck record-flow https://shop.example.com -o checkout.yaml
# ↑ Opens Playwright. You click through the checkout flow by hand.
# Pluck captures every selector + input and emits:
#
# name: shop-checkout
# match: https://shop.example.com/*
# steps:
#   - navigate: /cart
#   - click: "button[data-test=checkout]"
#   - fill: "input[name=email]" → "${email}"
#   - click: "button[type=submit]"
# ...

pluck run checkout.yaml --var email=test@example.com --dry-run
# Replay the flow, any time, any user.

This is the "zero-code authoring" demo – non-developers click through a flow, get back a versionable YAML recipe. The same recipe plays back deterministically forever.


Studio server

The pluck studio CLI boots a local HTTP server (default port 4620) that serves a self-contained HTML view of traces:

Shell
pluck studio                        # open the most recent trace
pluck studio ./trace.plucktrace.json # open a specific file
pluck studio --list                 # list every trace in the studio dir
pluck studio --open                 # also shell out to xdg-open / open / start
pluck studio --studio-dir ./traces  # override default trace directory

The HTML bundle is generated by buildStudioHtml() from @sizls/pluck – zero external assets, works offline, embeddable. The server is also exposed programmatically:

TypeScript
import { startStudioServer } from "@sizls/pluck";

const server = await startStudioServer({
  traceDir: "./.pluck/traces",
  port: 4620,
});
// ... server.close() when done

Shareable URLs

Studio traces can be shared one-click from the hosted dashboard. The POST /v1/traces/:id/share API returns a one-time URL that any viewer can open without an account. Perfect for "look at this weird edge case" messages in Slack.


Why it matters

Data pipelines are the opposite of observable. A scrape fails, an LLM extraction drifts, a receipt verification breaks – and the pipeline just… moves on. Studio flips that: every pipeline run is inspectable after the fact, and the same artifact is a regression fixture, a demo asset, and a customer-support tool.

No other data-pipeline library ships a time-travel debugger. Combined with signed receipts + drift detection + MCP-first integration, Studio is the "Cursor moment" for web data – the moment a developer sees the timeline scrubbing every phase, rewinding, forking, and re-running, and gets it.


What's next

Edit this page on GitHub
Previous
Output

Ready to build?

Install Pluck and follow the Quick Start guide to wire MCP-first data pipelines into your agents and fleets in minutes.

Get started →