Spool
v3.1 · OpenTelemetry-native

See the slow query. Not the stack trace.

Spool attaches a span to every SQL statement your service runs — so a 1.4s endpoint is one click from the exact query, its plan, the lock it waited on, and the deploy that regressed it.

$ npm i @spool/node — auto-instruments pg, prisma, drizzle
spool › trace 7f3a91 · GET /orders/:id
live
span
waterfall
dur
handler.getOrder
1,412ms
└ auth.verifySession
12ms
└ SELECT orders JOIN line_items
1,188ms
└ Seq Scan on line_items no index
1,041ms
└ redis.get sess:cache
3ms
└ serialize.json
8ms
root cause · missing index line_items(order_id) regressed in deploy #2,418 · 6d ago
Last 24h · prod-us-east
live
Queries traced
48,206,114
+8.2% vs Tue
Slow queries (>200ms)
1,907
+14.0% vs Tue
P95 DB time
38.4 ms
−4.1ms vs Tue
Cost / 1M spans
$0.0061
−$0.0004 vs Tue
DB time / min peak 41.2s
5 regions · 1.2s ingest lag 99.98% SLA
Tracing production databases at
Vantage FORECAST GLIDE Mercury Brightwave nimbus. Coast/ Halcyon Strata Vantage FORECAST GLIDE Mercury Brightwave nimbus. Coast/ Halcyon Strata
Capabilities

Everything your DBA checks by hand, attached to the request.

Auto-instrumentation, the bound plan, lock waits, and N+1 detection — wired to the same trace context as your HTTP spans. One install, no agent, no sampling guesswork.

Bound plan capture

The EXPLAIN you never have to re-run.

Spool captures the planner output with the live bind parameters, so the plan you read is the plan that ran — slow rows highlighted, scan types flagged.

EXPLAIN ANALYZE · trace 7f3a91 1,188ms · 1 row
-- bound params: order_id = 90412
SELECT o.*, li.sku, li.qty
FROM orders o
JOIN line_items li ON li.order_id = o.id
WHERE o.id = $1;

Nested Loop  (rows=1 width=212)
  -> Index Scan on orders  (0.04ms)
  -> Seq Scan on line_items  (1041ms)
       Filter: order_id = $1 · 4.2M rows scanned
Suggested: CREATE INDEX ON line_items(order_id)
N+1 detection

Find the 240-query loop.

Spool collapses repeated statements within a trace and tells you which line of code fanned them out.

N+1 SELECT * FROM users WHERE id = $1 ×240
DUP SELECT plan FROM teams WHERE id = $1 ×18
OK orderRepo.findById:42 → batch loader ×1
Lock waits

See who held the row.

Every block records the blocking PID, the lock mode, and the statement that held it — no pg_locks archaeology.

⏳ blocked 842ms — ShareLock on tuple
holder pid 4471 UPDATE inventory SET qty…
Install

Three lines.

// auto-patches pg + prisma
import { spool } from "@spool/node";
spool.init({ service: "api" });
Drivers

Already speaks your stack.

node-postgres
auto · v8+
Prisma
auto · v5+
Drizzle
auto · 0.30+
SQLAlchemy
py · beta
Regression alerts

Page on the deploy. Skip the noise.

Spool diffs each query's P95 across deploys and routes a regression to the exact commit — high-severity to PagerDuty, the rest to Slack.

PagerDuty Slack GitHub Checks Linear Webhook
last 7d · prod
Total DB time
14.2 h
↘ −2.4h / 7d
Slowest P95
1,188 ms
↗ +402ms / 7d
Distinct queries
3,914
22 services
SELECT orders JOIN line_items
1,188ms
41%
UPDATE inventory SET qty
214ms
19%
SELECT users WHERE id = $1
0.9ms
12%
INSERT events VALUES …
1.4ms
9%
P95 of top query / day
baseline current
Query Explorer

Rank every query by the time it actually costs you.

Spool groups statements by their normalized shape, then sorts by total DB time — not call count. The query firing 4 times an hour at 1.2s outranks the one firing a million times at 0.4ms, because it's the one burning your latency budget.

  • Normalized fingerprinting — $1 and $2 fold into one row, literals stripped.
  • Drill from a query to the 50 slowest traces that ran it, sampled tail-first.
  • Compare two deploys side-by-side — every query's P50/P95/P99 delta.
  • Native OpenTelemetry — your Grafana and Datadog taps keep working.
Read the Query Explorer guide
Pricing

Pay per span. Not per host.

A span is one instrumented statement. Bundles roll over for 90 days. No agent license, no per-seat tax on the engineers reading the traces.

Hobby

free forever
$0
/ month

For a side project or one service in staging.

  • 1,000,000 spans / mo
  • 3-day trace retention
  • Bound plan capture + N+1
  • Community Discord
Start free
Most popular

Team

incl. deploy diffing
$49
/ month · billed annually

For teams shipping a Postgres-backed product to real traffic.

  • 250,000,000 spans / mo
  • 30-day retention + deploy diffing
  • Lock-wait + regression alerts
  • Slack + PagerDuty + GitHub Checks
  • SLA: P95 ingest < 2s
Start 14-day trial

Scale

BYOC
$299
/ month base · usage on top

For regulated workloads and 10B+ spans a month.

  • Bring-your-own-cloud ingest
  • SOC 2 Type II + HIPAA + SCIM
  • Query-text redaction at the agent
  • Solution architect + 1h SLA
Talk to sales
Customers

Engineers who got tired of guessing which query.

"Spool found a Seq Scan that had been hiding behind a 200 OK for fourteen months. Adding one index cut our P95 checkout from 1.4s to 90ms."
PA
Priya Anand
Staff Engineer · Vantage
"The deploy diffing is the feature I didn't know I needed. Every PR now shows me which query its migration made 40% slower before it merges."
KM
Kenji Mori
Platform Lead · Forecast
"We dropped two APM agents and our log bill in half. Spool's per-span price means we finally trace 100% of traffic instead of sampling 5%."
IC
Ines Calderón
Head of Eng · Brightwave
FAQ

Questions a senior engineer asks.

Everything else is in the docs. The SDK is open source under Apache-2.0 — read exactly what it patches before you install it.

What's the per-query overhead of auto-instrumentation?+

We wrap the driver's submit path, not the protocol. Measured overhead is P50 28µs, P99 71µs per statement — captured plans are fetched out-of-band from a sampled tail, never on your request path.

How is a "span" defined for billing?+

One executed statement = one span, regardless of rows returned. A transaction with 6 statements is 6 spans. We bill in $0.0061-per-1M increments and bundles roll over month-to-month for 90 days.

Can I redact query text and parameters before they leave my VPC?+

Yes. On Scale, the collector normalizes statements and strips literals at the agent — only the fingerprint and timing leave your network. Bind parameters are hashed locally and never transmitted.

How do you capture EXPLAIN without doubling my query load?+

We only run EXPLAIN (ANALYZE, BUFFERS) on tail-sampled slow statements — by default the slowest 0.5% over 200ms, capped at 4 per query shape per minute. Never on the hot path.

Does it work with PgBouncer in transaction-pooling mode?+

Yes. Spool instruments above the pooler at the driver level, so trace context survives connection multiplexing. We carry the trace ID in an application_name tag so it correlates with pg_stat_activity too.

Install it before lunch. Find the slow query by 2pm.

Free up to a million spans a month, no credit card. Three lines, no agent, no sampling config.

$ npm i @spool/node