Marcus Tobin. Distributed systems engineer, consulting since 2024.
I work with high-performance teams on database internals, query planning, and concurrency. Available for short engagements — usually 4 to 12 weeks, hands on keyboard, embedded with one team.
Selected work
Five roles. Each row links to a longer write-up where it exists; otherwise the bullet points are the whole story.
-
2024 — Now2.2 yr
Forecast
· Founding engineerBuilt the realtime query engine for a financial scenario-planning product. Three engineers; I led data & storage.
- Designed a column-store + incremental-view planner; P95 1.18s on the median dashboard at 12,408 scenarios.
- Wrote the Rust async runtime adapter on top of Tokio that the team still uses (1,402 LOC, 0 unsafe).
- Shipped the first paying customer 11 weeks after I joined; ARR crossed $1.4M at the close of Q1.
RustTokiocolumn-storePostgres -
2021 — 20243.0 yr
Linnea
· Staff engineerLed a 14-month Postgres-to-Cockroach migration for the analytics tier. Zero customer-visible downtime.
- Cut warehouse cost −42% versus the rolling 6-month baseline by replanning hot dashboards.
- Authored Linnea's internal "on-call playbook for query slowness"; still mandatory reading for staff+ candidates.
- Drove the SLO program from idea to operating discipline; 0.0064% error budget burn on the read path through 2023.
PostgresCockroachDBSLOmigration -
2018 — 20213.0 yr
Halcyon
· Senior engineerOwned distributed tracing platform serving 412 internal services.
- Reduced ingest cost-per-span to $0.0064 while doubling retention to 14 days.
- Open-sourced our sampling library (halcyon-rs/sample); 2,184 GitHub stars, used in production at four named companies.
- Mentored five engineers from mid to senior — three are now staff-level elsewhere.
OpenTelemetryRustKafka -
2016 — 20182.0 yr
Stratos Protocol
· EngineerDesigned and shipped a custom k-d tree index for geo queries.
- Replaced an off-the-shelf R-tree, dropping P99 nearest-neighbor latency from 84ms → 11ms on the production fleet.
- Patent: US 11,408,124 — Coarse-to-fine geospatial bucketing for moving point sets. (Assignee not me; thoughts on patents available on request.)
k-d treeC++geo -
2014 — 20162.0 yr
Atrium
· EngineerFirst job out of CMU. Wrote a lot of CRUD, shipped the first production GraphQL gateway, learned the trade.
- Built the company's first read-replica failover system — still in service eleven years later.
- Took the on-call beeper, on average, every fifth weekend. Learned what production actually feels like.
GoGraphQLfirst job—
Talks, writing & open source
Conference talks I have recordings of, essays I still stand behind, and the OSS work that survived its own pull requests.
-
2025 · 10SREcon · "Six P99s that lied to me, and what I did about it"A 28-minute talk on misleading latency histograms, with the corrected dashboards open in a second window. Slides + recording linked.SREcon · Dublin
-
2024 · 09PgCon · "Custom indexes you can actually keep in production"On building and operating GIST and SP-GiST opclasses, drawing on the Stratos k-d tree work and three regrets I had at Linnea.PgCon · Ottawa
-
2023 · 09Strange Loop · "Concurrency for people who hate Rust"A reluctant defense of async/await for non-systems engineers. The most viewed talk I've given (218,408 views and counting).Strange Loop · STL
-
2022 · 05PostgresOpen Silicon Valley · "Reading EXPLAIN like a novel"A tutorial on plan trees, joins, and the four lies the planner tells you on a Tuesday afternoon.Tutorial · 90 min
-
2026 · 02Your error budget is a budgetOn treating SLO error budgets like real money, with a worked example from Forecast's first quarter.Essay · 11 min
-
2025 · 04The wrong P95 is worse than no P95On weighted vs. simple histograms; the talk above is the short version of this essay.Essay · 16 min
-
2024 · 11Notes on running Cockroach on bare metalCapacity tuning, NUMA, raid layout — what survived from the Linnea migration, what I'd do differently today.Notes · 22 min
-
2023 · 08Don't hire me to do thisA short essay on when consulting is the wrong answer. I republished it when I went independent.Essay · 5 min
About
I have written code professionally for about ten years. I am picky about the work I take on because most companies don't actually need someone like me — they need to hire five mid-level engineers and run a real on-call rotation. If you have an actual hard distributed-systems problem, and you want someone with deep query engine and concurrency experience for four to twelve weeks, this is the page to start from.
I prefer being one of three on an engagement, not the lone outside expert. I don't ghost-write code or attend status meetings. I will tell you when the answer is to hire, not to consult.
Outside work I cycle a lot, repair old typewriters, and run a small reading group for engineers on databases papers from the 1980s. All of this is on Tuesday nights.
- 2010 — 2014Carnegie Mellon · BS Computer Sciencehonors thesis · query optimization
- 2014Postgres summer of code · plannermentor: Peter Geoghegan
- → Write down the question before the code.
- → A profiler beats a hot take.
- → If it doesn't fit on a 2-page resume, it didn't ship.
- → Two engineers, not five, beats five not three.
dated · 2026.04.18 · Pittsburgh
Booked through Q3 2026. Available for a 6-week engagement starting October 6.
Current rotation: 3 days a week on the Forecast planner, 2 days reviewing query plans for a Series-C team in Berlin (NDA). Open to a fourth project starting Q4 — please write before September.
Write a paragraph. Tell me the team size, the problem in plain prose, the deadline, and the budget range. I read every email and reply within two business days; if I'm not the right fit I'll usually suggest someone who is.