Compass
v3.8 · eBPF + OTel

See every hop in your cluster. Without rebuilding your stack.

Compass picks up traces, RPS, and error rates from your existing Envoy and Linkerd proxies using eBPF — no code changes, no sidecar redeploys. Sub-1% CPU overhead per pod.

$ helm install compass compass/agent --namespace observability
cluster · prod-us-east-1 · 58 services
live last 5m
2.4k rps 418 rps 1.8k 214 ingress envoy gateway 0.04% auth orders 1.2% err catalog pricing 4.8% err postgres redis
healthy degraded error
cluster · last 5m
live
Services tracked
58
+2 this week
Spans / second
142,408
peak 184k
Mesh latency overhead P95
0.84 ms
−0.18ms / 7d
Active alerts
3
1 sev-2, 2 sev-3
Spans / sec last 24h
EKS · Linkerd 2.14 DaemonSet 12 nodes
Observing clusters in production at
Crucible CINDER FOUNDRY & CO Cargo Strata lattice/cloud Volt Beacon Labs Crucible CINDER FOUNDRY & CO Cargo Strata lattice/cloud Volt Beacon Labs
Capabilities

The four panes platform teams open on a bad day.

Topology, traces, latency, and saturation — joined on workload labels, not service names. Works on EKS, GKE, AKS, on-prem k3s.

Trace flame chart

A request, 7 services, one line of vision.

Drilled to span; colored by service. The slow leg of every request is one click away.

trace 7f1a4c · POST /checkout · 1.18s P95 of last 1k
ingress · envoy
1180ms
↳ gateway · /checkout
1128ms
↳ auth · /verify
128ms
↳ orders · /create
566ms
↳ pricing · /quote
378ms
↳ pg.query SELECT … pricing_rules
284ms
↳ catalog · /sku
92ms
↳ payments · /charge
354ms
0ms295ms590ms885ms1.18s
Latency heatmap

Services × time. Hot cells = bad day.

P95 latency per service, binned in 15-second buckets.

mTLS status

Mutual TLS, but make it visible.

Pair-wise cert status across the mesh — one yellow flags an expiring cert before Istio does.

gw
auth
ord
cat
pri
pay
ingress
gateway
orders
verified cert < 7d no path
Retry budget

Cap the retry-storm. Globally.

# retries.yaml
budget:
  ratio: 0.2
  min_per_sec: 10
  ttl: "5s"
backoff: exponential
cap: 3
Pod resources

Sub-1% CPU per pod. Verified.

eBPF agent runs as a DaemonSet, not a sidecar. Average CPU 0.6%, memory 38 MB resident — measured across 24h on a 12-node EKS cluster.

CPU avg
0.61%
Mem RSS
38 MB
P99 hook
14 µs
prod · last 15m
Traces / sec
38,402
↗ +8.2% / 24h
P95 e2e latency
1.18 s
↘ −214ms / 7d
Error rate
0.084%
↗ pricing 4.8%
Sample · POST /checkout · 7 services 1.18s
envoy
1180ms
gateway
1140ms
auth
128ms
orders
566ms
pricing
378ms
catalog
92ms
payments
354ms
Zero-config tracing

Don't touch the code. See the traces.

Compass attaches at the kernel — eBPF tracepoints on the socket layer pick up every TLS-terminated request inside the mesh. No SDK, no instrumentation library, no rebuild.

  • Speaks HTTP/1.1, HTTP/2, gRPC, Postgres, Kafka, Redis out of the box.
  • Reads W3C traceparent from the wire when your app sets one.
  • Stitches synthetic trace IDs across uninstrumented hops via timing + headers.
  • Exports OTLP to Jaeger, Tempo, Honeycomb — Compass UI is optional.
Read the eBPF guide
Pricing

Per cluster. Not per host.

Run as many pods as you like. Compass charges by cluster footprint and span retention, not node count.

Cluster

free forever
$0
/ month

For one cluster up to 30 services.

  • 1 cluster · unlimited nodes
  • 7-day trace retention
  • Topology + flame chart
  • Community Slack
Install Helm chart
Most popular

Multi-cluster

incl. alerting
$199
/ month · billed annually

For teams running 2–10 clusters per region.

  • 10 clusters · cross-cluster joins
  • 30-day trace retention
  • SLO + burn-rate alerting
  • OTLP export · Grafana plugin
  • SLA: 99.95% query API
Start 14-day trial

Fleet

air-gap
$499
/ month base · usage on top

For platform teams with 50+ clusters and regulated workloads.

  • Unlimited clusters · multi-tenant
  • Air-gapped deploy · self-host
  • Signed audit log · SOC 2 · SSO
  • Customer-managed encryption
Talk to sales
Customers

Platform teams running real clusters at real scale.

"We rolled Compass out across 31 EKS clusters in three hours. No code changes. The CPU overhead on our 8k-pod fleet is 0.6%, end of story."
YA
Yusuf Abara
CTO · Mercury
"The eBPF approach means our app teams don't need to touch their dockerfiles. Compass detected a retry storm in pricing 14 seconds before our SLO burn alarm did."
HS
Hana Suzuki
Platform Lead · Glide
"Our previous mesh dashboard required redeploying every sidecar. Compass dropped in as a DaemonSet and we got better fidelity that night."
IC
Ines Calderón
Head of Platform · Brightwave
FAQ

Frequently asked, honestly answered.

Most of our team is ex-Cilium and ex-Datadog. If something here doesn't satisfy, ping [email protected].

Does it require sidecar injection or can I use Envoy at the ingress?+

Neither is required. The agent runs as a DaemonSet and reads kernel events directly. If you already have Envoy or Linkerd injected, Compass also consumes their stats endpoint for richer L7 metadata — but it's strictly additive. A pod with no proxy still gets full topology and RPS coverage.

What's the CPU/memory overhead per pod?+

The agent itself is per-node, not per-pod. On a typical mixed workload we measure 0.6% CPU and 38 MB RSS per node averaged over 24h, and a P99 hook latency of 14µs. There is zero overhead added to the application pods themselves — they don't run any Compass code.

Can I export traces to my existing Jaeger?+

Yes. The agent's primary protocol is OTLP — point it at Jaeger, Tempo, Honeycomb, or any OTLP-compatible collector and you can use Compass purely as a capture layer. Our UI is optional and you only pay for what you query.

How do you handle multi-tenancy in a shared cluster?+

Namespaces are the tenant boundary. The agent applies a tenant label to every span based on the source pod's namespace, and our RBAC enforces row-level filters at query time. You can scope an API token to a single namespace, multiple, or the whole cluster.

Does it work on EKS Fargate?+

Fargate denies kernel-level access, so eBPF mode is out. We ship a Fargate-compatible userspace agent that consumes the Envoy admin endpoint and proxy access logs. Coverage is slightly lower (no DB-protocol parsing without proxy support) but topology, RPS, and HTTP latency are intact.

Helm install · 90 seconds

One DaemonSet. A full mesh map.

Free for a single cluster, forever. No credit card, no contract — just a Helm chart and a topology graph in your browser.