v3.8 · eBPF + OTel

See every hop in your cluster. Without rebuilding your stack.

Compass picks up traces, RPS, and error rates from your existing Envoy and Linkerd proxies using eBPF — no code changes, no sidecar redeploys. Sub-1% CPU overhead per pod.

Start tracing free Read the architecture

$ helm install compass compass/agent --namespace observability

cluster · prod-us-east-1 · 58 services

live last 5m

healthy degraded error

cluster · last 5m

live

Services tracked

+2 this week

Spans / second

142,408

peak 184k

Mesh latency overhead P95

0.84 ms

−0.18ms / 7d

Active alerts

1 sev-2, 2 sev-3

Spans / sec last 24h

EKS · Linkerd 2.14 DaemonSet 12 nodes

Observing clusters in production at

Crucible CINDER FOUNDRY & CO Cargo Strata lattice/cloud Volt Beacon Labs Crucible CINDER FOUNDRY & CO Cargo Strata lattice/cloud Volt Beacon Labs

Capabilities

The four panes platform teams open on a bad day.

Topology, traces, latency, and saturation — joined on workload labels, not service names. Works on EKS, GKE, AKS, on-prem k3s.

Trace flame chart

A request, 7 services, one line of vision.

Drilled to span; colored by service. The slow leg of every request is one click away.

trace 7f1a4c · POST /checkout · 1.18s P95 of last 1k

ingress · envoy

1180ms

↳ gateway · /checkout

1128ms

↳ auth · /verify

128ms

↳ orders · /create

566ms

↳ pricing · /quote

378ms

↳ pg.query SELECT … pricing_rules

284ms

↳ catalog · /sku

92ms

↳ payments · /charge

354ms

0ms295ms590ms885ms1.18s

Latency heatmap

Services × time. Hot cells = bad day.

P95 latency per service, binned in 15-second buckets.

mTLS status

Mutual TLS, but make it visible.

Pair-wise cert status across the mesh — one yellow flags an expiring cert before Istio does.

auth

ord

cat

pri

pay

ingress

gateway

orders

verified cert < 7d no path

Retry budget

Cap the retry-storm. Globally.

# retries.yaml
budget:
  ratio: 0.2
  min_per_sec: 10
  ttl: "5s"
backoff: exponential
cap: 3

Pod resources

Sub-1% CPU per pod. Verified.

eBPF agent runs as a DaemonSet, not a sidecar. Average CPU 0.6%, memory 38 MB resident — measured across 24h on a 12-node EKS cluster.

CPU avg

0.61%

Mem RSS

38 MB

P99 hook

14 µs

prod · last 15m

Traces / sec

38,402

↗ +8.2% / 24h

P95 e2e latency

1.18 s

↘ −214ms / 7d

Error rate

0.084%

↗ pricing 4.8%

Sample · POST /checkout · 7 services 1.18s

envoy

1180ms

gateway

1140ms

auth

128ms

orders

566ms

pricing

378ms

catalog

92ms

payments

354ms

Zero-config tracing

Don't touch the code. See the traces.

Compass attaches at the kernel — eBPF tracepoints on the socket layer pick up every TLS-terminated request inside the mesh. No SDK, no instrumentation library, no rebuild.

Speaks HTTP/1.1, HTTP/2, gRPC, Postgres, Kafka, Redis out of the box.
Reads W3C traceparent from the wire when your app sets one.
Stitches synthetic trace IDs across uninstrumented hops via timing + headers.
Exports OTLP to Jaeger, Tempo, Honeycomb — Compass UI is optional.

Read the eBPF guide

Pricing

Per cluster. Not per host.

Run as many pods as you like. Compass charges by cluster footprint and span retention, not node count.

Cluster

free forever

/ month

For one cluster up to 30 services.

1 cluster · unlimited nodes
7-day trace retention
Topology + flame chart
Community Slack

Install Helm chart

Multi-cluster

incl. alerting

$199

/ month · billed annually

For teams running 2–10 clusters per region.

10 clusters · cross-cluster joins
30-day trace retention
SLO + burn-rate alerting
OTLP export · Grafana plugin
SLA: 99.95% query API

Start 14-day trial

Fleet

air-gap

$499

/ month base · usage on top

For platform teams with 50+ clusters and regulated workloads.

Unlimited clusters · multi-tenant
Air-gapped deploy · self-host
Signed audit log · SOC 2 · SSO
Customer-managed encryption

Talk to sales

Customers

Platform teams running real clusters at real scale.

"We rolled Compass out across 31 EKS clusters in three hours. No code changes. The CPU overhead on our 8k-pod fleet is 0.6%, end of story."

Yusuf Abara

CTO · Mercury

"The eBPF approach means our app teams don't need to touch their dockerfiles. Compass detected a retry storm in pricing 14 seconds before our SLO burn alarm did."

Hana Suzuki

Platform Lead · Glide

"Our previous mesh dashboard required redeploying every sidecar. Compass dropped in as a DaemonSet and we got better fidelity that night."

Ines Calderón

Head of Platform · Brightwave

FAQ

Frequently asked, honestly answered.

Most of our team is ex-Cilium and ex-Datadog. If something here doesn't satisfy, ping [email protected].

Does it require sidecar injection or can I use Envoy at the ingress?+

Neither is required. The agent runs as a DaemonSet and reads kernel events directly. If you already have Envoy or Linkerd injected, Compass also consumes their stats endpoint for richer L7 metadata — but it's strictly additive. A pod with no proxy still gets full topology and RPS coverage.

What's the CPU/memory overhead per pod?+

The agent itself is per-node, not per-pod. On a typical mixed workload we measure 0.6% CPU and 38 MB RSS per node averaged over 24h, and a P99 hook latency of 14µs. There is zero overhead added to the application pods themselves — they don't run any Compass code.

Can I export traces to my existing Jaeger?+

Yes. The agent's primary protocol is OTLP — point it at Jaeger, Tempo, Honeycomb, or any OTLP-compatible collector and you can use Compass purely as a capture layer. Our UI is optional and you only pay for what you query.

How do you handle multi-tenancy in a shared cluster?+

Namespaces are the tenant boundary. The agent applies a tenant label to every span based on the source pod's namespace, and our RBAC enforces row-level filters at query time. You can scope an API token to a single namespace, multiple, or the whole cluster.

Does it work on EKS Fargate?+

Fargate denies kernel-level access, so eBPF mode is out. We ship a Fargate-compatible userspace agent that consumes the Envoy admin endpoint and proxy access logs. Coverage is slightly lower (no DB-protocol parsing without proxy support) but topology, RPS, and HTTP latency are intact.