Phase 4 · End-to-end Linux compile path landed

A distributed compiler cache
an auditor will actually approve.

ccache is great on your laptop. sccache adds a daemon and a remote cache. distcc farms compiles across machines. They all assume the worker is trusted shared-kernel infrastructure. hpcc assumes it isn't.

Isolation
Firecracker · KVM
Network
vsock only · no NIC
Identity
OCI image digest
Wire
gRPC · zstd · 5–10×
§ 01 · Why hpcc exists

A security review isn't asking "is namespace isolation technically sufficient?" They're asking "is this a boundary auditors recognize?"

A bwrap sandbox is not. A KVM boundary is. That single distinction is the entire reason hpcc exists — and the reason every existing OSS distributed compiler stalls at the regulated-enterprise door.

ccache · sccache · distcc

The worker is trusted shared-kernel infrastructure. Isolation is best-effort: a bubblewrap namespace, a Linux user, a chroot. Defensible on a laptop, indefensible in a regulated multi-tenant environment.

// gVisor was considered and rejected:
it's a userspace kernel intercepting
syscalls — not the kernel+KVM boundary
a security review actually recognises.

hpcc

The worker is hostile-by-default, multi-tenant, and on the audit trail. One Firecracker microVM per tenant session. Separate kernel. KVM boundary. No NIC. Every compile produces a single audit row reproducible from one line of a TSV.

// the host↔guest channel is
one vsock device carrying a single
bidirectional gRPC stream.
// there is no exfiltration argument to have.
§ 02 · Where the conversation usually ends

No competing OSS distributed compiler ships hardware-virtualised per-tenant isolation.

A cell-by-cell view of the boundary your auditor can recognise — and the operational pieces that fall out of it.

ccache sccache-dist distcc hpcc
Worker isolation none
local user, no remote
bwrap
shared kernel namespace
none
worker is the host
Firecracker · KVM
per-tenant microVM, separate kernel
Network on worker full NIC full NIC no NIC · vsock only
Toolchain identity compiler binary hash compiler binary hash operator-managed OCI image digest
Preprocessing client-side client- or server-side client-side client-side · canonical bytes via image-pinned env + auto-injected flags
Audit trail logs logs per-job row · 9-tuple · reproducible
Reproducibility by ceremony by ceremony by ceremony auto-injected flags · pinned env
Cache poisoning laptop-trusted client writes cache paranoid mode · worker-only writes
Windows path Hyper-V isolated · MSVC behind same RPC
§ 03 · What ships in the box

Nine decisions that fall out of "the worker is hostile-by-default."

F.01

One Firecracker microVM per tenant session

Driven directly by hpcc — no firecracker-containerd dependency. The VM stays warm across compiles, snapshotted on idle timeout. Separate kernel. KVM boundary.

F.02

The VM has no NIC

There is no exfiltration argument to have, because there is no network device. Full stop. Host↔guest is one vsock device carrying a single bidirectional gRPC stream.

F.03

Container image digest is the toolchain identity

No "hash the gcc binary" dance. 50 developers sharing one image produce one cache bucket. CI and laptops cannot silently diverge.

F.04

Route-only scheduler

The scheduler is a lookup service — never touches compile payloads or artifact bytes. The client dials the worker directly. No proxy hop, no bottleneck, no scheduler holding cache keys it doesn't need.

F.05

Reproducibility by default

Auto-injected -Werror=date-time, -ffile-prefix-map, -frandom-seed; pinned locale, timezone, hostname inside the VM. Byte-identical outputs by default — not by ceremony.

F.06

Per-job audit row

(image_digest, source_digest, flags, output_digest, tenant, worker, vm, duration, exit) — every compile, reproducible from a single line. The table format auditors want to see.

F.07

Structured miss explanations

hpcc explain <file> names which header or which flag changed the cache key. Not a debug log you have to grep.

F.08

Per-call zstd on the wire

Preprocessed C++ compresses 5–10×. The single largest perf lever, and it's on by default.

F.09

Paranoid mode

Cache reads and writes happen only on the worker. Clients never touch cache stores, never hold remote-store credentials. A compromised laptop cannot poison the cache.

§ 04 · Architecture

Route-only scheduler. Workers dispatch into per-tenant VMs over vsock.

The scheduler returns a worker address and TLS trust info — it never touches compile payloads. The client dials the worker directly. The worker dispatches into a tenant-pinned Firecracker VM via a single bidirectional gRPC stream over a vsock device. No NIC ever appears in the guest.

CLIENT hpcc wrap → gcc / clang / cl + daemon over loopback SCHEDULER route-only never sees payload REMOTE CACHE S3 / MinIO / R2 cache/ prefix · LRU WORKER · linux host FIRECRACKER µVM · TENANT-PINNED hpcc-agent PID 1 · OCI rootfs (ext4) — no NIC — vsock cid:3 — bidi gRPC: AgentService.Exec 1. route? 2. signed JWT 3. gRPC · zstd · per-call 4. paranoid mode: worker-only vsock
direct payload path · zstd · per-call control / cache plane response
§ 05 · The audit row

A reproducible compile is one line of a table.

Every job emits a 9-tuple. hpcc inspect <hash> reads it back. hpcc explain <file> tells you which header or flag broke the key — not a log to grep.

job · audit row 2026-05-10T14:22:08Z
image_digest
sha256:3f7c…ba12
source_digest
sha256:9ea1…44d0
flags
-O2 -std=c++20 -fPIC
+-Werror=date-time
+-ffile-prefix-map
output_digest
sha256:1d8f…cc7a
tenant
fixed-income.eq-deriv
worker
w-eu-w1-04 · 7.5 GiB
vm
fc-7831 · cid:3 · warm
duration
1.84s · cache miss → store
exit
0 · ok
byte-identical reproducible tsv · 1 line

When a build that should have hit the cache misses, the question stops being "where do I start?" and starts being "what changed?" hpcc answers the second question directly.

terminal · hpcc explain MarketRiskBook.cpp.o
$ hpcc explain build/MarketRiskBook.cpp.o
scanning 142 inputs against last hit (2026-05-09T11:08Z)…

source hash unchanged (MarketRiskBook.cpp · 9ea1…44d0)
image digest unchanged (toolchain:gcc-13.3 · 3f7c…ba12)
138 of 140 transitive headers identical

flag changed · cache key broken here
was: -DRISK_BUILD=2026.05.08-rc3
now: -DRISK_BUILD=2026.05.09-rc1

header changed · vendored/curve_solver.hpp
@line 41 · added: constexpr int kMaxIters = 256;

// 1 missed compile · 7m 12s spent · 4 follow-on misses suppressed
$ _
§ 06 · Roadmap

Three phases done, one in progress, one to go.

The cache loop and the daemon are table stakes — sccache does those well. hpcc's bet is on phase 4: the place existing tools can't follow without rebuilding their isolation model from scratch.

§ 07 · Maintainer
Signed-off-by: Afshin Arani <@aarani>

Senior software engineer. Spends his days inside the kind of environments hpcc was built for — which is the short answer to "why does this exist?" Reviews, issues, and well-formed bug reports are welcome on github.

solo maintainer · AGPL licensed · no external funding