Read the launch

independent inference research

Schedulers, systems notes, and reproducible measurements for shared inference.

KVWarden Gate 2: 1.14× of solo TTFT under load. 26× better than FIFO.

Coconut Labs works on the shared layer of inference: scheduling, fairness, cache pressure, and the measurements that keep claims honest.

The lab is small by design. Fewer abstractions between the benchmark, the note, and the code.

The quiet tenant should still have a name.

the lab

Two engineers, close to the work.

Coconut Labs is intentionally small. The work happens in the open at github.com/coconut-labs and shows up here when there is a result worth standing behind.

How we work

Building something at this layer? Write us.

latest note · 2026-04-19 (7 weeks ago)14 commits this weekkvwarden gate 2 · 1.14× solo · 26× better than fifo4 repos tracked6 rfc openupdated 11d ago