about

A small lab for shared inference.

Coconut Labs is Shrey Patel and Jay Patel. We work on inference systems, schedulers, and the small pieces of infrastructure that decide whether shared compute behaves like a tool or a tax.

The lab is intentionally small: two people, close to the measurements. A result should have a harness. A claim should have a number. A page should leave enough quiet around the thing it is trying to say.

The first public thread is KVWarden: tenant fairness on shared inference. Weft follows the same line closer to local Apple Silicon inference. Both projects are about making the shared layer more honest under pressure.

People

Shrey Patel

Co-founder · Engineer

Shrey Patel

Engineer and writer. Builds inference middleware between LLMs and the GPUs they run on. Currently building Coconut Labs.

Jay Patel

Co-founder · Engineer

Jay Patel

Engineer focused on inference reliability and tenant fairness on shared hardware. Co-founder of Coconut Labs.

How we work

Honest scale

Coconut Labs is Shrey Patel and Jay Patel. Two people, close to the work, no fake-team plurals beyond that.

Specific numbers

A result earns attention when the units are visible. Latency, pressure, and failure modes stay in frame.

Boring infrastructure

Schedulers, harnesses, and traces are not decoration. They are the artifact.

Ship the code first

A post can be elegant. The repository still has to run.

Slow web

Pages should feel like edited pages. Motion is allowed when it carries meaning.