OSI-openLocal inference and "what runs on my machine"

openlake

openlake-project

Rust object-storage engine on io_uring that moves data the shortest path from NVMe to GPU memory for training and inference.

1.6k stars(as of 2026-06-26)View on GitHub Homepage

Overview

What is openlake?

A Rust object-storage engine built on io_uring that aims to move data the shortest path from NVMe to GPU memory, using GPUDirect Storage and RDMA, a thread-per-core design and SIMD erasure coding. It is S3-wire-compatible and targets training checkpoints, inference model and KV-cache loads and vector/RAG segment reads.

Analysis

Pros & Cons

Pros

Genuine low-level systems engineering (io_uring, GPUDirect/RDMA, SIMD erasure coding) under Apache-2.0
S3-wire-compatible, so it drops into PyTorch, vLLM, Ray, Triton, FAISS and Milvus stacks without rewrites
A tightly scoped, clearly explained niche: the NVMe-to-VRAM data path

Cons

Very early (v0.4.0, source-build-only, no stability statement)
Effectively Linux-only (io_uring) and dependent on specialised RDMA/GPUDirect hardware to realise its claims
Headline throughput multipliers are unverified internal benchmarks

License

Apache-2.0 (OSI-open)

When it is interesting

NVMe-to-GPU data loading is your training or inference bottleneck and you run RDMA/GPUDirect-capable Linux hardware.

When it is too early

For production use or commodity (non-RDMA) setups; it is pre-1.0 with niche hardware requirements.

Context

Commercial alternative & related

Commercial counterpart: VAST Data

This repo featured in the 2026-07 edition of the Open-Source AI Radar.

Similar repositories

oMLX

jundot

16.6k

macOS-native LLM inference server for Apple Silicon with continuous batching and SSD-tiered caching.

OSI-openLocal inference and "what runs on my machine"

apfel

Arthur-Ficial

5.8k

Expose the on-device Apple Intelligence model on macOS 26 as a zero-setup OpenAI-compatible local API.

OSI-openLocal inference and "what runs on my machine"

shimmy

Michael-A-Kuykendall

5.3k

Pure-Rust local inference engine with an OpenAI-compatible API, shipped as one binary.

OSI-openLocal inference and "what runs on my machine"