OSI-openLocal inference and "what runs on my machine"

apfel

Arthur-Ficial

Expose the on-device Apple Intelligence model on macOS 26 as a zero-setup OpenAI-compatible local API.

5.8k stars(as of 2026-06-14)View on GitHub Homepage

Overview

What is apfel?

apfel wraps Apple's on-device Foundation Models framework (the ~3B model shipping with macOS 26 / Tahoe) as a CLI, REPL, and OpenAI-compatible HTTP server on localhost. No model download, no API key, no cloud. Supports tool calling, MCP, JSON output and nine languages.

Analysis

Pros & Cons

Pros

Zero model download and zero cost - uses the model already baked into macOS 26
OpenAI-compatible server, so existing integrations work unchanged
MCP support and tool calling enable on-device agentic workflows

Cons

Requires macOS 26 (Tahoe), only on developer betas at the time of writing
4,096-token context window is small versus most open-weight models
Quality is bound by Apple's on-device ~3B model - not for complex reasoning

License

License

MIT (OSI-open)

When it is interesting

Developers on macOS 26 who want a fully offline, zero-cost LLM endpoint for prototyping and privacy-sensitive automation.

When it is too early

If you need macOS 15 support today, a larger context window, or stronger model quality.

Context

Commercial alternative & related

Commercial counterpart: OpenAI / Anthropic API

This repo featured in the 2026-07 edition of the Open-Source AI Radar.

Similar repositories

oMLX

jundot

macOS-native LLM inference server for Apple Silicon with continuous batching and SSD-tiered caching.

OSI-openLocal inference and "what runs on my machine"

shimmy

Michael-A-Kuykendall

Pure-Rust local inference engine with an OpenAI-compatible API, shipped as one binary.

OSI-openLocal inference and "what runs on my machine"

whichllm

Andyyyy64

CLI that detects your hardware and ranks local LLMs that will run well on it, scored against real benchmarks.

OSI-openLocal inference and "what runs on my machine"