OSI-openLocal inference and "what runs on my machine"

whichllm

Andyyyy64

CLI that detects your hardware and ranks local LLMs that will run well on it, scored against real benchmarks.

2.8k stars(as of 2026-06-05)View on GitHub

Overview

What is whichllm?

A CLI that detects your hardware (GPU, CPU, RAM) and ranks the local LLM that will actually run well on it, scored against real benchmarks (LiveBench, Artificial Analysis, Aider, Arena ELO) rather than parameter count alone.

Analysis

Pros & Cons

Pros

Evidence-based ranking from multiple leaderboards, not a size heuristic
Confidence markers (~ for estimated, ? for no data) - honest about uncertainty
Scriptable JSON output, plus GPU simulation for purchase planning

Cons

Speed figures are estimates, not measured guarantees
Ollama integration needs manual HuggingFace ID mapping
Early 0.x phase (v0.5.8)

License

MIT (OSI-open)

When it is interesting

Deciding what to run, or which GPU to buy, before you commit.

When it is too early

If you need measured throughput rather than estimates.

This repo featured in the 2026-06 edition of the Open-Source AI Radar.

Similar repositories

oMLX

jundot

16.6k

macOS-native LLM inference server for Apple Silicon with continuous batching and SSD-tiered caching.

OSI-openLocal inference and "what runs on my machine"

apfel

Arthur-Ficial

5.8k

Expose the on-device Apple Intelligence model on macOS 26 as a zero-setup OpenAI-compatible local API.

OSI-openLocal inference and "what runs on my machine"

shimmy

Michael-A-Kuykendall

5.3k

Pure-Rust local inference engine with an OpenAI-compatible API, shipped as one binary.

OSI-openLocal inference and "what runs on my machine"