Skip to main content
AI Tool Radar
Open Source

Open-Source AI Radar: 19 Rising GitHub Repos (June 2026)

Emerging, fast-growing open-source AI projects on GitHub - with star counts, honest license labels, maturity and real caveats. June 2026 edition.

20 min read2026-06-05By Roland Hentschel
open source aigithub trendingopen weightself-hosted ailocal llm

Most "best open-source AI" lists recycle the same household names. This one does the opposite: it tracks rising, niche repositories that are growing fast right now but have not yet become the default answer. Think of it as a radar for what is gaining altitude, not a museum of what already landed.

This is the first monthly edition. The method is deliberately hybrid: a measurable shortlist from the GitHub Search API (star growth, recent activity, AI relevance), then human editorial selection and honest framing. The full methodology lives on the Open-Source Radar hub.

Update (June 7, 2026): We expanded this edition from 13 to 19 repos, adding three new clusters - computer-use agents, MCP servers and security agents - plus fresh entries in voice and document AI. The six additions are all OSI-open; their star counts are as of June 7, the original 13 as of June 5.

One thing this edition takes seriously that most lists ignore: "open source" is not one thing. A repo can be truly OSI-licensed, or "open weight" with usage restrictions, or merely source-available. Each tool below gets its real license label, not a blanket "open source" tag.

The three license tiers#

TierWhat it meansExamples in this list
OSI-openApache/MIT/BSD, free for any use incl. commercialgraphify, claude-mem, VoxCPM, zvec, OpenMemory, dograh, deta Surf, Rapid-MLX, whichllm, chrome-devtools-mcp, langextract, UI-TARS-desktop, voicebox, strix, n8n-mcp
Open weight, with conditionsWeights downloadable, but the model license adds limitschandra (model: OpenRAIL-M), supertonic (model: OpenRAIL-M)
Source-availableCode visible, but not a free-use licenseOmniVoice Studio (FSL-1.1)

All star and fork numbers below are as shown on GitHub on June 5, 2026 (the six mid-June additions: June 7, 2026). Fork-to-star ratios were healthy across the board (4:1 to 17:1), which argues against fake-star inflation.

Local inference and "what runs on my machine"#

shimmy (Michael-A-Kuykendall/shimmy) - 5.3k stars#

A pure-Rust inference engine with an OpenAI-API-compatible endpoint, shipped as a single binary: no Python, no llama.cpp. It runs on Vulkan, D3D12 and Metal, so CUDA is not required, and auto-discovers models from HuggingFace, Ollama and LM Studio.

Pros

  • Single binary, no Python or C++ toolchain
  • Broad GPU coverage without a CUDA dependency
  • Drop-in OpenAI API for local models

Cons

  • The Airframe GPU core cannot be built from source by the public - a real caveat for an 'open' tool
  • One model per server instance, no multi-model
  • MoE not yet implemented; performance claims (startup <100ms vs Ollama) are unverified project claims

License: Apache-2.0 per the badges (the README text says MIT - a genuine inconsistency worth checking before you rely on it). When it is interesting: OpenAI-API drop-in on mixed GPU hardware without Python. When it is too early: if you need to audit or build the GPU core yourself, or want multi-model serving.

Rapid-MLX (raullenchai/Rapid-MLX) - 2.7k stars#

A local OpenAI-compatible inference server for Apple Silicon built on MLX, designed to plug into coding agents like Cursor and Claude Code. It ships with tool-calling, prompt caching and 3,300+ tests.

Pros

  • Serious engineering signals: 3,300+ tests, a doctor diagnostic, broad model support
  • Clean Ollama/llama.cpp replacement on Apple Silicon
  • Apache-2.0, fully OSI-open

Cons

  • macOS / Apple Silicon only - no Linux, Windows or NVIDIA
  • Officially Beta (PyPI development status 4) despite a high version number
  • The '4.2x faster than Ollama' headline has no disclosed benchmark conditions - and PyPI states a more modest '2-4x'

License: Apache-2.0 (OSI-open). When it is interesting: Apple Silicon users running local inference for coding agents. When it is too early: any non-Apple hardware, or if you need reproducible speed guarantees rather than a marketing headline.

whichllm (Andyyyy64/whichllm) - 2.8k stars#

A CLI that detects your hardware (GPU, CPU, RAM) and ranks the local LLM that will actually run well on it, scored against real benchmarks (LiveBench, Artificial Analysis, Aider, Arena ELO) rather than parameter count alone.

Pros

  • Evidence-based ranking from multiple leaderboards, not a size heuristic
  • Confidence markers (~ for estimated, ? for no data) - honest about uncertainty
  • Scriptable JSON output, plus GPU simulation for purchase planning

Cons

  • Speed figures are estimates, not measured guarantees
  • Ollama integration needs manual HuggingFace ID mapping
  • Early 0.x phase (v0.5.8)

License: MIT (OSI-open). When it is interesting: deciding what to run, or which GPU to buy, before you commit. When it is too early: if you need measured throughput rather than estimates.

Open voice and text-to-speech#

This was the strongest cluster on the radar this month, and the one with real commercial reference points.

VoxCPM (OpenBMB/VoxCPM) - 26.1k stars#

A tokenizer-free TTS system from OpenBMB. VoxCPM2 (2B parameters) covers 30 languages including German, supports voice design from a text description (no reference audio), and streams in real time.

Pros

  • Apache-2.0 including the weights - genuinely free to use commercially
  • 30 languages with voice design and cloning
  • Dedicated inference engines with an OpenAI-compatible audio endpoint

Cons

  • Needs a GPU (~8 GB VRAM, CUDA 12+); Linux is the primary target
  • The README itself notes voice-design results vary between runs
  • Real-time factor depends heavily on hardware

License: Apache-2.0, code and weights - cleanly OSI-open. When it is interesting: self-hosters with a GPU who want true commercial freedom. When it is too early: CPU-only setups or anyone who needs a managed API.

supertonic (supertone-inc/supertonic) - 11.3k stars#

A very fast on-device TTS that runs natively through ONNX, with a compact ~99M-parameter model. It covers 31 languages, runs on CPU without a GPU, and even does browser inference via WebGPU.

Pros

  • Runs on CPU - Raspberry Pi, mobile, browser, no network needed
  • Real-time (a whole web page narrated in under a second)
  • SDKs across Python, Node, browser, Java, C++, Swift, iOS, Rust, Flutter

Cons

  • No built-in voice cloning in the open variant (fixed voice only)
  • The model is OpenRAIL-M, so this is open-weight with use restrictions, not fully OSI-open
  • 78 open issues at time of writing

License: code MIT, model OpenRAIL-M (open weight, with conditions). When it is interesting: edge, on-device or browser TTS where latency and privacy matter. When it is too early: if you need out-of-the-box cloning or want to avoid the OpenRAIL-M use clauses commercially. The natural managed upsell is the vendor's own Supertone Play/API.

OmniVoice Studio (debpalash/OmniVoice-Studio) - 6.2k stars#

A desktop app for local dictation, zero-shot voice cloning from a 3-second clip, and video dubbing - all on-device. It markets itself as "the open-source ElevenLabs alternative," and it is a real, multi-OS app with native installers.

Pros

  • Finished desktop app with native installers (macOS, Windows, Linux, Docker)
  • Fully local, no API keys, broad engine choice (CosyVoice, MLX-Audio, VoxCPM2)
  • Each release becomes Apache-2.0 two years after publication

Cons

  • Active beta (v0.3.5) - things break between releases by the maintainer's own admission
  • Despite the 'open-source' label, the license is FSL-1.1: source-available, not OSI-open
  • Non-commercial / no 'competing use' until the 2-year Apache conversion

License: FSL-1.1-ALv2 - source-available, free for private and non-commercial use only. This is the most important correction to its own "open-source" framing. When it is interesting: local, data-sovereign dubbing and dictation with no API costs. When it is too early: any commercial use without a purchased license, or production reliability.

If you want a supported, managed option instead of self-hosting a beta, the commercial original it benchmarks against is ElevenLabs:

Sponsored
E

ElevenLabs

4.6

dograh (dograh-hq/dograh) - 4.2k stars#

An open-source, self-hostable platform for building production voice agents (inbound and outbound calls) through a visual workflow builder. It positions itself as a self-hosted alternative to Vapi and Retell.

Pros

  • Self-hostable and free, no vendor lock-in versus Vapi/Retell
  • Telephony integrations (Twilio, Vonage, Telnyx, Cloudonix)
  • BSD-2-Clause, permissive and OSI-open

Cons

  • Docker setup required; hardware specs are not documented
  • Brings no engine of its own - quality depends on your own LLM/STT/TTS keys
  • Voice-agent reliability is on you to validate

License: BSD-2-Clause (OSI-open). When it is interesting: teams building voice agents who want data control and can run Docker. When it is too early: if you expect plug-and-play without your own model keys. The commercial originals here are Vapi and Retell (we have no affiliate relationship with either).

voicebox (jamiepine/voicebox) - 29.5k stars#

A local-first "voice studio" desktop app from Jamie Pine (of Spacedrive) that combines text-to-speech, zero-shot voice cloning from a few seconds of audio, and dictation (a global hotkey plus Whisper STT) - and exposes an MCP/REST server so agents can speak in a cloned voice. It runs fully on-device across macOS, Windows and Linux with seven swappable TTS engines.

Pros

  • MIT code with mostly MIT/Apache model weights - genuinely OSI-open and fully local
  • Covers both halves of the voice loop: TTS output and dictation/STT input, with native MCP integration for agents
  • Broad hardware support (Apple Silicon MLX, CUDA, ROCm, DirectML, Intel Arc, CPU)

Cons

  • Very young: repo created January 2026, v0.5.0, 433 open issues, several core features still on the roadmap
  • Voice-cloning abuse risk with no consent framework - the homepage promotes non-consenting celebrity presets (Freeman, Johansson, Obama)
  • Performance and privacy claims ('150x realtime on CPU', 'nothing leaves your device') are the project's own, unverified

License: MIT (OSI-open). When it is interesting: private, on-device TTS, cloning and dictation with agent integration. When it is too early: production use, or anywhere the cloning ethics and a four-month-old codebase are a concern. voicebox bills itself as a free, open-source alternative to ElevenLabs - the managed counterpart shown above for OmniVoice applies here too.

Agent memory and code knowledge#

claude-mem (thedotmack/claude-mem) - 80.8k stars#

A persistent memory layer across agent sessions. An observer agent automatically captures tool use and decisions, generates semantic summaries, and makes them available to future sessions, with visible token costs and <private> tags for sensitive content.

Pros

  • Apache-2.0, OSI-open
  • Installs as a Claude Code plugin in one command
  • Privacy tags and token-cost transparency built in

Cons

  • Primarily tuned for Claude Code; other agents are more marketing than support
  • Heavy dependency stack: Node 18+, Bun, SQLite, Chroma, uv
  • Version 13.x suggests a history of breaking changes

License: Apache-2.0 (OSI-open). When it is interesting: heavy Claude Code users losing context across sessions. When it is too early: if you primarily use other agents, or do not want the Bun + Chroma + uv stack. No commercial variant exists.

OpenMemory (CaviraOSS/OpenMemory) - 4.2k stars#

A local, self-hosted long-term memory store ("cognitive memory engine") for LLM apps, pitched as an alternative to RAG. It models memory in sectors (episodic, semantic, procedural) with a temporal knowledge graph and explainable recall traces.

Pros

  • Local-first, self-hosted (SQLite or Postgres), no lock-in
  • Connectors (GitHub, Notion, Drive) and migration from Mem0/Zep
  • Python and JS SDKs, integrations with LangChain, CrewAI, AutoGen, MCP

Cons

  • Smallest community here, and no release since December 2025 - the least actively maintained
  • The homepage says MIT, but it is actually Apache-2.0 (sloppy, if minor)
  • Comparison benchmarks are the project's own

License: Apache-2.0 (OSI-open). When it is interesting: a local, explainable memory layer with ready connectors. When it is too early: if active maintenance matters to you (six months without a release).

graphify (safishamsi/graphify) - 59.6k stars#

An AI-coding-assistant skill that turns a folder of code, SQL schemas, docs, PDFs and images into a queryable knowledge graph, invoked with /graphify across roughly 20 agents (Claude Code, Codex, Cursor, Gemini CLI, Aider and more).

Pros

  • Very broad agent support, no Neo4j or server needed
  • Outputs HTML visualization, JSON graph, Obsidian vault, architecture diagrams
  • MIT, OSI-open; local AST extraction via tree-sitter

Cons

  • The semantic step sends data to your agent's model API (cost and privacy)
  • Pre-1.0 (v0.8.31) - formats and APIs can still shift
  • Python 3.10+ required

License: MIT (OSI-open). When it is interesting: making a codebase or document set navigable as a graph, if you already use one of the agents. When it is too early: production-critical pipelines, given the pre-1.0 status. A commercial layer (Penpax) is on a waitlist, not yet live.

Vectors, documents and extraction#

zvec (alibaba/zvec) - 9.8k stars#

A lightweight, in-process vector database from Alibaba - a library you embed directly, not a server. It supports dense and sparse vectors, hybrid search, write-ahead logging, and bindings for Python, Node and Dart/Flutter.

Pros

  • No server or config - a real embedded vector DB
  • Multi-language bindings, runs anywhere from notebooks to edge
  • Apache-2.0, and 'battle-tested within Alibaba Group'

Cons

  • 'Billions of vectors, sub-millisecond latency' is a vendor claim, not independently verified
  • v0.4.0 - early, only 7 releases
  • C++ core may need a build toolchain on some platforms

License: Apache-2.0 (OSI-open). When it is interesting: an embedded vector DB for edge, desktop apps or local RAG. When it is too early: large production vector workloads while it is still v0.x.

chandra (datalab-to/chandra) - 11.1k stars#

An OCR/document model from the makers of Marker and Surya (Datalab). It converts images and PDFs into structured HTML, Markdown or JSON while preserving layout, across 90+ languages, including complex tables, forms and handwriting.

Pros

  • Very broad: tables, forms, handwriting, 90+ languages
  • Usable both locally (HuggingFace) and as a hosted API
  • Backed by an established team (Marker/Surya)

Cons

  • The model is Modified OpenRAIL-M: free only for research, personal use, and startups under $2M - not unrestricted OSI-open
  • A GPU is effectively required for local use
  • Benchmark claims are self-reported

License: code Apache-2.0, model Modified OpenRAIL-M (open weight, with a revenue/use condition). Worth checking carefully before commercial use. When it is interesting: demanding document digitization with a GPU or via the API. When to be careful: commercial self-use above the $2M threshold. Datalab offers a managed API (pay-per-page, $5 free credits), but we found no public affiliate program for it.

langextract (google/langextract) - 36.8k stars#

A Python library from Google that uses an LLM to pull structured information out of unstructured text, then grounds every extraction back to its exact location in the source ("source grounding") and renders an interactive HTML view. It calls no model itself - you bring a provider: Gemini (default), OpenAI, or local models via Ollama (no API key needed).

Pros

  • Apache-2.0, permissive and OSI-open, no copyleft
  • Provider-agnostic: cloud (Gemini/OpenAI/Vertex) or fully local via Ollama with no API key
  • Source-grounding and an out-of-the-box HTML visualization are a genuine differentiator

Cons

  • For cloud models it needs an external LLM API: running token costs, and your text leaves your machine (local only via Ollama)
  • The README states plainly 'this is not an officially supported Google product' - no SLA
  • Accuracy is the project's own claim and depends on the chosen model, prompt and examples

License: Apache-2.0 (OSI-open). When it is interesting: turning documents, reports or notes into structured data with traceable provenance. When it is too early: if you need a supported product with guarantees, or cannot send text to a cloud model and do not want to run Ollama locally. No commercial variant exists, and the LLM providers it needs carry no consumer affiliate program.

AI notebook#

deta Surf (deta/surf) - 3.4k stars#

A local-first "AI notebook" desktop app that pulls local files and web content (sites, YouTube, tweets, PDFs) into notebooks, with smart notes, source-linked citations, and "Surflets" - AI-generated interactive mini-apps.

Pros

  • Local-first: your data stays on your device
  • Cross-platform (macOS, Windows, Linux) with flexible model choice incl. local LLMs
  • Apache-2.0, open core on GitHub

Cons

  • Open beta (1.4.7-beta) - not final
  • No pricing communicated, so the monetization model is unclear
  • Marketing claims ('trillions of pages') are not measurable

License: Apache-2.0 (OSI-open). When it is interesting: research and note-taking with local data and free model choice. When it is too early: production-critical workflows, or if long-term cost planning matters.

Computer-use and autonomous agents#

A new cluster on the radar: tools that do not just answer, but act - driving a GUI, or probing an app like an attacker. Powerful, and genuinely riskier than the rest, so the maturity and safety caveats matter more here.

UI-TARS-desktop (bytedance/UI-TARS-desktop) - 36.2k stars#

A native desktop app (Windows/macOS, plus a browser build) for a GUI / computer-use agent: it takes a screenshot, a vision-language model reads the interface, and the agent drives mouse and keyboard from a natural-language instruction. It is powered by the open-weight UI-TARS model (e.g. UI-TARS-1.5-7B, run locally) or ByteDance's Seed series, and ships alongside Agent TARS, an MCP-based CLI/web sibling that works with any provider.

Pros

  • Both the app and the base model (UI-TARS-1.5-7B) are real Apache-2.0 - commercially free and fully self-hostable
  • Large, active community (36k+ stars) with a peer-reviewed paper behind it, and cross-platform
  • Flexible: run local or cloud, plus the Agent TARS stack with MCP and free provider choice

Cons

  • The open 7B model is Apache-2.0, but the strongest models (Doubao-1.5-UI-TARS, Seed-1.5-VL) are proprietary and paid via ByteDance's VolcEngine API - top performance means cloud lock-in, and the free remote operator was discontinued in August 2025
  • Computer use is inherently risky: an agent with full mouse/keyboard/browser control is exposed to prompt injection and misclicks - run it in a sandbox or VM
  • Pre-1.0 (v0.3.0) with 403 open issues; local hardware requirements for the 7B model are not documented

License: app and open model both Apache-2.0 (OSI-open) - but the highest-performing models are a proprietary, paid cloud backend. When it is interesting: an open, self-hostable computer-use agent for automation experiments. When it is too early: unsandboxed or production use, or if you need the top models without a paid VolcEngine plan. The commercial counterparts are Anthropic's Claude Computer Use and OpenAI's Operator (no affiliate relationship with either).

strix (usestrix/strix) - 25.9k stars#

A framework of autonomous "AI hacker" agents that test an application dynamically the way a pentester would. Each agent gets a full toolkit (HTTP proxy, Playwright browser, terminal, Python runtime, recon) and reports validated proof-of-concepts for issues like IDOR, SQL and command injection, SSRF, XSS, auth and JWT flaws, and business-logic bugs. It runs locally (Python 3.12+, a Docker sandbox, strix --target ...) and requires an external LLM key.

Pros

  • Validates findings with real proof-of-concepts rather than signature matches, which the project claims cuts false positives (its own claim)
  • Broad tool and vulnerability coverage out of the box, with multi-agent orchestration
  • Apache-2.0, local execution, CI/CD integration and provider-agnostic LLM support

Cons

  • PyPI marks it development status Alpha despite the 'v1.0' tag - treat production security gates with care
  • 'AI hackers' and 'zero false positives' are project claims; autonomous offensive tools still need human validation, and you may only test systems you own or are authorized to test
  • Agentic pentests burn a lot of tokens (running LLM cost, unquantified by the project) and require Docker

License: Apache-2.0 (OSI-open). When it is interesting: developers who want continuous, PoC-backed security testing they can self-host. When it is too early: as an unattended production gate, given the alpha status. Strix has its own commercial cloud (the open project is the lead-gen layer), but we found no affiliate program; the established AppSec counterpart is Snyk.

MCP servers for coding agents#

The Model Context Protocol became one of the fastest-moving corners of open-source AI this year: small servers that give agents real, structured capabilities instead of guesswork. Two stood out.

chrome-devtools-mcp (ChromeDevTools/chrome-devtools-mcp) - 43.0k stars#

An MCP server from the Chrome DevTools team that gives coding agents (Claude Code, Cursor, Copilot, Codex and more) control of a real, running Chrome instance via Puppeteer, plus DevTools-grade inspection: performance traces, network and console analysis, source-mapped stack traces and screenshots. Install with npx chrome-devtools-mcp@latest; it needs Google Chrome or Chrome for Testing.

Pros

  • Official Chrome DevTools team project with very high adoption (43k stars, ~2.76M weekly npm downloads, daily commits)
  • Gives agents real DevTools powers - performance traces and network/console inspection, not just click automation
  • Broad client support (Claude Code as CLI and plugin, Cursor, Copilot, Codex, Cline) and Apache-2.0 with no usage strings

Cons

  • By design it exposes all browser content to the MCP client, including potentially sensitive data - secure it deliberately
  • Telemetry is on by default (usage stats; performance tools can send trace URLs to Google's CrUX API) - opt out with flags
  • Officially only Google Chrome / Chrome for Testing; other Chromium browsers are not guaranteed. 'Reliable automation' is the project's claim, not an independent benchmark

License: Apache-2.0 (OSI-open). When it is interesting: giving a coding agent real browser debugging and performance insight. When it is too early: sensitive environments where exposing browser content to an external client is unacceptable without hardening. Free tool, no commercial variant.

n8n-mcp (czlonkowski/n8n-mcp) - 21.6k stars#

An MCP server (an independent project, not by n8n) that gives an AI assistant structured access to n8n's node documentation, properties and operations, so the model can build, validate and deploy n8n workflows correctly. It exposes tools like search_nodes and validate_workflow, is self-hostable (npx/Docker/Railway), and optionally connects to your own n8n instance for management. Clients include Claude Desktop and Code, Cursor, Windsurf, VS Code and Codex.

Pros

  • Clean MIT license on the server code (the n8n Sustainable Use License applies to the n8n platform, not this server)
  • Very actively maintained and widely adopted (21.6k stars, daily commits, a 6:1 star/fork ratio)
  • Broad client support, plus a zero-setup hosted option for trying it out

Cons

  • Most value depends on having an n8n instance (with its own Sustainable Use License) - without it you only get the docs and validation tools
  • The hosted variant is freemium (free tier capped at 100 tool calls/day); the open server stays free, but convenience is monetized
  • Coverage metrics (1,851 nodes, 99% property coverage, 5,418 passing tests) are the project's own claims

License: MIT (OSI-open). When it is interesting: building or debugging n8n automations from inside a coding agent. When it is too early: if you do not already run n8n, since the deploy features need an instance. The commercial counterpart is n8n itself - its cloud plans are where a managed alternative lives.

Info

Methodology and what this post does not cover. Candidates came from the GitHub Search API (young repos, star growth, AI relevance), then editorial selection. Star and fork counts are as displayed on GitHub on June 5, 2026 (the six mid-June additions on June 7, 2026), and are not independently audited. We deliberately excluded the well-known incumbents (Ollama, ComfyUI, vLLM, llama.cpp) - this radar is about what is rising, not what already won. Performance figures attributed to a project (e.g. "4.2x faster") are that project's own claims, not our measurements. Affiliate disclosure: this post has no affiliate relationship with any of the open-source repos listed; the only commercial link is to ElevenLabs, shown as the managed alternative to a self-hosted voice tool. Licenses and maturity change fast, so verify each repo's current LICENSE and release status before relying on it.

The next edition follows next month. If a repo here matures or commercializes, we will track that too - early coverage is the point of a radar.


Roland Hentschel

Roland Hentschel

AI & Web Technology Expert

Web developer and AI enthusiast helping businesses navigate the rapidly evolving landscape of AI tools. Testing and comparing tools so you don't have to.

Tools Covered in This Post

More from the Blog