forge
antoinezambelli
Python reliability layer that makes self-hosted and small LLMs dependable at tool-calling, with guardrails.
What is forge?
A Python reliability layer for self-hosted LLM tool-calling: you supply tools and the model invokes them in any order, with guardrails like rescue parsing of malformed tool calls, retry and error tracking, response validation, a synthetic respond tool for small models and context compaction. It deploys as a transparent OpenAI/Anthropic-compatible proxy, a workflow runner or composable middleware, and is explicit that it is not a full agent orchestrator.
Pros & Cons
Pros
- Targets a real niche: making small and self-hosted models reliable at tool-calling
- Strong engineering signals: 865 unit tests, an eval harness and three deployment modes
- Honest scoping: it openly says it is guardrails middleware, not a full orchestrator
Cons
- A narrow remit (a single agentic loop); the 'multi-step workflows' framing oversells a reliability layer
- Headline accuracy gains are self-reported on the author's own benchmark, unverified
- Python 3.12+ and a self-hosted backend raise the setup bar
License
MIT (OSI-open)
When it is interesting
Running self-hosted or small LLMs and needing dependable tool-calling without adopting a heavy agent framework.
When it is too early
If you rely on the cited accuracy numbers, or want full multi-agent orchestration.
This repo featured in the 2026-07 edition of the Open-Source AI Radar.
Nanobot
HKUDS
Lightweight personal AI agent for tools, chats and workflows - one binary, multi-channel.
DeepTutor
HKUDS
Agent-native personalized learning workspace - tutoring, quizzes, research and RAG knowledge bases.
OpenFang
RightNow-AI
Open-source Agent Operating System in Rust - autonomous capabilities, 40 channels, WASM sandbox.