Skip to main content
AI Tool Radar
OSI-openAgent frameworks and runtimes

forge

antoinezambelli

Python reliability layer that makes self-hosted and small LLMs dependable at tool-calling, with guardrails.

2.1k stars(as of 2026-06-26)View on GitHub

What is forge?

A Python reliability layer for self-hosted LLM tool-calling: you supply tools and the model invokes them in any order, with guardrails like rescue parsing of malformed tool calls, retry and error tracking, response validation, a synthetic respond tool for small models and context compaction. It deploys as a transparent OpenAI/Anthropic-compatible proxy, a workflow runner or composable middleware, and is explicit that it is not a full agent orchestrator.

Pros & Cons

Pros

  • Targets a real niche: making small and self-hosted models reliable at tool-calling
  • Strong engineering signals: 865 unit tests, an eval harness and three deployment modes
  • Honest scoping: it openly says it is guardrails middleware, not a full orchestrator

Cons

  • A narrow remit (a single agentic loop); the 'multi-step workflows' framing oversells a reliability layer
  • Headline accuracy gains are self-reported on the author's own benchmark, unverified
  • Python 3.12+ and a self-hosted backend raise the setup bar

License

MIT (OSI-open)

When it is interesting

Running self-hosted or small LLMs and needing dependable tool-calling without adopting a heavy agent framework.

When it is too early

If you rely on the cited accuracy numbers, or want full multi-agent orchestration.

This repo featured in the 2026-07 edition of the Open-Source AI Radar.