Skip to main content
AI Tool Radar
OSI-openLocal inference and "what runs on my machine"

needle

cactus-compute

26M-parameter open-weights model for single-shot function calling on phones, watches and glasses.

2.6k stars(as of 2026-06-26)View on GitHubHomepage

What is needle?

A 26-million-parameter 'Simple Attention Network' for single-shot function and tool calling on resource-constrained devices like phones, watches and glasses. It takes a user query plus JSON tool schemas and emits the matching function call, and ships with weights, a dataset-generation pipeline, a CLI, a Python library and a web playground.

Pros & Cons

Pros

  • Fully MIT for both code and weights, no conditions, rare for an on-device model
  • Tiny (26M params), so it can run on phones, watches and glasses, with weights and dataset generation open
  • Complete tooling out of the box: CLI, Python API, web playground and local finetuning on consumer Mac/PC

Cons

  • At 26M params it is a narrow single-shot function-caller, not conversational or general-purpose
  • Headline speed and benchmark-win numbers are unverified vendor claims, some measured on Cactus's own hardware
  • No formal releases or versioning, and explicitly described as an 'experimental run'

License

MIT (OSI-open) - model license: MIT

Both the code and the model weights are MIT (verified against the LICENSE file and the Hugging Face model card), with no extra use conditions - unusually clean for an on-device model.

When it is interesting

Ultra-cheap, fully open, finetunable on-device tool calling on constrained hardware like wearables.

When it is too early

If you need conversation, multi-turn reasoning, or a stable versioned release.

This repo featured in the 2026-07 edition of the Open-Source AI Radar.