agent-device
callstack
CLI that lets AI agents drive and verify real iOS, Android, desktop and TV apps via accessibility-tree snapshots.
What is agent-device?
A CLI that lets AI agents drive and verify real apps across iOS, Android, tvOS, Android TV, macOS, Linux and desktop, plus React Native, Expo and Flutter. It exposes accessibility-tree snapshots with stable element refs and semantic selectors built for LLM context, handles taps, typing, scrolling and gestures, captures evidence (screenshots, video, logs, network, traces) and records or replays .ad scripts for CI.
Pros & Cons
Pros
- Actively maintained by Callstack (v0.17.10, 100+ releases) with broad real-device coverage
- LLM-first design: stable accessibility refs and semantic selectors, MIT-licensed
- One workflow spans mobile, desktop and TV plus React Native, Expo and Flutter
Cons
- Still pre-1.0 (0.17.x), so the CLI and API surface can shift
- Heavy local prerequisites (Xcode, Android SDK/ADB, Node 22/24+), not zero-config
- A separate paid 'agent-device Cloud' exists, and the replay auto-healing is labelled experimental
License
MIT (OSI-open)
When it is interesting
Building AI agents that must operate or verify real mobile, desktop or TV apps, including in CI.
When it is too early
If you need a frozen, stable API, or rely on the experimental auto-healing replay as production-critical.
Commercial alternative & related
- Commercial counterpart: BrowserStack
This repo featured in the 2026-07 edition of the Open-Source AI Radar.
UI-TARS-desktop
bytedance
Native desktop app for a GUI/computer-use agent powered by the open-weight UI-TARS model.
strix
usestrix
Framework of autonomous AI hacker agents for dynamic application security testing.
Page Agent
alibaba
In-page JavaScript GUI agent - control any webpage with natural language, no headless browser or extension.