Skip to main content
AI Tool Radar
OSI-openComputer-use and autonomous agents

Page Agent

alibaba

In-page JavaScript GUI agent - control any webpage with natural language, no headless browser or extension.

18.5k stars(as of 2026-06-14)View on GitHubHomepage

What is Page Agent?

Page Agent is a client-side TypeScript library that drops into any webpage and lets LLMs control the UI via text-based DOM manipulation - no Python, no headless browser, no extension required. An optional Chrome extension enables multi-tab workflows and a beta MCP server enables agent integration.

Pros & Cons

Pros

  • Zero server-side infrastructure - runs entirely in-page, deployable as a script tag
  • 32 versioned releases with active CI/CD show production-grade discipline
  • Bring-your-own-LLM design avoids API lock-in

Cons

  • Text-based DOM approach may struggle on canvas-heavy or very dynamic SPAs
  • MCP server is still beta
  • Alibaba origin may raise supply-chain concerns in some Western orgs

License

MIT (OSI-open)

When it is interesting

Embedding a natural-language copilot directly in a web product without backend infrastructure.

When it is too early

You need reliable multi-page orchestration - multi-tab flows require the beta extension.

Commercial alternative & related

  • Commercial counterpart: Anthropic Computer Use / Browserbase

This repo featured in the 2026-07 edition of the Open-Source AI Radar.