Skip to main content
AI Tool Radar
OSI-openVectors, documents and extraction

PixelRAG

StarTrail-org

Pixel-native retrieval: renders documents as screenshots and searches over the images with a vision embedding model.

5.4k stars(as of 2026-06-26)View on GitHubHomepage

What is PixelRAG?

A retrieval system that renders documents (web pages, PDFs, images) as screenshots and retrieves over the images directly with a fine-tuned vision-language embedding model, instead of parsing HTML or text. It ships a CLI, a hosted search API over a pre-built index of millions of Wikipedia pages and a Claude Code plugin, and comes from the same Berkeley lab (StarTrail-org) as LEANN.

Pros & Cons

Pros

  • Apache-2.0 with code, FAISS indexes and LoRA adapter weights published openly, not just an API
  • Credible authorship: the same Berkeley lab behind the already-trusted LEANN
  • Usable today with a one-line install plus a live hosted API and a Claude Code plugin

Cons

  • Very young (v0.3.0, ~28 days), so the API and CLI surface will likely churn
  • Headline accuracy and cost numbers (e.g. +18% over text RAG) are unverified project claims
  • Local indexing needs a GPU and a heavy vision model; the convenient path leans on the hosted endpoint

License

Apache-2.0 (OSI-open)

When it is interesting

Retrieval over visually rich documents (tables, charts, layouts) where HTML-to-text parsing loses signal.

When it is too early

If you need a stable, versioned, fully self-hosted retrieval stack rather than a fast-moving v0.x plus a hosted index.

This repo featured in the 2026-07 edition of the Open-Source AI Radar.