ProofShot - Visual proof for AI-built code

March 24, 2026 Source

Summary

ProofShot is an open-source CLI tool that records video, screenshots, and error logs while an AI agent interacts with a browser, packaging everything into a standalone HTML proof artifact. The HN discussion reveals the real value is not in ProofShot itself but in the broader shift from “generate UI” to “validate UI” - most commenters already achieve similar results with Playwright MCP or Chrome DevTools MCP, but the gap remains for visual/semantic validation that DOM assertions cannot cover.

Key Insight

The tool’s core loop: launch dev server, open headless Chromium, record agent actions, trim dead time, output a standalone HTML viewer with synced video + action timeline + screenshots.
HN consensus: Playwright already does most of this. Multiple commenters (onion2k, theshrike79, mohsen1, vunderba) confirm they give Claude Code or Cursor direct Playwright/browser access and it works well for debugging layout issues, console errors, and even GLSL shaders.
Chrome DevTools MCP vs Playwright MCP: boomskats prefers DevTools MCP (lighter, shorter loop, works with Electron); nunodonato says DevTools MCP clutters context and prefers playwright-cli (not MCP) for efficiency.
Playwright MCP has screenshots built in (roxolotl), and Anthropic added a /plugins shortcut in Claude Code for easy Playwright MCP setup (vunderba).
The real gap is visual/semantic validation (mrothroc’s key insight): Playwright checks structural DOM properties; it cannot tell you if the page looks right. Using a multimodal LLM to evaluate screenshots against design mocks catches a completely different class of error (wrong colors, shifted layout, overlapping components). These are “stochastic gates” vs. “structural gates” - very little overlap, you want both.
Desktop apps and mobile are unsolved: alkonaut points out that without a DOM (e.g., drawing applications), screenshot diffing is your only option. For mobile, deepwalker.xyz and accessibility API XML dumps were mentioned as alternatives. iOS simulator tooling is still lacking.
PR workflow use case (sd9): attaching visual proof to PRs is valuable even for human-written code. Integration with GitLab MRs where an agent screenshots any open PR and adds it as a comment.
Mozilla pioneered screenshot-based regression testing for Gecko ~25 years ago - same concept, now enhanced by vision models for semantic analysis instead of pixel diffing.