ProofShot - Visual proof for AI-built code
Summary
ProofShot is an open-source CLI tool that records video, screenshots, and error logs while an AI agent interacts with a browser, packaging everything into a standalone HTML proof artifact. The HN discussion reveals the real value is not in ProofShot itself but in the broader shift from “generate UI” to “validate UI” - most commenters already achieve similar results with Playwright MCP or Chrome DevTools MCP, but the gap remains for visual/semantic validation that DOM assertions cannot cover.
Key Insight
- The tool’s core loop: launch dev server, open headless Chromium, record agent actions, trim dead time, output a standalone HTML viewer with synced video + action timeline + screenshots.
- HN consensus: Playwright already does most of this. Multiple commenters (onion2k, theshrike79, mohsen1, vunderba) confirm they give Claude Code or Cursor direct Playwright/browser access and it works well for debugging layout issues, console errors, and even GLSL shaders.
- Chrome DevTools MCP vs Playwright MCP: boomskats prefers DevTools MCP (lighter, shorter loop, works with Electron); nunodonato says DevTools MCP clutters context and prefers playwright-cli (not MCP) for efficiency.
- Playwright MCP has screenshots built in (roxolotl), and Anthropic added a
/pluginsshortcut in Claude Code for easy Playwright MCP setup (vunderba). - The real gap is visual/semantic validation (mrothroc’s key insight): Playwright checks structural DOM properties; it cannot tell you if the page looks right. Using a multimodal LLM to evaluate screenshots against design mocks catches a completely different class of error (wrong colors, shifted layout, overlapping components). These are “stochastic gates” vs. “structural gates” - very little overlap, you want both.
- Desktop apps and mobile are unsolved: alkonaut points out that without a DOM (e.g., drawing applications), screenshot diffing is your only option. For mobile, deepwalker.xyz and accessibility API XML dumps were mentioned as alternatives. iOS simulator tooling is still lacking.
- PR workflow use case (sd9): attaching visual proof to PRs is valuable even for human-written code. Integration with GitLab MRs where an agent screenshots any open PR and adds it as a comment.
- Mozilla pioneered screenshot-based regression testing for Gecko ~25 years ago - same concept, now enhanced by vision models for semantic analysis instead of pixel diffing.