# ProofShot - Visual proof for AI-built code

> ProofShot records video and screenshots while AI agents interact with browsers, packaging everything into standalone HTML proof artifacts.

Published: 2026-03-24
URL: https://daniliants.com/insights/proofshot-visual-proof-for-ai-built-code/
Tags: ai-coding, visual-testing, playwright, browser-automation, proof-artifacts, code-review, ui-validation

---

## Summary

ProofShot is an open-source CLI tool that records video, screenshots, and error logs while an AI agent interacts with a browser, packaging everything into a standalone HTML proof artifact. The HN discussion reveals the real value is not in ProofShot itself but in the broader shift from "generate UI" to "validate UI" - most commenters already achieve similar results with Playwright MCP or Chrome DevTools MCP, but the gap remains for visual/semantic validation that DOM assertions cannot cover.

## Key Insight

- The tool's core loop: launch dev server, open headless Chromium, record agent actions, trim dead time, output a standalone HTML viewer with synced video + action timeline + screenshots.
- **HN consensus: Playwright already does most of this.** Multiple commenters (onion2k, theshrike79, mohsen1, vunderba) confirm they give Claude Code or Cursor direct Playwright/browser access and it works well for debugging layout issues, console errors, and even GLSL shaders.
- **Chrome DevTools MCP vs Playwright MCP**: boomskats prefers DevTools MCP (lighter, shorter loop, works with Electron); nunodonato says DevTools MCP clutters context and prefers playwright-cli (not MCP) for efficiency.
- **Playwright MCP has screenshots built in** (roxolotl), and Anthropic added a `/plugins` shortcut in Claude Code for easy Playwright MCP setup (vunderba).
- **The real gap is visual/semantic validation** (mrothroc's key insight): Playwright checks structural DOM properties; it cannot tell you if the page *looks* right. Using a multimodal LLM to evaluate screenshots against design mocks catches a completely different class of error (wrong colors, shifted layout, overlapping components). These are "stochastic gates" vs. "structural gates" - very little overlap, you want both.
- **Desktop apps and mobile are unsolved**: alkonaut points out that without a DOM (e.g., drawing applications), screenshot diffing is your only option. For mobile, deepwalker.xyz and accessibility API XML dumps were mentioned as alternatives. iOS simulator tooling is still lacking.
- **PR workflow use case** (sd9): attaching visual proof to PRs is valuable even for human-written code. Integration with GitLab MRs where an agent screenshots any open PR and adds it as a comment.
- Mozilla pioneered screenshot-based regression testing for Gecko ~25 years ago - same concept, now enhanced by vision models for semantic analysis instead of pixel diffing.