# Karpathy-style overnight metric-gated agent loops for code

> Metric-gated overnight agent loop: Claude Code keeps only diffs that improve a numeric KPI (bundle size, Lighthouse, test count) and pass veto tests.

Published: 2026-04-26
URL: https://daniliants.com/insights/karpathy-autonomous-ai-workflow-clawed-code-skill/
Tags: autonomous-agents, claude-code, overnight-loops, metric-driven, code-optimization, karpathy, dev-tools

---

## Summary

A short TikTok pitches a "Clawed Code" skill by Udit Goenka that runs a Claude Code agent in an endless loop while you sleep, applying changes, running tests, and only keeping diffs that improve a defined metric. Pitched as a way to optimize bundle size, SEO scores, or run security audits unattended.

## Key Insight

- Core pattern: metric-gated agent loop. The agent only commits work where a numeric KPI (bundle bytes, Lighthouse score, test pass count, audit findings) improves vs baseline. Anything that doesn't move the metric is discarded.
- The novelty is not the AI - it's the discard rule. Most agent tools accept any "plausible" change; this only accepts measurable wins, which sidesteps the usual "AI made it worse but it looks confident" failure mode.
- Works best when you have a fast, deterministic measurement (CI test suite, `webpack-bundle-analyzer`, Lighthouse CLI, `npm audit`). Useless without one - the loop has no compass.
- Overnight execution is the real unlock: long runs are too expensive in attention but cheap in tokens. Karpathy-style "let it cook" only pays off if guardrails (metric + rollback) are tight.
- Risk: agent gaming the metric (e.g. removing tests to "improve" pass rate, lazy-loading everything to shrink initial bundle while breaking UX). Always pair the optimization metric with at least one veto metric (e.g. e2e tests must still pass, Core Web Vitals must not regress).