GSD 2.0 vs Claude Code: Head-to-Head Agentic Build

1 min read
claude-codegsdagentic-codinganthropicapi-costsmax-planorchestration
View as Markdown
Originally from vm.tiktok.com
View source

My notes

Watch on TikTok Tap to open video

Summary

GSD2 split off from being a Claude Code orchestration layer into a standalone agentic CLI built on the Python SDK. In a head-to-head expense-tracker build, Claude Code (Max Plan) finished in 4m38s using <1% of a 5-hour quota, while GSD2 (Anthropic API, Opus 4.6 plan + Sonnet 4.6 exec) took ~1.5 hours with multiple hangs and burned ~$28 in API costs for a worse-looking result.

Key Insight

  • Subsidy gap is huge. Claude Max Plan at $200/month is equivalent to roughly $2,500-$5,000 in API credits. Any tool that pulls you out of Cloud/Codex/anti-gravity onto raw API spend has to massively outperform to be worth it.
  • Don’t use Max Plan OAuth in third-party CLIs. Anthropic has been explicit: using Max Plan OAuth outside Claude Code (in tools like GSD2, anti-gravity OAuth flows) risks an account ban. Same pattern that hit OpenClaw users.
  • GSD2 architecture is sound, even if economics aren’t. Core idea: break project into phases, tasks, sub-agents, with the “iron rule” that each task must fit in a single context window. Reasoning: even with 1M-token context in Opus 4.6 / Sonnet 4.6, models perform best near token zero, not at token 700k.
  • GSD2 ships a budget ceiling ($/project cap) and step vs. auto modes, but in this test it still wasted ~40 minutes hung without burning tokens, requiring three full restarts.
  • Two-terminal pattern. GSD2 expects a workhorse terminal (running auto) and a discussion terminal that writes to disk; the workhorse picks up changes from the disk state. Useful pattern even outside GSD2 for any agentic tool that reads project state from files.
  • Visual quality favored Claude Code in this run despite both meeting the spec (expense form, list, dashboard, monthly summary, dark mode, dummy data).
  • When GSD2 might still win: if you’re already API-only (no Max Plan), like the explicit phase/task/context scaffolding, and project scale justifies the overhead.