Category: tech > ai
139 insights in this category - page 3 of 7. View all insights
7 levels of Claude Code and RAG
A 7-level maturity model for giving Claude Code durable memory. Most should stop at level 4 (Obsidian vault with index hierarchy); real RAG adds cost and fra...
Aperture by Tailscale: Identity-Based AI Gateway for LLM Requests
Tailscale's Aperture (alpha) is a centralized AI gateway using Tailscale identity to route LLM requests with spending limits, access control, and telemetry.
Artifacts: versioned storage that speaks Git
Cloudflare launched Artifacts, a Git-protocol versioned filesystem for AI agents, plus open-source ArtifactFS that cuts multi-GB repo startup from 2 min to 1...
Automate work with routines - Claude Code Docs
Claude Code Routines are cloud-hosted Claude sessions triggered by schedule, HTTP API, or GitHub events, replacing Claude-in-CI and cron-plus-scripts patterns.
Caveman: Claude Code skill cuts output tokens 65% via caveman-speak
Caveman is a Claude Code skill that responds in caveman-speak (no articles, no filler) to cut output tokens ~65% on average without losing technical accuracy.
Codex for (almost) everything
OpenAI expanded Codex into a full desktop agent: drives the Mac cursor, runs 90+ plugins, parallel agents, image generation, and scheduled cross-day automati...
Darkbloom: Private AI Inference on Apple Silicon
Darkbloom routes encrypted AI requests to idle Apple Silicon Macs, the Airbnb of GPU compute. ~50% cheaper than OpenRouter, with hardware attestation.
Fireworks AI - Fastest Inference for Generative AI
Fireworks AI is an inference platform for open-source generative models, marketed with latency drops from 2s to 350ms but no pricing or benchmarks.
Friends Don't Let Friends Use Ollama
Ollama wraps llama.cpp but skipped attribution, forked ggml badly, and pivoted to VC-backed cloud. llama.cpp delivers up to 1.8x throughput on the same hardw...
Gemini for macOS - your native AI desktop app
Google shipped a native Gemini macOS app with one-keystroke Option+Space access and optional screen sharing. Free, Apple Silicon only, macOS Sequoia 15.0+.
exo: Cluster Macs to Run Frontier AI Models Locally
exo clusters Apple Silicon Macs into a distributed AI inference pool, running DeepSeek v3.1 671B and Kimi K2 locally with RDMA over Thunderbolt 5.
Graphify: Knowledge Graph Skill for AI Coding Assistants
Graphify builds a multi-modal knowledge graph (Tree-sitter + LLM extraction, Leiden communities) so AI coding assistants grasp large codebases at 71x fewer t...
Introducing Agent Lee - a new interface to the Cloudflare stack
Cloudflare Agent Lee is an in-dashboard AI assistant using Codemode to turn MCP tools into a TypeScript API, with Durable Object proxy gating writes by elici...
Opus 4.7 explained in 30 seconds
A 30-second rundown of Opus 4.7: gains on coding benchmarks, 3x higher screenshot resolution, new X high reasoning tier, and a /ultra-review slash command.
Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All
Alibaba open-sourced Qwen3.6-35B-A3B, a 35B MoE with 3B active params scoring 73.4 on SWE-bench Verified and integrating with Claude Code via OpenAI-compatib...
Claude Code /routines: Server-Side Scheduled Tasks
Claude Code /routines (via /schedule) runs scheduled tasks on Anthropic's servers, not your terminal. Triggers support cron, API, or GitHub webhooks.
Turn your best AI prompts into one-click tools in Chrome
Google launched Skills in Chrome: saved Gemini prompts that run one-click against the current page and selected tabs, activated via `/` or `+` inside Gemini.
Crawl4AI: Async Web Crawler for LLM-Friendly Markdown Extraction
Crawl4AI is an open-source async crawler that extracts LLM-friendly markdown, with concurrent crawling, anti-bot bypass, and AI-powered structured extraction.
Why Chinese AI Is Suddenly So Good (ft. DeepSeek, Seedance 2.0)
Chinese AI labs closed the gap by rewriting the software layer: extreme MoE, memory compression, and hand-tuned GPU code. Douyin adds a video data moat.
ThePopeBot: git-native autonomous AI agent scaffolding
Open-source scaffolding for 24/7 autonomous AI coding agents via GitHub Actions and Docker. Each task branches, runs isolated, opens a PR, and auto-merges.