All Posts

Tech blog about vibe coding and full-stack development

Retrospective

Running Local LLMs on a Mac Studio Taught Me How to Use Slow Models Powerfully

After moving from dual 3090 Ti GPUs to an M3 Ultra, I found that for individuals and small teams, slow local LLMs work best as queue-based workers rather than real-time chat replacements.

Feb 4, 202610 min read