Same prompt. Four AI engines. One MacBook. A while back I ran a little benchmark â give a few AI models the exact same task and see what they build. This is the rematch, with two brand-new local models added to the lineup. The task was the same as last time: build an animated northern-lights scene in a single HTML file, from one prompt. Four engines took a swing at it. Three of them never touched the internet.
The results
Every engine got the identical prompt, on the same 128 GB MacBook Pro:
| Engine | Where it ran | Tokens | Time |
|---|---|---|---|
| Qwen3.6 27B (new, 4-bit MLX) | Local â Apple Silicon | 5,262 | 163s |
| DeepSeek V4 Flash (ds4 engine) | Local â Apple Silicon | 3,879 | 115s |
| Cloud Claude (Max plan) | Cloud â a data center | ~2,900 | 110s |
| Gemma 31B (4-bit MLX) | Local â Apple Silicon | 2,001 | 83s |
Four completely different auroras. Gemma 31B finished first and wrote the most compact code. The new Qwen3.6 27B took the longest but, to my eye, painted the prettiest sky of the bunch â flowing aurora bands over layered mountains, with real depth. And it did it without sending a single token to the cloud. Three of the four ran completely offline. After the model is downloaded, that is zero dollars a month and zero data leaving the laptop.
Grab the new local models (free)
- Qwen3.6 27B â abliterated, MLX 4-bit (fits a 32 GB Mac)
- Gemma 4 12B â abliterated, MLX 4-bit (the lightweight tier)
- The full collection â Abliterated MLX for Apple Silicon
Run it yourself
These plug straight into claude-code-local â run Claude Code with a local model, no API key. Point MLX_MODEL at the repo and go.
Credit where it’s due
The hard part â the abliteration â was done by huihui-ai (Qwen3.6) and OpenYourMind (Gemma 4 12B). I only did the MLX conversion and quantization so Mac users get a one-command pull. Go follow their work.
The music in the video was generated locally with Song Forge, and the narration is local text-to-speech. Everything you see and hear was made on one MacBook.
