10x faster models and the consulting angle for AI

Hey I’m Ben. I build stuff with agents, even though I’m not technical. Here’s all the stuff I’m reading and tinkering with. If you want to start building or level up your ‘vibe-coding’ skills, join our community.


Hey folks,

Google is back on top of the benchmark charts with Gemini 3.1 Pro. Impressive on paper, genuinely strong at reasoning tasks, creating SVGs, but there’s a speed issue. Many folks are really enjoying using it for frontend work—once they are able to get it working. Again, there’s some drama - a lot of people got their Google account banned for using their Google AI/Antigravity subscription to use Gemini 3.1 Pro with OpenClaw.

A 2.5-year-old hardware startup, Taalas, built a chip that has the weights of Llama 3.1 baked into the hardware, and it lets them achieve ~17k tokens/second in output speeds. For comparison, Groq is at ~600 tokens/second, and Cerebras is at ~2k/second. The model on the chip (they call it “silicon llama”) is largely unwritable, but supports custom context window sizes and LoRA fine-tuning. I compared the same model on their chat demo and Groq’s playground. As expected, it is dumber on Taalas’s demo (due to low-quality quantisation), but at this stage, the proof of “any AI model can be made 10x faster and cost 20x less” is more important. They plan to release a reasoning model version very soon, with frontier LLMs in plans too.

OpenAI is partnering with 4 major consulting firms, BCG, McKinsey, Accenture and Capgemini to make enterprises use their new platform “ Frontier ” that lets you create AI coworkers. Weren’t consulting shops supposed to die with AI?

Claude Code updates - built-in support for git worktrees for parallel agents, CC desktop can preview running apps and a new security scanning feature in beta.

Why’s there always a meeting bot in your Zoom call? Blame Recall.ai. They power every meeting AI app, from Cluely to Hubspot to Clickup. Recall.ai handles the hard part: getting recording data across meeting platforms. Get started with $100 in credits *


🌐 What I’m consuming


⚙️ Tools and demos

  • AssemblyAI Universal-3 Pro - Prompt your speech model to get jargon, speakers, and formatting right the first time. Free to try through Feb.*
  • here.now - Free, instant web hosting for agents, static elements only.
  • mdnb - a markdown notebook for MacOS.
  • Rork Max - One-shot almost any app for iPhone, or any Apple device (including watches, TVs and Vision Pro). i’m an investor.
  • Interpreter - Desktop agent that can fill PDFs, edit your Excel and Word docs, and learn new skills. Runs locally, works with any model.
  • Wideframe - AI agent that speeds up the 75% of video work happening outside the editor.
  • Typefully has a new writing assistant to help you write better (not just more).
  • Trajectory Explorer by Raindrop - Every decision your agent made, searchable in seconds.
  • FasterGH - GitHub with instant navigation and a modern UI. (repo)
  • Quipslop - A live game where different models try their best to be funny. (repo)
  • Shiori - A beautifully simple read-it-later app.
  • I was looking for a way to add a “browser” to a web-app I’m working on. Came across hyperbeam and lifo.sh.

🥣 Dev Dish


🍦 Afters


Enjoy this newsletter? Forward it to a friend.

That’s it for today. Feel free to comment and share your thoughts. 👋