Helpful LLocal Models

301 permanently moved podcast cover - Large white text reading 301 Permanently Moved over a digital illustration of a white cloud with blue glowing circuit-like tentacles.

March 8, 2025

AI 🤖 | Dimensino 👾 | Little Guys 🥹 | Permanently Moved 🔊

Tutorial engines are coming. And they’ll run atop local AI models embedded in our devices at the OS level. What’s the bet that future versions of the Mac Studio—will offer something like Framework’s modular scalability? Imagine supercompute clusters in every office—or even in every home.

Full Show Notes: https://thejaymo.net/2025/03/08/2505-helpful-llocal-models/

Experience.Computer: https://experience.computer/

Worldrunning.guide: https://worldrunning.guide/

Subscriber Zine! https://startselectreset.com/

Permanently moved is a personal podcast 301 seconds in length, written and recorded by @thejaymo

Subscribe to the Podcast: https://permanentlymoved.online/

@thejaymo https://permanentlymoved.online/2505-helpful-llocal-models

Show Supporter 📻

£5 MONTHLY 👏

Quarterly zine; my gift to you ✉️

Subscribe 🔊

Wherever you get your podcasts

I continue to play close attention to Local AI models.

Most mainstream coverage focuses on frontier models and hype—OpenAI, Anthropic, Deepseek etc. But it rarely explores the wider industry and the context. I mean, AI and frankly all online services, are embedded within a vast stack of global industrial compute technologies.

Musk’s Grok 3 was trained on the world’s largest, first fully watercooled, data center cluster. That alone is worth an article on its engineering and technical innovation—regardless of what you think of the man.

Also, Yes, Microsoft just pulled out of its planned data centre leases. You could report that this implies the AI race has peaked, but it also signals excess compute capacity ahead. Which means prices are going to come down.

It wouldn’t surprise me if, in a few years, we see dedicated compute clusters for training or updating check-pointed open source AI models at prices that most large organisations could justify as reasonable CapEx. Model Creche’s.

But even still, local AI is more an important area to be tracking. For the first time in a decade, I’m excited about hardware again.

Last week Framework announced their desktop computer, which comes with 128GB of RAM and a rapid Ryzen GPU. For just £1,999. And crucially, you can chain multiple units together, forming your own local compute cluster. Scaling capability at a significantly lower costs.

The new Apple Studio was also just announced, which, when fully maxed out with an M3 Ultra, 512GB of RAM, and 16TB of storage, is priced at $14,099.00. This is a lot of money. But, the new Mac Studio can comfortably run a 4-bit Deep Seek R1 model locally, with room to spare.

Think about that: you can now run models on a desktop machine. Models which a month ago, needed an entire data centre.

What’s the bet that future versions of the Mac Studio—or even the Mac mini—will offer something like Framework’s modular scalability? enabling them to be chained together? Imagine supercompute clusters in every office—or even at home.

The new Mac Studio specs also now provides open-source model makers a target memory size and hardware platform to optimise for.

The same is also true for the new iPhone 16E, which just got a RAM bump to 8GB. And with its updated processor, Apple now has a new minimum-spec platform for local, on-device Apple intelligence. Which alongside Google’s Tensor platform, shows there’s an industry-wide push toward making devices AI inference platforms rather than cloud clients.

Last year, I talked about us heading towards ‘maximal intelligence at all levels’ and local inference will emerge and immediately sink below the user interface rather—than being directly exposed to the user as a chatbot.

I’ve recently been really impressed by how useful Brave search’s AI overviews have become—and find myself opening links directly from the references list. I’ve also been making use of Perplexity’s Deep Research tool for more complicated searches, and now that I’ve tried OpenAI’s Deep Research for myself, I sort of have a sense of where things might be heading.

My own use cases for deep search so far have mostly been for creating souped-up tutorials, as I’m currently learning a new piece of software.

And I’ve decided that help systems are where we’re going to see a ton of innovation,

Every single application we all use has a built help system. Some are better than others—Microsoft Excel for example has some of the best help documentation of any consumer software. But even with great documentation, finding exactly what you need is often a pain.

Powerful local AI, is going to be built into the operating systems directly—and will let us ask software for help directly.

How often in your daily lives do you use the help menu in a piece of software? I use it all the time, but maybe that’s because I spend my life not knowing what I’m doing. But recently, in Google Sheets I’ve been asking Gemini to debug a formula for me—and it will point out a missing IF statement or incorrect nested clause.

Soon, app help data will load as a ‘knowledge object’ atop base intelligence, letting you ask how to do something—or why it’s not working.

For example:

“How do I implement an anamorphic lens effect in Blender?”
Or:
“I’m using this software to do X, Y, and Z. I know I need these specific plugins, which require some JavaScript knowledge. Give me a step-by-step guide on how to implement them.”

These are things I’ve asked both deep research tools recently and got back extremely useful results. In both cases, I was able to follow the steps they gave me quite far, before switching to a referenced youtube video that got me across the finish line.

Tutorials—YouTube, blogs, forums—aren’t going away, but the ecosystem is definitely going to change. And all software is going to need better documentation—but when does it not?

Apple has its emerging on-device AI strategy as does Google and I think Microsoft has probably cancelled all that cloud compute because really good local chain of thought is coming soon.

Software won’t necessarily become more useful with all this, but it will become more helpful. Giving computers the ability to explain themselves is a UX game-changer. Lightweight debugging will only improve and be built into everything.

Tutorial engines are coming.

And they’ll run atop local AI models embedded in our devices at the OS level..

The entire AI conversation will be very different when AI moves out of the cloud and on to our machines.