Are better models better? ⊗ A case for extrospective technologies ⊗ AI as electricity, AI as magic

No.343 — Archigram keeps the dream alive ⊗ The META Trending Trends ⊗ AI to revolutionise fundamental physics ⊗ Inventing the Renaissance

Are better models better? ⊗ A case for extrospective technologies ⊗ AI as electricity, AI as magic
Jebel Kissu, in northwestern Sudan. Image by the USGS on Unsplash.

Are better models better?

I quite like this one by Benedict Evans, as it nicely attaches two common attitudes towards LLMs hallucination. Some think they are just a temporary nuisance, and some are really focused on them and end up dismissing the whole field as a result. Instead, Evans emphasizes that while probabilistic models, like LLMs generate responses based on statistical patterns, they lack the ability to provide definitive answers required for certain tasks. For example, when Evans prompted ChatGPT 4o about the number of elevator operators in the US in 1980, the model produced incorrect answers despite various prompts and guidance. Though simple, his test underscores the inherent uncertainty in probabilistic systems, which approximate likely answers rather than delivering exact results. Deterministic systems, on the other hand, are designed to produce consistent and precise outputs, making them more suitable for tasks where accuracy is critical.

People often expect AI to deliver precise, factual answers, when in reality, these systems are designed to produce statistically likely outputs rather than definitive truths. As models improve, they may become more convincing but not necessarily more correct. There’s a technical aspect, but it’s also a question of expectations. For example, the first iPod was expected to break if dropped on the floor, but not the mobile phones contemporary to it. By the time we got to the iPhone, we expected a dropped mobile to break. Expectations shifted. What will change faster, will we reset our expectations of AI, or will AI models be adapted—like with Agents, which give deterministic tools to LLMs—faster? If we change our expectations, to what, and how will AI Labs’ business models work if we expect an AI to be fanciful but imprecise, “unprofessional”?

More → In his yearly presentation, Evans also goes through an interesting exercise, where AI either keeps scaling at the rate of the last few years, or doesn’t, and he gives a few examples of potential outcomes for some use cases. The whole thing is good but that bit I mention starts with slide 62.

However, there is also a broad class of task that we would like to be able to automate, that’s boring and time consuming and can’t be done by traditional software, where the quality of the result is not a percentage, but a binary. For some tasks, the answer is not better or worse: it's right or not right. […]

Plenty of people have opinions, but so far, we don’t know, and for the time being, ‘error rates’ (if that’s even the right way to think about this) are not a gap that will get closed with a bit more engineering, the way the iPhone got copy/paste or dialup was replaced by broadband: as far as we know, they are a fundamental property of the technology. […]

This is one way to think about ‘agentic’ systems (which might be the Next Big Thing or might be forgotten in six months) - the LLM turns everything else into an API call. Which way around is better? Should you control the LLM within something predictable, or give the LLM predictable tools?

The internet can’t discover: A case for new technologies

Christopher Butler argues that, while the internet serves as a vital infrastructure, it is fundamentally an introspective technology that only reflects existing knowledge rather than discovering new information about the physical world. He believes, and I tend to agree, that we need to focus more resources on extrospective technologies, like the James Webb Space Telescope, which provide genuine windows into the unknown and expand our understanding of the universe—or simply our oceans. As Chris says, “as we face unprecedented environmental challenges and explore new frontiers in space, we need more windows into the physical world, not better mirrors of our digital one.” “It’s time to rebalance our technological investment.”

In a bit of a sidestep, I’d like to connect this to an interview I’ve mentioned many times here and in conversations, about viewing AI as Collective Intelligence. Holly Herndon sees it “as a kind of aggregate human intelligence. It’s trained on all of us.” Not only is the internet introspective, so would be AI, where we can converse with our collective knowledge. We could build more extrospective tech to extend that knowledge.

The internet is, fundamentally, an introspective technology; it is a mirror, showing us only what we’ve already put into it. […]

When we marvel at AI’s capabilities, we’re really just admiring an increasingly sophisticated form of introspection. […]

Unlike the internet, which can only reflect what we already know about ourselves, Webb is quite literally a window for looking outward. It extends our vision not just beyond our natural capabilities, but beyond what any human has ever seen before.

AI as electricity, AI as magic

David Mattin presents a framework of “AI as electricity” and “AI as magic.” He proposes that, while intelligence will become ubiquitous and commoditised, the real value will lie in the applications that deliver “AI magic” to end users, rather than in the creation of the AI itself. To his mind, the winners will be those with “deep wells of user data, which will allow them to craft AI experiences that resonate with people. And those with vast distribution, which allows them to deliver this magic at scale.” In other words, tech companies like Meta, Google, Microsoft, and Apple, who possess vast user data and distribution networks, will further consolidate their power.

I’d like to put his idea alongside Evans’ presentation mentioned above and add a tweak. On slide 63 his scale options lead to “LLMs are just another API” or “everything else is an API.” In other words, LLMs are infrastructure (“electricity”) that software/websites/apps fetches intelligence from, or the LLMs do everything and fetch information from websites for their own use. In the first case the magic happens in software, which might (Mattin) or might not be provided by big tech. In the latter, foundational models are both electricity and magic. Another possibility would be LLMs as infrastructure with no magic, just software perceived as more conversational, easier to use, and more powerful.

AI as electricity imagines a world in which intelligence is ubiquitous. In other words, in which it is a commodity. A world in which there are a plethora of ‘good enough’ LLMs, many of them open-source. As with any commodity business, the result for suppliers will be a race to the bottom on price. […]

Instead, and contrary to the AI narrative across the last few years, much of the money will be made not by those who create the AI, but by those who use it to deliver magic to end users. The value, in other words, will be at the app layer.


§ After a 50-year pause, Archigram keeps the dream alive. Book review and historical overview of the mythic architecture group. “Archigram were the great mod dreamers, a London collective that took square aim at all the square ideas prevailing in architecture—and blew them up on the pages of their proto-zine, in exhibitions, and elsewhere with anarchic glee.”Their mode was Dadaist-Futurist-Advertising Supplement. As aesthetic radicals, they loathed the drab torpor of late rigor-mortis Modernism.


§ A programming note. I’m mostly staying away from the Truskian coup so far. Jason Kottke on the other hand, decided to stay with the trouble. When you have the energy for it, have a look, his curation works as well for the dire as it usually does for the more whimsical. “I have pivoted to posting almost exclusively about the coup happening in the United States right now. My focus will be on this crisis for the foreseeable future. I don’t yet know to what extent other things will make it back into the mix. I still very much believe that we need art and beauty and laughter and distraction and all of that, but I also believe very strongly that this situation is too important and potentially dangerous to ignore.”

Futures, Fictions & Fabulations

  • Don’t take a position on signals of change. “So we say, hold your views about the future very loosely. It keeps you open to adjusting your understanding. It keeps you actively searching for more information, whatever that information might be telling you. For example, when you find a weak signal, there is no need to either promote it as the next great thing, or dismiss it as way out there. In today’s vernacular, ‘hold space’ for the signal in a neutral fashion, and watch and learn.”
  • The META Trending Trends: 02025. Matt Klein’s yearly meta analysis. “TL;DR: Trend reports, in aggregate, should not be seen as sources of emergent or disruptive thinking, but rather culprits of repackaging established nomenclature.
  • Taming the high frontier: Five works featuring space colonies. Still alive and well today, see Bezos’ vision of space colonisation. “As you know, proposals to establish permanent communities in space did not begin with O’Neill. Nor was he the first person to focus on settling asteroids. But it was his book, The High Frontier, that really captured popular attention.”

Algorithms, Automations & Augmentations

  • AI to revolutionise fundamental physics and ‘could show how universe will end’. “Prof Mark Thomson, the British physicist who will assume leadership of Cern on 1 January 2026, says machine learning is paving the way for advances in particle physics that promise to be comparable to the AI-powered prediction of protein structures that earned Google DeepMind scientists a Nobel prize in October.”
  • Frontier AI systems have surpassed the self-replicating red line. I’m sure this is fine. “We for the first time discover that two AI systems driven by Meta's Llama31-70B-Instruct and Alibaba's Qwen25-72B-Instruct, popular large language models of less parameters and weaker capabilities, have already surpassed the self-replicating red line. In 50% and 90% experimental trials, they succeed in creating a live and separate copy of itself respectively.”
  • Not every AI prompt deserves multiple seconds of thinking: how Meta is teaching models to prioritize. Feels like something they might have though of doing before launch. “A new technique presented by researchers at Meta AI and the University of Illinois Chicago trains models to allocate inference budgets based on the difficulty of the query. This results in faster responses, reduced costs, and better allocation of compute resources.”

Asides

  • Ada Palmer’s book Inventing the Renaissance is coming out soon and she’s been writing a series of posts as a countdown to it. The latest is Gonzaga vs Sanseverino: I Fart in Your General Direction, and they all include lots of visuals and insights.
  • Do aliens exist? We studied what scientists really think. “In total, 521 astrobiologists responded, and we received 534 non-astrobiologist responses. The results reveal that 86.6% of the surveyed astrobiologists responded either ‘agree’ or ‘strongly agree’ that it’s likely that extraterrestrial life (of at least a basic kind) exists somewhere in the universe.”
  • Yet more AI, sorry, but this one is a bit of a laugh. The Infinite Conversation, “an AI generated, never-ending discussion between Werner Herzog and Slavoj Žižek.” Uncanny.

Your Futures Thinking Observatory