
A Merchant’s Wisdom, Half a Millennium Late
24/02/2026
Your AI Is Already in the Field. Do You Know What It’s Doing?
03/03/2026
For the past two years, the AI race has been framed in almost childish terms: bigger models, bigger clusters, bigger capex, bigger valuations. The market has behaved as if intelligence were simply a function of scale, and scale a function of electricity, water, and money.
The question today is not whether AI will become more powerful. It will. The question is whether it can become more economically inclusive, more regionally distributed, and more resource-responsible before its infrastructure locks in a deeply unequal and environmentally expensive model by default. On that front, the answer is not obvious, but it is not hopeless either.
There might be a path. It simply does not look like the one Silicon Valley has trained us to admire.
The Ecological Ledger: Measuring the True Cost of Scale
Today, global data centre electricity consumption is estimated at about 415 TWh, roughly 1.5% of global electricity demand in 2024, and the International Energy Agency projects that this could rise to around 945 TWh by 2030, with data centre demand growing about 15% per year from 2024 to 2030. The same IEA analysis projects that electricity generation to supply data centres rises from 460 TWh in 2024 to more than 1,000 TWh in 2030. (IEA)
Water is the even less discussed side of the ledger. A widely cited research paper estimated that training GPT-3 could directly evaporate around 700,000 litres of clean freshwater, and projected that global AI demand could account for 4.2 to 6.6 billion cubic metres of water withdrawal in 2027. Whatever exact methodology one prefers, the direction is unmistakable: the hidden ecological bill of AI is no longer a rounding error. (arXiv)
And yet sustainability is only half the story. The other half is inclusion.
If AI remains dependent on a handful of hyperscale firms, massive data centre campuses, English-dominant datasets, and universal cloud subscriptions, then the future of intelligence will be concentrated in the same places where capital, compute, and geopolitical leverage are already concentrated. That is not merely an energy problem. It is a civilisational design choice.
Beyond Silicon Valley: Building Culturally Grounded Systems
An inclusive AI ecosystem would look very different. It would include frontier models, yes, but also smaller open models for underrepresented languages, sovereign deployments for public institutions, and edge systems optimised for specific tasks where latency, privacy, and energy efficiency matter more than generality. It would not assume that every school, ministry, hospital, or factory in Europe, Africa, Central Asia, or the Balkans must rent cognition from three or four American platforms forever.
We already have early evidence that another path is possible. In 2025, researchers released Sherkala Chat, an 8B open weight model designed for Kazakh, trained on 45.3 billion tokens across Kazakh, English, Russian, and Turkish, with the explicit aim of improving LLM inclusivity for Kazakh speakers. That matters. Inclusion in AI will not come only from building ever larger universal systems. It will also come from building capable, culturally grounded models for languages and societies that the mainstream AI economy still underserves. (arXiv)
This is why the emergence of Taalas is so interesting.
Not because it has “defeated” Nvidia. It has not. And not because every AI model should be hardwired into silicon. They should not.
Specialized Hardware: Printing Knowledge into Silicon
Taalas matters because it challenges a dangerous assumption: that the only serious future for AI is a universal, software-defined model running on ever larger, ever hotter, ever more capital-intensive compute stacks.
The company, based in Toronto, says it has now raised $219 million in total funding. Its approach is radical: instead of treating the model as software that runs on a general-purpose substrate, Taalas effectively prints key parts of a specific model into custom silicon. Reuters reports that the company customises only the top two metal layers of an almost complete chip and says TSMC can fabricate a chip for a specific model in about two months. The Taalas HC1 demonstrator runs Llama 3.1 8B, is built on TSMC 6nm, and the company says it delivers 17,000 tokens per second per user on that model in a 2.5 kW server. (Reuters)
What makes this story even more resonant for our part of the world is that Taalas is led by Ljubiša Bajić, whom Fortune reported grew up in Serbia, moved to Canada, and built his career in advanced chip design.
Challenging the Center: Regional Innovation as a Strategic Advantage
This is not a trivial detail. Too often, the Balkans appear in technology narratives only as markets, subcontractors, or talent pools to be harvested by others. Taalas is a reminder that world-class technical imagination can also emerge from our broader regional intellectual tradition, and challenge the centre, not merely serve it. (Fortune)
Still, the point is not Balkan sentiment. The point is strategic architecture.
Taalas embodies one possible answer to AI’s sustainability problem: specialisation. If a model is going to be used constantly, for a stable task, in a constrained environment, then perhaps it should not run on an expensive, overpowered, general-purpose stack designed for infinite flexibility. Perhaps some models should become something closer to silicon instinct, optimised, fast, and frugal.
That matters especially in robotics, industrial systems, defence, edge devices, and public infrastructure, where milliseconds, wattage, and local autonomy often matter more than generalised brilliance. In those settings, a smaller or fixed model may be more valuable than a frontier model that is marginally more capable but far more expensive, power hungry, and dependent on remote cloud infrastructure.
But this is not a romantic story. It is a trade-off story.
Hardwiring models into chips creates obvious risks: lock-in, shorter hardware relevance cycles, and the possibility that if model architectures change quickly, perfectly good silicon becomes prematurely obsolete. Taalas is, in effect, making a bold wager that at least some important layers of the AI stack are stabilising enough for hardware specialisation to pay off. That may prove brilliant. It may also prove early.
Even so, the company is forcing the right question onto the table. The future of AI does not have to be a monoculture.
The Road to Maturity: Efficiency as the New Benchmark of Intelligence
A mature AI ecosystem will probably be plural. Frontier cloud models will remain essential for science, reasoning, and open-ended exploration. But alongside them, we are likely to see national and sectoral models, multilingual open weight systems, distilled models at the edge, and highly optimised inference hardware for recurring workloads. That is not fragmentation. It is the beginning of adulthood.
And it is here that sustainability and inclusion start to converge.
A system that relies less on one-size-fits-all compute gigantism can also become more regionally adaptable. A system that supports smaller, open, and local models can serve more languages and institutions. A system that values fit-for-purpose inference can reduce waste. A system that measures electricity, water, and lifecycle costs honestly can finally start to align intelligence with responsibility.
Choosing Wisdom over Size: The Next Five Years
The real question is not whether AI becomes too powerful. It will. It is that we build its infrastructure in such a narrow way that we end up with the worst of both worlds: a resource-hungry system that is still socially exclusionary.
If we are serious, then three things should follow.
First, AI providers should disclose far more clearly the energy, water, and infrastructure intensity of both training and inference. Second, governments and institutions should support smaller, multilingual, open, and sovereign AI deployments rather than assuming hyperscaler dependency is inevitable. Third, investors should stop pretending that efficiency innovation is a side story. In the next phase of AI, efficiency is not a footnote to intelligence. It is part of intelligence.
The future of AI should not be decided only by who can build the largest model or the biggest campus. It should also be shaped by who can build systems that are efficient enough to scale responsibly, inclusive enough to matter broadly, and adaptable enough to serve real societies rather than only capital markets.
That is why the most important AI question of the next five years may not be how big models become.
It may be how intelligently we choose where bigness is necessary, and where it is not.




