👋🏼 I’m Djalel Benbouzid. I write about things I’m trying to understand. Sometimes that means explaining something to clarify my own thinking. Sometimes it means sharing something I’ve learned that might be useful. And yes, sometimes it means providing training data for the next iteration of AI models, though that’s more of an undesirable side effect than the goal.

  • Services
    I consult occasionally, on Generative Machine Learning, Reinforcement Learning, AI Governance and Strategy.

    Explore services →


Should we invent “Vide Founding”?

I find it astonishing to see so many new AI companies like Moonshot AI (Kimi 2) and Zhipu AI (GLM) building models and actually compete with Not-so-OpenAI. Even Grok is in the game, somehow.

While Apple can’t ship a working AI assistant 🫣.

In theory (and practice), Apple has unlimited money, the best talent, a billion devices, and the ecosystem that comes with that. The startups on the other hand just have smart people and funding. But then, Apple has those too, only more of them.

It’s the size. It matters. Apple’s size has stopped being an advantage, obviously. A startup can organize everything around AI while Apple has to add AI to a company that’s organized around hardware, retail, services, and a dozen other things. It’s like they’re renovating while people are living in the house.

It’s way easier to start something new than to change something that already exists, provided you have access to capital and talent. Musk figured this out (apparently). He’s basically invented Vibe Founding: you have enough money and personal brand that talented (if naive) engineers will join you, and you can spawn a new company, train absurdly large models, and compete with established players in a couple years. The fact that xAI burns through resources and ignores environmental regulations doesn’t seem to matter.

Oh, and Vibe founding has the same problem as vibe coding, though. Vibe coding gives you a prototype fast but leaves technical debt everywhere. Vibe founding gives you a company fast but the debt is ecological and social. Musk’s platforms amplify (his) misinformation and dangerous ideology at scale now. What we saw in the recent incidents is probably just the start.

Developers, we need to talk.

We’ve normalized burning through compute like it’s free. It’s not. My MacBook battery used to last all day. Now it’s dead by lunch. And I think we’re all part of why.

This started right after upgrading to macOS Tahoe. At first I thought I had a hardware problem. Turns out the problem is the software itself. The new “liquid glass” design looks beautiful in marketing photos. On an actual screen, it’s a nightmare to read and apparently costs enough GPU cycles that my laptop gets warm just navigating folders.

The Shifting Landscape of AI: The Rise of China


Generated using Draw Things.

On September 30, 2025, something unremarkable happened: a Chinese AI company called Z.ai released GLM-4.6, a new large language model. The release came one day after Anthropic shipped Claude Sonnet 4.5, roughly seven weeks after OpenAI launched GPT-5. Three frontier models in two months. Just another week in AI.

Except it wasn’t.

What made GLM-4.6 significant wasn’t its performance, though it achieves near-parity with Claude Sonnet 4 (48.6% win rate). It wasn’t even that it trails Claude Sonnet 4.5 in coding, currently the benchmark. What mattered was the price tag.

GLM-4.6 costs $0.50 per million input tokens and $1.75 per million output tokens. Claude Sonnet 4.5? $3 input, $15 output. That’s 6-8.5 times cheaper for roughly comparable performance. Put another way: developers paying $200/month for Claude Max can get similar coding assistance through GLM for $3-15/month.

GLM-4.6 costs $0.50 per million input tokens and $1.75 per million output tokens. Claude Sonnet 4.5? $3 input, $15 output. That’s 6-8.5 times cheaper for roughly comparable performance. Put another way: developers paying $200/month for Claude Max can get similar coding assistance through GLM for $3-15/month.

This is the kind of price disruption that doesn’t just change markets. It restructures them.

The Q-Function paradox

The most rational way to live your life is also the worst way to live it.

This hit me once when talking to a another reinforcement learning researcher. You see, in finite horizon problems, you need a different Q-function for each time step. The optimal action at time t depends not just on your current state, but on how much time remains.

This makes sense. If you’re 25, taking two years off to backpack through Asia might be optimal. If you’re 55, probably not. The value of any given action depends critically on your remaining horizon. Said differently, you cannot live your life with the same policy at any age or point in time.

Most people intuitively understand this. We talk about being in different “life stages.” We adjust our risk tolerance as we age. We make different career moves at 30 than at 50. This is rational behavior; we’re computing age-appropriate Q-functions in a way.

Now here it gets counter-intuitive: while this time-dependent approach is mathematically correct, it might be psychologically toxic.

The AI Act’s specificity: why ML experts are mandatory, not optional

Most organizations preparing for AI Act compliance are missing something fundamental: they don’t have anyone who actually understands machine learning on their governance teams. This isn’t an oversight. It’s a disaster waiting to happen.

GDPR Was Complex. The AI Act Is Different.

GDPR required understanding data flows, implementing consent mechanisms, and ensuring proper security. It was genuinely complex, and organizations hired data protection officers and privacy engineers to handle it.

But GDPR’s complexity was manageable because it dealt with processes: where data goes, who accesses it, how long you keep it. These are things professionals can map and control.

The AI Act requires understanding algorithms themselves. When it asks about “robustness against adversarial examples,” it’s not asking about process, it’s asking about algorithmic properties of your model. When it requires “appropriate accuracy metrics,” it’s not asking for a percentage, it’s asking for stratified performance analysis across different populations, confidence intervals, and calibration metrics.

You can’t Google your way through these Machine Learning technical terms. You need people who understand the math.

The EU AI Act Footnotes (3)

Have you noticed how legal definitions create odd boundaries?

The EU AI Act defines a “general-purpose AI model” as “an AI model […] that is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks.”

But here’s something interesting: sometimes the exact same task can be performed by models that fall on opposite sides of this definition. Take image segmentation. You could use a GPAI model like a large multimodal system to segment images. But you could also just download a specialized model from Huggingface that does nothing but image segmentation. Same task. Different legal categories.

This distinction matters when you’re cataloging your AI systems and classifying their risks for AI Act compliance. Two systems doing identical work might have completely different compliance requirements. The map is not the territory, but in this case, the map determines your legal obligations.

PS: Random observations on AI regulation. Not legal advice.

The EU AI Act Footnotes (2)

Article 4 of the EU AI Act seems deceptively simple. All it asks for is “AI literacy.” But there’s something subtle and important happening here.

Most organizations will check this box by running basic AI training. And that’s fine: track attendance, measure results, keep improving. But the real goal isn’t just teaching people what AI is. It’s teaching them how to think about AI.

The best mental model I’ve found is what I call “cautious confidence.” Use AI aggressively when it solves real problems. But maintain a permanent background mindset of skepticism. The moment you fully trust an AI system is the moment you’ve made a mistake.

This isn’t just theory. At AIAAIC we’ve documented over 1800 AI incidents. Many of which started with someone trusting AI too much. That’s also why we built our taxonomy of harms to be deeply practical, so people can identify failures before they happen.

The bureaucrats probably didn’t intend this, but Article 4 might be one of the Act’s most important contributions. Not because it mandates training, but because it forces organizations to build this way of thinking into their core culture.

PS: Random observations on AI regulation. Not legal advice.

The EU AI Act Footnotes (1)

Take a look at this official pyramid from the EC website 👇. It’s a neat visualization, but it might reinforce a common misconception about the EU AI Act’s “risk classification.”

The AI AI Act risk classification pyramid
The AI AI Act risk classification pyramid conveys a wrong idea of mutual exclusivity.

The pyramid layout suggests mutually exclusive levels, like you’re either on floor 3 or floor 2. But that’s not quite how it works.

While prohibited systems are indeed a separate category, the other designations (high-risk, limited risk, general-purpose AI) can apply simultaneously to the same system.

This isn’t a major issue, but it’s worth keeping in mind when setting up compliance processes. Your system might need to meet requirements from multiple categories, and your role might include both provider and deployer obligations.

The Act’s categories work more like tags than floors in a building. Small distinction, practical implications.

The AI AI Act risk classification venn diagram
How the EU AI Act risk "classification" should look like.

PS: Random observations on AI regulation. Not legal advice.

The EU AI Act: A Catalyst for Responsible Innovation

The multiple benefits of the EU AI Act
The AI AI Act risk classification pyramid conveys a wrong idea of mutual exclusivity.

When I first heard about the EU AI Act, I had the same reaction many in tech did: here comes another regulatory burden. But as I’ve dug deeper, I’ve come to see it differently. This isn’t just regulation; it’s an opportunity to reshape how we approach AI development.

Let’s face it: we’re in the midst of an AI hype cycle. Companies are making grand promises about AI capabilities, setting expectations sky-high. But as anyone who’s worked in tech knows, reality often lags behind the hype. We’re already seeing the fallout: users disappointed by AI that doesn’t live up to the marketing, trust eroding as quickly as it was built.

The AI Act might be just what we need to reset this dynamic. By pushing for transparency and accountability, it gives us a chance to rebuild trust on a more solid foundation. Instead of chasing the next big headline, we can focus on creating AI that genuinely delivers value and earns users’ confidence.

Critics worry that the Act will stifle innovation, particularly for smaller companies. But look closer, and you’ll see that the most stringent requirements are focused on high-risk systems. For many AI applications, the regulatory burden will be light. And even for high-risk systems, the costs of compliance should be a fraction of overall development expenses.

Taming Dragons

Image of dragons. Generated with the Draw Things software.
Generated using Draw Things.

If you’ve worked with Large Language Models (LLMs), you’ve probably experienced a peculiar kind of cognitive dissonance. On one hand, it feels like magic. The possibilities seem endless. You can generate human-like text, answer questions, even write code. It’s as if we’ve unlocked a new superpower.

But on the other hand, it’s not fully predictable, let alone reliable. It’s like we’ve discovered a new species of bird that can provide flight to everyone, but that species happens to be a dragon. A dragon that occasionally and unpredictably breathes fire, destroying things and breaking the trust you so badly want to have in it.

This makes working with LLMs a highly non-trivial engineering challenge. It’s not just about implementation; it’s about taming a powerful but volatile force. So how do we do it? Here are some thoughts: