Another month is behind us, and with it another batch of news from the world of artificial intelligence. As usual, it was busy: new models, new architectures, record-breaking context windows, and a push on speed and price.
We chose carefully and focused on what we think moves AI forward the most.
Llama 4
Meta released Llama 4, a new family of open-weight language models. It brings two major changes:
- A move to the MoE (Mixture of Experts) architecture, which activates only a small part of the model, the specific "experts", on each query. The result is both higher speed and lower cost.
- Three models: the fastest Scout, the Maverick with a million-token context window, and the largest Behemoth, which is still training.
- Scout handles a context window of up to 10 million tokens, an extreme jump over commonly available models. But a context window that large is still mostly theoretical. The models can't yet "recall" all the information at this scale.
Useful resources:
GPT-4.1 models
OpenAI is shipping a new iteration of its core model, GPT-4.1. What it offers:
- it's available primarily via API,
- three variants (4.1, the faster and weaker Mini, and the fastest Nano),
- it's cheaper than GPT-4o but a bit slower, because response speed is the bottleneck,
- it handles up to one million tokens,
- it follows instructions much better.
The model works well with long texts and their context. Alongside GPT-4.1, OpenAI introduced a new benchmark for MRCR (Multi-round Co-reference Resolution). The GPT-4.1 Nano variant is currently the fastest of all, but also the least capable.
OpenAI launches multimodal models o3 and o4-mini
These are the most advanced reasoning models yet. Both are available to paying users and can be used via API.
- o3 reaches state-of-the-art results on genuinely hard benchmarks such as Codeforces or SWE-bench.
- o4-mini is a smaller but faster reasoning model.
- Both models are trained specifically to use tools (function calling), which hints at where you'll likely use them: in intelligent agents.
Useful resources:
Gemini 2.5 Pro does well
Google's models keep gaining users, mainly thanks to their strong price-performance-speed ratio.
Gemini 2.5 Pro is currently Google's most capable model, and for now it's available free to everyone.
OpenAI considered buying Windsurf IDE
Last week, OpenAI started talking about buying Windsurf, a competitor to Cursor. It offered three billion dollars. That shows two things: how much value AI development tools have and will have, and, more importantly, that OpenAI is aiming long-term at end-user products.
Useful resources:
Development in AI is in full swing and new releases keep coming every week. We track them for you and will bring you the highlights in the months ahead.
Want to stay one step ahead?
Don't miss our best insights. No spam, just practical analyses, invitations to exclusive events, and podcast summaries delivered straight to your inbox.
