Notatki

Gemini CLI: your open-source AI agent

Now Google has its own CLI coding agent. Massive context, generous free tier (it’s free for most cases), and fully open-source. I’m yet to try it, but it looks impressive.

Anthropic wins ruling on AI training in copyright lawsuit but must face trial on pirated books

A very important copyright legal precedent. TL;DR: Training models on legally acquired copyrighted materials is fair use. The model’s creators need to make sure that its output is “quintessentially transformative”.

The illegal acquisition of training data is punishable as before.

Claude Deep Research, or How I Learned to Stop Worrying and Love Multi-Agent Systems

I usually approach shiny new things with a healthy dose of skepticism. Until recently, this was precisely my attitude toward multi-agent systems. This is hardly surprising, given the immense hype surrounding them and the conspicuous absence of genuinely successful examples. Most implementations that actually worked fell into one of the following categories: Agentic systems following a predefined plan. These are essentially LLMs with tools, trained to automate a very specific process. This approach allows each step to be tested individually and its results verified. Such systems are typically described as a directed acyclic graph (DAG), sometimes dynamic, and developed using now-standard primitives from frameworks like LangChain and Griptape1. The early implementation of Gemini Deep Research operated this way: first, a search plan was created, then the search was executed, and finally, the results were compiled. Solutions operating in systems with a feedback loop. Various Claude Code, Cursor, and other code-generating agents fall into this group. The stronger the feedback loop—that is, the better the tooling and the stricter the type checking—the greater the chance they won’t completely wreck your codebase2. Models trained using Reinforcement Learning, such as those with interleaved thinking, like OpenAI’s o3. This is a separate, very interesting conversation, but even these models have a certain modus operandi defined by the specifics of their training. Meanwhile, open-ended multi-agent systems have largely remained in the proof-of-concept stage due to their general unreliability. The community lacked a clear understanding of where and how to implement them. This was the case until Anthropic published a deeply technical article on how they developed their Deep Research system. It defined a reasonably clear framework for building such systems, and that is what we will examine today. ...

Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified “task” at which human radiologists beat AI. So maybe the “jobs are bundles of tasks” model in labor economics is incomplete. […]
Can you break up your own job into a set of well-defined tasks such that if each of them is automated, your job as a whole can be automated? I suspect most people will say no. But when we think about other people’s jobs that we don’t understand as well as our own, the task model seems plausible because we don’t appreciate all the nuances.

— Arvind Narayanan

Nevertheless, my take on this is that while jobs won’t be fully automated, one specialist would be able to do more work, so it all boils down to the classic supply and demand problem. I believe that in most areas the demand will still outweigh the supply, but not in all of them. See Jevons paradox.

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

Another MCP server vulnerability, this time from Atlassian. It allows for a prompt injection from external support tickets, giving the attacker the opportunity to exfiltrate data and wreak havoc in the internal system.

MCP's June Update: Safer, Smarter, Simpler?

The Model Context Protocol, despite its aggressive adoption (or perhaps because of it), continues to evolve. Anthropic recently updated the MCP specification, and below, we’ll look at the main changes. Security Enhancements An MCP server is now always classified as an OAuth Resource Server, and clients are required to implement Resource Indicators (RFC 8707). This is necessary to protect against attacks like the Confused Deputy. Previously, tokens requested by a client from an authorization server were “impersonal,” meaning they could be used by anyone. This allowed an attacker to create a phishing MCP server, deceive a client, steal the token, and use that token to gain access to the real MCP server. ...

Agent mode is now generally available with MCP tools support in Visual Studio

GitHub Copilot rolled out Model Context Protocol support for their Agent mode in general availability. The usual security problems with MCP are compounded by the “Always allow” option for the tools usage.

Update to GitHub Copilot consumptive billing experience

It’s the end of unlimited access to the best model from all leading model providers in GitHub Copilot Chat. Now only GPT-4o and GPT-4.1 are unlimited.

It is another step towards the global reevaluation of pricing strategies for AI-based products. The aggressive promotion phase is gone. AI is becoming another kind of utility, and we may expect similar strategies.

https://eugeneyan.com/writing/writing-faq/

This FAQ about blogging kinda resonates with my motivation

AI makes the humanities more important, but also a lot weirder

AI is a double-edged sword in education. It helps students cheat on traditional assignments, but also acts as a powerful new tool that can fully engage them. Education has to change, but I believe it will ultimately be for the better.