Introducing ChatGPT agent: bridging research and action

OpenAI released an agent that can control your own computer. It uses its advanced reasoning capabilities to plan and solve tasks in applications like Excel and PowerPoint.

While Sam Altman “feels the agi” looking at how the system works, I find it incredibly clunky. Instead of concentrating on providing the models with native tools (MCP is a good step forward, although not without its problems), they try to emulate hands and eyes for them, so models can do the same things we do, but slowly and awkwardly.

So I would consider this type of agent a temporary workaround until we develop better machine-to-machine communication mechanisms. After this, it will be used to serve an increasingly long tail of legacy systems that will not have such machine-usable interfaces.

P.S. Gemini mentioned a point of view I didn’t consider, namely that such systems can collect data necessary to train better embodied intelligence, meaning one that can act in the real world. It’s a perfectly valid point that shouldn’t be left without attention.