The Shift to Typed Parts: Why AI Agents Need Structured, Multi-Part Outputs
For years, the core interaction with Large Language Models (LLMs) was deceptively simple: a text prompt goes in, and a text response comes out. This foundational assumption—that all communication is just a continuous stream of text—is rapidly becoming an outdated bottleneck. As AI agents take on complex, multi-step tasks, the underlying software architecture must evolve to handle something far richer and more structured.
From Simple Text to Typed Parts: The Architectural Leap
The industry standard is undergoing a major refactor. New model versions, like LLM 0.32a0, are fundamentally changing how inputs and outputs are processed. Instead of treating the entire history as a single text blob, the system now requires that input be modeled as a sequence of distinct 'messages' (conversational turns). This mirrors industry best practices, such as the OpenAI chat completions API, allowing the model to better understand context and turn-taking.
What This Means for the Agentic Workflow
The shift to 'typed parts' is the technical backbone enabling true agentic workflows. Agents are designed to define measurable 'Outcomes' for complex, multi-step tasks. They aren't just chatting; they are executing structured processes. Features like 'Dreaming'—where an agent inspects previous sessions to create new plans or memories—rely entirely on the ability to process and output structured, non-textual data. Similarly, multi-agent orchestration allows developers to create 'fleets of agents' to solve problems that require diverse, coordinated outputs.
The New Standard: Proof of Use Over Documentation
This architectural evolution reflects a deeper shift in the value proposition of AI software. The focus is moving away from 'documented quality' (perfect READMEs, comprehensive tests) toward 'proven, real-world usage.' As AI agents accelerate development speed dramatically, the value of a system is no longer measured by how well it is documented, but by its sustained adoption and successful, long-term use in complex enterprise environments. The standard of proof is moving from a perfect codebase to a successful, functional outcome.
Ultimately, handling inputs as structured 'messages' and outputs as typed, multi-part streams is not just a technical refinement; it is the prerequisite for building reliable, autonomous, and deeply integrated AI systems that can truly operate as software components.