The Agent Is the User

This week I had a good back-and-forth on X with someone who called Notion the ultimate backend for agents. His case was the strongest version of a position I keep hearing: that post-LLMs, the data model underneath your knowledge tool no longer matters. Models wrangle unstructured data for a living, so the user is abstracted away from whatever is underneath. The only things left to compete on are security and UX. By that logic, Notion wins.

It is a sharp take from someone who has clearly spent real time here, and I came down on the other side of most of it, but the exchange did something useful. It made me sit down and map every product currently calling itself a company brain, a second brain for agents, an AI operating system, a converged AI workspace, a memory layer, or any of the other phrases that have shown up on landing pages over the last eighteen months. The map looks nothing like the conversation about it.

The public debate is still arguing file system versus wiki. Drive versus Notion. Files versus blocks. Obsidian versus Roam. That fight has been going on for ten years and the people having it have strong opinions. Fine. What almost nobody is calling out is that there are now three different products using the same vocabulary: three different primary users, three different defensibility profiles, three different odds of being around in five years. The conflation is hiding what is actually happening. Until you name the shapes, you cannot tell which company you are looking at, which competitors it really has, or whether its moat is real.

This post is the map. The names. The technical floor that makes one of the three a category instead of a feature. And the line I keep coming back to that pulls it together:

The agent is the user.

That is the whole thing. Once you accept that some products are built for a person to sit in front of, some are built for a person to run their team and their agents from, and some are built for an agent to read, the rest of the category falls into place.

Three shapes hiding under one phrase

When a vendor today says "company brain," they almost always mean one of three completely different things.

Shape one is the agent orchestration platform. Agents are the new employees, and your company needs a place to run them: departments, managers, shared context, human-in-the-loop approvals, MCP and custom skills as extension points. The data layer is incidental. The product is the orchestration. Cofounder pitches "run an entire company with agents," with agentic departments and a human in the loop for anything consequential. HQ, from Indigo AI, is the same shape narrowed onto engineering teams: a shared context layer on top of Claude Code or Codex that syncs skills, workers, and secrets across the team. Letta belongs here now too. The company that coined "LLM OS" and "stateful agents" has pointed that focus at the next generation of agent systems: Letta Code, the number one model-agnostic open-source coding harness on Terminal-Bench. Memory and persistence are the substrate. The product is the harness.

This shape has nothing to do with the memory or knowledge-graph debate. The center of gravity is the agent loop and the shared procedure library that feeds it. Whether there is a real entity graph underneath is not the question. The question is whether the team has a place to standardize the agent's skills, manage its credentials, watch what it does, and approve what matters.

Shape two is the AI-native app, specialized or everything. A unified end-user product with AI woven into every surface. Sometimes one category (Linear for issues, Attio for CRM, Superhuman for email, Granola for meetings), sometimes all of them at once. Micro is the loudest current example of the everything-app version: email, CRM, calendar, tasks, and docs, with an agent tying them together. Even where there is a real graph under the hood, it is still shape two in form, because the graph is private to the app, there is no public MCP server, and the user has to leave their existing tools to use it.

Shape three is the actual company brain. The only one of the three where the product is the data layer. No primary UI, or the UI is a dashboard for operators rather than the product itself. A typed entity model. Multi-source live connectors with permission and temporal awareness. Multimodal indexing that actually runs, not just storage. MCP as the integration path. The agent is the user.

Three shapes. One phrase. The rest of this post is what each shape really is, why two of them keep losing their most successful versions to shape three, and what makes shape three a defensible category instead of a marketing label.

The everything-app trap

The loudest collisions between the three shapes happen in shape two, so start there. The dynamics inside the AI-native app category are the hardest in the stack, and they are badly underappreciated by anyone who has not lived through trying to displace an incumbent.

The everything-app pitch is some variant of "one place for everything you do." Email, CRM, meetings, tasks, docs, chat, all in one product, AI woven through every surface. The intuition is that AI gives the everything-app a fresh shot at categories it lost in the SaaS era: a unified context, an agent that can act across tools, a single seat instead of seven. The intuition is wrong in the way most everything-app intuitions are wrong, and the failure mode is structural, not executional.

The user is not being asked to add a tool. They are being asked to leave their incumbent in every category the everything-app touches. It is not replacing one app. It is asking someone to leave Superhuman and Attio and Granola and Linear and Notion at the same time, and migrate years of accumulated state across all of them. That is not one switching cost. That is five switching costs stacked onto a single purchase decision. The everything-app graveyard is large for a reason that has nothing to do with product quality.

The stickiness in the incumbents is not habit. It is network. Slack is sticky because every external partner is already in channels with you. Linear is sticky because the engineering team's entire triage ritual lives there. Notion is sticky because every doc anyone outside the company has linked to points into it. Attio is sticky because years of pipeline state and contact history have accreted into truth. You can love an everything-app's email-plus-CRM combo and still not be able to leave Linear, because Linear is not a tool you use alone. The everything-app has to beat each incumbent on its own axis and absorb the network effect that incumbent built. Almost nothing pulls that off.

The interesting move is not from the companies trying to win the everything-app fight today. It is from the ones that already won it years ago, in a pre-AI world, and are now spending their AI budgets reaching outside their own walls.

ClickUp is the clearest example. Its AI strategy is not about being more all-in-one. Brain MAX, its desktop app, pulls context from tools ClickUp does not own. Its enterprise-search acquisition reaches across systems it does not own. The headline on its AI product page is One AI to Replace Them All, but what it is actually selling is no longer "come live in ClickUp instead of those tools." It is keep using those tools, and let our AI see across all of them. That is not an everything-app move. It is a context-layer move, dressed in everything-app clothing, because the everything-app alone is not enough anymore.

Notion is doing the same thing from the opposite direction. Their recent Developer Platform launch includes a feature called Workers that syncs any data source (Salesforce, Zendesk, Postgres, anything with an API) into Notion databases. The pitch: now your agents can read it, your team can see it, and everyone is working from the same shared, trusted context. Read it again. The answer to fragmented context is to bring everyone's data into Notion databases. The blocks-and-databases model did not change. The world is still expected to fit into Notion's shape, just with more import pipes attached. Anything that does not fit (a meeting recording, a video, a five-person call transcript, a thread that crosses six tools) has to be flattened into a row to participate. That is a real step forward for Notion users. It is not a company brain. It is Notion, with import pipes.

Atlassian's Rovo is the same story for Jira and Confluence. Salesforce's Agentforce is the same story for the CRM. Every category-defining app of the SaaS era is now reaching for a cross-tool context layer, because the AI tax exposed something they all spent fifteen years pretending was not true: the work crosses tools. The context has to too.

Which is shape three.

The signal is loud. The most successful everything-app on the planet is spending its AI budget building a cross-system context layer that reaches into tools it does not own. The most successful wiki on the planet is doing the same. The players who already won their wedge are admitting that the next layer of competition will not be fought inside any single app. It will be fought over which product owns the agent-readable view across all of them.

What is underneath: the agent is the user

Here is the sentence everything pivots on, and it is worth saying plainly.

A file is for a person. A block is for a person. A page is for a person. When an agent reads any of them, it is reading something shaped for someone else's job.

Shape one's job is to run the agent. Shape two's job is to be the surface a person works on. Shape three's job is to be the thing the agent reads. Those three jobs need three different primitives, and the category keeps confusing itself by pretending the primitives are interchangeable. They are not, and they get less interchangeable as the models get better, not more.

The gap is most visible in the everyday claims that do not survive contact with a real agent.

"Multimodal" mostly means stored, not indexed. Notion stores your video. Notion does not watch your video. The products that actually index non-text content are a short list: some PKM tools do voice and meeting transcription, a few do image OCR, and the headless backends built around it from day one do all of it. An agent that has to act on what was said in a customer call cannot be served by a system that filed the recording and never listened to it.

"Memory layer for agents" is now claimed by at least six vendors with radically different primitives: record-of-fact (Mem0), entity graph (Cognee, Graphlit), context block (Letta), corpus (LlamaCloud, Vectara), markdown brain (GBrain). The phrase has lost its discriminating power. The meaningful question is which primitive the agent actually sees when it connects.

"Backend for agents" is the same problem one level up. Box (file system), Glean (portal), Notion (wiki), and every headless player all claim it. The phrase is fine. The question is what shape the agent finds underneath.

A file is the wrong primitive because the file is the human's primitive, not the agent's. A block is the wrong primitive because the block is shaped around how a person browses a page. A page is the wrong primitive because the page is shaped around what a person reads in one sitting. None of these carry the structure an agent needs to act with provenance, scoped permissions, incremental updates, and multi-step reasoning across sources. LLMs can wrangle anything you throw into context. But wrangling is not reasoning, and the substrate handed to the wrangler determines how much reasoning is even possible.

This is the reframe the public debate keeps missing. The category is not bifurcating into AI on top of files versus AI on top of pages. It is bifurcating into AI on top of pages, and agents on top of graphs. The second one is not a better version of the first. It is a different product, with a different consumer, and the gap widens with every model generation.

What makes shape three a real category

If shape three is just "vector DB plus glue," the rest of the stack will eat it as a feature. That is the honest read on most of what is marketed as a company brain today, and it is why the term has lost so much meaning. A category survives commoditization only if its underlying shape is hard to replicate by accident. So let me say plainly what that shape is.

A real context graph for agents (and that is the name I will use from here, because it is what the thing actually is) carries four primitives. Any backend missing one of them loses to a competitor that has all four, or to a UI vendor that grew the missing piece as a feature.

First, a typed entity and relationship model. Not chunks in a vector DB. Something an agent can reason about by name: this is a Person, this is a Company, this is a Deal, this Person works at that Company, this Company has had three commitments slip in the last quarter. The agent does not have to re-derive the schema every time it asks a question, because the schema is already there.

Second, multi-source live connectors with permission and temporal awareness. Slack, Gmail, Drive, Notion, GitHub, calendars, podcasts, support tools, ticketing systems. Kept current. Scoped to who is allowed to see what. Aware that facts have validity periods and that the world changes. The work crosses tools, the context has to too, and the connectors have to carry the security model inside them.

Third, multimodal indexing that actually runs. Audio diarization, OCR, image embeddings, video understanding. Not blob storage with a mimeType field. If the backend does not know what was said in the meeting recording, it does not know what the deal looks like, and the agent on top of it is reasoning over a partial picture without knowing it.

Fourth, MCP as the integration path. Persistent state, BYO-LLM, and a credible MCP server are what make a context graph addressable by every agent in the stack. The agent is the user. Build for the agent, and let the model and the surface come and go.

Hit all four and you have a category. Hit one or two and you are a feature waiting to be absorbed.

The four primitives are the surface. Underneath, the shape that makes shape three a real product is four data layers, and they are load-bearing in a way most of shape two structurally cannot replicate.

Content is the immutable evidence layer. Every email, message, meeting recording, document, ticket, commit, page, post, and file. Ingested with its original structure preserved, not flattened into text for embedding. Content does not change. Its provenance is permanent. When an agent cites a claim, it can point back to the exact source.

Entities are the identity layer. The same person showing up in Slack, email, a CRM record, a calendar invite, and a meeting transcript is one entity, resolved across sources, aligned with open standards. Same for companies, products, places, projects, events. The entity layer is the spine that makes everything else queryable. Without it, the agent gets five copies of the same person and does not know which one is real.

Facts are the assertions about the world, with validity periods, sources, and status. This deal is at $240k ARR. This person changed roles in February. This commitment was made on a call last Tuesday. Facts are first-class data. They carry provenance back to the content. They carry time. They can be superseded. They can be wrong, and the system has to know they can be wrong. An agent that does not carry time in its substrate cannot tell the difference between current truth and stale truth.

Conversations are the connective tissue. Both human chats and prior agent runs, where an agent already answered a question, made a recommendation, drafted a brief, or took an action on someone's behalf. Without conversations, the graph tells you what is true. With them, it tells you how the organization actually came to think so, and what has already been done about it. This is the layer almost everyone forgets, and it is the one that makes a company brain feel like it actually knows the company.

These four layers (content, entities, facts, conversations) bind into a single graph continuously derived from ingested reality, not curated by hand into forms. That graph is the context graph. Indexed across full-text, vector, graph relationships, time, and geospatial axes at once. Source-agnostic. Schema-stable. Built from day one for the job an agent actually needs it to do. None of that gets replicated by bolting a context feature onto a shape-two product, because that product's data model is already shaped around a different primary user. You cannot bolt a typed temporal graph onto a wiki any more than you can bolt a wiki onto a file system. The shape decides the ceiling.

This is the technical floor that turns shape three from a marketing label into a category. And it answers the obvious question hanging over the whole post: what stops Notion or ClickUp or anyone else from cloning this in eighteen months? A typed entity graph with bitemporal facts, multimodal indexing, and conversation history is not a feature you ship in a quarter. It is an architectural commitment you make on day one, and most of the products in shapes one and two made the opposite commitment years ago. Notion built blocks and databases. ClickUp built a flexible primitives platform for projects, docs, and tasks. Box built files. They can grow connectors. They can grow an MCP frontage. They can ingest harder. The shape underneath stays what it is.

Where Graphlit, Zine, and Dossium sit in this

I have been building Graphlit for five years, against a thesis that looked early for a long time and is finally arriving as the category around it clarifies. "Context graph" is a recently minted phrase, but it is the most accurate name yet for what Graphlit has always been. It started as a content-centric graph: every document, message, recording, and file ingested with its structure intact. Then it became a conversation-centric graph, capturing how people and agents actually talk and work. The entities and facts are what bind those two together: the people, companies, and commitments that thread through all of it, resolved across every source. Content, conversations, and the entities and facts connecting them. That is the context graph, and it is the shape Graphlit was built around before there was a word for it.

Graphlit is shape three, on purpose, from day one: the same Observable-typed entity graph this whole post has been describing. Person, Organization, Product, Event, Place, plus customer-defined types. 30+ live connectors across Slack, Gmail, Drive, Notion, GitHub, calendars, podcasts, RSS, Discord, Linear, Jira, Zendesk, and Intercom. First-class facts with validity periods, sources, and supersession. MCP as the integration surface. And the pieces a markdown-only brain cannot match, which are the actual moat:

Full multimodal indexing. Deepgram audio diarization, OCR, image embeddings, video understanding. The backend knows what is in the meeting recording or the screen-share, not just that a file exists.
Conversation history as a first-class layer. Both human chats and agent runs: not just retrieved facts, but the discussions and tool traces that produced them. The history of how the company came to know what it knows.
Vault, with bidirectional Git sync against a markdown projection of the graph. Markdown becomes the lingua franca on top, while the underlying graph stays typed, temporal, and multimodal. The familiar surface, against a substrate that does not constrain what the agent can do underneath it.

Zine is the UX layer wrapping a Graphlit backend for an individual user: markdown-first reading and writing, like the shape-two products, but the page is a view over a typed graph rather than the source of truth. The Notion- or Obsidian-shaped seat on top of a real backend. It is shape two in form and shape three in substance.

Dossium is the B2B version for teams. Same Graphlit backend underneath, with a multi-tenant identity model, role-based access, audit, and a native agentic layer for collaborative work. It is the parallel to Notion Agents, against a substrate that was not designed around blocks and databases.

The three of them together are the answer to who is the agent built for? when the answer has more than one consumer. An individual reading and writing in Zine. A team collaborating in Dossium. Any other agent connecting through MCP to Graphlit directly. Same substrate. Three surfaces. The agent stays the user underneath all of them.

The dynamics nobody has mapped yet

The three shapes are not competing for the same dollar, and they are not aging at the same speed. This is the part of the picture the analyst grids have not caught up to.

Shape one (orchestration) is a winner-takes-most race inside specific agent surfaces. Letta Code is winning the model-agnostic coding harness. HQ is winning shared skills for Claude Code and Codex teams. Cofounder is going for the broader business-orchestration crown. The winners will each own a slice of agent runtime the way Slack owns chat and Linear owns issues. The market is real. The moats are skill libraries, approval workflows, and operator trust. The AI tax gets paid as platform fees rather than per-seat.

Shape two (the app) is where the most revenue is, and where most companies are going to fail. The everything-app graveyard keeps growing, and the few survivors won their wedge in a pre-AI world. Repeating that in 2026, against incumbents who now have AI as table stakes, is a much harder bet, which is why most of the new wave of AI-native everything-apps are going to hit a wall they cannot get over. The specialty apps (Linear, Notion, Attio, Superhuman, Slack, Granola) stay sticky because of network effects the everything-app cannot absorb in one purchase. The defensible AI move from this shape is the one the survivors are already making: stop trying to absorb the other tools, start ingesting from them, which is the same as building a piece of shape three from the inside.

Shape three (the context graph) is the youngest market and the one with the cleanest TAM logic. Every team eventually has more than one source of truth, every agent eventually needs to act across more than one of them, and that is a substrate-shaped problem. The everything-app vendors are quietly cloning pieces of it. The orchestration vendors are reaching for it as their MCP source of truth. The category is still defining itself, which is exactly when category-defining bets get made. The risk in shape three is commoditization: markdown-first variants are easy to copy. The defense is depth: multimodal indexing, live multi-source ingestion, conversation history including agent runs, temporal facts with provenance, identity and permissions, and a markdown surface that does not constrain the underlying graph. Anybody can ship a Postgres-plus-pgvector-plus-entity-extraction prototype in a weekend. Almost nobody can ship the full surface.

The sequencing matters. Shape two is where the current revenue is. Shape one is where the current attention is. Shape three is where the lock-in compounds, because once the agent is connected to your context graph, switching it is harder than switching any UI on top.

And the three shapes are not enemies. They are complements. The orchestrator needs the context layer to act on something. The context layer needs the orchestrator to be actuated. The app stays the comfortable surface a person sits in front of. Teams treating any two of the three as substitutes are going to ship orchestration without context, or context without orchestration, or an app without either, and each of those is half a product. The vendors that win the next five years are the ones that pick a shape, commit to its primitives, and integrate cleanly with the other two.

How to pick

The "substrate does not matter" position I started this post arguing against is correct, partially correct, and wrong. In that order, depending on which shape you mean.

If the consumer is you, in a UI, alone or with a small team: you are shopping in shape two, and the substrate honestly does not matter much. Notion or Obsidian, depending on whether you want native multiplayer or local-first portability. Both are excellent. If you want the same UX but want the page to be a view over a real graph rather than the source of truth, that is where Zine sits. This is the case where substrate does not matter is genuinely right.

If the consumer is your company, in a UI, with lots of internal data sources and a team: still shape two in form, but the substrate question reappears. Glean is the obvious enterprise choice on the search-and-portal end. Notion can stretch into this with Workers and connectors if you commit to fitting your world into Notion's shape. ClickUp can stretch into this if you live in projects and tasks. Dossium is where I would point you when the team wants a collaborative surface and a native agentic layer against a graph that was not built around blocks and databases.

If the consumer is an agent you are building, or an agent someone else is building on top of your data: shape three. You want a headless backend with all four primitives: typed entity model, multi-source connectors, multimodal indexing, MCP as the integration path. Graphlit, with Vault for Git-synced markdown, is the answer when you want full multimodal indexing, conversation history as first-class data, and the most live connectors in the category. Cognee, Supermemory, and Mem0 are credible narrower options. Letta is the right answer if the specific problem is stateful agents rather than retrieval.

If the consumer is the team itself, running its agents through one place: shape one. Cofounder for the broad business-orchestration case. HQ if the team is already deep in Claude Code or Codex. Letta Code if the work is engineering and the model has to stay portable.

The line that ties it together

LLMs are great at wrangling unstructured data. Agents need structured access to it. That difference is the entire reason the category is splitting. The wrangling premium falls as the models get better. The structured-access premium grows, because no amount of model improvement gives the agent identity, time, multimodal grounding, or permission scoping unless those are already in the substrate.

A file is for a person. A block is for a person. A page is for a person. Notice that all three are the same kind of thing: a document, something a human opens and reads one at a time. The agent's primitive is not a document at all. It is the context graph: typed, so the agent can reason by name; temporal, so it can tell current truth from stale; multimodal, so it knows what was said in the room and not just that a file exists; permission-aware and source-grounded, so every claim is scoped and traceable; with conversation history attached, so it carries not just what is true but how the company came to believe it. You can call it a company brain or an agent backend or a headless memory layer if you want. Underneath every one of those labels is a graph, and the shape of that graph matters more than the name on it.

The shape decides the ceiling. The shape decides whether your product is a category, a feature, or a marketing line. The shape decides whether the next agent generation makes you more valuable or less. The shape is what the public debate keeps missing, because the debate is still arguing about which UI a person should sit in front of.

The agent is the user. Build for the agent. The rest of the category falls into place from there.