Memory Index

A memory index is the lookup layer that enables fast retrieval of relevant context from agent memory without reprocessing entire histories. It organizes memory by entities, relationships, time ranges, topics, and metadata, allowing agents to query "What tasks did Alice own last week?" or "Show me decisions related to Project X" in milliseconds.

Memory indexes combine multiple indexing strategies: semantic (vector embeddings), structural (entity and relationship graphs), temporal (time-based ranges), and keyword (full-text search). This multi-dimensional indexing ensures agents can access the right memory quickly, regardless of query type.

The outcome is performant agent memory that scales to millions of facts, events, and relationships without degrading retrieval speed.

Why it matters

Enables fast retrieval: Without indexes, querying memory means scanning all data—indexes reduce lookup time from seconds/minutes to milliseconds.
Supports diverse query types: Semantic ("similar to X"), structural ("owned by Alice"), temporal ("last week"), and keyword ("contains authentication") queries all work.
Scales to production workloads: As memory grows to millions of entities and events, indexes ensure retrieval performance remains constant.
Reduces latency for agents: Agents need context quickly to respond in real-time—slow memory access breaks user experience.
Optimizes resource usage: Efficient indexes reduce compute and memory overhead compared to brute-force scans.
Facilitates hybrid retrieval: Combine semantic similarity, relationship traversal, and temporal filtering in one query—indexes make this feasible.

How it works

Memory indexes operate through multi-layered indexing and query routing:

Semantic Index → Vector embeddings are indexed using ANN algorithms (HNSW, IVF) for fast similarity search.
Entity Index → Entities (people, projects, tasks) are indexed by type, attributes, and identifiers for lookup queries.
Relationship Index → Graph edges are indexed for traversal: "Find all tasks owned by people at Acme."
Temporal Index → Events and state changes are indexed by timestamp, enabling time-range queries: "Show changes from Nov 1-7."
Keyword Index → Full-text search indexes content for keyword queries: "Find memory containing 'authentication error.'"
Query Routing → When a query arrives, the memory index determines which indexes to use and how to combine results.
Result Assembly → Matching memory is retrieved from relevant indexes, deduplicated, ranked, and returned.

This architecture ensures every query type is fast and efficient.

Comparison & confusion to avoid

Term	What it is	What it isn't	When to use
Memory Index	Multi-dimensional lookup layer for fast retrieval	The memory itself—indexes enable access, they don't store semantic meaning	Optimizing retrieval speed and supporting diverse queries
Vector Index	Similarity search index for embeddings	Complete memory index—vector is one dimension among many	Semantic similarity queries—not relationships or time-based queries
Knowledge Graph	Structured representation of entities and relationships	An index—graphs store memory structure; indexes enable fast lookup	Reasoning and traversal—not retrieval optimization
Search Engine	Keyword-based full-text search	Structured, multi-modal memory index with semantics and time	Keyword search—not semantic, relationship, or temporal queries

Examples & uses

Multi-modal query: "Show me Alice's tasks blocked last week"

Entity index: Resolve "Alice" to canonical entity ID.
Relationship index: Find tasks with owned_by: Alice and status: blocked.
Temporal index: Filter events to "last week."
Result: 3 tasks that match all criteria, retrieved in <50ms.

Semantic + structural query: "Meetings about authentication involving engineers"

Semantic index: Vector search for "authentication."
Entity index: Filter to entities with type: meeting.
Relationship index: Find meetings with attendee relationships where attendee.role: engineer.
Result: 7 relevant meetings.

Time-travel query: "What was Project X's status on Oct 15?"

Temporal index: Query state snapshot at "Oct 15."
Entity index: Fetch Project X's state as of that date.
Result: "In Progress, 12 tasks, 2 blockers."

Best practices

Index all relevant dimensions: Don't just index vectors—include entities, relationships, time, and keywords for comprehensive retrieval.
Update indexes incrementally: When memory changes, update affected indexes immediately—batch updates cause staleness.
Monitor index performance: Track query latency by type—if temporal queries slow down, optimize the temporal index.
Implement query caching: Frequently asked queries (e.g., "today's status") can be cached to reduce index load.
Balance index size vs. speed: More indexes improve query diversity but increase storage and update overhead—optimize for your use cases.
Use approximate indexes where appropriate: Exact nearest-neighbor search is slow—ANN (approximate) indexes trade minimal accuracy for massive speed gains.

Common pitfalls

Only indexing vectors: Semantic similarity is one query type—relationship and temporal queries need dedicated indexes.
Stale indexes: If indexes aren't updated when memory changes, retrieval returns outdated results—implement real-time updates.
No query optimization: Complex queries that scan multiple indexes can be slow—analyze query patterns and optimize index selection.
Over-indexing: Indexing every possible dimension is wasteful—focus on query patterns your agents actually use.
Ignoring index maintenance: Indexes fragment and degrade over time—implement periodic reindexing and compaction.

Memory Index

Memory Index

Why it matters

How it works

Comparison & confusion to avoid

Examples & uses

Best practices

Common pitfalls

See also

Ready to build with Graphlit?