Memory Index
A memory index is the lookup layer that enables fast retrieval of relevant context from agent memory without reprocessing entire histories. It organizes memory by entities, relationships, time ranges, topics, and metadata, allowing agents to query "What tasks did Alice own last week?" or "Show me decisions related to Project X" in milliseconds.
Memory indexes combine multiple indexing strategies: semantic (vector embeddings), structural (entity and relationship graphs), temporal (time-based ranges), and keyword (full-text search). This multi-dimensional indexing ensures agents can access the right memory quickly, regardless of query type.
The outcome is performant agent memory that scales to millions of facts, events, and relationships without degrading retrieval speed.
Why it matters
- Enables fast retrieval: Without indexes, querying memory means scanning all data—indexes reduce lookup time from seconds/minutes to milliseconds.
- Supports diverse query types: Semantic ("similar to X"), structural ("owned by Alice"), temporal ("last week"), and keyword ("contains authentication") queries all work.
- Scales to production workloads: As memory grows to millions of entities and events, indexes ensure retrieval performance remains constant.
- Reduces latency for agents: Agents need context quickly to respond in real-time—slow memory access breaks user experience.
- Optimizes resource usage: Efficient indexes reduce compute and memory overhead compared to brute-force scans.
- Facilitates hybrid retrieval: Combine semantic similarity, relationship traversal, and temporal filtering in one query—indexes make this feasible.
How it works
Memory indexes operate through multi-layered indexing and query routing:
- Semantic Index → Vector embeddings are indexed using ANN algorithms (HNSW, IVF) for fast similarity search.
- Entity Index → Entities (people, projects, tasks) are indexed by type, attributes, and identifiers for lookup queries.
- Relationship Index → Graph edges are indexed for traversal: "Find all tasks owned by people at Acme."
- Temporal Index → Events and state changes are indexed by timestamp, enabling time-range queries: "Show changes from Nov 1-7."
- Keyword Index → Full-text search indexes content for keyword queries: "Find memory containing 'authentication error.'"
- Query Routing → When a query arrives, the memory index determines which indexes to use and how to combine results.
- Result Assembly → Matching memory is retrieved from relevant indexes, deduplicated, ranked, and returned.
This architecture ensures every query type is fast and efficient.
Comparison & confusion to avoid
Examples & uses
Multi-modal query: "Show me Alice's tasks blocked last week"
- Entity index: Resolve "Alice" to canonical entity ID.
- Relationship index: Find tasks with
owned_by: Aliceandstatus: blocked. - Temporal index: Filter events to "last week."
- Result: 3 tasks that match all criteria, retrieved in <50ms.
Semantic + structural query: "Meetings about authentication involving engineers"
- Semantic index: Vector search for "authentication."
- Entity index: Filter to entities with
type: meeting. - Relationship index: Find meetings with
attendeerelationships whereattendee.role: engineer. - Result: 7 relevant meetings.
Time-travel query: "What was Project X's status on Oct 15?"
- Temporal index: Query state snapshot at "Oct 15."
- Entity index: Fetch Project X's state as of that date.
- Result: "In Progress, 12 tasks, 2 blockers."
Best practices
- Index all relevant dimensions: Don't just index vectors—include entities, relationships, time, and keywords for comprehensive retrieval.
- Update indexes incrementally: When memory changes, update affected indexes immediately—batch updates cause staleness.
- Monitor index performance: Track query latency by type—if temporal queries slow down, optimize the temporal index.
- Implement query caching: Frequently asked queries (e.g., "today's status") can be cached to reduce index load.
- Balance index size vs. speed: More indexes improve query diversity but increase storage and update overhead—optimize for your use cases.
- Use approximate indexes where appropriate: Exact nearest-neighbor search is slow—ANN (approximate) indexes trade minimal accuracy for massive speed gains.
Common pitfalls
- Only indexing vectors: Semantic similarity is one query type—relationship and temporal queries need dedicated indexes.
- Stale indexes: If indexes aren't updated when memory changes, retrieval returns outdated results—implement real-time updates.
- No query optimization: Complex queries that scan multiple indexes can be slow—analyze query patterns and optimize index selection.
- Over-indexing: Indexing every possible dimension is wasteful—focus on query patterns your agents actually use.
- Ignoring index maintenance: Indexes fragment and degrade over time—implement periodic reindexing and compaction.
See also
- Agent Memory — Persistent memory accessed via indexes
- Semantic Retrieval — Meaning-based queries using semantic indexes
- Context Engine — Intelligent assembly using indexed memory
- Knowledge Graph — Structured memory indexed for relationship queries
- Temporal Memory — Time-aware memory indexed by timestamps
See how Graphlit implements Memory Indexes for fast agent retrieval → Agent Memory Platform
Ready to build with Graphlit?
Start building agent memory and knowledge graph applications with the Graphlit Platform.