Learn how to implement these concepts with Graphlit. Start building →

Technical Infrastructure

Vector Database

A system that stores embeddings for similarity search—useful for retrieval, but insufficient for time-aware structured memory with entities and relationships.

Vector Database

A vector database is a specialized storage system optimized for storing, indexing, and querying high-dimensional vector embeddings. It enables fast semantic similarity search: given a query embedding, find the most similar vectors (and their associated content) based on distance metrics like cosine similarity or Euclidean distance.

Vector databases power retrieval-augmented generation (RAG), semantic search, and recommendation systems. However, they don't provide structured memory—they lack entity linking, relationship graphs, temporal awareness, and reasoning capabilities. For agent memory, vector databases are one component (retrieval), not a complete solution.

The outcome is knowing when vector search is sufficient (semantic similarity tasks) and when you need additional layers (knowledge graphs, timelines, entity resolution) for robust agent memory.

Why it matters

  • Enables semantic similarity search: Find documents, messages, or data "similar in meaning" to a query—without exact keyword matches.
  • Powers RAG systems: Retrieve relevant context based on query embeddings, then pass to LLMs for grounded generation.
  • Scales to large datasets: Vector databases index millions of embeddings for fast approximate nearest neighbor (ANN) search.
  • Supports multimodal retrieval: Store embeddings for text, images, audio—enable cross-modal search (e.g., "find images similar to this text").
  • Clarifies limitations: Vector similarity is not structured memory—no entity linking, no relationships, no temporal awareness.
  • Informs hybrid architectures: Combine vector databases (semantic search) with knowledge graphs (structured reasoning) for robust agent memory.

How it works

Vector databases operate through embedding storage and similarity search:

  • Ingestion → Content (documents, messages, images) is converted into vector embeddings using models like OpenAI text-embedding-3, Cohere, or custom encoders.
  • Indexing → Embeddings are indexed using algorithms like HNSW (Hierarchical Navigable Small Worlds) or IVF (Inverted File Index) for fast ANN search.
  • Query Embedding → When a user queries "show me documents about agent memory," the query is converted into a vector.
  • Similarity Search → The database finds the k-nearest neighbor embeddings to the query vector (e.g., top 10 most similar).
  • Result Retrieval → Associated content (document text, metadata) for the top matches is returned to the application.

This pipeline enables semantic retrieval but doesn't track entities, relationships, or time-based context.

Comparison & confusion to avoid

TermWhat it isWhat it isn'tWhen to use
Vector DatabaseStorage for embeddings with fast similarity searchStructured memory with entities, relationships, and timelinesSemantic search and RAG—not agent memory or reasoning
Knowledge GraphStructured graph of entities and relationshipsSimilarity-based retrieval—graphs enable traversal and reasoningRelationship queries and multi-hop reasoning
Agent MemoryDurable, structured, time-aware context for agentsA retrieval system—memory includes entities, timelines, and stateMulti-session continuity, learning, ownership tracking
Relational DatabaseTables with foreign keys and joinsOptimized for vector similarity—RDBs don't do semantic searchTransactional workloads with fixed schemas

Examples & uses

RAG for document Q&A
1,000 company documents are chunked and embedded. User asks: "What's our refund policy?" The query is embedded, and the vector database retrieves the 5 most similar chunks. An LLM generates an answer based on retrieved context.

Semantic search for support tickets
Thousands of support tickets are embedded. An agent searches for "authentication errors" (no exact keyword needed). The vector database returns semantically similar tickets like "login failure," "SSO not working," "password reset issues."

Recommendation system
User interactions are embedded (e.g., articles read, products viewed). When a user visits, the system queries for similar embeddings to recommend related content or products.

Limitation: No entity or temporal reasoning
User asks: "What did Alice work on last week?" Vector search finds documents mentioning Alice, but can't track ownership changes, state transitions, or time-based context—knowledge graphs and temporal memory are required.

Best practices

  • Chunk content appropriately: Embedding a 100-page document as one vector loses granularity—chunk into sections or paragraphs for precise retrieval.
  • Use hybrid search: Combine vector similarity with keyword filters or metadata (e.g., "semantically similar AND from last month").
  • Re-embed when content changes: If a document is updated, re-generate its embedding—stale embeddings degrade retrieval quality.
  • Monitor embedding drift: Model updates (e.g., new OpenAI embedding version) change vector spaces—re-embed corpus or handle version mismatches.
  • Combine with knowledge graphs: Vector search finds relevant content; knowledge graphs provide structured reasoning—use both for agent memory.
  • Implement metadata filtering: Store timestamps, authors, tags alongside embeddings—enables filtered retrieval (e.g., "similar to this AND authored by Alice").

Common pitfalls

  • Assuming vector DB equals agent memory: Vector databases provide retrieval, not structured memory—agents need entity linking, timelines, and reasoning layers.
  • No entity resolution: If "Alice Johnson" and "Alice J." are embedded separately, they're treated as different—knowledge graphs handle canonicalization.
  • Ignoring temporal context: Vector search doesn't understand "what changed last week" or "who owned this in October"—temporal memory is required.
  • Over-relying on embeddings: Semantic similarity is imperfect—hybrid approaches (vector + keyword + graph) improve accuracy.
  • Not handling embedding updates: When models improve or content changes, embeddings need refresh—plan for versioning and re-indexing.

See also


See how Graphlit combines vector search with structured memory → Agent Memory Platform

Ready to build with Graphlit?

Start building agent memory and knowledge graph applications with the Graphlit Platform.

Vector Database | Graphlit Agent Memory Glossary