Here's the good news: you don't have to choose between Reducto and Graphlit. Reducto is one of the extraction backends that Graphlit integrates, which means you can get Reducto's excellent document parsing capabilities AND Graphlit's semantic infrastructure in a single platform.
Reducto specializes in document parsing — turning PDFs, spreadsheets, and presentations into structured, chunked content optimized for LLM applications. It's particularly strong at table extraction, form parsing, and intelligent chunking.
Graphlit is the semantic infrastructure layer that handles everything after extraction — embedding, entity extraction, knowledge graphs, hybrid search, and conversational AI. And when you use Graphlit, you can choose Reducto as your extraction backend.
This page explains what each platform does, when to use Reducto directly, and when the combination through Graphlit gives you more power.
Table of Contents
- TL;DR — Quick Comparison
- What Reducto Does Best
- What Graphlit Adds
- Using Reducto Through Graphlit
- When to Use Reducto Directly
- When to Use Graphlit with Reducto
- Pricing Comparison
- Integration Example
TL;DR — Quick Comparison
What Reducto Does Best
Reducto is a document parsing powerhouse. If you need to extract structured data from complex documents, Reducto delivers:
State-of-the-Art Table Extraction
Reducto's table extraction is among the best available. It handles:
- Complex merged cells
- Multi-page tables
- Nested table structures
- Tables without clear borders
Intelligent Chunking
Unlike basic text splitters, Reducto's chunking is layout-aware:
- Respects document structure (headings, sections)
- Keeps tables intact
- Preserves semantic coherence
- Optimized for RAG retrieval
Structured Extraction
The /extract endpoint lets you define custom schemas and pull specific fields:
{
"invoice_number": "INV-2024-001",
"vendor_name": "Acme Corp",
"line_items": [
{ "description": "Widget A", "quantity": 10, "price": 99.99 }
],
"total_amount": 999.90
}
Format Support
Reducto handles 30+ file types:
- PDFs (native and scanned)
- Office documents (DOCX, XLSX, PPTX)
- Images (PNG, JPEG, TIFF)
- And more
What Graphlit Adds
Graphlit takes Reducto's excellent extraction and builds the complete AI infrastructure around it:
Automatic Embedding
Every document is embedded for vector search immediately after extraction — no separate embedding pipeline needed.
Entity Extraction
People, organizations, places, events, and products are automatically identified and linked:
Document: "Q3 Report.pdf"
→ Person: Alice Chen (CFO)
→ Organization: Acme Inc.
→ Event: Earnings Call (Oct 15, 2024)
Knowledge Graphs
Entities aren't just extracted — they're connected. Alice's mentions across all documents are linked, creating a navigable knowledge graph.
Hybrid Search
Search across your entire knowledge base with:
- Vector semantic search
- Keyword/BM25 search
- Graph-aware context expansion
- Entity and metadata filters
RAG Conversations
Built-in conversational AI with streaming responses, conversation branching, and automatic source citations.
30+ Data Connectors
Beyond document upload:
- Communication: Slack, Discord, Teams, Email
- Development: GitHub, Linear, Jira
- Cloud Storage: Google Drive, Dropbox, SharePoint
- Media: RSS feeds, podcasts, YouTube
Using Reducto Through Graphlit
When you configure a Graphlit workflow with Reducto as the extraction backend, you get:
- Reducto's extraction quality — tables, forms, layouts all handled correctly
- Automatic embedding — no separate pipeline
- Entity extraction — people, orgs, places identified
- Knowledge graph — relationships connected
- Search index — immediately queryable
- RAG ready — conversational AI available
import { Graphlit, Types } from 'graphlit-client';
const client = new Graphlit();
// Create a workflow that uses Reducto for extraction
const workflow = await client.createWorkflow({
name: "Reducto Extraction Workflow",
preparation: {
jobs: [{
connector: {
type: Types.FilePreparationServiceTypes.Reducto,
reducto: {
// Reducto-specific configuration
}
}
}]
}
});
// Ingest using Reducto extraction + full Graphlit processing
const result = await client.ingestUri(
"https://example.com/complex-report.pdf",
"Q3 Financial Report",
undefined,
undefined,
true,
{ id: workflow.createWorkflow?.id }
);
// Document is now:
// - Extracted by Reducto (tables, forms, layout preserved)
// - Embedded for vector search
// - Entity-enriched
// - Knowledge graph connected
// - Searchable and RAG-ready
When to Use Reducto Directly
Use Reducto's API directly when:
- You only need extraction: Your pipeline handles everything else
- You have existing infrastructure: Vector DB, embedding pipeline, search — all built
- You need the
/extractendpoint: Structured JSON extraction with custom schemas for automation workflows - You're building document automation: Invoice processing, form extraction, data entry automation
Reducto excels as a focused extraction tool. If extraction is your only need, use it directly.
When to Use Graphlit with Reducto
Use Graphlit (with Reducto as the backend) when:
- You need the full stack: Extraction through conversation, all managed
- You want automatic entity extraction: People, organizations, events identified
- You need knowledge graphs: Relationships and connections across documents
- You have multiple data sources: Not just PDFs — Slack, GitHub, email, feeds
- You're building AI applications: Semantic search, RAG chatbots, knowledge assistants
- You want zero infrastructure: No databases to manage, no pipelines to maintain
The combination gives you Reducto's extraction quality plus Graphlit's semantic infrastructure.
Pricing Comparison
Reducto Pricing
Credits roughly correspond to pages, though complex documents may use more.
Graphlit Pricing
Graphlit credits include extraction (via Reducto or other backends) plus all downstream processing — embedding, entity extraction, search indexing, and conversations.
Cost Comparison
For a RAG application with 1,000 documents/month:
Reducto alone:
- Extraction: ~$15/month
- Plus: Vector database, embedding API, entity extraction, search infrastructure, engineering time
Graphlit with Reducto:
- Starter plan: $49/month (includes everything)
- No additional infrastructure costs
Integration Example
Reducto Direct: Extraction Only
import reducto
client = reducto.Reducto(api_key="...")
# Parse document
result = client.parse.run(
document_url="https://example.com/report.pdf",
options={"chunking_method": "semantic"}
)
# Now you need to:
# 1. Send chunks to embedding API
# 2. Store in vector database
# 3. Build entity extraction pipeline
# 4. Create search infrastructure
# 5. Implement RAG conversations
Graphlit with Reducto: Complete Infrastructure
import { Graphlit, Types } from 'graphlit-client';
const client = new Graphlit();
// Ingest with Reducto extraction + automatic processing
const result = await client.ingestUri(
"https://example.com/report.pdf",
"Financial Report",
undefined,
undefined,
true,
{ id: reductoWorkflowId } // Workflow configured for Reducto
);
// Everything is automatic:
// - Reducto extracts with excellent table handling
// - Content is embedded
// - Entities are extracted
// - Knowledge graph is updated
// - Search index is ready
// Immediately queryable
const contents = await client.queryContents({
types: [Types.ContentTypes.File],
search: "quarterly revenue"
});
// RAG conversation ready
const response = await client.promptConversation(
"What were the key financial metrics?",
conversationId,
{ id: specificationId }
);
Summary
Reducto is excellent at what it does — document parsing with state-of-the-art table extraction and intelligent chunking.
Graphlit builds the semantic infrastructure around extraction — embedding, entities, knowledge graphs, search, and conversations.
Together, you get the best of both worlds: Reducto's extraction quality powering Graphlit's AI infrastructure. No need to choose — use Reducto through Graphlit and get everything.
Explore Graphlit Features:
- Document Processing — Configure extraction backends
- Building Knowledge Graphs — Automatic entity extraction
- Complete Guide to Search — Hybrid semantic search
- Workflows and Processing — Custom processing pipelines
Learn More:
- Graphlit Documentation
- Reducto Documentation
- PDF Extraction Comparison
- Schedule a Demo
- Join our Discord
Great extraction is the foundation. Semantic infrastructure is what you build on top.