Azure AI Document Intelligence (formerly Form Recognizer) is Microsoft's enterprise document processing service. It's battle-tested, SOC 2 compliant, and deeply integrated with the Azure ecosystem. It's also Graphlit's default extraction backend.
When you ingest documents into Graphlit, Azure AI Document Intelligence handles the extraction by default — giving you Microsoft's enterprise-grade OCR and layout analysis, combined with Graphlit's semantic infrastructure for everything that comes after.
This page explains what each platform provides, how they work together, and when you might want to use Azure directly vs. through Graphlit.
Table of Contents
- TL;DR — Quick Comparison
- What Azure AI Document Intelligence Does
- What Graphlit Adds
- The Default Integration
- When to Use Azure Directly
- When to Use Graphlit with Azure
- Pricing Comparison
- Integration Example
TL;DR — Quick Comparison
What Azure AI Document Intelligence Does
Azure AI Document Intelligence is Microsoft's enterprise document processing service. It excels at:
Enterprise-Grade OCR
- 275+ languages supported
- Handles low-quality scans and photos
- Confidence scores for extracted text
- Bounding box coordinates for every element
Layout Analysis
The Layout model understands document structure:
- Headers and titles
- Paragraphs and sections
- Tables (with cell merging)
- Figures and captions
- Page numbers and headers/footers
Pre-built Models
Specialized models for common document types:
- Invoices: Vendor, line items, totals, due dates
- Receipts: Merchant, items, subtotal, tax, tip
- ID Documents: Name, DOB, address, ID number
- Tax Forms: W-2, 1040, 1099 data extraction
- Contracts: Parties, terms, dates, clauses
- Health Insurance Cards: Member info, coverage details
Custom Models
Train models on your specific document types:
- Template-based (fixed layouts)
- Neural (variable layouts)
- Composed models (multiple document types)
Compliance
Enterprise-ready security:
- SOC 2 Type 2
- HIPAA BAA available
- ISO 27001, 27017, 27018
- Azure Government regions
What Graphlit Adds
Graphlit takes Azure AI's extraction and builds complete semantic infrastructure:
Automatic Embedding
Every document is embedded for vector search immediately — no separate pipeline.
Semantic Entity Extraction
Beyond Azure's key-value pairs, Graphlit identifies Schema.org entities:
Document: "Partnership Agreement.pdf"
→ Person: Alice Chen (CEO, Acme Inc)
→ Person: Bob Smith (CFO, Widget Co)
→ Organization: Acme Inc
→ Organization: Widget Co
→ Event: Signing Date (March 15, 2024)
Knowledge Graphs
Entities are connected across documents. Alice appears in meeting notes, contracts, and emails — all linked in a navigable graph.
Hybrid Search
Search with:
- Vector semantic similarity
- Keyword matching
- Graph-aware context expansion
- Entity and metadata filters
RAG Conversations
Built-in conversational AI:
- Streaming responses
- Source citations
- Conversation branching
- Multi-turn context
30+ Data Connectors
Beyond document upload:
- Communication: Slack, Discord, Teams, Email
- Development: GitHub, Linear, Jira
- Cloud Storage: Google Drive, Dropbox, SharePoint, OneDrive
- Media: RSS feeds, podcasts, YouTube
The Default Integration
Azure AI Document Intelligence is Graphlit's default extraction backend. When you ingest a document without specifying a workflow, Azure AI handles extraction automatically.
import { Graphlit } from 'graphlit-client';
const client = new Graphlit();
// Default ingestion uses Azure AI Document Intelligence
const result = await client.ingestUri(
"https://example.com/contract.pdf",
"Partnership Agreement"
);
// Document is:
// - Extracted by Azure AI (OCR, layout, tables)
// - Converted to Markdown
// - Embedded for vector search
// - Entity-enriched
// - Knowledge graph connected
// - Search indexed
This gives you enterprise-grade extraction with zero configuration.
When to Use Azure Directly
Use Azure AI Document Intelligence directly when:
- You need custom models: Training on specific document types (invoices, forms, IDs)
- You need pre-built models: Invoice extraction, receipt parsing, ID verification
- You need raw coordinates: Bounding boxes for UI overlays or validation
- You're all-in on Azure: Deep Azure ecosystem integration (Logic Apps, Power Automate)
- You need Azure Government: FedRAMP compliance requirements
- You only need extraction: No search, conversations, or knowledge graphs
Azure AI is particularly strong for structured document automation — processing thousands of invoices or forms with consistent schemas.
When to Use Graphlit with Azure
Use Graphlit (which uses Azure AI by default) when:
- You need the full stack: Extraction through conversation, all managed
- You want automatic embeddings: No separate vector pipeline
- You need semantic entities: People, organizations, events identified
- You need knowledge graphs: Relationships across documents
- You have multiple data sources: Not just documents — Slack, email, GitHub
- You're building AI applications: Search, RAG chatbots, knowledge assistants
- You want zero infrastructure: No databases, no pipelines, no maintenance
Graphlit gives you Azure AI's extraction quality plus complete semantic infrastructure.
Pricing Comparison
Azure AI Document Intelligence Pricing
Commitment tiers available for volume discounts.
Graphlit Pricing
Graphlit credits include Azure AI extraction plus all downstream processing — embedding, entity extraction, search indexing, and conversations.
Cost Comparison
For a knowledge base with 1,000 documents/month:
Azure AI alone:
- Layout extraction: ~$10/month
- Plus: Vector database, embedding API, entity extraction, search infrastructure, engineering time
Graphlit with Azure AI:
- Starter plan: $49/month (includes everything)
- No additional infrastructure costs
Integration Example
Azure AI Direct: Extraction Only
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.core.credentials import AzureKeyCredential
client = DocumentIntelligenceClient(endpoint, AzureKeyCredential(key))
# Analyze document
poller = client.begin_analyze_document(
"prebuilt-layout",
document_url
)
result = poller.result()
# Now you need to:
# 1. Convert result to your format
# 2. Send to embedding API
# 3. Store in vector database
# 4. Build entity extraction
# 5. Create search infrastructure
# 6. Implement RAG conversations
Graphlit with Azure AI: Complete Infrastructure
import { Graphlit, Types } from 'graphlit-client';
const client = new Graphlit();
// Ingest with Azure AI extraction (default) + automatic processing
const result = await client.ingestUri(
"https://example.com/contract.pdf",
"Partnership Agreement"
);
// Everything automatic:
// - Azure AI extracts with enterprise-grade OCR
// - Content converted to Markdown
// - Embedded for vector search
// - Entities extracted (people, orgs, dates)
// - Knowledge graph updated
// - Search index ready
// Immediately queryable
const contents = await client.queryContents({
types: [Types.ContentTypes.File],
search: "partnership terms"
});
// RAG conversation ready
const response = await client.promptConversation(
"What are the key terms of this partnership?",
conversationId,
{ id: specificationId }
);
Summary
Azure AI Document Intelligence is Microsoft's enterprise document processing powerhouse — battle-tested OCR, layout analysis, and pre-built models with enterprise compliance.
Graphlit uses Azure AI as its default extraction backend, adding semantic infrastructure — embeddings, entity extraction, knowledge graphs, search, and conversations.
Together, you get Microsoft's enterprise extraction quality powering Graphlit's AI infrastructure. Azure AI is already integrated — just start ingesting documents and the combination works automatically.
Explore Graphlit Features:
- Document Processing — Extraction backend options
- Building Knowledge Graphs — Automatic entity extraction
- Complete Guide to Search — Hybrid semantic search
- Data Connectors — 30+ source integrations
Learn More:
- Graphlit Documentation
- Azure AI Document Intelligence Documentation
- PDF Extraction Comparison
- Schedule a Demo
- Join our Discord
Enterprise extraction is the foundation. Semantic infrastructure is what makes it intelligent.