Comparison

Claude OCR vs. Graphlit: Raw LLM Calls vs. Document Infrastructure

Kirk Marple
Kirk Marple
December 5, 2025
Comparison

Search "Claude OCR" and you'll find developers excited about using Anthropic's vision models to extract text from documents. And they're right — Claude Sonnet's visual understanding is impressive. It can read complex tables, understand layouts, and extract text that traditional OCR misses.

But there's a significant gap between "Claude can read this PDF" and "I have a production document processing system."

Claude OCR means calling Anthropic's API with images and hoping for good results. Graphlit is document infrastructure that uses Claude (among other extractors) as one component of a complete semantic platform.

This comparison explains what you get with raw Claude API calls, what you're missing, and why document infrastructure matters.


Table of Contents

  1. TL;DR — Quick Comparison
  2. What "Claude OCR" Actually Means
  3. The DIY Claude Pipeline
  4. What's Missing from Raw API Calls
  5. What Graphlit Provides
  6. Cost Comparison
  7. When to Use Claude Directly
  8. When You Need Infrastructure

TL;DR — Quick Comparison

CapabilityClaude API (DIY OCR)Graphlit (with Claude backend)
What You GetRaw API calls, you build everything elseComplete document infrastructure
Vision QualityExcellent — Claude Sonnet is best-in-classSame Claude quality, properly integrated
PDF HandlingDIY: Convert pages to images, manage tokensAutomatic: Upload PDF, get Markdown
Multi-page DocumentsDIY: Split, process, reassembleAutomatic: Handled seamlessly
Rate LimitingDIY: Implement backoff, queuingAutomatic: Managed for you
Error HandlingDIY: Retries, fallbacks, loggingAutomatic: Built-in resilience
Token ManagementDIY: Count tokens, manage contextAutomatic: Optimized chunking
Output FormatWhatever Claude returnsConsistent Markdown with structure
Vector EmbeddingsDIY: Separate embedding pipelineAutomatic on ingestion
Entity ExtractionDIY: Additional prompts or pipelinesAutomatic Schema.org entities
Knowledge GraphsDIY: Build from scratchPer-user knowledge graphs
Semantic SearchDIY: Vector DB, indexing, queryHybrid search included
RAG ConversationsDIY: Context assembly, streamingBuilt-in with streaming
Cost~$3/1M input tokens + engineering timeUsage-based credits, infrastructure included

What "Claude OCR" Actually Means

When people say "Claude OCR," they mean using Claude's vision capabilities to extract text from images or PDFs. Claude Sonnet 3.5/4 can:

  • Read text from images with high accuracy
  • Understand complex table structures
  • Interpret charts and diagrams
  • Handle handwritten content (to a degree)
  • Preserve document layout in output

The vision quality is genuinely excellent. For complex documents with tables and mixed layouts, Claude often outperforms traditional OCR and even specialized document AI services.

But Claude is an LLM, not document infrastructure.


The DIY Claude Pipeline

To build "Claude OCR" into a working system, you need to:

1. PDF Processing

# Convert PDF pages to images
# Handle different PDF types (native, scanned, mixed)
# Manage image resolution and quality
# Deal with corrupted or malformed PDFs

2. Token Management

# Claude has context limits
# Large documents exceed single-call limits
# Split documents into chunks
# Reassemble results coherently

3. API Integration

# Handle rate limits (429 errors)
# Implement exponential backoff
# Manage concurrent requests
# Queue large batches

4. Error Handling

# Retry failed requests
# Handle partial failures
# Log errors for debugging
# Implement fallback strategies

5. Output Normalization

# Claude's output varies
# Parse and normalize responses
# Handle unexpected formats
# Convert to consistent structure

6. Everything Else

# Store extracted text
# Generate embeddings
# Build search index
# Create entity extraction
# Implement RAG conversations
# ...

This is months of engineering work to handle edge cases, scale reliably, and maintain over time.


What's Missing from Raw API Calls

No Standardized Output

Claude returns what it returns. Different prompts, different documents, different results. You need to build parsing and normalization.

No Multi-Page Handling

PDFs have hundreds of pages. Claude has context limits. You need to split, process, and reassemble — maintaining document coherence across chunks.

No Rate Limit Management

Hit Anthropic's rate limits and your pipeline stops. You need queuing, backoff, and retry logic.

No Error Recovery

API calls fail. Networks timeout. Claude occasionally hallucinates. Production systems need graceful degradation.

No Downstream Processing

Extraction is step one. What about:

  • Embedding for vector search?
  • Entity extraction (people, companies, dates)?
  • Knowledge graph construction?
  • Search indexing?
  • RAG conversation assembly?

No Cost Optimization

Claude vision tokens are expensive. Without proper chunking and caching, costs explode at scale.

No Observability

When something goes wrong (and it will), how do you debug? Production systems need logging, metrics, and tracing.


What Graphlit Provides

Graphlit uses Claude as one of several extraction backends — but wraps it in production infrastructure:

Automatic PDF Processing

Upload a PDF. Get Markdown. We handle:

  • Page extraction and image conversion
  • Resolution optimization
  • Multi-page document assembly
  • Corrupted file handling

Token-Optimized Processing

We chunk documents intelligently:

  • Respect Claude's context limits
  • Maintain document coherence
  • Optimize for extraction quality
  • Minimize token costs

Production Reliability

Built-in infrastructure:

  • Rate limit management with queuing
  • Automatic retries with backoff
  • Error logging and recovery
  • Processing status tracking

Consistent Output

Every document produces:

  • Clean Markdown with structure
  • Preserved tables and formatting
  • Consistent quality regardless of input

Everything After Extraction

Automatic downstream processing:

  • Vector embeddings for semantic search
  • Entity extraction (Schema.org types)
  • Knowledge graph construction
  • Hybrid search indexing
  • RAG-ready conversations

Multiple Backend Options

Claude is one choice. You can also use:

  • Azure AI Document Intelligence — Fast, reliable default
  • Reducto — Specialized for structured documents
  • Deepseek — Cost-effective for high volume

Cost Comparison

DIY Claude OCR Costs

API costs (Claude Sonnet):

  • Input: ~$3 per million tokens
  • Output: ~$15 per million tokens
  • A 10-page PDF with images: ~50K tokens = ~$0.15-0.50

Infrastructure costs:

  • PDF processing compute
  • Vector database ($70-200/month)
  • Embedding API ($50-100/month)
  • Queue/worker infrastructure

Engineering costs:

  • Initial build: 2-4 months
  • Ongoing maintenance: Significant
  • Edge case debugging: Continuous

Graphlit Costs

PlanCostIncludes
Free100 creditsFull platform with Claude extraction
Starter$49/month1,000 credits
Pro$149/month5,000 credits
EnterpriseCustomVolume discounts

Credits include extraction (Claude or other backends) plus all infrastructure — embeddings, entities, search, conversations.

Real Cost Comparison

Processing 500 documents/month:

DIY Claude:

  • API: ~$75-250/month (varies by document size)
  • Vector DB: $70/month
  • Embedding: $30/month
  • Engineering: ??? (your team's time)
  • Total: $175+ plus significant engineering time

Graphlit:

  • Pro plan: $149/month
  • Engineering: Hours of integration
  • Total: $149/month, everything included

When to Use Claude Directly

Use raw Claude API calls when:

  • One-off extraction: Single documents, manual review
  • Experimentation: Learning how vision models work
  • Custom prompts: Specific extraction needs with unique prompting
  • Existing infrastructure: You've already built the pipeline
  • Cost optimization at scale: Very high volume with custom optimization

If you're extracting a few documents manually and don't need search, embeddings, or conversations, direct API calls work fine.


When You Need Infrastructure

Use Graphlit when:

  • Production workloads: Reliability and consistency matter
  • Multiple documents: More than occasional one-offs
  • Search and retrieval: You need to find information later
  • Knowledge graphs: Entity extraction and relationships
  • RAG applications: Conversational AI over documents
  • Team collaboration: Multiple users accessing shared knowledge
  • Time constraints: You can't spend months building infrastructure

The gap between "Claude can read this" and "production document system" is infrastructure. Graphlit provides that infrastructure.


Integration Example

DIY Claude OCR

import anthropic
import base64
from pdf2image import convert_from_path

client = anthropic.Anthropic()

# Convert PDF to images
images = convert_from_path("document.pdf")

results = []
for i, image in enumerate(images):
    # Convert to base64
    buffer = io.BytesIO()
    image.save(buffer, format="PNG")
    image_data = base64.b64encode(buffer.getvalue()).decode()
    
    # Call Claude (handle rate limits yourself)
    try:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=[{
                "role": "user",
                "content": [
                    {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}},
                    {"type": "text", "text": "Extract all text from this document page as Markdown."}
                ]
            }]
        )
        results.append(response.content[0].text)
    except anthropic.RateLimitError:
        # Handle rate limiting (implement backoff)
        pass
    except Exception as e:
        # Handle other errors
        pass

# Combine results (handle page boundaries)
full_text = "\n\n".join(results)

# Now you need:
# - Embedding pipeline
# - Vector database
# - Entity extraction
# - Search indexing
# - RAG implementation
# - Error handling
# - ...

Graphlit with Claude Backend

import { Graphlit, Types } from 'graphlit-client';

const client = new Graphlit();

// Create workflow with Claude extraction
const workflow = await client.createWorkflow({
    name: "Claude LLM Extraction",
    preparation: {
        jobs: [{
            connector: {
                type: Types.FilePreparationServiceTypes.ModelDocument,
                modelDocument: {
                    specification: { id: claudeSpecificationId }
                }
            }
        }]
    }
});

// Ingest document — everything automatic
const result = await client.ingestUri(
    "https://example.com/complex-document.pdf",
    "Financial Report",
    undefined,
    undefined,
    true,
    { id: workflow.createWorkflow?.id }
);

// Document is now:
// - Extracted by Claude with vision
// - Multi-page handled automatically
// - Embedded for vector search
// - Entities extracted
// - Knowledge graph updated
// - Search indexed

// Query immediately
const contents = await client.queryContents({
    search: "quarterly revenue"
});

// RAG conversation ready
const response = await client.promptConversation(
    "Summarize the key financial metrics",
    conversationId,
    { id: specificationId }
);

Summary

Claude's vision capabilities are excellent. For document understanding and text extraction, Claude Sonnet is among the best available.

But "Claude OCR" isn't a product — it's an API call. Building production document processing requires:

  • PDF handling and page management
  • Token optimization and chunking
  • Rate limiting and error recovery
  • Output normalization
  • Embedding and search infrastructure
  • Entity extraction and knowledge graphs
  • RAG conversation assembly

Graphlit provides the infrastructure that turns Claude's capabilities into a production system. You get Claude's extraction quality plus everything else you need to build AI applications.

Don't build document infrastructure from scratch. Use Claude through Graphlit and focus on your application.


Explore Graphlit Features:

Learn More:

Claude can read documents. Graphlit turns that into infrastructure.

Ready to Build with Graphlit?

Start building AI-powered applications with our API-first platform. Free tier includes 100 credits/month — no credit card required.

No credit card required • 5 minutes to first API call

Claude OCR vs. Graphlit: Raw LLM Calls vs. Document Infrastructure