Most search systems only handle text documents—PDFs, Word docs, plain text files.
Your team's knowledge lives in:
- Meeting recordings (audio, video) - architectural decisions, product reviews
- Screenshots (images) - error messages, UI designs, whiteboard sessions
- Videos (YouTube, Loom) - product demos, onboarding tutorials
- Code files (GitHub) - implementations, not just docs
- Presentations (PowerPoint, Keynote) - strategy decks, customer pitches
If your search can't handle these formats, you're missing 60%+ of your team's knowledge.
Zine's multimodal processing extracts searchable text from audio, video, images, and code—so you can search:
- "What did the CTO say about microservices?" (from meeting recording)
- "Show me error screenshots from Slack" (OCR'd images)
- "Find the authentication implementation" (syntax-aware code search)
- "Onboarding video about our deployment process" (video transcripts)
This guide covers all supported formats, how they're processed, and how to search them effectively.
Table of Contents
- Supported Formats Overview
- Audio Processing
- Video Processing
- Image Processing (OCR)
- Code Processing
- Document Processing
- Searching Multimodal Content
- Best Practices
Supported Formats Overview
All Supported Content Types
Audio Formats:
- MP3, WAV, M4A, FLAC, OGG
- Podcast recordings
- Voice memos
- Meeting audio
- Phone call recordings
Video Formats:
- MP4, MOV, AVI, WMV, MKV
- Meeting recordings (Zoom, Google Meet, Teams)
- Product demos (Loom, Screen Studio)
- YouTube videos (via URL)
- Training videos
Image Formats (with OCR):
- JPEG, PNG, GIF, TIFF, BMP, WebP
- Screenshots
- Whiteboard photos
- Diagrams and charts
- Scanned documents
- UI mockups
Code Formats (syntax-aware):
- JavaScript/TypeScript (.js, .ts, .jsx, .tsx)
- Python (.py)
- Java (.java)
- C/C++ (.c, .cpp, .h)
- Go (.go)
- Rust (.rs)
- Ruby (.rb)
- PHP (.php)
- Swift (.swift)
- Kotlin (.kt)
- 30+ languages supported
Document Formats:
- PDF (searchable and scanned)
- Microsoft Word (.doc, .docx)
- PowerPoint (.ppt, .pptx)
- Excel (.xls, .xlsx)
- Google Docs, Sheets, Slides
- Markdown (.md)
- Plain text (.txt)
- HTML (.html)
- RTF, ODT, LaTeX
Presentation Formats:
- PowerPoint slides (text + embedded images OCR'd)
- Keynote exports
- Google Slides (via Drive connector)
Audio Processing
How Audio is Processed
Step 1: Ingestion
- Upload audio file or connect feed (Zoom, Google Meet)
- Zine detects format (MP3, WAV, etc.)
Step 2: Transcription
- Speech-to-text (OpenAI Whisper or similar)
- Multi-language support (50+ languages)
- Speaker diarization (identifies who said what)
- Timestamp alignment (every utterance timestamped)
Step 3: Indexing
- Transcript becomes searchable text
- Timestamps preserved for navigation
- Speaker names extracted (if available)
Step 4: Enrichment (optional)
- Entity extraction (people, companies mentioned)
- Key topics identified
- Action items extracted
Supported Audio Sources
Direct Upload:
Upload MP3, WAV, M4A files (up to 500MB)
Meeting Recordings (via connectors):
- Zoom: Auto-sync cloud recordings
- Google Meet: Recordings saved to Drive
- Microsoft Teams: Meeting recordings from OneDrive
- Loom: Video/audio from Loom library
Podcasts:
- RSS feed connector (auto-download new episodes)
Voice Memos:
- Upload from phone (iOS Voice Memos, Android Recorder)
- Dropbox/Drive sync
Example: Meeting Recording Search
Scenario: Product roadmap review meeting (1 hour)
Uploaded: Zoom recording (meeting.mp4, 350MB)
Zine processes:
- Extracts audio track
- Transcribes (10 minutes processing)
- Identifies speakers: Alice (PM), Bob (Eng Lead), Sarah (Design)
- Timestamps every sentence
Transcript indexed:
[00:03:42] Alice: Should we prioritize mobile app in Q4?
[00:04:15] Bob: Mobile is technically complex. I'd estimate 3 months.
[00:04:58] Sarah: From design perspective, mobile needs 2 weeks prep.
[00:06:12] Alice: Let's target Q4 for beta, Q1 for full launch.
Now searchable:
- Query: "mobile app priority"
- Returns: This meeting, jump to 00:03:42
- Query: "Bob's concerns about mobile"
- Returns: Timestamp 00:04:15 ("technically complex")
- Query: "Q4 roadmap decisions"
- Returns: This meeting + other roadmap discussions
Video Processing
How Video is Processed
Step 1: Ingestion
- Upload video file or connect feed (YouTube, Loom)
Step 2: Audio Extraction
- Extract audio track from video
- Transcribe (same process as audio)
Step 3: Visual Analysis (optional)
- Extract frames (every 5 seconds)
- OCR any text visible in video (screen shares, slides)
- Identify scenes/chapters
Step 4: Indexing
- Transcript + visual text searchable
- Thumbnail generated
- Chapter markers (if available)
Supported Video Sources
Direct Upload:
Upload MP4, MOV, AVI (up to 2GB)
YouTube (via URL or feed):
Paste YouTube URL → Zine downloads + transcribes
Meeting Platforms:
- Zoom: Cloud recordings
- Google Meet: Drive-saved recordings
- Microsoft Teams: OneDrive recordings
Screen Recording Tools:
- Loom: Library connector
- Screen Studio: Export MP4s
- OBS recordings: Upload MP4s
Example: Product Demo Video Search
Scenario: Loom demo of new feature (15 minutes)
Uploaded: demo-checkout-flow.mp4
Zine processes:
- Transcribes narration: "Here's the new checkout flow..."
- OCRs screen content: Button labels, form fields
- Indexes both transcript + visible text
Now searchable:
- Query: "checkout flow demo"
- Returns: This video, plays from start
- Query: "payment button"
- Returns: Timestamp where button is shown + mentioned
- Query: "How do we handle errors in checkout?"
- Returns: Section of video discussing error handling
Image Processing (OCR)
How Images are Processed
Step 1: Ingestion
- Upload image or pulled from Slack/Drive/email
Step 2: OCR (Optical Character Recognition)
- Extract all text visible in image
- Handle:
- Screenshots (UI text, code snippets, error messages)
- Whiteboard photos (handwritten notes, diagrams)
- Scanned documents (PDFs, receipts)
- Charts/graphs (axis labels, legends)
Step 3: Indexing
- Extracted text becomes searchable
- Image thumbnail preserved
- Metadata (filename, source, date)
Supported Image Sources
Direct Upload:
Upload JPEG, PNG, GIF, TIFF (up to 50MB)
Slack (via connector):
- Screenshots shared in channels
- Whiteboard photos
- Design mockups
Google Drive (via connector):
- Scanned documents
- Photos
Email (via connector):
- Inline images
- Attachments
Example: Screenshot Search
Scenario: Developer shares error screenshot in Slack
Image content:
[Screenshot of console]
Error: ECONNREFUSED 127.0.0.1:6379
at RedisClient.connect (/app/redis.js:42)
at Database.init (/app/db.js:18)
Zine processes:
- OCRs screenshot → Extracts text
- Indexes error message, stack trace
- Links to Slack message context
Now searchable:
- Query: "Redis ECONNREFUSED error"
- Returns: This screenshot + Slack thread discussing fix
- Query: "redis.js line 42"
- Returns: This screenshot + GitHub file
redis.js
- Returns: This screenshot + GitHub file
Code Processing
How Code is Processed (Syntax-Aware)
Step 1: Ingestion
- GitHub connector syncs repos
- Direct file upload
Step 2: Syntax Parsing
- Language detection (JavaScript, Python, etc.)
- Parse AST (Abstract Syntax Tree)
- Identify:
- Function definitions
- Class declarations
- Imports/dependencies
- Comments
Step 3: Indexing
- Full-text search (every line)
- Symbol search (functions, classes)
- Semantic understanding (what code does)
Step 4: Enrichment
- Links to GitHub (file, line numbers)
- Links to related PRs, issues
- Links to Slack discussions mentioning this code
Syntax-Aware Search
Traditional search: Keyword match only
Zine code search: Understands code structure
Example:
Query: "createUser function"
Traditional search returns:
- All files containing string "createUser" (hundreds of matches)
Zine code search returns (ranked):
- Function definition:
function createUser()inauth-service/users.js - Function calls: Where
createUser()is called (usage examples) - Tests:
test('createUser should...) - Documentation: README mentioning createUser
- Slack discussions: Team discussing createUser implementation
Supported Languages
Strongly supported (syntax-aware):
- JavaScript, TypeScript, React (JSX/TSX)
- Python
- Java, Kotlin
- Go
- Rust
- C, C++
- C#
- Ruby
- PHP
- Swift
Text search (still searchable, less syntax awareness):
- Shell scripts (bash, zsh)
- SQL
- YAML, JSON, TOML
- Markdown
- HTML, CSS
Example: Code Search
Query: "authentication middleware"
Returns:
- GitHub code:
middleware/auth.ts(functionauthenticateRequest()) - GitHub PR #234: "Add authentication middleware" (implementation)
- Slack #engineering: Discussion about auth approach
- Notion: "Auth Architecture" spec (requirements)
- GitHub issues: Bug reports mentioning auth middleware
All connected: Spec → Discussion → Implementation → Issues
Document Processing
PowerPoint/Keynote
Processing:
- Extract text from slides
- OCR embedded images (screenshots, charts)
- Extract speaker notes
- Index slide order (Slide 1, 2, 3...)
Search:
- Query: "Q4 roadmap"
- Returns: Presentation, jumps to relevant slide
PDF (Scanned Documents)
Processing:
- Detect if PDF is searchable (text layer) or scanned (images)
- If scanned: OCR every page
- Extract tables, charts (with labels)
- Index page numbers
Search:
- Query: "revenue projections"
- Returns: Financial PDF, page 14
Excel/Google Sheets
Processing:
- Extract text from cells
- Preserve table structure (row/column context)
- Index sheet names
Search:
- Query: "Acme Corp pricing"
- Returns: Pricing spreadsheet, Sheet: "Enterprise Customers", Row 23
Searching Multimodal Content
Unified Search (All Formats)
Query: "Redis performance"
Returns (mixed formats):
- Meeting recording (audio): Architecture review discussing Redis
- Slack screenshot (image): Redis performance graph (OCR'd labels)
- GitHub code:
redis-client.tsimplementation - Notion doc (text): "Redis Configuration Guide"
- Loom video: Demo of Redis integration
All ranked by relevance, all formats unified.
Filtering by Content Type
Search only videos:
type:video Redis performance
Search only code:
type:code createUser function
Search only images:
type:image error screenshot
Search only audio/meetings:
type:audio roadmap discussion
Time-Based Filtering
Recent recordings:
after:7d meeting recording
Old presentations (may be outdated):
before:2024-01-01 roadmap presentation
Best Practices
1. Name Files Descriptively
Bad:
recording.mp4screenshot.pngslides.pptx
Good:
2025-11-13-roadmap-review-meeting.mp4redis-error-screenshot-2025-11-13.pngQ4-product-roadmap-slides.pptx
Why: Filenames are indexed, help with search.
2. Use Timestamps in Queries
For long recordings:
- Zine returns timestamp where match occurs
- Click timestamp → jump to exact moment in video/audio
Example: 1-hour meeting, query "mobile app"
- Returns: Timestamp 00:34:12 where mobile was discussed
- Click → plays from 00:34:12
3. Combine Multimodal with Text Search
Best queries mix formats:
Query: "authentication system"
Returns:
- Code (
auth-service/) - Meeting recordings (arch reviews)
- Slack discussions
- Notion specs
- Screenshots (auth flow diagrams)
Result: Complete picture, all formats.
4. Upload Meeting Recordings Immediately
Don't wait:
- After meetings, upload recording same day
- Zine processes in background (10-30 min)
- Searchable by next meeting
Benefit: Context preserved while fresh.
5. OCR Whiteboards and Sketches
After whiteboard sessions:
- Take photo
- Upload to Zine (or share in Slack)
- Zine OCRs handwritten notes (if legible)
Searchable: Brainstorming sessions, architecture sketches.
6. Connect YouTube for Tutorials
If your team shares YouTube tutorials:
- Connect YouTube channel/playlist via feed
- Zine auto-transcribes new videos
- Search video content like docs
7. Use Dev Mode for Code + Context
When searching code:
- Use Dev Mode (split view)
- Left: Code files
- Right: Related issues, PRs, Slack discussions
Example: Click auth.ts → See related Slack thread about auth decisions.
Next Steps
Now that you understand multimodal processing:
- ✅ Upload Meeting Recordings: Past architecture reviews, product demos
- ✅ Connect Slack: Screenshots shared in channels get OCR'd automatically
- ✅ Connect GitHub: Code becomes searchable alongside discussions
- ✅ Test Unified Search: Query something discussed in meeting + Slack + code
- ✅ Use Timeline View: See chronological narrative across all formats
Related Guides:
- Data Connectors - Connect Zoom, Slack, GitHub, Drive
- GitHub Intelligence - Deep dive on code search
- Slack Knowledge Base - Search screenshots in Slack
Learn More:
- Try Zine - Free tier available
- Schedule a demo - See multimodal search in action
Text is just one format. Your team's knowledge lives in audio, video, images, and code. Search it all.