3:47 AM. Your phone buzzes. Checkout API is down.
You're on-call. You need to:
- Check error logs (Sentry)
- Search Slack #incidents for similar issues
- Check GitHub for recent deployments
- Search Notion for runbooks
- Ask teammates if they remember this error
By the time you've gathered context, it's 4:15 AM—28 minutes wasted before you even start fixing.
Zine transforms incident response: Search once, get everything—error logs (via Sentry MCP), Slack incident history, recent GitHub changes, and runbooks. Root cause identified in 2 minutes.
This guide shows DevOps, SREs, and on-call engineers how to set up unified incident context that saves hours during critical moments.
Table of Contents
- The Incident Response Time Sink
- The Zine Incident Response Workflow
- Setup: One-Time Configuration
- During an Incident: The 2-Minute Context Gather
- Connecting Sentry MCP for Error Logs
- Post-Incident: Automated Documentation
- Real Incident Examples
- Best Practices
The Incident Response Time Sink
Traditional Incident Response Flow
Step 1: Identify the problem (2 minutes)
- Alert comes in: "Checkout API 500 errors spiking"
- Check monitoring dashboard (Datadog, New Relic, etc.)
Step 2: Check error logs (5-10 minutes)
- Open Sentry or CloudWatch
- Filter by service: checkout-api
- Filter by timeframe: Last hour
- Read stack traces, identify error patterns
Step 3: Search Slack #incidents (5-10 minutes)
- "Has this happened before?"
- Manually scroll through channel
- Find similar incident from 3 months ago
- Read 50-message thread to find resolution
Step 4: Check recent deployments (5-10 minutes)
- Open GitHub
- Check recent PRs merged to production
- Read PR descriptions, commits
- Identify suspicious changes
Step 5: Search runbooks (3-5 minutes)
- Open Notion or Confluence
- Search "checkout troubleshooting"
- Find (hopefully) relevant runbook
Step 6: Ask teammates (10-15 minutes)
- Slack: "Anyone seen checkout errors before?"
- Wait for responses
- Senior engineer shares context from memory
Total time: 30-45 minutes of context gathering before you start fixing.
During that time: Users can't checkout. Revenue lost. Stress accumulates.
The Zine Incident Response Workflow
Unified Incident Context in One Query
3:47 AM Alert: Checkout API errors
3:48 AM - Open Zine, one query:
Checkout API errors OR timeouts
3:49 AM - Zine returns (in 15 seconds):
- Sentry (via MCP): 87 errors in last hour, stack trace shows Redis timeout
- Slack #incidents (2 months ago): Same error, resolution documented (increase Redis timeout)
- GitHub PR #567: Merged yesterday, "Optimize Redis cache" (modified timeout config)
- Slack #engineering (yesterday): "Concerns about new Redis timeout settings" (Alice raised this)
- Notion runbook: "Redis Troubleshooting" (timeout adjustment procedure)
- GitHub PR #601 (2 months ago): Past fix for same issue
3:49 AM - Hypothesis identified:
- Recent Redis config change (PR #567) set timeout too aggressive
- This caused the same issue 2 months ago (PR #601 fixed it)
- Alice warned about this yesterday in Slack
3:50 AM - Fix:
- Revert Redis timeout config
- OR: Increase timeout based on runbook
- Deploy fix
Total time: 3 minutes from alert to fix deployment.
Time saved: 27-42 minutes.
Setup: One-Time Configuration
Step 1: Connect Core Tools to Zine
Required:
- Slack: Connect #incidents, #engineering, #devops channels
- GitHub: Connect repos (especially backend services)
- Notion: Connect runbooks, architecture docs
Recommended: 4. Meeting recordings: Past architecture/incident review meetings 5. Email: Vendor discussions, escalation threads
Initial sync: 1-3 hours (one-time)
Step 2: Connect Sentry MCP (Optional but Powerful)
Sentry offers an MCP server for error tracking.
Add Sentry MCP to Zine (as MCP client):
- Zine Settings → MCP Servers → Add Server
- Select "Sentry MCP"
- Enter Sentry API key
- Authorize
Now Zine can query Sentry errors in addition to searching Slack/GitHub.
Benefit: One query gets error logs + team discussions + code changes.
Step 3: Create Saved Views for Incidents
In Zine, create saved views:
"Recent Incidents":
source:slack channel:#incidents after:30d
"Production Changes":
source:github label:production merged after:7d
"Critical Bugs":
source:github label:critical state:open
Time saved: One-click access during incidents.
Step 4: Set Up Alert (Optional)
Create an alert for proactive monitoring:
Alert Name: "Incident Monitor"
Query:
Slack #incidents new threads
OR GitHub issues labeled 'production' OR 'outage'
OR Sentry errors increased by 50%+
from the last hour
Schedule: Hourly
Delivery: Slack DM
Benefit: Know about incidents immediately, even if you're not actively monitoring.
During an Incident: The 2-Minute Context Gather
Query Templates for Common Incidents
API Errors:
[service-name] API errors OR timeouts OR 500
Database Issues:
Database OR postgres OR mongodb slow OR timeout OR connection
Cache Problems:
Redis OR memcached OR cache timeout OR eviction
Deployment Issues:
Deployment OR deploy failed OR rollback recent
Performance Degradation:
Slow OR performance OR latency [service-name]
What to Look For in Results
1. Past Incidents (Slack #incidents):
- "Has this happened before?"
- If yes: How was it resolved?
- Time saved: Don't re-diagnose
2. Recent Changes (GitHub):
- PRs merged in last 24-48 hours
- Changes to affected service
- Likely culprits for new bugs
3. Known Issues (GitHub Issues):
- Open issues about this service
- Known bugs or limitations
- Workarounds documented
4. Runbooks (Notion):
- Troubleshooting procedures
- Recovery steps
- Contact information for escalation
5. Team Knowledge (Slack #engineering):
- Discussions about this service
- Known gotchas or edge cases
- Expertise (who knows this system best)
Connecting Sentry MCP for Error Logs
Why Sentry MCP?
Without Sentry MCP:
- Search Zine → Get Slack/GitHub context
- Open Sentry separately → Get error logs
- Manually correlate the two
With Sentry MCP:
- Search Zine → Get Slack/GitHub context + Sentry error logs in one query
- AI correlates automatically
Setup Sentry MCP
Option 1: Connect to Zine (Recommended)
- Zine Settings → MCP Servers → Add Server
- Select "Sentry"
- Enter:
Sentry API Key: your-sentry-api-key Sentry Organization: your-org-slug - Save
Now when you query Zine, it can include Sentry data.
Option 2: Connect Directly in Cursor
Add Sentry MCP alongside Zine MCP in Cursor config:
{
"mcpServers": {
"zine": { ... },
"sentry": {
"type": "sse",
"url": "sentry-mcp-endpoint",
"apiKey": "your-sentry-key"
}
}
}
Querying Sentry via Zine
Query:
Checkout API Sentry errors in the last hour
Zine returns:
- Sentry errors (87 errors, stack traces, affected users)
- Slack #incidents (team discussion if any)
- GitHub recent changes (PRs merged recently)
All unified.
Post-Incident: Automated Documentation
Generate Incident Report
After resolving an incident, use Zine to generate postmortem:
Query in Zine chat:
Generate incident report for checkout API timeout on November 13, 2025
Zine AI compiles:
- What happened: Timeline of errors (Sentry data)
- Root cause: GitHub PR #567 changed Redis timeout (too aggressive)
- Team response: Slack #incidents thread (Bob identified issue, Alice deployed fix)
- Resolution: Config rolled back, errors stopped
- Follow-up actions: Update runbook, add monitoring for Redis timeouts
Export to Notion: Click "Export" → saves as Notion page in "Incident Reports" database.
Time saved: 30-45 minutes writing postmortem manually.
Real Incident Examples
Example 1: Database Connection Exhaustion
Alert: 4:00 AM - API returning 500 errors
Query Zine:
Database connection errors OR pool exhausted
Returns:
- Slack #incidents (6 months ago): Same error, resolution: Increase connection pool size
- GitHub PR #234: Past fix
- GitHub PR #789: Merged 3 days ago, modified database config (potential cause)
- Notion runbook: "Database Connection Issues"
Root cause identified in 3 minutes: Recent PR reduced connection pool size (optimization attempt backfired).
Fix: Revert connection pool change.
Downtime: 8 minutes (vs. 45 minutes without Zine context).
Example 2: Redis Cache Eviction Bug
Alert: 2:00 PM - Checkout flow broken
Query Zine:
Checkout OR payment redis OR cache
Returns:
- Sentry: 143 errors, "Redis key not found"
- Slack #engineering (last week): "Changed Redis eviction policy to save memory"
- GitHub PR #567: "Update Redis config" (merged 3 days ago)
- Slack #engineering (last week): Alice warned: "This might evict active session keys"
Root cause identified in 2 minutes: New eviction policy is too aggressive, evicting session keys prematurely.
Fix: Adjust eviction policy to exclude session keys.
Alice's warning was right: Slack context prevented surprise. Team knew this was a risk.
Example 3: Third-Party API Outage
Alert: 5:30 PM - Payment processing failing
Query Zine:
Payment API errors OR Stripe OR payment gateway
Returns:
- Slack #incidents: Bob posted 10 minutes ago "Stripe status page shows outage"
- Email (from Stripe): Incident notification received 15 minutes ago
- Notion runbook: "Third-Party Outage Response" (enable fallback payment processor)
- GitHub: Fallback implementation in payment-service
Root cause identified in 1 minute: Stripe outage (external, not our bug).
Response: Enable fallback processor, notify customers, monitor Stripe status.
No time wasted debugging our code (Slack context immediately indicated external issue).
Best Practices
1. Connect Tools Before Incidents Happen
Don't wait until 3 AM:
- Set up Slack, GitHub, Sentry integration during normal hours
- Test queries during calm periods
- Create saved views for common incident types
Preparation pays off when seconds matter.
2. Document Resolutions in Slack
After fixing:
- Post resolution in Slack #incidents
- Include: Root cause, fix applied, prevention steps
Why: Next time this happens (it will), Zine finds this thread immediately.
Example post:
Checkout API timeout resolved.
Root cause: PR #567 set Redis timeout to 1000ms (too low).
Fix: Increased to 3000ms.
Prevention: Added monitoring for Redis timeouts.
Runbook updated in Notion.
3. Use Time Filters Strategically
Recent changes (last 24-48 hours):
after:24h deploy OR merged
Past incidents (last 6 months):
after:6mo [error-pattern]
Why: New bugs likely caused by recent changes. Historical incidents provide resolution patterns.
4. Create Incident-Specific Saved Views
"Recent Deployments":
source:github merged to:production after:48h
"Open Production Issues":
source:github label:production state:open
"Past Incidents":
source:slack channel:#incidents after:30d
Benefit: One-click access during high-stress incidents.
5. Set Up Proactive Alerts
Don't wait for pages:
- Alert when Sentry errors spike
- Alert when Slack #incidents has new thread
- Alert when GitHub issues labeled "production" are created
Result: Catch issues before they become full outages.
Next Steps
Now that you understand incident response with Zine:
- ✅ Connect Tools: Slack #incidents, GitHub, Notion runbooks
- ✅ Test During Calm: Practice queries before real incidents
- ✅ Create Saved Views: For recent changes, open issues, past incidents
- ✅ Set Up Alerts: Proactive monitoring
- ✅ Add Sentry MCP: Unified error logs + context
- ✅ Document Runbook: Update team's incident response process to include Zine
Related Guides:
- MCP Integration - Connect Sentry MCP
- Slack Knowledge Base - Set up #incidents search
- GitHub Intelligence - Track deployments
- Automated Alerts - Proactive incident monitoring
Learn More:
- Try Zine - Free tier available
- Schedule a demo - Get help setting up incident response workflow
Every minute counts during incidents. Don't waste 30 gathering context.