Data connectors (Feeds) automatically sync content from external sources into Graphlit. Instead of manually ingesting files, feeds continuously monitor Slack channels, Gmail inboxes, Google Drive folders, GitHub repos, and 25+ other sources—keeping your knowledge base up-to-date in real-time.
This guide covers feed architecture, OAuth vs API key authentication, all connector types, polling strategies, and production patterns. By the end, you'll know how to connect any data source and build automated content pipelines.
What You'll Learn
- Feed architecture and lifecycle
- OAuth flows vs API key authentication
- Connector patterns by category (messaging, cloud storage, project management)
- Feed configuration options (readLimit, schedules, filters)
- Polling vs webhook patterns
- Production feed management
- Error handling and retry strategies
Prerequisites:
- A Graphlit project - Sign up (2 min)
- SDK installed:
npm install graphlit-client(30 sec) - OAuth apps set up for connectors you want to use (we'll show you how)
Time to complete: 80 minutes
Difficulty: Intermediate
Developer Note: All Graphlit IDs are GUIDs. Example outputs show realistic GUID format.
Table of Contents
- Feed Architecture
- Authentication Methods
- Messaging Connectors
- Cloud Storage Connectors
- Project Management Connectors
- Social Media & Web Connectors
- Feed Management
- Production Patterns
Part 1: Feed Architecture
What is a Feed?
A feed is a continuous sync between an external data source and Graphlit. Once created, it:
- Initial sync: Fetches existing content (e.g., last 100 Slack messages)
- Continuous monitoring: Polls for new content (e.g., every 15 minutes)
- Auto-ingestion: New content automatically appears in Graphlit
Key insight: Feeds are "set it and forget it"—no manual re-triggering needed.
✅ Quick Win: Once a feed is created, new content automatically appears in your search results and RAG responses—no additional code needed.
Feed Types
Feeds are categorized by data source:
import { FeedServiceTypes } from 'graphlit-client/dist/generated/graphql-types';
// Messaging
FeedServiceTypes.Slack
FeedServiceTypes.MicrosoftTeams
FeedServiceTypes.Discord
FeedServiceTypes.Gmail
FeedServiceTypes.OutlookEmail
FeedServiceTypes.Intercom
// Cloud Storage
FeedServiceTypes.GoogleDrive
FeedServiceTypes.OneDrive
FeedServiceTypes.SharePoint
FeedServiceTypes.Dropbox
FeedServiceTypes.Box
FeedServiceTypes.GitHub
FeedServiceTypes.AmazonS3
FeedServiceTypes.AzureStorage
// Project Management
FeedServiceTypes.Jira
FeedServiceTypes.Linear
FeedServiceTypes.Notion
FeedServiceTypes.Trello
FeedServiceTypes.GitHubIssues
FeedServiceTypes.GitHubPullRequests
// Social/Web
FeedServiceTypes.Reddit
FeedServiceTypes.Twitter
FeedServiceTypes.YouTube
FeedServiceTypes.Rss
FeedServiceTypes.Web // Web crawling
FeedServiceTypes.Sitemap
// Calendars
FeedServiceTypes.GoogleCalendar
FeedServiceTypes.OutlookCalendar
Feed Lifecycle
CREATE → ENABLED → SYNCING → INDEXED
↓
DISABLED (if paused)
↓
DELETED (if removed)
Part 2: Authentication Methods
OAuth (Recommended for Most Connectors)
OAuth lets users authorize access without sharing passwords. Graphlit manages the OAuth flow.
Connectors using OAuth:
- Slack
- Gmail / Google Drive / Google Calendar
- Microsoft (Outlook, OneDrive, SharePoint, Teams)
- GitHub
- Notion
- Jira
- Linear
OAuth flow:
- User clicks "Connect Slack"
- Redirected to Slack OAuth
- User authorizes
- Graphlit receives OAuth token
- Create feed with token
// Example: Slack OAuth
const authUrl = `https://slack.com/oauth/v2/authorize?client_id=${SLACK_CLIENT_ID}&scope=channels:read,channels:history&redirect_uri=${REDIRECT_URI}`;
// User visits authUrl, authorizes
// Slack redirects back with code
// Exchange code for token
const tokenResponse = await fetch('https://slack.com/api/oauth.v2.access', {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: `code=${code}&client_id=${SLACK_CLIENT_ID}&client_secret=${SLACK_CLIENT_SECRET}`
});
const { access_token } = await tokenResponse.json();
// Create feed with token
const feed = await graphlit.createFeed({
name: 'My Slack Feed',
type: FeedServiceTypes.Slack,
slack: {
token: access_token,
channels: ['general', 'engineering'],
readLimit: 100
}
});
API Keys (For Services Without OAuth)
Some connectors use direct API keys:
- RSS feeds (no auth)
- Web crawling (no auth)
- S3 (access key + secret)
- Azure Storage (connection string)
// Example: S3 feed with API keys
const s3Feed = await graphlit.createFeed({
name: 'Company S3 Bucket',
type: FeedServiceTypes.AmazonS3,
amazonS3: {
accountName: 'my-company',
accessKey: process.env.AWS_ACCESS_KEY,
accessSecret: process.env.AWS_SECRET_KEY,
bucketName: 'documents',
prefix: 'pdfs/' // Optional: filter by folder
}
});
Part 3: Messaging Connectors
Slack
Use case: Search team conversations, RAG over chat history, entity extraction from messages.
import { Graphlit } from 'graphlit-client';
import { FeedServiceTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create Slack feed
const slackFeed = await graphlit.createFeed({
name: 'Engineering Slack',
type: FeedServiceTypes.Slack,
slack: {
token: process.env.SLACK_BOT_TOKEN,
channels: ['engineering', 'product'], // Channel names
readMessages: true, // Sync messages
readThreads: true, // Sync replies
readLimit: 500, // Last 500 messages per channel
includeAttachments: true // Sync files/images
}
});
console.log('Slack feed created:', slackFeed.createFeed.id);
// Wait for initial sync
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(slackFeed.createFeed.id);
isDone = status.isFeedDone.result;
await new Promise(r => setTimeout(r, 10000)); // Check every 10s
}
console.log('✓ Slack history synced');
OAuth scopes needed:
channels:read- List channelschannels:history- Read messagesgroups:read- Private channels (optional)groups:history- Private messages (optional)
What gets synced:
- All messages in specified channels
- Threaded replies
- User mentions
- Files/images attached to messages
- Reactions (optional)
💡 Pro Tip: Combine Slack feeds with entity extraction to automatically identify who's working on which projects from Slack conversations.
Gmail
Use case: Search emails, extract contacts/companies, email-based RAG.
const gmailFeed = await graphlit.createFeed({
name: 'My Gmail',
type: FeedServiceTypes.Gmail,
gmail: {
token: process.env.GMAIL_OAUTH_TOKEN,
readLimit: 100, // Last 100 emails
includeAttachments: true
}
});
OAuth scopes needed:
https://www.googleapis.com/auth/gmail.readonly
What gets synced:
- Email subject, body, sender, recipients
- Attachments (PDFs, images, etc.)
- Timestamps
- Email threads
Microsoft Teams
const teamsFeed = await graphlit.createFeed({
name: 'Engineering Team',
type: FeedServiceTypes.MicrosoftTeams,
teams: {
token: process.env.TEAMS_OAUTH_TOKEN,
teamId: 'team-guid',
channelId: 'channel-guid',
readLimit: 100
}
});
Discord
const discordFeed = await graphlit.createFeed({
name: 'Community Discord',
type: FeedServiceTypes.Discord,
discord: {
token: process.env.DISCORD_BOT_TOKEN,
guildId: 'guild-id',
channelId: 'channel-id',
readLimit: 500
}
});
Part 4: Cloud Storage Connectors
Google Drive
Use case: Sync company documents, collaborative files, shared folders.
const driveFeed = await graphlit.createFeed({
name: 'Company Drive',
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
token: process.env.GOOGLE_OAUTH_TOKEN,
folderId: 'folder-id', // Optional: sync specific folder
readLimit: 1000,
includeSharedDrives: true // Include Team Drives
}
});
What gets synced:
- Google Docs (converted to markdown)
- Google Sheets (tables extracted)
- Google Slides (text extracted)
- PDFs, images, videos
- Files in subfolders
OAuth scopes needed:
https://www.googleapis.com/auth/drive.readonly
OneDrive / SharePoint
// OneDrive personal
const oneDriveFeed = await graphlit.createFeed({
name: 'My OneDrive',
type: FeedServiceTypes.OneDrive,
oneDrive: {
token: process.env.MICROSOFT_OAUTH_TOKEN,
folderId: 'folder-id', // Optional
readLimit: 500
}
});
// SharePoint (team sites)
const sharePointFeed = await graphlit.createFeed({
name: 'Company SharePoint',
type: FeedServiceTypes.SharePoint,
sharePoint: {
token: process.env.MICROSOFT_OAUTH_TOKEN,
siteId: 'site-id',
driveId: 'drive-id',
readLimit: 1000
}
});
GitHub
Use case: Sync code repos, documentation, READMEs.
const githubFeed = await graphlit.createFeed({
name: 'Company Repo',
type: FeedServiceTypes.GitHub,
github: {
token: process.env.GITHUB_PAT, // Personal Access Token
repositoryOwner: 'my-company',
repositoryName: 'main-repo',
includeBranches: ['main', 'develop']
}
});
What gets synced:
- Source code files
- README.md files
- Documentation
- Commit messages (optional)
Amazon S3
const s3Feed = await graphlit.createFeed({
name: 'Documents S3 Bucket',
type: FeedServiceTypes.AmazonS3,
amazonS3: {
accountName: 'my-company',
accessKey: process.env.AWS_ACCESS_KEY,
accessSecret: process.env.AWS_SECRET_KEY,
bucketName: 'company-documents',
prefix: 'public/', // Optional: sync specific folder
region: 'us-east-1'
}
});
Part 5: Project Management Connectors
Jira
Use case: Search issues, track project status, entity extraction from tickets.
const jiraFeed = await graphlit.createFeed({
name: 'Engineering Jira',
type: FeedServiceTypes.Jira,
jira: {
token: process.env.JIRA_OAUTH_TOKEN,
accountId: 'jira-account-id',
project: 'PROJ', // Project key
readLimit: 500
}
});
What gets synced:
- Issue title, description, comments
- Status, assignee, reporter
- Attachments
- Custom fields
Linear
const linearFeed = await graphlit.createFeed({
name: 'Product Linear',
type: FeedServiceTypes.Linear,
linear: {
token: process.env.LINEAR_API_KEY,
teamId: 'team-id',
readLimit: 500
}
});
Notion
const notionFeed = await graphlit.createFeed({
name: 'Company Wiki',
type: FeedServiceTypes.Notion,
notion: {
token: process.env.NOTION_INTEGRATION_TOKEN,
databaseId: 'database-id', // Optional
readLimit: 1000
}
});
What gets synced:
- Pages and sub-pages
- Databases and records
- Embedded content
- Inline comments
GitHub Issues & Pull Requests
// Issues
const issuesFeed = await graphlit.createFeed({
name: 'Repo Issues',
type: FeedServiceTypes.GitHubIssues,
githubIssues: {
token: process.env.GITHUB_PAT,
repositoryOwner: 'my-company',
repositoryName: 'main-repo',
readLimit: 500,
includeClosedIssues: true
}
});
// Pull Requests
const prFeed = await graphlit.createFeed({
name: 'Repo PRs',
type: FeedServiceTypes.GitHubPullRequests,
githubPullRequests: {
token: process.env.GITHUB_PAT,
repositoryOwner: 'my-company',
repositoryName: 'main-repo',
readLimit: 100
}
});
Part 6: Social Media & Web Connectors
const redditFeed = await graphlit.createFeed({
name: 'Tech Subreddit',
type: FeedServiceTypes.Reddit,
reddit: {
token: process.env.REDDIT_OAUTH_TOKEN,
subreddit: 'MachineLearning',
readLimit: 100,
sortBy: 'hot' // 'hot', 'new', 'top'
}
});
RSS Feeds
const rssFeed = await graphlit.createFeed({
name: 'Tech News RSS',
type: FeedServiceTypes.Rss,
rss: {
uri: 'https://techcrunch.com/feed/',
readLimit: 50
}
});
Web Crawling
Use case: Scrape documentation sites, competitor analysis, content aggregation.
const webCrawl = await graphlit.createFeed({
name: 'Documentation Crawler',
type: FeedServiceTypes.Web,
web: {
uri: 'https://docs.example.com',
readLimit: 500,
allowedDomains: ['docs.example.com'], // Stay on domain
excludedPaths: ['/api/', '/archive/'] // Skip sections
}
});
What gets scraped:
- Page HTML (converted to markdown)
- Links (follows to crawl more pages)
- Images (optional)
- Metadata (title, description)
YouTube
const youtubeFeed = await graphlit.createFeed({
name: 'Channel Videos',
type: FeedServiceTypes.YouTube,
youtube: {
token: process.env.YOUTUBE_API_KEY,
channelId: 'channel-id',
readLimit: 50
}
});
What gets synced:
- Video transcripts (auto-generated or manual)
- Titles, descriptions
- Thumbnails
- Comments (optional)
Part 7: Feed Management
Query Feeds
// Get all feeds
const feeds = await graphlit.queryFeeds();
feeds.feeds.results.forEach(feed => {
console.log(`${feed.name} (${feed.type})`);
console.log(` State: ${feed.state}`);
console.log(` Last sync: ${feed.lastSyncDateTime}`);
});
Update Feed
// Change feed configuration
await graphlit.updateFeed(feedId, {
name: 'Updated Name',
slack: {
readLimit: 1000 // Increase sync limit
}
});
Disable/Enable Feed
// Pause syncing
await graphlit.disableFeed(feedId);
// Resume syncing
await graphlit.enableFeed(feedId);
Delete Feed
// Delete feed (and optionally its content)
await graphlit.deleteFeed(feedId);
// Delete feed but keep synced content
await graphlit.deleteFeed(feedId, false);
Trigger Manual Sync
// Force immediate sync (useful for testing)
await graphlit.triggerFeedSync(feedId);
// Wait for sync to complete
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(feedId);
isDone = status.isFeedDone.result;
await new Promise(r => setTimeout(r, 5000));
}
Part 8: Advanced Patterns
Pattern 1: Feed with Workflow
Apply processing to synced content:
// Create workflow first
const workflow = await graphlit.createWorkflow({
name: "Extract Entities",
extraction: { /* ... */ }
});
// Create feed with workflow
const feed = await graphlit.createFeed({
name: 'Slack with Entities',
type: FeedServiceTypes.Slack,
slack: { /* ... */ },
workflow: { id: workflow.createWorkflow.id }
});
// All synced messages will have entities extracted
Pattern 2: Feed with Collections
Auto-organize synced content:
// Create collection
const collection = await graphlit.createCollection('Slack Messages');
// Create feed that adds to collection
const feed = await graphlit.createFeed({
name: 'Slack Feed',
type: FeedServiceTypes.Slack,
slack: { /* ... */ },
collections: [{ id: collection.createCollection.id }]
});
Pattern 3: Multi-Feed Strategy
Sync from multiple sources into unified knowledge base:
// Feed 1: Slack
const slackFeed = await graphlit.createFeed({
name: 'Slack',
type: FeedServiceTypes.Slack,
slack: { /* ... */ }
});
// Feed 2: Gmail
const gmailFeed = await graphlit.createFeed({
name: 'Gmail',
type: FeedServiceTypes.Gmail,
gmail: { /* ... */ }
});
// Feed 3: Google Drive
const driveFeed = await graphlit.createFeed({
name: 'Drive',
type: FeedServiceTypes.GoogleDrive,
googleDrive: { /* ... */ }
});
// Now search across all sources
const results = await graphlit.queryContents({
search: "project update"
});
// Returns results from Slack, Gmail, AND Drive
Pattern 4: Scheduled Feeds
Control sync frequency:
const feed = await graphlit.createFeed({
name: 'Daily News Feed',
type: FeedServiceTypes.Rss,
rss: {
uri: 'https://news.com/feed',
readLimit: 50
},
schedulePolicy: {
recurrenceType: 'DAILY',
interval: 1 // Every 1 day
}
});
Part 9: Production Patterns
Pattern 1: OAuth Token Refresh
OAuth tokens expire—handle refresh:
// Store refresh token when user authorizes
const oauthData = {
accessToken: '...',
refreshToken: '...',
expiresAt: Date.now() + 3600000
};
// Before creating feed, check if token is expired
async function getValidToken() {
if (Date.now() > oauthData.expiresAt) {
// Refresh token
const newTokens = await refreshOAuthToken(oauthData.refreshToken);
oauthData.accessToken = newTokens.accessToken;
oauthData.expiresAt = Date.now() + 3600000;
}
return oauthData.accessToken;
}
// Use refreshed token
const token = await getValidToken();
const feed = await graphlit.createFeed({
type: FeedServiceTypes.Slack,
slack: { token, /* ... */ }
});
Pattern 2: Feed Health Monitoring
Monitor feed status:
// Check all feeds
const feeds = await graphlit.queryFeeds();
feeds.feeds.results.forEach(feed => {
if (feed.state === 'FAILED') {
console.error(`Feed ${feed.name} failed`);
// Alert ops team
}
if (feed.lastSyncDateTime) {
const hoursSinceSync = (Date.now() - new Date(feed.lastSyncDateTime).getTime()) / 3600000;
if (hoursSinceSync > 24) {
console.warn(`Feed ${feed.name} hasn't synced in ${hoursSinceSync}h`);
}
}
});
Pattern 3: Rate Limiting
Avoid overwhelming external APIs:
// Create feeds with delays
const urls = ['url1', 'url2', 'url3'];
for (const url of urls) {
const feed = await graphlit.createFeed({
type: FeedServiceTypes.Rss,
rss: { uri: url }
});
// Wait 5 seconds between feed creations
await new Promise(r => setTimeout(r, 5000));
}
Common Issues & Solutions
Issue: OAuth Token Invalid
Problem: "Invalid token" error when creating feed.
Solution: Refresh OAuth token or re-authorize:
try {
const feed = await graphlit.createFeed(config);
} catch (error: any) {
if (error.message.includes('invalid token')) {
// Redirect user to re-authorize
window.location.href = getOAuthUrl();
}
}
Issue: Feed Not Syncing
Problem: Feed created but no content appears.
Solutions:
- Check feed state:
const feed = await graphlit.getFeed(feedId);
console.log('State:', feed.feed.state);
- Wait for initial sync:
await waitForFeedCompletion(feedId);
- Trigger manual sync:
await graphlit.triggerFeedSync(feedId);
Issue: Too Much Content
Problem: Feed syncs thousands of items, overwhelming system.
Solution: Use readLimit:
const feed = await graphlit.createFeed({
type: FeedServiceTypes.Slack,
slack: {
/* ... */,
readLimit: 100 // Only last 100 messages
}
});
What's Next?
You now understand data connectors completely. Next steps:
- Set up OAuth apps for connectors you need
- Create feeds for key data sources
- Apply workflows to customize processing
- Monitor feed health in production
Related guides:
- Content Ingestion - Manual ingestion vs feeds
- Workflows and Processing - Process feed content
- Building Knowledge Graphs - Extract entities from feeds
- Production Architecture - Monitor feed health
Happy connecting! 🔌