Cloud storage contains your organization's documents, codebases, and shared files. Graphlit syncs Google Drive, Dropbox, S3, GitHub, and 8+ storage providers—making everything searchable.
Covered Storage Providers
- Google Drive (personal & Team Drives)
- Microsoft (OneDrive, SharePoint)
- Dropbox
- Box
- Amazon S3
- Azure Blob Storage
- GitHub repositories
- Google Cloud Storage
Google Drive
OAuth Setup
// Google Drive OAuth
const DRIVE_SCOPES = ['https://www.googleapis.com/auth/drive.readonly'];
// After OAuth flow
const driveFeed = await graphlit.createFeed({
name: 'Company Drive',
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
token: access_token,
folderId: 'folder-id', // Optional: sync specific folder
readLimit: 1000,
includeSharedDrives: true // Include Team Drives
}
});
What syncs:
- Google Docs → Converted to markdown
- Google Sheets → Tables extracted
- PDFs, images, videos
- Files in subfolders
OneDrive & SharePoint
OneDrive (Personal)
const oneDriveFeed = await graphlit.createFeed({
name: 'My OneDrive',
type: FeedServiceTypes.OneDrive,
oneDrive: {
token: microsoft_token,
folderId: 'folder-id', // Optional
readLimit: 500
}
});
SharePoint (Team Sites)
const sharePointFeed = await graphlit.createFeed({
name: 'Company SharePoint',
type: FeedServiceTypes.SharePoint,
sharePoint: {
token: microsoft_token,
siteId: 'site-id',
driveId: 'drive-id',
readLimit: 1000
}
});
Dropbox
const dropboxFeed = await graphlit.createFeed({
name: 'Dropbox Files',
type: FeedServiceTypes.Dropbox,
dropbox: {
token: dropbox_token,
path: '/Documents', // Optional: specific folder
readLimit: 500
}
});
Amazon S3
const s3Feed = await graphlit.createFeed({
name: 'S3 Bucket',
type: FeedServiceTypes.AmazonS3,
amazonS3: {
accountName: 'my-company',
accessKey: process.env.AWS_ACCESS_KEY,
accessSecret: process.env.AWS_SECRET_KEY,
bucketName: 'documents',
prefix: 'pdfs/', // Optional: sync folder
region: 'us-east-1'
}
});
GitHub Repositories
const githubFeed = await graphlit.createFeed({
name: 'Main Repo',
type: FeedServiceTypes.GitHub,
github: {
token: github_pat, // Personal Access Token
repositoryOwner: 'my-company',
repositoryName: 'main-repo',
includeBranches: ['main', 'develop']
}
});
What syncs:
- Source code files
- README.md
- Documentation
- Commit messages (optional)
Production Patterns
Multi-Storage Strategy
// Sync from all storage providers
const feeds = [
{ name: 'Google Drive', type: FeedServiceTypes.GoogleDrive, config: { googleDrive: driveConfig } },
{ name: 'Dropbox', type: FeedServiceTypes.Dropbox, config: { dropbox: dropboxConfig } },
{ name: 'S3', type: FeedServiceTypes.AmazonS3, config: { amazonS3: s3Config } }
];
for (const feed of feeds) {
await graphlit.createFeed({
name: feed.name,
type: feed.type,
...feed.config
});
}
// Now search across ALL storage
const results = await graphlit.queryContents({
search: 'quarterly report'
});
// Returns files from Drive, Dropbox, and S3
Selective Sync
// Only sync PDFs from specific folder
const selectiveFeed = await graphlit.createFeed({
name: 'Marketing PDFs',
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
token: access_token,
folderId: 'marketing-folder-id',
// Note: File type filtering done in workflow
}
});
Related Guides
- Data Connectors - All connector types
- Document Processing - PDF extraction
- Content Ingestion - Ingestion basics