Cloud storage contains your organization's documents, codebases, and shared files. Graphlit syncs Google Drive, Dropbox, S3, GitHub, and 8+ storage providers—making everything searchable.
Covered Storage Providers
- Google Drive (personal & Team Drives)
- Microsoft (OneDrive, SharePoint)
- Dropbox
- Box
- Amazon S3
- Azure Blob Storage
- GitHub repositories
- Google Cloud Storage
Google Drive
OAuth Setup
import { FeedTypes, FeedServiceTypes } from 'graphlit-client/dist/generated/graphlit-types';
// Google Drive OAuth
const DRIVE_SCOPES = ['https://www.googleapis.com/auth/drive.readonly'];
// After OAuth flow
const driveFeed = await graphlit.createFeed({
name: 'Company Drive',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
refreshToken: access_token,
folderId: 'folder-id' // Optional: sync specific folder
},
isRecursive: true, // Include subfolders
readLimit: 1000
}
});
What syncs:
- Google Docs → Converted to markdown
- Google Sheets → Tables extracted
- PDFs, images, videos
- Files in subfolders
OneDrive & SharePoint
OneDrive (Personal)
const oneDriveFeed = await graphlit.createFeed({
name: 'My OneDrive',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.OneDrive,
oneDrive: {
refreshToken: microsoft_token,
folderId: 'folder-id' // Optional
},
isRecursive: true,
readLimit: 500
}
});
SharePoint (Team Sites)
const sharePointFeed = await graphlit.createFeed({
name: 'Company SharePoint',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.SharePoint,
sharePoint: {
refreshToken: microsoft_token,
siteId: 'site-id',
driveId: 'drive-id'
},
isRecursive: true,
readLimit: 1000
}
});
Dropbox
const dropboxFeed = await graphlit.createFeed({
name: 'Dropbox Files',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.Dropbox,
dropbox: {
refreshToken: dropbox_token,
folderPath: '/Documents' // Optional: specific folder
},
isRecursive: true,
readLimit: 500
}
});
Amazon S3
const s3Feed = await graphlit.createFeed({
name: 'S3 Bucket',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.S3Blob,
s3: {
bucketName: 'documents',
region: 'us-east-1',
accessKey: process.env.AWS_ACCESS_KEY,
secretAccessKey: process.env.AWS_SECRET_KEY,
prefix: 'pdfs/' // Optional: sync folder
},
isRecursive: true,
readLimit: 1000
}
});
GitHub Repositories
const githubFeed = await graphlit.createFeed({
name: 'Main Repo',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GitHub,
github: {
personalAccessToken: github_pat, // Personal Access Token
repositoryOwner: 'my-company',
repositoryName: 'main-repo'
},
isRecursive: true,
readLimit: 1000
}
});
What syncs:
- Source code files
- README.md
- Documentation
- Commit messages (optional)
Production Patterns
Multi-Storage Strategy
// Sync from all storage providers
const feeds = [
{
name: 'Google Drive',
siteType: FeedServiceTypes.GoogleDrive,
config: { googleDrive: driveConfig }
},
{
name: 'Dropbox',
siteType: FeedServiceTypes.Dropbox,
config: { dropbox: dropboxConfig }
},
{
name: 'S3',
siteType: FeedServiceTypes.S3Blob,
config: { s3: s3Config }
}
];
for (const feed of feeds) {
await graphlit.createFeed({
name: feed.name,
type: FeedTypes.Site,
site: {
type: feed.siteType,
isRecursive: true,
readLimit: 1000,
...feed.config
}
});
}
// Now search across ALL storage
const results = await graphlit.queryContents({
filter: {
search: 'quarterly report'
}
});
// Returns files from Drive, Dropbox, and S3
Selective Sync
// Only sync PDFs from specific folder
const selectiveFeed = await graphlit.createFeed({
name: 'Marketing PDFs',
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.GoogleDrive,
googleDrive: {
refreshToken: access_token,
folderId: 'marketing-folder-id'
},
isRecursive: true
// Note: File type filtering done in workflow
}
});
Related Guides
- Data Connectors - All connector types
- Document Processing - PDF extraction
- Content Ingestion - Ingestion basics