Specialized14 min read

Email Intelligence: Gmail & Outlook Entity Extraction

Extract contacts, companies, and insights from email. Complete guide to Gmail and Outlook integration, entity extraction, and email search patterns.

Email contains critical business intelligence—customer conversations, deal discussions, project updates. Graphlit transforms email archives into searchable knowledge bases with automatic entity extraction.

What You'll Learn

  • Gmail OAuth setup and sync
  • Outlook integration
  • Entity extraction (contacts, companies)
  • Email search patterns
  • Building email-powered RAG
  • Production email processing

Part 1: Gmail Integration

OAuth Setup

// Gmail requires OAuth
const GMAIL_SCOPES = [
  'https://www.googleapis.com/auth/gmail.readonly'
];

// OAuth URL
const authUrl = `https://accounts.google.com/o/oauth2/v2/auth?client_id=${GOOGLE_CLIENT_ID}&redirect_uri=${REDIRECT_URI}&response_type=code&scope=${GMAIL_SCOPES.join(' ')}`;

// After user authorizes, exchange code for token
const tokenResponse = await fetch('https://oauth2.googleapis.com/token', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    code: authorizationCode,
    client_id: GOOGLE_CLIENT_ID,
    client_secret: GOOGLE_CLIENT_SECRET,
    redirect_uri: REDIRECT_URI,
    grant_type: 'authorization_code'
  })
});

const { access_token, refresh_token } = await tokenResponse.json();

Create Gmail Feed

import { Graphlit } from 'graphlit-client';
import { FeedServiceTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

const gmailFeed = await graphlit.createFeed({
  name: 'My Gmail',
  type: FeedServiceTypes.Gmail,
  gmail: {
    token: access_token,
    readLimit: 100,  // Last 100 emails
    includeAttachments: true
  }
});

console.log('Gmail feed created:', gmailFeed.createFeed.id);

With Entity Extraction

import { 
  FilePreparationServiceTypes,
  EntityExtractionServiceTypes,
  ObservableTypes
} from 'graphlit-client/dist/generated/graphql-types';

// Workflow for email entities
const emailWorkflow = await graphlit.createWorkflow({
  name: "Email Entities",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Email
      }
    }]
  },
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        extractedTypes: [
          ObservableTypes.Person,        // Senders, recipients, mentioned people
          ObservableTypes.Organization,  // Companies discussed
          ObservableTypes.Place,         // Locations, offices
          ObservableTypes.Event          // Meetings, deadlines
        ]
      }
    }]
  }
});

// Feed with entity extraction
const gmailWithEntities = await graphlit.createFeed({
  name: 'Gmail with Entities',
  type: FeedServiceTypes.Gmail,
  gmail: {
    token: access_token,
    readLimit: 500
  },
  workflow: { id: emailWorkflow.createWorkflow.id }
});

Part 2: Outlook Integration

OAuth Setup

// Outlook OAuth scopes
const OUTLOOK_SCOPES = [
  'https://graph.microsoft.com/Mail.Read',
  'https://graph.microsoft.com/User.Read'
];

// Microsoft OAuth URL
const msAuthUrl = `https://login.microsoftonline.com/common/oauth2/v2.0/authorize?client_id=${MS_CLIENT_ID}&redirect_uri=${REDIRECT_URI}&response_type=code&scope=${OUTLOOK_SCOPES.join(' ')}`;

// Exchange code for token
const msTokenResponse = await fetch('https://login.microsoftonline.com/common/oauth2/v2.0/token', {
  method: 'POST',
  headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
  body: new URLSearchParams({
    code: authorizationCode,
    client_id: MS_CLIENT_ID,
    client_secret: MS_CLIENT_SECRET,
    redirect_uri: REDIRECT_URI,
    grant_type: 'authorization_code'
  })
});

const { access_token } = await msTokenResponse.json();

Create Outlook Feed

const outlookFeed = await graphlit.createFeed({
  name: 'Outlook Email',
  type: FeedServiceTypes.OutlookEmail,
  outlookEmail: {
    token: access_token,
    readLimit: 100,
    includeAttachments: true
  }
});

Part 3: Email Search

Basic Email Search

import { ContentTypes } from 'graphlit-client/dist/generated/graphql-types';

// Search emails
const results = await graphlit.queryContents({
  search: 'project phoenix status',
  filter: {
    types: [ContentTypes.Email]
  }
});

results.contents.results.forEach(email => {
  console.log(`From: ${email.email?.from[0]?.email}`);
  console.log(`Subject: ${email.email?.subject}`);
  console.log(`Date: ${email.email?.sentDateTime}`);
  console.log(`Preview: ${email.email?.body?.substring(0, 100)}...\n`);
});

Search by Sender

// Find all emails from alice@example.com
const aliceEmails = await graphlit.queryContents({
  search: '',  // Empty = all emails
  filter: {
    types: [ContentTypes.Email],
    // Note: Sender filtering requires metadata query
  }
});

// Filter in application
const fromAlice = aliceEmails.contents.results.filter(email => 
  email.email?.from.some(sender => sender.email === 'alice@example.com')
);

Search with Date Range

// Emails from last quarter
const recentEmails = await graphlit.queryContents({
  search: 'budget proposal',
  filter: {
    types: [ContentTypes.Email],
    creationDateRange: {
      from: new Date(Date.now() - 90 * 24 * 60 * 60 * 1000).toISOString()
    }
  }
});

Part 4: Entity Extraction from Email

Extracted Entities

What gets extracted automatically:

  • People: Senders, recipients, mentioned names
  • Organizations: Company names in signatures, body
  • Places: Office locations, cities, addresses
  • Events: Meeting invites, deadline mentions

Query Entities from Email

// Get email content with entities
const email = await graphlit.getContent(emailContentId);

const entities = {
  people: email.content.observations?.filter(obs => obs.type === ObservableTypes.Person) || [],
  organizations: email.content.observations?.filter(obs => obs.type === ObservableTypes.Organization) || [],
  places: email.content.observations?.filter(obs => obs.type === ObservableTypes.Place) || []
};

console.log('People mentioned:', entities.people.map(p => p.observable.name));
console.log('Companies mentioned:', entities.organizations.map(o => o.observable.name));

Build Contact Database

// Extract all contacts from emails
const emails = await graphlit.queryContents({
  filter: { types: [ContentTypes.Email] }
});

const contacts = new Map<string, { name: string; email: string; companies: Set<string> }>();

for (const email of emails.contents.results) {
  const details = await graphlit.getContent(email.id);
  
  // Extract people
  details.content.observations?.forEach(obs => {
    if (obs.type === ObservableTypes.Person) {
      const name = obs.observable.name;
      const emailAddr = details.content.email?.from[0]?.email;
      
      if (!contacts.has(name)) {
        contacts.set(name, { name, email: emailAddr || '', companies: new Set() });
      }
    }
    
    // Link to companies
    if (obs.type === ObservableTypes.Organization) {
      // Find person and link company
    }
  });
}

console.log('Total contacts:', contacts.size);

Part 5: Email RAG

Chat with Email History

// Create conversation
const conversation = await graphlit.createConversation('Email Assistant');

// Ask about emails
const response = await graphlit.promptConversation(
  'What did we discuss with Acme Corp last month?',
  conversation.createConversation.id
);

console.log(response.promptConversation?.message?.message);

// Citations show which emails were used
response.promptConversation?.message?.citations?.forEach(citation => {
  console.log(`From: ${citation.content?.email?.from[0]?.email}`);
  console.log(`Subject: ${citation.content?.email?.subject}`);
});

Filter by Contact

// Find contact entity
const acmeContact = await graphlit.queryObservables({
  filter: {
    searchText: "John Smith",
    types: [ObservableTypes.Person]
  }
});

const contactId = acmeContact.observables?.results?.[0]?.observable.id;

// Chat filtered to John's emails
const response = await graphlit.promptConversation(
  'What has John discussed about pricing?',
  conversationId,
  undefined,
  undefined,
  {
    observations: {
      observables: [{ id: contactId }]
    }
  }
);

Part 6: Production Patterns

Pattern 1: Email Archiving

// Sync entire inbox for archival
const archiveFeed = await graphlit.createFeed({
  name: 'Email Archive',
  type: FeedServiceTypes.Gmail,
  gmail: {
    token: access_token,
    readLimit: 10000  // Large limit for full archive
  }
});

// Monitor progress
let count = 0;
const interval = setInterval(async () => {
  const contents = await graphlit.queryContents({
    filter: { types: [ContentTypes.Email] }
  });
  
  const newCount = contents.contents.results.length;
  if (newCount > count) {
    console.log(`Archived: ${newCount} emails`);
    count = newCount;
  }
}, 10000);

Pattern 2: Smart Email Categorization

// Extract entities and categorize
async function categorizeEmail(contentId: string) {
  const email = await graphlit.getContent(contentId);
  
  const hasCustomer = email.content.observations?.some(
    obs => obs.type === ObservableTypes.Organization && 
           ['Acme Corp', 'Example Inc'].includes(obs.observable.name)
  );
  
  const hasProduct = email.content.observations?.some(
    obs => obs.type === ObservableTypes.Product
  );
  
  if (hasCustomer && hasProduct) {
    return 'customer-product-discussion';
  } else if (hasCustomer) {
    return 'customer-communication';
  }
  
  return 'general';
}

Pattern 3: Email Summarization

import { EnrichmentServiceTypes } from 'graphlit-client/dist/generated/graphlit-types';

// Workflow with summarization
const summaryWorkflow = await graphlit.createWorkflow({
  name: "Email Summary",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Email
      }
    }]
  },
  enrichment: {
    jobs: [{
      connector: {
        type: EnrichmentServiceTypes.ModelSummarization,
        prompt: "Summarize this email in 2-3 sentences, including: sender, key points, and any action items."
      }
    }]
  }
});

// Feed with auto-summarization
const summaryFeed = await graphlit.createFeed({
  name: 'Gmail with Summaries',
  type: FeedServiceTypes.Gmail,
  gmail: {
    token: access_token,
    readLimit: 100
  },
  workflow: { id: summaryWorkflow.createWorkflow.id }
});

Common Issues & Solutions

Issue: Attachments Not Syncing

Problem: Email synced but PDF attachments missing.

Solution: Enable attachments:

gmail: {
  includeAttachments: true  // Must be true
}

Issue: Token Expired

Problem: Feed stops syncing after 1 hour.

Solution: Implement token refresh:

async function refreshGmailToken(refreshToken: string) {
  const response = await fetch('https://oauth2.googleapis.com/token', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      refresh_token: refreshToken,
      client_id: GOOGLE_CLIENT_ID,
      client_secret: GOOGLE_CLIENT_SECRET,
      grant_type: 'refresh_token'
    })
  });
  
  const { access_token } = await response.json();
  
  // Update feed with new token
  await graphlit.updateFeed(feedId, {
    gmail: {
      token: access_token
    }
  });
}

What's Next?

You now have email intelligence. Next steps:

  1. Build contact database from email entities
  2. Create email summaries for quick review
  3. Set up email RAG bot for querying history

Related guides:

Happy emailing! đŸ“§

Ready to Build with Graphlit?

Start building AI-powered applications with our API-first platform. Free tier includes 100 credits/month — no credit card required.

No credit card required • 5 minutes to first API call

Email Intelligence: Gmail & Outlook Entity Extraction | Graphlit Developer Guides