LLMs for Podcasters: Automating Content Workflows with Graphlit, MapScaping, Deepgram and OpenAI GPT-4 Turbo

Kirk Marple

December 3, 2023

Introduction

Before starting Graphlit, I was an avid podcast listener but always felt like there was untapped value in the knowledge presented in each episode.

There is one level of understanding that can be gained from listening to the podcast, but I wanted to dive deeper into topics discussed, learn more about the background of the speakers, or find similar content to explore.

In 2022, I had the pleasure of being a guest on Daniel O'Donohue's MapScaping podcast.

We discussed how Unstructured Data is Dark Data, and how a platform like Graphlit can be used to unlock the knowledge contained in all forms of unstructured data.

In this tutorial, we will explore the benefits of using Graphlit for automating content workflows for podcasters like Daniel.



Walkthrough

You likely have seen podcast websites with transcriptions, timestamped chapters or summaries.

Let's see how we can use Graphlit to automate the generation of these forms of summarized content, as well as provide a conversational chatbot for your listeners.

Completing this tutorial requires a Graphlit account, and if you don't have a Graphlit account already, you can signup here for our free tier. The GraphQL API requires JWT authentication, and the creation of a Graphlit project.


Using OpenAI GPT-4 Turbo

In this tutorial, we are going to use the new OpenAI GPT-4 Turbo model.

We first create a specification, which describes which LLM to use, and the configuration of the model, such as temperature and probability.


// Mutation:
mutation CreateSpecification($specification: SpecificationInput!) {
  createSpecification(specification: $specification) {
    id
  }
}

// Variables:
{
  "specification": {
    "type": "COMPLETION",
    "serviceType": "AZURE_OPEN_AI",
    "azureOpenAI": {
      "model": "GPT4_TURBO_128K_PREVIEW",
      "temperature": 0.1,
      "probability": 0.2
    },
    "name": "GPT-4 Turbo"
  }
}


Create Content Workflow

Next, we can create a workflow object, which describes how the podcast will be processed, as it is ingested into Graphlit.

Here we specify to create 5 summary paragraphs, 5 potential headlines, 5 social media posts, timestamped chapters, and 5 followup questions - for each episode in the podcast feed. Each of these summarized items are created in parallel, and you can specify any number of items you want to create.

All of these will be created automatically using the GPT-4 Turbo model which we assigned in the specification.


// Mutation:
mutation CreateWorkflow($workflow: WorkflowInput!) {
  createWorkflow(workflow: $workflow) {
    id
  }
}

// Variables:
{
  "workflow": {
    "preparation": {
      "specification": {
        "id": "026acd57-fb42-4856-96ae-9210f6f87d9e"
      },
      "summarizations": [
        {
          "type": "SUMMARY",
          "items": 5
        },
        {
          "type": "HEADLINES",
          "items": 5
        },
        {
          "type": "POSTS",
          "items": 5
        },
        {
          "type": "CHAPTERS"
        },
        {
          "type": "QUESTIONS",
          "items": 5
        }
      ]
    },
    "name": "Podcast Summarization"
  }
}


Ingest MapScaping Podcast Feed

To bring content into Graphlit, one option is to use a feed. Here we are providing the RSS URL for the MapScaping podcast, and Graphlit automatically locates the episodes and ingests them into Graphlit.

Each episode will be processed with the specified workflow we created above.

All of the workflow processing happens asynchronously after the createFeed mutation returns. You can use the feed query to check on the feed's state, and see how many episodes it has read.


// Mutation:
mutation CreateFeed($feed: FeedInput!) {
  createFeed(feed: $feed) {
    id
  }
}

// Variables:
{
  "feed": {
    "type": "RSS",
    "rss": {
      "uri": "https://mapscaping.com/feed/podcast/"
    },
    "workflow": {
      "id": "e2fb0855-6e41-480f-b53e-1d3407b50891"
    },
    "name": "MapScaping Podcast"
  }
}


After automatically transcribing the audio from the podcast episodes using the Deepgram Nova 2 model, we use the OpenAI GPT-4 Turbo model to generate a comprehensive, long-form summary.

Notice that it generates 5 summary paragraphs, as we asked for in the specification.



Once the podcast episodes have completed their content workflows, we can query the summarized results from the Graphlit knowledge graph. Here we are filtering by the feed we created, so we will only get those content results.


// Query:
query QueryContents($filter: ContentFilter!) {
  contents(filter: $filter) {
    results {
      id
      name
      uri
      summary
      headlines
      posts
      questions
      chapters
    }
  }
}

// Variables:
{
  "filter": {
    "feeds": [
      {
        "id": "c1e034ab-1981-46f5-b89e-353a77c6ee5c"
      }
    ],
    "offset": 0,
    "limit": 100
  }
}


Podcast Summaries

Let's look at an example of all the generated summaries from one of the MapScaping episodes, Fire Mapping, Maritime Search And Wide Angle Imaging.

The podcast chapters have been generated automatically to cover the major topics, and conform to the Spotify and YouTube chapter format.

Graphlit will generate chapters both from audio sources, and from video sources which contain audio.





The social media posts which are generated by Graphlit can be easily copy-pasted into your preferred client.

For example, here is how the first one would look in X (formerly known as Twitter):


Making your Podcast Interactive for Listeners

In addition to creating a podcast summary and social media posts, we also asked Graphlit to auto-suggest followup questions which may be of interest to your listeners.

Here are some examples:


How can we use these questions?

Graphlit provides support for conversations over your podcast content. These conversations use the Retrieval Augmented Generation (RAG) pattern to first leverage vector embeddings and semantic search to find relevant content. Then the text from the search 'hits' are formatted intelligently into the LLM prompt to ground its results and provide a higher quality response.

We can easily create a conversation (i.e. chatbot) where your listeners can ask questions and more information, from the episodes in your podcast feed.

You can use any LLM for your chatbot, and it can be different from the model used for summarizations.

Here we are specifying the use of the Azure OpenAI GPT-3.5 16K model, and asking to return citations for the referenced content in the conversation responses.

We are also constraining the conversation just to the feed which we created above.


// Mutation:
mutation CreateSpecification($specification: SpecificationInput!) {
  createSpecification(specification: $specification) {
    id
  }
}

// Variables:
{
  "specification": {
    "type": "COMPLETION",
    "serviceType": "AZURE_OPEN_AI",
    "azureOpenAI": {
      "model": "GPT35_TURBO_16K",
      "temperature": 0.1,
      "probability": 0.2
    },
    "strategy": {
      "embedCitations": true
    },
    "name": "GPT-3.5 16K with citations"
  }
}
// Mutation:
mutation CreateConversation($conversation: ConversationInput!) {
  createConversation(conversation: $conversation) {
    id
  }
}

// Variables:
{
  "conversation": {
    "type": "CONTENT",
    "filter": {
      "feeds": [
        {
          "id": "c1e034ab-1981-46f5-b89e-353a77c6ee5c"
        }
      ]
    },
    "specification": {
      "id": "5b825a16-b875-4112-b155-12871eaa027e"
    },
    "name": "Podcast Chatbot"
  }
}


We can now simulate a question from a listener, by prompting the conversation.

Let's ask the first suggested question:
What is wide angle imaging and how is it used in fire mapping and maritime search?

The LLM responded with:


We called the promptConversation mutation to send the user prompt, as if a listener typed this into a chatbot on your website.

It returns the message, which is the completed response from the LLM, and the citations it used when generating the response.


// Mutation:
mutation PromptConversation($prompt: String!, $id: ID) {
  promptConversation(prompt: $prompt, id: $id) {
    conversation {
      id
    }
    message {
      role
      author
      message
      citations {
        content {
          id
        }
        index
        startTime
        endTime
        pageNumber
      }
      tokens
      completionTime
      timestamp
    }
    messageCount
  }
}

// Variables:
{
  "prompt": "What is wide angle imaging and how is it used in fire mapping and maritime search?",
  "id": "de12f47f-6b16-432e-be0b-9e73fd765529"
}

// Response:
{
  "conversation": {
    "id": "de12f47f-6b16-432e-be0b-9e73fd765529"
  },
  "message": {
    "role": "ASSISTANT",
    "message": "Wide angle imaging is a technology used in fire mapping and maritime search. It involves capturing wide area mosaics of images and vector products to provide situational awareness and support response crews. In fire mapping, wide angle imaging helps identify and outline areas affected by fire, aiding in firefighting efforts. In maritime search, it assists in identifying and tracking objects of interest, such as boats or individuals in distress. Wide angle imaging combines hardware and software to offer a broad view of the environment and is tailored to specific missions. [0][1][2][3][4][5][6][7]",
    "citations": [
      {
        "content": {
          "id": "4eeea260-517d-4f7c-a0a2-28ca063b5549"
        },
        "index": 0
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 1,
        "startTime": "PT21M",
        "endTime": "PT22M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 2,
        "startTime": "PT3M",
        "endTime": "PT4M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 3,
        "startTime": "PT5M",
        "endTime": "PT6M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 4,
        "startTime": "PT19M",
        "endTime": "PT20M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 5,
        "startTime": "PT6M",
        "endTime": "PT7M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 6,
        "startTime": "PT43M",
        "endTime": "PT44M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 7,
        "startTime": "PT10M",
        "endTime": "PT11M"
      }
    ],
    "tokens": 493,
    "completionTime": "PT6.8314845S",
    "timestamp": "2023-12-04T05:03:18.013Z"
  },
  "messageCount": 2
}


The citations reference the content where Graphlit found relevant information. These can be linked back to the original source URL, and formatted on your podcast chatbot web page for later exploration by your listeners.

If we look closely, the conversation response cited information found in two different content objects:




Let's dive deeper into what Graphlit found. The first content object 4eeea260-517d-4f7c-a0a2-28ca063b5549 is the podcast web page, which was also automatically ingested from the podcast feed.


// Query:
query GetContent($id: ID!) {
  content(id: $id) {
    id
    uri
    type
    fileType
    name
    summary
  }
}

// Variables:
{
  "id": "4eeea260-517d-4f7c-a0a2-28ca063b5549"
}

// Response:
{
  "type": "POST",
  "summary": "This episode of The MapScaping Podcast features a discussion on wide-angle imaging for fire mapping and maritime search. The guest, Alison Harrod, is a mission success manager at Overwatch Imaging. The conversation delves into the challenges of introducing new technologies and gaining trust for their use in critical applications such as fire mapping, which requires a cautious approach.\n\nAlison Harrod and the host explore the potential of wide-angle imaging in the context of fire mapping and maritime search. They consider the implications of adopting artificial intelligence and smart sensors in these fields, highlighting Overwatch Imaging's involvement in these technological advancements.\n\nThe MapScaping Podcast, hosted by Daniel O'Donohue, serves the geospatial community by unraveling complex spatial issues. O'Donohue, with his geospatial expertise, has created a platform that discusses various topics ranging from thermal imagery from space to the use of cube satellites in the stratosphere.\n\nListeners are encouraged to connect with Alison Harrod through her LinkedIn profile and to learn more about Overwatch Imaging through their website. The podcast also suggests other relevant episodes that cover a wide range of geospatial topics, providing a resource for those interested in the field.\n\nThe MapScaping Podcast invites listeners to become supporters, offering them the opportunity to place their brand in front of a geospatially interested audience. The podcast also extends an invitation to tech companies, developers, and professionals to engage with the community through various collaborations, including sponsorships and guest appearances.",
  "fileType": "DOCUMENT",
  "uri": "https://mapscaping.com/podcast/fire-mapping-maritime-search-and-wide-angle-imaging/",
  "id": "4eeea260-517d-4f7c-a0a2-28ca063b5549",
  "name": "Fire Mapping, Maritime Search And Wide Angle Imaging"
}


And the second content object d2ffb813-caa2-4649-beee-ee2c01c060ca is the audio file from the podcast episode.

So, Graphlit uses more than just the audio source from podcasts, and automatically pulls in relevant web content sources as well.


// Query:
query GetContent($id: ID!) {
  content(id: $id) {
    id
    uri
    type
    fileType
    name
    summary
  }
}

// Variables:
{
  "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
}

// Response:
{
  "type": "FILE",
  "summary": "Daniel hosts Alison Harrott from Overwatch Imaging on the Mapscaping podcast to discuss wide angle imaging for fire mapping and maritime search. Overwatch Imaging, a startup, contributes to changing operational culture by promoting trust in new methodologies. The company's focus is on aiding high-risk environments with technology that simplifies and enhances the effectiveness of tasks such as fire response and emergency services.\n\nAlison, with an engineering background, joined Overwatch Imaging seeking a role with tangible impact on significant problems. As mission success manager, she ensures that the company's products meet customer expectations and adapt to their evolving needs. Overwatch Imaging integrates top market products with their software to create tailored solutions for diverse missions like fire mapping, maritime search, and tactical intelligence, all under the umbrella of wide area imaging.\n\nThe wide area imaging technology developed by Overwatch Imaging differs from traditional high zoom video gimbals used in military applications. It provides a broader view with higher resolution, enabling better situational awareness. The company's systems produce rapid orthomosaics and vector products in near real-time, thanks to on-edge processing. This capability allows for quick delivery of critical information to operators and decision-makers.\n\nOverwatch Imaging's products serve a wide range of users, from frontline responders to management teams. The data produced is crucial for immediate tactical decisions and long-term strategic planning, such as resource allocation and post-event analysis. However, the adoption of new technology in high-stakes environments is slow due to established practices and the need for trust in the reliability of new systems.\n\nAlison emphasizes the importance of understanding the operational environment and the challenges in introducing new technology. Despite the potential of Overwatch Imaging's solutions, the company faces hurdles in training and integrating their products into existing workflows. The goal is to ensure that their technology not only advances capabilities but also aligns with the safety and operational needs of the users.",
  "fileType": "AUDIO",
  "uri": "https://s356.podbean.com/pb/1cabd94c6c437b472f0e00cb0acfda5a/656c4dd6/data1/fs26/4204498/uploads/Over_Watch_Imaging_podcast_final8s2fn.mp3?pbss=c2c362ce-e013-542c-a851-b9087d069b73",
  "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca",
  "name": "Over_Watch_Imaging_podcast_final8s2fn"
}


Let's also ask an additional followup question:
Where can more information about Overwatch Imaging and its services be found?

You can see how it provides a much richer response, which includes the Overwatch Imaging website information that was found on the podcast web page - not within the podcast audio transcription itself.



// Mutation:
mutation PromptConversation($prompt: String!, $id: ID) {
  promptConversation(prompt: $prompt, id: $id) {
    conversation {
      id
    }
    message {
      role
      author
      message
      citations {
        content {
          id
        }
        index
        startTime
        endTime
        pageNumber
      }
      tokens
      completionTime
      timestamp
    }
    messageCount
  }
}

// Variables:
{
  "prompt": "Where can more information about Overwatch Imaging and its services be found?",
  "id": "de12f47f-6b16-432e-be0b-9e73fd765529"
}

// Response:
{
  "conversation": {
    "id": "de12f47f-6b16-432e-be0b-9e73fd765529"
  },
  "message": {
    "role": "ASSISTANT",
    "message": "Overwatch Imaging can be found on LinkedIn and their website overwatchimaging.com. [0][1]\n\nOverwatch Imaging is a startup focused on supporting folks in high intensity, high risk environments such as fire response, emergency response, and tactical response. [1]\n\nWide angle imaging is a technology used in fire mapping and maritime search. [0][2]\n\nOverwatch Imaging's mission success manager, Alison Harrott, provides an overview of wide angle imaging and its applications in fire mapping and maritime search. [2][3]\n\nOverwatch Imaging's website contains links to Alison Harrott's LinkedIn profile and other relevant episodes of the Mapscaping podcast. [0][4]\n\nOverwatch Imaging is a software company that develops wide area mapping technology for various applications. [5]\n\nOverwatch Imaging's products are created for a wide range of users, including those on the ground fighting fires, firefighters, management personnel, and other sensors in the sky. [5][6]\n\nOverwatch Imaging offers solutions for fire mapping, maritime search, and other related fields. [5]\n\nOverwatch Imaging provides wide angle imaging technology to improve situational awareness and automated detection in fire mapping and maritime search. [5]\n\nMore information about Overwatch Imaging and its services can be found on their website overwatchimaging.com. [0]",
    "citations": [
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 0,
        "startTime": "PT45M",
        "endTime": "PT46M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 1,
        "startTime": "PT2M",
        "endTime": "PT3M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 2,
        "startTime": "PT0S",
        "endTime": "PT1M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 3,
        "startTime": "PT1M",
        "endTime": "PT2M"
      },
      {
        "content": {
          "id": "4eeea260-517d-4f7c-a0a2-28ca063b5549"
        },
        "index": 4,
        "pageNumber": 1
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 5,
        "startTime": "PT4M",
        "endTime": "PT5M"
      },
      {
        "content": {
          "id": "d2ffb813-caa2-4649-beee-ee2c01c060ca"
        },
        "index": 6,
        "startTime": "PT10M",
        "endTime": "PT11M"
      },
      {
        "content": {
          "id": "10f0ffa4-d905-4180-be57-8a6b42d01272"
        },
        "index": 7,
        "startTime": "PT33M",
        "endTime": "PT34M"
      }
    ],
    "tokens": 691,
    "completionTime": "PT8.8492027S",
    "timestamp": "2023-12-04T05:24:31.477Z"
  },
  "messageCount": 4
}


Graphlit supports chatbot conversations across all media types, not just web pages and audio.

You can ingest additional background information from your website, or related PDFs or documents, which will be incorporated into Graphlit's conversation responses.


Summary

Graphlit provides an automated approach to content workflows for creators of podcasts or YouTube video content. We offer the ability to integrate any publicly hosted LLM, such as OpenAI GPT-4 Turbo, to use for summary generation as well as interactive chatbots over your content.

We look forward to hearing how you make use of Graphlit in your applications.

If you have any questions, please reach out to us here.