GPT-to-Audio: Publish AI-Generated Podcasts with Graphlit, GPT-4 and ElevenLabs
February 3, 2024
Text-to-speech models such as those from ElevenLabs have become incredibly human-sounding, and have the ability to clone your own voice.
Now, the Graphlit Platform can be used to generate audio with ElevenLabs voices from any content - web pages, PDFs, audio transcripts and more.
Say we want to generate a podcast about this week's interesting AI news and academic papers.
🔉 Listen to an example here.
We first need to ingest some content to use for our podcast. We'll start by ingesting an interesting blog post about knowledge graphs, but we also want to crawl any hyperlinks and ingest the ArXiV papers referenced in the blog post.
After ingesting all the content we want to use for the podcast, we'll create an LLM specification for the latest GPT-4 Turbo (0125) model, and then will publish summarized versions of our content as an MP3 audio file using an ElevenLabs voice.
Once the publishing has completed, you can download the MP3 to post on social media or upload to a media hosting site.
You can use any content you want with this audio publishing process, create your own LLM publishing prompt, and select any ElevenLabs voice.
To crawl the links automatically, we need to create a workflow object which enables crawling of Web page hyperlinks.
To ingest a web page, we will specify the
uri and the workflow
id we created above, since we want to crawl any hyperlinks we find in this blog post.
You can repeat this process for any other content you want to include in your AI-generated podcast, and you aren't limited just to web pages. You can use audio transcripts, PDFs, Word documents, or even Slack or email.
Once we have the content we want to publish, let's use the GPT-4 Turbo (0125) model to publish the script which will be provided to the ElevenLabs text-to-speech API.
Here we specify the
OPEN_AI service, and the
Creating a specification is optional, but GPT-4 tends to give better quality output for the publishing process.
We're now ready to publish our AI-generated podcast. Listen to an example here.
We first need to tell Graphlit that we want to use
ELEVEN_LABS_AUDIO as our publishing type, with the
MP3 format. We are using the
ENGLISH_V1 model with a British male voice. You can find all the supported the voice IDs on the ElevenLabs website.
We are also assigning the
publishSpecification to the GPT-4 specification we created above.
Here is the publish prompt used for this podcast:
Graphlit will summarize each of the contents that we ingested, and then use the publishing prompt provided to generate the final script which is passed to the ElevenLabs text-to-speech API.
This process takes a few minutes to complete, and when the
publishContents mutation finishes, your MP3 will be ready for download.
Get Published MP3
Once you've published your audio file, you can easily get a URI to download it.
Query the content via the
id returned from
publishContents, and the
masterUri provides a temporary URI for download. Since the published content is re-ingested into Graphlit, you can look at
transcriptUri for the audio transcript automatically generated from the MP3.
By combining the power of LLMs, such as OpenAI GPT-4 Turbo, with the latest text-to-speech models from ElevenLabs, Graphlit helps you automate the creation of AI-generated podcasts, daily audio summaries, or any other audio renditions of your ingested contents.
Please email any questions on this tutorial or the Graphlit Platform to email@example.com.