From the course: Build with AI: Building a Project with the ChatGPT API
Generate audio from a text prompt
From the course: Build with AI: Building a Project with the ChatGPT API
Generate audio from a text prompt
- [Narrator] Imagine your app could speak, not just return text, but actually respond with voice. Generating audio from text is a game-changer, whether you're building for accessibility, language learning, voice assistance, or interactive storytelling. Let's look at the audio API. I've navigated to the Jupyter Notebook. You can find this in the courses GitHub repo. The first few lines you're familiar with, it's where we install the necessary libraries. You're also familiar with this next section where I am loading the API key from my local environment file and setting up the client. Here in section two, this is where we generate audio from text. This first line here, I am setting up the path for the output file, and I want the output stored in speech.mp3. Here, I'm using the client to call the audio API, specifically the create function. I'm passing in the name of the model, gpt-4o-mini-tts. tts stands for text to speech. I'm selecting onyx as the voice. Here, I've included other voices that you can test, and the input is Welcome to your AI-powered app. Let's get started! In this next section, I'm able to play the audio in the notebook by importing audio and passing the speech.mp3 file to that audio. I've already executed all of the cells in this notebook, and let's play the audio to hear what it sounds like. - [Voiceover] Welcome to your AI-powered app. Let's get started! - [Narrator] You can also open the file here and click play. - [Voiceover] Let's get started! - [Narrator] Generating audio from text just unlocks this new layer of user experience. It makes your applications more human, more accessible, and more engaging. A few common use cases include building an accessible chatbot with voice response, adding narration to generated content, and creating onboarding or product walkthroughs with dynamic voiceover. In the next video, we'll take a look at the reverse, how to transcribe speech to text using OpenAI's powerful Transcription API.
Contents
-
-
-
Authenticate to the OpenAI API3m 58s
-
Generate text with the Chat Completions API4m 53s
-
Create an image from a prompt5m 48s
-
Understand images using vision capabilities6m 20s
-
Generate audio from a text prompt2m 42s
-
Convert text and speech with the Transcriptions API3m 49s
-
Create embeddings using the Embeddings API4m 31s
-
Challenge: Build a movie script generator2m 1s
-
Solution: Build a movie script generator8m 56s
-
-
-
-
-