From the course: Introduction to Conversational AI

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

The technology stack

The technology stack

- A lot happens behind the scenes to make conversational AI feel smooth and natural. Instead of dumping all of the parts on the table at once, let's build this picture together, step-by-step. Step one is capturing the input. If it's a voice conversation, everything starts with automatic speech recognition, or ASR. This is the AI's ears, turning sound waves into written text. These days, ASR is faster, more accurate, and can handle dozens of accents, dialects, and even background noise. But, if you're typing, we skip ASR entirely. The input is already text. Step two is about understanding the words once the words are in text form. At that point, they move into the AI's brain, natural language processing, or NLP. Today, NLP often runs on large foundation models, LLMs, or even multimodal models that can process text, audio, images, and video together. This is where the AI starts making sense of your message. Step three is…

Contents