From the course: Oracle Cloud Infrastructure Generative AI Professional

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

Chat models

Chat models

(bright music) - [Instructor] Welcome to this lesson on track models available in the OCI generative AI service. Before we dive deeper, let us look at tokens, first. Large language models understand tokens rather than characters. One token can be part of a word, an entire word, or even a punctuation symbol. A common word such as apple is a token. A word such as friendship is made up of two tokens, friend and ship. Number of tokens per word depend on the complexity of the text. So for a simple text, you can assume one token per word on average. For complex text, meaning text with less common words, you can assume two to three tokens per word on average. So for example, if you have a sentence like this, many words map to one token, but some don't, indivisible, and you run this through a token for a large language model. This is an example of what a tokenizer would do. So it would break this particular sentence into multiple tokens. If you count, the total number of tokens is 15, whereas…

Contents