From the course: Python for AI Projects: From Data Exploration to Impact

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Model metrics

Model metrics

- [Tutor] To evaluate any model, whether it's a traditional classifier or a powerful LLM, we need labeled reference data. That means we already have examples with a known good output, usually written or chosen by a human. In traditional NLP, labeled data might be, "The food was terrible and disgusting." Negative sentiment. In LLM evaluation, labeled reference data often includes entire text outputs. These could be summaries, translations, or answers paired with the input they respond to. The input might be a complete customer review talking about dishes, how close the restaurant was to key public transport, and how the waiter was not very attentive. The reference summary could be as simple as, convenient location, but service needs improvement. These references act as the gold standard, and we use them to compare the model's output using specific metrics. When evaluating models, it's important to remember…

Contents