From the course: Predictive Analytics with Categorical Data: Advanced Regression Methods for Real-World Applications
Unlock the full course today
Join today to access over 24,900 courses taught by industry experts.
The problem with the linear probability model
From the course: Predictive Analytics with Categorical Data: Advanced Regression Methods for Real-World Applications
The problem with the linear probability model
- [Instructor] Imagine you're a data analyst at a healthcare company. And your task is to predict whether patients will adhere to their medication schedules based on a set of factors, such as age, gender, and health condition. You decide to use a linear regression method to predict this binary outcome: Will the patient adhere, yes or no? At first glance, everything seems fine. But numerous errors lurk beneath the surface. Investigating your results further reveals glaring issues with your approach, and without this thorough investigation, you might have naively accepted these incorrect results. In this lesson, I'll explore why a linear regression approach often falls short for binary data and why more advanced models, like logistic regression, are typically preferred. When you estimate a regression model that has a binary dependent variable using a linear estimation technique, such as ordinary least squares, the resultant model is called a linear probability model. In such a model…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.