Newest 'regression' Questions

2 votes

1 answer

69 views

Why does the coefficient of a regressor increase with sample size when using poly() in lm() and glm()?

I’m trying to use the R poly() function with degree 1 to force glm to interpret a factor linearly. I’m puzzled by the fact that the size of the sample seems to increase the coefficient of the ...

Guillaume

43

asked 2 days ago

1 vote

0 answers

47 views

Prediction intervals for random-design simple linear regression

I am going through the creation of a prediction interval for a value drawn from the conditional distribution of $Y$ given $X=x$ under simple linear regression as shown in the image above. The ...

froot

83

asked Nov 20 at 17:22

0 votes

0 answers

26 views

True slope parameter for quantile regression with heterogeneous error

I am trying to perform a Monte-Carlo simulation on quantile regression using R. Currently I am getting stuck simulating the data from the model below. ...

UNI39

11

asked Nov 20 at 1:01

4 votes

2 answers

109 views

+100

Narrow vs Broad-based U-shape comparisons

I’m modeling mortality using a multivariate logistic regression model with a nonlinear effect of X1 and I’m examining whether this relationship changes across ...

Konstantinos Gkirgkiris

451

asked Nov 18 at 20:06

0 votes

1 answer

47 views

Using ordinal logistic regression to extract insights with imbalanced data

I am attempting to understand how each independent variable effects the probability of each dependent variable, which are ordinal (0, 1 and 2). Therefore, I am attempting to use ordinal logistic ...

92carmnad

11

asked Nov 17 at 17:05

4 votes

4 answers

318 views

+50

Borderline interaction p value

I’m working on a logistic regression model where I want to examine whether the effect of one continuous predictor (X1) on a binary outcome depends on another ...

Konstantinos Gkirgkiris

451

asked Nov 15 at 19:59

1 vote

1 answer

35 views

Interpretation of LMM output with scaled predictors

I'm running a linear mixed model, in which I have included a few categorical variables - time, sex - with two levels, as well as three continuous nutrition variables as fixed effects and their ...

Ekiboi

41

asked Nov 14 at 17:00

0 votes

0 answers

30 views

Partialing out a time-trend I'm unable to evaluate

I am investigating the influence of policy X on grade outcomes. Earlier research was able to utilise a partial implementation of policy X in the population of interest to establish a natural ...

SharpShimmer

1

asked Nov 14 at 0:16

1 vote

0 answers

60 views

How to analyze the influence of a variable on an outcome in clinical pre-post data (if the pre measurement as predictor isn't enough)?

I’m trying to get a better grasp of how to handle an issue in pre–post observational data. Let’s say I have data from a rehab center with measures at admission and discharge (only these two ...

querent

11

asked Nov 13 at 12:09

0 votes

0 answers

60 views

What regression model do I chose for my DV?

My data is a ratio of: perceived time elapsed/actual time elapsed. Now this ranges from 0 to +infinity. It a continuous positive number. My experiment is mixed model (with within and between subject ...

the_bluestreak

11

asked Nov 13 at 8:44

5 votes

1 answer

216 views

Number of knots in splines (internal vs total)

I’m trying to understand how natural cubic splines (splines::ns) and restricted cubic splines (rms::rcs) handle knots — ...

Konstantinos Gkirgkiris

451

asked Nov 12 at 15:27

2 votes

0 answers

41 views

How to choose features for a Gamma regression, vs. Linear Regression

I'm new to using GLMs which are not Linear Regression, and am working on a project where I am using Gamma regression with a log-link. I'm having problems with the feature engineering step. With linear ...

michael james

21

asked Nov 10 at 20:08

3 votes

2 answers

159 views

Understanding and interpreting Cox Regression when using ordered factors

I am trying to understand ordered factors (polynomial terms) and their interpretation in Cox Proportional Hazards regression model. I know when using lm() to fit ...

SIO

133

asked Nov 10 at 0:04

0 votes

0 answers

38 views

Maximum likelihood estimation for linear regression [duplicate]

When conducting maximum likelihood estimation for simple linear regression whilst considering the regressors as random, the joint distribution of $f_{X,Y}(x,y;\theta) = f_{Y|X}(y|x;\theta) * f_{X}(x;\...

froot

83

asked Nov 7 at 19:53

1 vote

0 answers

49 views

What is the best statistical approach to forecast cash flow from run-off debt vintages with a growing balance?

community. I'm facing a modeling problem for cash flow forecasting and would like to know what the most robust mathematical/statistical approach is to solve it. The Problem: Debt Recovery Forecasting ...

sn3fru

215

asked Nov 4 at 21:54

1 vote

1 answer

164 views

Does strict exogeneity imply uncorrelation among error terms?

Does the strict exogeneity assumption of OLS $ \mathbb{E} [\epsilon \mid X ] = 0 $ imply that the error terms of different observations are uncorrelated with one another, that is $ \text{Cov}( \...

robertspierre

3,403

asked Nov 3 at 6:24

2 votes

2 answers

99 views

lm() and glm() equivalence for log-transformed response variable [duplicate]

I can't seem to wrap my head around this: What is the glm() equivalent for lm(log(y) ~ x1 + x2, data=data)? Is it? a. ...

Mubita

121

asked Nov 2 at 9:34

1 vote

1 answer

93 views

Multicollinearity in logistic regression

I would like to check for multicollinearity of the independent variables in a binary logistic regression. Some independent variables are binary (coded 0, 1), others are polytomous (converted to dummy) ...

José Luis

21

asked Oct 30 at 8:15

0 votes

0 answers

19 views

Non-linear regression for modeling accuracy of ML models

Suppose I have a slow model with accuracy of between 75 and 80 %. I want to approximate this model with faster models. Fast models require $e$ effort and the more effort the better. I want to estimate ...

Gaslight Deceive Subvert

517

asked Oct 27 at 15:49

2 votes

1 answer

240 views

Calculating standard errors in least squares and the normality assumption

The question titled “How are the standard errors of coefficients calculated in a regression?” is asking how the standard errors of regression coefficient estimates are computed (for example, the ...

Laut567

83

asked Oct 25 at 8:11

0 votes

1 answer

28 views

In linear regression, what changes when you use robust standard errors to overcome non-constant variance?

In my first course on linear regression, I learned the 4 basic assumptions that every textbook teaches: linearity, independence, homoscedasticity, and normality. However, I recently learned about ...

Iterator516

411

asked Oct 24 at 3:40

5 votes

1 answer

122 views

Is there a "better" approach when it comes to model evaluation on multiple test datasets?

I have two models trained and validated on the same training/validation data. Now I need to evaluate them on multiple independent test datasets (e.g., 10 different datasets of the same measure). Which ...

user26416177

131

asked Oct 23 at 16:28

2 votes

1 answer

110 views

In a regression, does AIC tell anything that the mean squared error does not, except for the penalty for more variables?

The equation for AIC is $$\mathrm{AIC} = n\ln(\mathrm{MSE})+2k$$ where: $n ={}$number of observations $\mathrm{MSE} ={}$mean squared error $k ={}$number of parameter estimates The way I ...

doubtful_noob

167

asked Oct 21 at 17:44

1 vote

0 answers

110 views

Bias in standard error of regression slope with not-independent data and effective sample size

Consider a sample of $N/2$ pairs of individuals. Each pair belongs to a group $j$. For each individual $i$ from the $N$ sample, I measure two variables ($y_{i}$ and $x_{i}$) and the average per group $...

CafféSospeso

267

asked Oct 21 at 13:24

0 votes

0 answers

41 views

Different optimal elbow points for different values of a second continuous variable in a regression model

I am analyzing the relationship between age, education, and the probability of having a high income (>50K) using data from the UCI Adult dataset. I've fit a logistic regression model with a natural ...

Konstantinos Gkirgkiris

451

asked Oct 15 at 17:41

Stack Exchange Network

Questions tagged [regression]

Why does the coefficient of a regressor increase with sample size when using poly() in lm() and glm()?

Prediction intervals for random-design simple linear regression

True slope parameter for quantile regression with heterogeneous error

Narrow vs Broad-based U-shape comparisons

Using ordinal logistic regression to extract insights with imbalanced data

Borderline interaction p value

Interpretation of LMM output with scaled predictors

Partialing out a time-trend I'm unable to evaluate

How to analyze the influence of a variable on an outcome in clinical pre-post data (if the pre measurement as predictor isn't enough)?

What regression model do I chose for my DV?

Number of knots in splines (internal vs total)

How to choose features for a Gamma regression, vs. Linear Regression

Understanding and interpreting Cox Regression when using ordered factors

Maximum likelihood estimation for linear regression [duplicate]

What is the best statistical approach to forecast cash flow from run-off debt vintages with a growing balance?

Does strict exogeneity imply uncorrelation among error terms?

lm() and glm() equivalence for log-transformed response variable [duplicate]

Multicollinearity in logistic regression

Non-linear regression for modeling accuracy of ML models

Calculating standard errors in least squares and the normality assumption

In linear regression, what changes when you use robust standard errors to overcome non-constant variance?

Is there a "better" approach when it comes to model evaluation on multiple test datasets?

In a regression, does AIC tell anything that the mean squared error does not, except for the penalty for more variables?

Bias in standard error of regression slope with not-independent data and effective sample size

Different optimal elbow points for different values of a second continuous variable in a regression model

Hot Network Questions