Statistically Significant Linear Regression Coefficients Solely Driven by Outliers in Finite-Sample Inference

Felix Reichel

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In this paper, we investigate the impact of outliers on the statistical significance of coefficients in linear regression. We demonstrate, through numerical simulation using R, that a single outlier can cause an otherwise insignificant coefficient to appear statistically significant. We compare this with robust Huber regression, which reduces the effects of outliers. Afterwards, we approximate the influence of a single outlier on estimated regression coefficients and discuss common diagnostic statistics to detect influential observations in regression (e.g., studentized residuals). Furthermore, we relate this issue to the optional normality assumption in simple linear regression[1], required for exact finite-sample inference but asymptotically justified for large \(n\) by the Central Limit Theorem (CLT). We also address the general dangers of relying solely on p-values without performing adequate regression diagnostics. Finally, we provide a brief overview of regression methods and discuss how they relate to the assumptions of the Gauss-Markov theorem.

Version published to 10.32388/kszxtm
May 19, 2025

Using heteroskedasticity-consistent standard errors and the bootstrap for linear regression analysis in SPSS: A tutorial

This article has 3 authors:
1. Hanna Rajh-Weber
2. Stefan E. Huber
3. Martin Arendasy
This article has no evaluationsLatest version Jun 3, 2025
Reinforcing Moving Linear Model Approach: Theoretical Assessment of Parameter Estimation and Outlier Detection

This article has 1 author:
1. Koki Kyo
This article has no evaluationsLatest version Jun 20, 2025
Bayesian variable selection in high-dimensional ordinal quantile regression models

This article has 3 authors:
1. Mai Dao
2. Md Sakhawat Hossain
3. Zhuanzhuan Ma
This article has no evaluationsLatest version Jun 3, 2025

Listed in

Abstract

Article activity feed

Related articles

Using heteroskedasticity-consistent standard errors and the bootstrap for linear regression analysis in SPSS: A tutorial

Reinforcing Moving Linear Model Approach: Theoretical Assessment of Parameter Estimation and Outlier Detection

Bayesian variable selection in high-dimensional ordinal quantile regression models