Linear regression modeling with high-dimensional or multicollinear data sets is prone to overfitting. As a general rule, the more parameters a model uses to predict its target, the higher the risk of overfitting. More parameters can either take the form of more features in your data set, or of more internal model flexibility if you are working with nonlinear methods.
The number of parameters in your model is called the model’s effective degrees of freedom, which I will denote
There are many different approaches to reducing the model degrees of freedom (this is arguably a large part of what the disciplines of machine learning and Bayesian analysis are all about). In the linear regression context these include subset selection,
In this recently published paper, I introduce the Hierarchical Feature Regression (HFR) which takes a somewhat different approach (Pfitzinger 2024). Instead of shrinking parameters towards zero, it shrinks them towards group target values. The idea is simple: if the effect of two features on the target is similar, their parameters should be similar. This type of grouping is highly intuitive and can lead to robust out-of-sample predictions.3