A comparative study between shrinkage methods (ridge-lasso) using simulation
DOI:
https://doi.org/10.21533/pen.v11.i2.98Abstract
The general linear model is widely used in many scientific fields, especially biological ones. The Ordinary Least Squares (OLS) estimators for the coefficients of the general linear model are characterized by good specifications symbolized by the acronym BLUE (Best Linear Unbiased Estimator), provided that the basic assumptions for building the model under study are met. The failure to achieve one of the basic assumptions or hypotheses required to build the model can lead to the emergence of estimators with low bias and high variance, which results in poor performance in both prediction and explanation of the model in question. The hypothesis that there are no multiple linear relationships between the explanatory variables is consid-ered one of the leading hypotheses on which the model is based. Thus, the emergence of this problem leads to misleading results and high (Wide) confidence limits for the estimators associated with those variables due to problems characterizing the model. Shrinkage methods are considered one of the most effective and preferable ways to eliminate the multicollinearity problem. These methods are based on addressing the mul-ticollinearity problems by reducing the variance of estimators in the model. Ridge and Lasso methods repre-sent the most and most common of these methods of shrinkage. The simulation was carried out for different sample sizes (40, 120, 200) and some variables (P=30, 60) in the first and second experiments arbitrarily and at the level of low, medium, and high correlation coefficients (0.2, 0.5, 0.8). When (p=30, 60) Lasso method has the smallest (MSE) than the Ridge method. The Lasso method proved its efficiency by obtaining the least MSE. Optimal Penalty parameter (λ) chosen from Cross-Validation through minimizing (MSE) of prediction. We see a rapid increase for (MSE) for both (Ridge-Lasso) where the top axis indicates the num-ber of model variables, and when the correlation between variables increases and sample size too, we can see the (MSE) values increase in the Ridge method than the Lasso method. A ridge method gives greater efficiency when the sample size is more significant than variables (p<n), but the Ridge method cannot shrink coefficients to precisely zero. So, the elasticity of ridge coefficients decreases, but variance increases bias, also (MSE) first remains relatively constant and then increases fast.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.




