A comparative study between shrinkage methods (ridge-lasso) using simulation

Zainab Fadhil Ghareeb, Suhad Ali Shaheed AL-Temimi

Abstract


The general linear model is widely used in many scientific fields, especially biological ones. The Ordinary Least Squares (OLS) estimators for the coefficients of the general linear model are characterized by good specifications symbolized by the acronym BLUE (Best Linear Unbiased Estimator), provided that the basic assumptions for building the model under study are met. The failure to achieve one of the basic assumptions or hypotheses required to build the model can lead to the emergence of estimators with low bias and high variance, which results in poor performance in both prediction and explanation of the model in question. The hypothesis that there are no multiple linear relationships between the explanatory variables is considered one of the leading hypotheses on which the model is based. Thus, the emergence of this problem leads to misleading results and high (Wide) confidence limits for the estimators associated with those variables due to problems characterizing the model. Shrinkage methods are considered one of the most effective and preferable ways to eliminate the multicollinearity problem. These methods are based on addressing the multicollinearity problems by reducing the variance of estimators in the model. Ridge and Lasso methods represent the most and most common of these methods of shrinkage. The simulation was carried out for different sample sizes (40, 120, 200) and some variables (P=30, 60) in the first and second experiments arbitrarily and at the level of low, medium, and high correlation coefficients (0.2, 0.5, 0.8). When (p=30, 60) Lasso method has the smallest (MSE) than the Ridge method. The Lasso method proved its efficiency by obtaining the least MSE. Optimal Penalty parameter (λ) chosen from Cross-Validation through minimizing (MSE) of prediction. We see a rapid increase for (MSE) for both (Ridge-Lasso) where the top axis indicates the number of model variables, and when the correlation between variables increases and sample size too, we can see the (MSE) values increase in the Ridge method than the Lasso method. A ridge method gives greater efficiency when the sample size is more significant than variables (p

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/pen.v11i2.3472

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Zainab Fadhil Ghareeb, Suhad Ali Shaheed AL-Temimi

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN: 2303-4521

Digital Object Identifier DOI: 10.21533/pen

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License