Tuning parameter selectors for bridge penalty based on particle swarm optimization method

The bridge penalty is widely used as a penalty for selecting and shrinking predictors in regression models. Although its effectiveness is sensitive to the parameters you decide to use for shrinking and adjusting. The shrinkage and tuning parameters of the bridge penalty are chosen concurrently, and a continuous optimization process called particle swarm optimization is proposed as a means to do this. If implemented, the proposed method will greatly facilitate regression modeling with superior prediction performance. The results show that the proposed method is effective in comparison to other well-known methods, but this varies greatly depending on the simulation setup and the real data application.


Introduction
Technological progress has generated and amassed a significant number of variables in numerous real-world, applied scientific, economic, and technological contexts. Having too many variables can cause linear regression to become overfit. Large prediction errors in the calculated parameters also seem to be caused by the multicollinearity issue. However, even if a large number of variables are available for regression modeling, many of them may not be pertinent to the response variable, where their inclusion would drastically reduce the prediction accuracy. For accurate regression modeling, variable selection is crucial. Its purpose is to reduce the number of variables in a model in order to increase its predictive power and simplify its interpretation. In these situations, the computational cost of using conventional subset selection techniques, such as backward elimination, forward selection, and stepwise selection, increases. Researchers have found that the penalty approaches provide a powerful framework for undertaking variable selection and model estimation in tandem. These strategies involve including a penalty term in the regression model's loss function. The purpose of this term is to allow the user to fine-tune the balance between the chosen model's bias and variance. Bridge penalty [1], LASSO [2], SCAD [3], elastic net [4], and adaptive LASSO are only a few of the penalties that have been proposed and developed by academics [5]. Specifically, Frank and Friedman (1993) suggested the bridge penalty, which requires including them in the loss function of the regression model. It is shown that the L2-norm penalty and the L1-norm penalty are both special examples of the bridge penalty, with and, respectively. In order to maximize the bridge penalty's effectiveness, it is crucial to pick the right tuning parameter. The tuning parameter selection problem in bridge penalty can be effectively dealt with using the data-driven Crossvalidation approach (CV). Unfortunately, CV is notorious for its high computational time and variability [2,6,7]. In this paper, a continuous approach that takes inspiration from nature, called particle swarm optimization, is proposed as a means of determining the tuning parameter in the bridge penalty. The proposed method will efficiently help to locate the most essential variables in the regression model with a high prediction. It is shown that the proposed method is superior by using it on a variety of synthetic data sets and a real-world dataset. The following is the outline for this paper. Sections 2 and 3 detail the bridge penalty regression model and its features. Section 4 covers the details of our proposed method. The illustration of the proposed method through simulated studies and through real data application is presented in Section 5. Section 6 discusses the final thoughts.

Regression model with bridge penalty
Consider that we have a data set.
stands for a response variable and 12 ( , ,..., ) Parameter estimates for Eq. (1) can be brought closer to zero thanks to a concept called the "bridge penalty," which was first proposed by Frank and Friedman in 1993 [1]. The bridge penalty regression coefficients can be written as: The bridge estimator of Eq. (2) with 01   can accurately choose predictors with non-zero regression coefficients and that, under suitable situations, the bridge estimator benefits from oracle qualities [8,9,10,11].  (1) Set values for both  and  , respectively.
(2) Set the initial vector (4) Iterate Eq. (5) until the following condition is satisfied where  represents a small positive value. It is equal to 5 10 − in our paper.

Selection criteria of  and 
Accurate selection of  and  is critical because it has a bigger impact on the bridge's efficiency. The most popular methods in the research community are the cross-validation method, the expanded cross-validation method, and information criteria like the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for estimating  and  . These criteria can be defined as [7]: , | A | represents the cardinal of A, and J is a matrix of (| A | 1) (| A | 1) +  + and it defined as in [7].

The proposed method
Metaheuristic algorithms, of which evolutionary algorithms are an example, have gained popularity in recent years due to their effectiveness in addressing difficult optimization issues [12]. Particle swarm optimization (PSO) is one of these algorithms; it's powerful and yet simple to implement [13]. Eberhart and Kennedy proposed the PSO algorithm in 1995 [14]. Animal social behaviors, such as schooling fish and flocking birds, were a primary source of motivation for PSO.
According to PSO, the swarm is made up of many distinct particles, each of which is treated as an independent entity. Additionally, each problem's solution space can be written as a search space. Each individual particle has a velocity, a position, and a value of fitness that is evaluated by a fitness function while the swarm moves through a search space with d dimensions. Particles will proceed in accordance with their individual velocities. Particle motion is determined as follows at each iteration of the algorithm: 11 , v , correspondingly, stands for a position and the velocity of particle i at iteration t , t i Pbest stands for the finest position that is found by particle i , and t i Gbest is the best position that is found by the whole swarm. In addition, w is the inertia weight, 1 k and 2 k are the acceleration coefficients. While, 1 r and 2 r are values chosen at random from a uniform distribution between 0 and 1. When the fitness of particles doing a maximization or minimization job is calculated using the objective function,, the best values for each particle and the swarm as a whole are updated at each iteration in the following ways: The pseudo code of the PSO is shown in Figure 1.
For bridge penalty, we have two parameter,  and  . Each of these parameters is treated as a position in PSO. Therefore, we have two positions that each particle in the swarm will search for them. Consequently, our proposed algorithm is as: Step 1: The particles number, b , is set to 30 and the extreme number of iterations is , and it is updating as: Step 2: All of the particle locations are chosen at random. Each place in the is produced at random between 0 and 1000. For, a uniformly distributed random number between 0.1 and 4 is used to determine the position. Particle locations are represented graphically in Figure 2. Particles' first and second positions stand for the and values, respectively. Step 3: The particle's first speed is produced at random from a uniform distribution in the interval [0, 4].
Step 4: An explanation of the fitness function is as follows:  (13) and (14), the particle locations and velocities are updated.
Step 6: Steps 4 and 5 will be recurrent till a max t is reached.

Results and discussion
Our suggested approach, PSO-bridge, is evaluated here to see how well it performs. Furthermore, we evaluate PSO-efficacy bridge's in terms of the CV, GCV, AIC, BIC, CAIC, and GBIC, as specified in Eqs. (7)-(12).

Simulation results
In this section, we follow the same simulation setting of Kawano (2014). Five simulation setting are considered.
The predictors' matrix X is generated from multivariate normal distribution ( ,1) N 0 for setting 1, 2, 3, and 4. The response variable was generated from the true regression model in Eq.

Real application results
We used the pollution data collection found in the R library SMPracticals. Many statisticians that work with variables selection in regression modeling rely on this data collection [7,10,15,16]. The response variable in this dataset, total age-adjusted mortality rate, is measured across 60 observations for 201 Standard Metropolitan Statistical Areas from 1959 to 1961. In addition, there are fifteen quantitative predictors included in this data set.
To estimate  and  for the constructed regression model with bridge penalty, 40 observations (training data set) of the data set has been chosen randomly and the rest, 20 observation (testing data set), is used to compute the prediction error (PE). This split was repeated 10 times. Comparing with GBIC as a method of estimating  and  , it can be seen that the PE of the PSO-Bridge was about 3.81% lower than that of GBIC. Figure 3 presents the boxplot for the seven methods. From Figure 3, it can be observed that the PSO-Bridge is superior to the other sex methods in terms of stability in PE where PSO-Bridge has the smallest standard deviation.

Conclusion
This work examines the challenge of choosing bridge penalty parameters for linear regression models. It was suggested that the parameters of the bridge penalty be selected using a particle swarm optimization algorithm. Tests on synthetic data and real-world examples showed that PSO-Bridge outperformed its rivals in terms of mean squared error (MSE), standard deviation (Se), and standard deviation of prediction (SP).

Declaration of competing interest
The authors declare that they have no any known financial or non-financial competing interests in any material discussed in this paper.

Funding information
No funding was received from any financial organization to conduct this research.