Regression trees in genomic selection for carcass traits in pigs
Genome-Wide Selection (GWS) uses molecular markers to predict the genetic merit of animals and plants. Usually, a high density of molecular markers to predict this genetic merit is used. Thus, statistical methods need to deal with problems of high dimensionality, multicollinearity and computational efficiency. We examined a set of machine learning methods, in particular the tree-based regression methods (Regression Tree, Bagging, Random Forest and Boosting) and evaluated them in relation to predictive ability and bias. Moreover, these methods were compared with the Bayesian Least Absolute Shrinkage and Selection Operator (BLASSO) method, which is routinely used in GWS. For this, we used information of 10 carcass traits in Piau x Commercial pigs. The tree-based regression methods were superior to the BLASSO method, presenting shorter computational times to predict the genetic values of the analyses, especially, the Random Forest and Bagging methods. Furthermore, the predictive abilities of tree-based regression methods were competitive with BLASSO. In terms of bias, the BLASSO underestimated the predictions while tree-based regression methods overestimated the predictions. In addition, among the methods, the Random Forest was the one that obtained the bias value closest to ideal for most of the traits, demonstrating the superiority of this method.