Bayesian models applied to genomic selection for categorical traits
We compared two statistical methodologies applied to genetic and genomic analyses of categorical traits. The first one consists of a Bayesian approach to the Bayesian Linear Mixed Model (BLMM), which addresses the statistical problems of genomic prediction. The second methodology, called Bayesian Generalized Linear Mixed Model (BGLMM) is similar, but it is used when the distribution of the response variable is not Gaussian, as in the case of disease resistance phenotype categories. These models were compared according to predictive ability, bias, computational time and cross validation error rate (CVER). Additionally, an alternative classification method for the BLMM was proposed, which allowed us to obtain the CVER for this model. Estimates of the genetic parameters were obtained using BLASSO (Bayesian Least Absolute Shrinkage and Selection Operator) and Bayesian G-BLUP (Genomic Best Linear Unbiased Prediction) estimation methods applied to BLMM and BGLMM. The models were applied in two scenarios, with two and four classes for the phenotype of resistance to rust disease caused by the pathogen Puccinia psidii and classified as reaction types (two classes) and infection levels (four classes) recorded for 559 trees of Eucalyptus urophylla with 24,806 SNP markers. Modeling this trait through SNPs allow the next generation of plants to be selected early, reducing time and costs. We found the same predictive ability for both models and a bias value closer to the ideal for BLMM (GBLUP). The BGLMM had the best CVER (0.29 against 0.32 and 0.47 against 0.51 for 2 and 4 categories, respectively), BLMM had a three times shorter computational time, and though BLMM is not the most appropriate model for handling categorical data, this model presented similar responses to BGLMM. Thus, we consider it as an appropriate alternative for categorical data modeling.