Support vector machines applied to the genetic classification problem of hybrid populations with high degrees of similarity
Selection of appropriate genitors in breeding programs increases gains due to the variability found in the divergent groups; this allows quantification of the existing variability, saving time and resources. There are many methods for quantification and evaluation of diversity in population studies, among which we highlight methods that are based on multivariate statistical analyses, such as linear discriminant analysis (LDA) and cluster analysis. Here we propose and evaluate the use of Support Vector machine (SVM) and Artificial Neural Network (ANN) in an attempt to solve the problem of genetic classification of hybrid populations with high degrees of similarity. The results obtained, in terms of the apparent error rate (APER), were compared with those obtained using ANN analysis and LDA. In general, the lowest APER values were associated with scenarios with low degrees of genetic similarity between populations. Specifically, the best results obtained through SVM (ranging from 14.44 to 67.41%) were observed when the exponential radial base kernel function was used. The APERs obtained by the ANN were even lower than those of the linear discriminant function.