Research Article

Development, characterization, and annotation of potential simple sequence repeats by transcriptome sequencing in pears (Pyrus pyrifolia Nakai)

Published: September 23, 2016
Genet. Mol. Res. 15(3): gmr8683 DOI: 10.4238/gmr.15038683

Abstract

Simple sequence repeats (SSRs), one of the most powerful molecular markers, can be used for DNA fingerprinting, variety identification, genetic mapping, and marker-assisted selection. Using the pear’s (Pyrus pyrifolia Nakai) 75,764 unigenes (55,676,271 bp) obtained by deep transcriptome sequencing, a total of 10,622 novel SSRs were identified in 9154 unigenes, accounting for 14.02% of all unigenes. The average length and distribution of these SSRs was about 16 bp and 5.24 kb, respectively. Dinucleotide repeat motifs were the main type, with a frequency of 55.87%, followed by trinucleotides (24.45%). There were 159 kinds of repeat motifs existing in the pear transcriptome. AG/CT was the most frequent motif, accounting for 49.64%. All 9154 SSR-containing unigenes were functionally annotated using Nr (NCBI non-redundant protein database), Nt (NCBI non-redundant nucleotide database), and the Swiss-Prot database, and were classified further by Gene Ontology and Clusters of Orthologous Groups. In addition, a total of 4300 primer pairs were designed from all SSR loci obtained. Of these, 40 primers were randomly selected for PCR amplification and polyacrylamide gel (PAGE) analysis. Among the 40 primer pairs, 31 were successfully separated via PAGE. These findings also confirm that mining SSRs using next-generating sequencing technologies is a fast, effective, and reliable approach.

Simple sequence repeats (SSRs), one of the most powerful molecular markers, can be used for DNA fingerprinting, variety identification, genetic mapping, and marker-assisted selection. Using the pear’s (Pyrus pyrifolia Nakai) 75,764 unigenes (55,676,271 bp) obtained by deep transcriptome sequencing, a total of 10,622 novel SSRs were identified in 9154 unigenes, accounting for 14.02% of all unigenes. The average length and distribution of these SSRs was about 16 bp and 5.24 kb, respectively. Dinucleotide repeat motifs were the main type, with a frequency of 55.87%, followed by trinucleotides (24.45%). There were 159 kinds of repeat motifs existing in the pear transcriptome. AG/CT was the most frequent motif, accounting for 49.64%. All 9154 SSR-containing unigenes were functionally annotated using Nr (NCBI non-redundant protein database), Nt (NCBI non-redundant nucleotide database), and the Swiss-Prot database, and were classified further by Gene Ontology and Clusters of Orthologous Groups. In addition, a total of 4300 primer pairs were designed from all SSR loci obtained. Of these, 40 primers were randomly selected for PCR amplification and polyacrylamide gel (PAGE) analysis. Among the 40 primer pairs, 31 were successfully separated via PAGE. These findings also confirm that mining SSRs using next-generating sequencing technologies is a fast, effective, and reliable approach.