Development and characterization of novel EST-SSR markers and their application for genetic diversity analysis of Jerusalem artichoke (Helianthus tuberosus L.)
Jerusalem artichoke (Helianthus tuberosus L.) is a perennial tuberous plant and a traditional inulin-rich crop in Thailand. It has become the most important source of inulin and has great potential for use in chemical and food industries. In this study, expressed sequence tag (EST)-based simple sequence repeat (SSR) markers were developed from 40,362 Jerusalem artichoke ESTs retrieved from the NCBI database. Among 23,691 non-redundant identified ESTs, 1949 SSR motifs harboring 2 to 6 nucleotides with varied repeat motifs were discovered from 1676 assembled sequences. Seventy-nine primer pairs were generated from EST sequences harboring SSR motifs. Our results show that 43 primers are polymorphic for the six studied populations, while the remaining 36 were either monomorphic or failed to amplify. These 43 SSR loci exhibited a high level of genetic diversity among populations, with allele numbers varying from 2 to 7, with an average of 3.95 alleles per loci. Heterozygosity ranged from 0.096 to 0.774, with an average of 0.536; polymorphic index content ranged from 0.096 to 0.854, with an average of 0.568. Principal component analysis and neighbor-joining analysis revealed that the six populations could be divided into six clusters. Our results indicate that these newly characterized EST-SSR markers may be useful in the exploration of genetic diversity and range expansion of the Jerusalem artichoke, and in cross-species application for the genus Helianthus.
Jerusalem artichoke (Helianthus tuberosus L.), a perennial member of the family Asteraceae; it is native to eastern North America and was introduced to Thailand decades ago. Its tubers are rich in inulin making it a healthy choice for individuals with diabetes (Kays and Nottingham, 2008; Alla et al., 2014). In general, Jerusalem artichoke has 2n (6x) = 102 chromosomes, similar to the species Helianthus annuus, which is commonly known as sunflower. Jerusalem artichoke has a long history of cultivation as a food supplement all over the world, and this is attributed to its adaptability to varied climates making it easy to plant for local people (Bock et al., 2014). Although Jerusalem artichoke has a very long planting history, international germplasm collections still focus on commercial breeding with the aim of developing both yield and tuber form (Kiru and Nasenko, 2010). Furthermore, plant breeding programs of Jerusalem artichokes still rely heavily on the inner genetic resources, which are essential to accurately identify genotypes and to delineate the various genetic relationships between available accessions in germplasm collections. These resources can then be utilized effectively to preserve and develop the species and to enhance its applications (Debnath, 2014). Although a number of international plant germplasm collections of Jerusalem artichokes have been established, which contain several hundred genotypes, including hybrids and landraces, a standard reference germplasm is still lacking (Kays and Nottingham, 2008). Levels and patterns of genetic diversity and the range expansion of Jerusalem artichoke remain largely unknown. DNA markers were first developed in the 1980s to evaluate variation between accessions within a germplasm or population and also variation at the DNA level between populations arising due to differences within the DNA (Park et al., 2009; Mondini et al., 2009). The markers most commonly used are simple sequence repeat (SSR) or microsatellite markers. These can be derived from polymerase chain reactions (PCR) (Mullis et al., 1986) and represent the second generation of molecular markers. Their particular strength lies in the fact that they are spread throughout the genome, do not require large amounts of DNA for analysis, are reliable and generate multiple markers, and their use does not require any prior genome information. SSRs are specific regions of DNA that contain either simple sequences or short tandem repeats (STRs). These STRs typically range from one to six base pairs and comprise repeated tandem short sequence motifs (Merritt et al., 2015). A number of public expressed sequence tag (EST) databases now exist. This is a helpful development in the identification of functional markers in suitable candidate genes (Poczai et al., 2013) that can support gene expression analysis and assist in the detection of genetic diversity (Andersen and Lübberstedt, 2003). ESTs are short transcribed sequences that have permitted the development of SSR markers in several species of plants (Gadaleta et al., 2011; Kumari et al., 2013; Şelale et al., 2013; Zhang et al., 2014; Ju et al., 2015). This has resulted in the subsequent identification of genetic diversity through the use of EST-SSR markers (Mujaju et al., 2013; Ramu et al., 2013; Malfa et al., 2014). Although Jerusalem artichoke is an important crop with both economic and cultural significance, very few informative molecular markers have been isolated from its genome. This is unusual in comparison with other crops of similar economic importance. A limited number of markers has been used to examine genetic diversity in the Jerusalem artichoke. These include random amplified polymorphic DNA (RAPD) (Wangsomnuk et al., 2011a,b), sequence-related amplified polymorphism (SRAP), and inter simple sequence repeat (ISSR) markers (Wangsomnuk et al., 2011a). Additionally, Kou et al. (2014) noted that accessions are available in China that are based on amplified fragment length polymorphism (AFLP). However, this method is not adequate to distinguish homozygous alleles from heterozygous alleles using these markers. Compared to other DNA markers like RAPD, ISSR, SRAP, and AFLP, SSR markers have the advantage of co-dominance, reproducibility, hyper-variability, and high coverage in the genome. The development of reliable co-dominant and multi-allelic markers is thus particularly important if cultivar or parental identifications are to be made. They are also important for breeding programs and in studies genetic diversity, conservation genetics, or population structure, which demand marker-assisted selection. Therefore, this study aimed to develop EST-SSR markers from public EST sequences of Jerusalem artichoke, available in the National Center for Biotechnology Information (NCBI) database, and to monitor their performance in the assessment of polymorphism of 25 accessions from five different sources provided by the Plant Genes Resources of Canada, and 35 open-pollinated lines from Thailand.
MATERIAL AND METHODS
A total of 60 Jerusalem artichoke genotypes were obtained from six different sources (Table 1). Sets of five accessions were obtained from Canada, the United States, Russia, Germany, and France, along with 35 open-pollination lines from Thailand. The 35 selected open-pollinated accessions of Jerusalem artichoke used in this study were derived from in vitro culture based on the protocol described by Wangsomnuk et al. (2015) to enrich clonal diversity. Young leaves were collected from all chosen genotypes of Jerusalem artichoke and dried in silica gel until use.
List of 60 accessions of Jerusalem artichoke used in this study and their origin/source and average dissimilarity (AD).
Genomic DNA was extracted from the dried leaves of different individuals of selected Jerusalem artichoke genotypes. A sample of 100 mg dried leaf tissue was ground using a pestle and mortar in liquid nitrogen. Next, the powder was suspended in 700 µL extraction buffer comprising 100 mM Tris-HCl, pH 7.5, 0.35 M mannitol, 50 mM EDTA, pH 8.0, and 0.3% β-mercaptoethanol. The mixture was gently vortexed and then incubated at 65°C for 1 h, during which the mixture was shaken gently several times during the incubation, followed by chloroform clean up. DNA was precipitated with isopropanol, and the DNA pellet was washed in ethanol (70%), air dried, and re-suspended in 100 μL TE buffer (10 mM Tris-HCl, pH 8.0, and 1 mM EDTA, pH 8.0). The DNA was quantified by gel electrophoresis and NanoDropTM (Thermo Scientific), and stored at -20°C until use.
EST-SSR and PCR amplification
A total of 40,362 ESTs of Jerusalem artichoke were retrieved from the NCBI nucleotide database. They were subsequently arranged into 6563 contigs and 17,128 single sequences using CAP3 (Huang and Madan, 1999). In order to identify the SSR motifs harboring two to six nucleotides, arranged sequences with a minimum of six, five, four, four, and four repeated units in the unigenes were detected by the SSRIT software (Temnykh et al., 2001). When the ESTs contained appropriate flanking sequences to the SSR, this EST was selected as a candidate and used to design the over-flanking amplified primer. With this purpose, 79 sequences were identified and used to design primer pairs in the Primer3 Plus software (
Genomic DNA of eight Jerusalem artichoke accessions (named CN52867, HEL53, HEL65, HEL250, JA6, JA37, JA102, and AMES2722) was used to test the efficiency of the novel designed EST-SSR primers. PCR amplification was performed on an Agilent Technology Sure Cycler 8800 (Germany) and carried out in a 10-µL reaction mixture with 30 ng DNA, 0.2 mM dNTPs, 0.2 mM each primer (Bio Basic Inc.), 0.4 U Taq DNA polymerase, 1X Buffer A [160 mM (NH4)2SO4, 500 mM Tris-HCl, pH 9.1, 17.5 mM MgCl2, and 0.1% Triton x-100; Vivantis], and 1.5 mM MgCl2. PCR involved one cycle of 3 min at 96°C, 37 cycles of 30 s at 93°C, 30 s at the exact annealing temperature for each locus, and 1 min at 72°C, and finally, one cycle for 5 min at 72°C. Primers were initially screened for eight individuals of Jerusalem artichoke, and the fragments obtained were visualized on 2% agarose. The successfully amplified EST-SSRs and clear fragments were validated via sequencing, and were further used to detect polymorphisms in all 60 individuals from the six populations. They were screened on 10% denaturing polyacrylamide gels and visualized by silver staining with a 100-bp DNA ladder Plus (Vivantis) as a size standard. Observation was facilitated by silver staining conducted in accordance with the modified approach of Bassam et al. (1991).
The amplified bands, which were each considered to be an allele, were examined using UVITEC (Topac Inc. Instrumentation, USA) to clarify the allele size. At each point within a gel, the designated EST-SSR alleles were evaluated manually in terms of their presence or absence, and coded as 1 or 0, respectively. The efficiency of the EST-SSR markers was evaluated by investigating the genetic distinctions resulting from the diversity found within Jerusalem artichoke genotypes. This assessment relied upon the examination of each genotype and use of the simple matching coefficient (S) (Sokal and Michener, 1958). The dissimilarity (D) observed at each loci was denoted as 1 - S, so that a measure of the mean dissimilarity among the genotypes could be obtained by taking an average of all the n - 1 EST-SSR dissimilarities (AD) associated with each genotype (Wangsomnuk et al., 2011b). Allelic diversity for each loci was quantified using the concept of polymorphism information content (PIC) described by Botstein et al. (1980), whose equation held that PIC = 1 -
RESULTS AND DISCUSSION
Jerusalem artichoke is used in several chemical and food industries. However, its tuber contains inulin and there is wide variation in its genotypes (Johansson et al., 2015). Evaluation of the genetic diversity of Jerusalem artichoke germplasm facilitates conservation and provides knowledge for the selection of parental clones, which is essential for cultivar improvement in order to improve yield, nutritional and commercial value to farmers and consumers, and is needed in the breeding program (Moose and Mumm, 2008).
A total of 40,362 ESTs from Jerusalem artichoke were retrieved from the NCBI database (
Expressed sequence tags (ESTs) and simple sequence repeats (SSRs) identified from Jerusalem artichoke.
|Total number of sequences examined||23,691|
|Total number of sequences containing SSRs||1,676|
|Total number of SSRs discovered||1,949|
|Number of sequences containing one SSR||1,458|
|Number of sequences containing two SSRs||177|
|Number of sequences containing three SSRs||31|
|Number of sequences containing four SSRs||7|
|Number of sequences containing five SSRs||2|
|Number of sequences containing six SSRs||1|
|Number of primers designed||79|
|Number of informative primers||43|
Types and frequencies of EST-SSRs in Jerusalem artichoke.
nd = not detected.
Primer development and validation
Seventy-nine ESTs containing SSRs were selected, and amplified primer pairs were generated and applied to clarify the performance of eight Jerusalem artichoke genotypes. This might increase the limited number of genetic markers for this species, where only 357 RAPD, 92 ISSR, and 194 SRAP markers have been found previously (Wangsomnuk et al., 2011a,b). Next, all designed primers were used on whole individuals to analyze polymorphism levels. Twenty-eight of 79 novel designed primers failed to amplify, or amplified only a few genotypes (35.44%). Eight loci showed monomorphic bands (10.13%) across all samples investigated. Those markers were excluded from further studies. Forty-three loci (54.43%) were informative for 60 genotypes and were further analyzed for genetic diversity. Examples of polymorphic loci are shown in Figure 1 and Table 4. The number of informative markers found in the present study was higher than that found for EST-SSR markers developed from olive EST sequences by Adawy et al. (2015). In that study, 10 of 25 randomly selected primers showed polymorphism across nine genotypes. Chen et al. (2015) reported that the number of informative EST-SSR primers validated for Adzuki bean included 38 polymorphic markers of 296 markers, which produced amplicons; this was lower than the results obtained for EST-SSR primers here.
Polymorphism at the LP10 and LP65 loci of 60 accessions. See Table 1 detailed information of Jerusalem artichoke accessions. Lane M = 100-bp ladder plus (Vivantis).
Characteristics of 43 polymorphic SSR markers developed for genetic diversity analyses.
|Primer||GenBank accession No.||Forward primer (5'-3')||Reverse primer (5'-3')||Size (bp)||Ta (°C)|
Ta = annealing temperature.
Overall, 170 alleles were found among 43 loci characterized in 60 accessions. NA, NE per locus, HO, and HE of the 43 polymorphic EST-SSR loci are presented in Table 5. NA per locus differed from two to seven with an average number of 3.95. The most frequent NA detected per marker were three (32.56%) and four (34.88%). This is consistent with the highly conserved nature of the primer sequences flanking the SSR region, which is higher than that previously reported in some plant species such as Phaseolus vulgaris (Garcia et al., 2011). HO and HE ranged from 0.0 to 0.983 (mean 0.458) and 0.096 to 0.774 (mean 0.536), respectively. The distribution of HE values revealed the presence of high heterozygosity within Jerusalem artichoke populations, and showed that 67% of the markers are within the range 0.5 to 0.8 (Figure 2). This might contribute to the auto-tetraploidy and cross-pollination observed for this species (Zhou et al., 2014).
Informativeness of SSR loci according to the amplification from 60 Jerusalem artichoke accessions.
Distribution of estimates of genetic heterozygosity.
The number of effective alleles (NE) per polymorphic locus varied from 1.106 to 4.425 with an average of 2.505. The locus LP5 possessed the highest effective number of alleles (4.425) and the highest expected heterozygosity (0.774), and harbored repeat motifs of (ACAT)5. Locus LP8 possesses the lowest number of effective alleles (1.106) and the lowest expected heterozygosity (0.096) with repeat motifs of (TCA)5 (Table 5).
Informativeness of markers was measured by the PIC. Markers with many alleles or those that are highly polymorphic tend to be highly informative. The degree of polymorphism can be classified into three levels: high (PIC > 0.5), medium (0.5 > PIC > 0.25), and low (PIC < 0.25) (Hildebrand et al., 1992). PIC analysis revealed that 43 loci have values ranging from 0.096 (LP8) to 0.854 (LP27) (Table 5 and Figure 3), with an average value of 0.568. The largest group of loci (27.91%) ranged from 0.611 to 0.689, followed by the group with PIC values ranging from 0.714 to 0.788 (20.93%). Nearly three-quarters of all loci possess PIC values higher than 0.5, meaning that the majority of loci studied here possess high levels of polymorphism. Only one-tenth of the loci possessed low polymorphism (PIC < 0.25) (Figure 3). The average PIC values reported here are higher than the allelic variation at 32 loci detected in cowpea (Gupta and Gopalakrishna, 2010).
Distribution of polymorphic information content (PIC) values for 170 simple sequence repeat (SSR) markers.
Approximately 82% of genetic variation was detected within individuals of accessions from a given country, with a much smaller amount of variation occurring among individuals (13%) or populations (5%) (Table 6). All the components of differentiation determined by AMOVA were statistically significant at P < 0.001.
Analyses of molecular variance (AMOVA) of Jerusalem artichoke by simple sequence repeat (SSR) loci.
|Source of variation||d.f.||Sum of squares||Variance component||Percentage of variance (%)||Pvalue|
Pairwise differentiation (FST) was calculated for all accessions. According to a previous study, FST of 0.00-0.05 indicates low differentiation, 0.05-0.15 indicates moderate differentiation, while FST > 0.15 indicates high levels of differentiation (Hartl and Clark, 1997). Variation in FST in the present study ranged from 0 to 0.096, which implies low-to-moderate genotypic differentiation across loci between six countries. There was no diversity of genetic subdivision of populations from Canada, Germany, and Russia (Table 7). A large pairwise FST value was observed between the populations from the USA and Russia (FST = 0.096), followed by accessions from the USA and Canada (FST = 0.093). These data indicate that accessions from the USA are more differentiated from those of Russia and Canada. This finding was also supported by the unweighted pair-group method based on arithmetic average (UPGMA) of Nei’s unbiased genetic distance analysis among accessions from different countries (resources) (Figure 4).
Proportional SSR variation among Jerusalem artichoke accessions of different origin/sources estimated from the analysis of molecular variance of 43 SSR loci.
All pairwise group FST values were statistically significant at P < 0.05, except for those that were non-significant and are highlighted in bold and italic.
Unweighted pair-group method based on arithmetic average (UPGMA) dendrogram showing genetic relationships among Jerusalem artichoke origins/sources based on Nei’s unbiased genetic distances.
Analysis of genetic diversity
Genetic diversity parameters for the 43 microsatellite loci of the 60 Jerusalem artichoke accessions were calculated. Polymorphism among genotypes within each country of origin was as follows: Canada (58.24%), the USA (61.77%), Russia (56.47%), Germany (51.18%), France (55.29%), and Thailand (91.18%). The highest number of polymorphic bands was observed for accessions from Thailand, which may contribute to the large number of accessions (35) compared to other origins (5). It is important to note that increasing the number of samples from other countries or analyzing the same set of samples using more informative primers developed from other available ESTs of Jerusalem artichoke (Jung et al., 2014) may change the genetic diversity information of each population.
A total of 3739 alleles were detected from populations of different sources with an average of 62.32 allele per genotype. The minimum number of alleles was 58, which was observed in four accessions from Thailand, namely, KK101, KK166, KK243, and KK283 (Figure 5). The maximum number of alleles was presented in AMES2736 from the USA. Accessions from Russia and France possess between 60 and 64 alleles, with mean values of 62.20 and 61.80, respectively. Within the accessions from Canada, the number of alleles was between 60 and 63 with an average number of 62. In five accessions from the USA, the number of alleles ranged from 59 to 69 with a mean number of 63. Within the accessions from Germany, the number of alleles ranged from 60 to 67, with an average of 62. Within accessions from Thailand, the number of alleles ranged from 58 to 68, with an average number of 61.80 alleles per accession.
Number of alleles detected in 60 Jerusalem artichoke accessions based on 43 SSR loci.
Genetic differentiation and cluster analysis
AD of Jerusalem artichoke accessions ranged from 0.257 (KK277) to 0.345 (PI503260) (Table 1) with a mean AD of 0.301. The 10 most distinct accessions with an AD of 0.319 or higher included PI503260, KK250, PI547241, AMES2722, KK203, KK148, JA78, KK126, JA55, and KK279. Of note, five open-pollinated lines produced in Thailand are among these 10 accessions, in addition to four wild accessions from the USA and one accession from France.
It is worth noting that the largest genetic distance calculated using the simple matching coefficient (0.45) was observed between KK133 (Thailand) and AMES2722 (USA), which can be used as potential parental sources for further breeding programs. The lowest genetic distance (0.11) was found between KK121 and KK112, and also between KK176 and KK212 which are breeding lines from Thailand, suggesting that EST-SSR markers could be used successfully to distinguish between closely related genotypes. The average genetic distance of accessions from Canada, the USA, Russia, Germany, France, and Thailand was 0.28, 0.30, 0.28, 0.25, 0.28, and 0.24, respectively. These results suggest that accessions from the USA possess higher levels of genetic diversity and might serve as a valuable resource. Overall, 54.58% of the genetic distance between any two accessions of six origins was at least 0.30 (Figure 6).
Distribution of pairwise genetic distances based on 170 SSR markers of 60 accessions.
The genetic relationship among 60 genotypes of Jerusalem artichoke is presented based on the neighbor-joining (NJ) analysis (Figure 7). Most of the accessions from six countries are dispersed among several clusters owing to the low resolution of SSR loci. Six clusters were detected. The first cluster contained 16 accessions, including seven (KK137, KK148, KK166, KK205, KK277, KK279, and KK299) from Thailand, one from Germany (JA102), two from Russia (JA59, JA95), one from Canada (JA134), four from France (JA89, JA97, JA98, HEL250), and one from the USA (JA55). The second cluster comprised seven accessions, including one from Canada (JA42), and the rest from Thailand (KK133, KK139, KK182, KK191, and KK264). The third cluster contained two accessions, including JA105 from Russia and KK191 from Thailand. The forth cluster, which was the biggest group, contained 20 accessions, most of which were from Thailand (15 accessions), with two accessions from Germany (HEL53, HEL231), two from Canada (JA6, JA37), and one from Russia (CN52867). The fifth cluster comprised six accessions, including five accessions from Thailand (KK212, KK224, KK250, KK261, KK283) and one from Russia (HEL65). The last cluster contained nine accessions, including four from the USA (PI547241, AMES2722, PI503260, AMES2736), two from Germany (HEL243, HEL 248), and one accession each from France (JA78), Canada (JA4), and Thailand (KK157). Jerusalem artichoke is a highly self-incompatible plant, which favors cross-pollination as it generally produces wider variation than vegetative propagation. Without control of pollination, varieties can be developed for characters of interest such as high tuber yield and disease resistance. Thus, it can be inferred that the genetic background of these Jerusalem artichoke accessions does not always correlate with their geographical regions.
Neighbor-joining tree showing the genetic association of 60 Jerusalem artichoke genotypes labeled with their origin/source: open square for USA; filled square for Germany; open triangle for Russia; filled triangle for France; open circle for Canada; filled circle for Thailand.
A PCoA was performed based on the genetic distance of the 60 accessions. The first three axes accounted for 29.78% (12.54, 9.17, and 8.07% of the distribution, respectively). The distribution of the relative contribution of each variable in the total variance of the first two axes is well represented by the projection of vectors indicating the maximum variation in the 1st and 2nd axes. The PCoA result revealed somewhat different clusters of accessions compared to those obtained by NJ cluster analysis. However, moderate agreement was detected between these two approaches.
In the present study, the genetic diversity of 60 Jerusalem artichoke was evaluated based on 43 EST-SSR loci. These markers were highly robust with high PIC values (mean 0.568), and polymorphism among accessions within each country ranging from 50.588% (Germany) to 91.764% (Thailand). These newly developed EST-SSR loci have the potential to be applied to studies on molecular breeding and genetic diversity in this species, which might help to cross species and determine genetic variation within the genus Helianthus.