Chloroplast DNA analysis of Tunisian cork oak populations (Quercus suber L.): sequence variations and molecular evolution of the trnL (UAA)-trnF (GAA) region
Sequences of the trnL-trnF spacer and combined trnL-trnF region in chloroplast DNA of cork oak (Quercus suber L.) were analyzed to detect polymorphisms and to elucidate molecular evolution and demographic history. The aligned sequences varied in length and nucleotide composition. The overall ratio of transition/transversion (ti/tv) of 0.724 for the intergenic spacer and 0.258 for the pooled sequences were estimated, and indicated that transversions are more frequent than transitions. The molecular evolution and demographic history of Q. suber were investigated. Neutrality tests (Tajima’s D and Fu and Li) ruled out the null hypothesis of a strictly neutral model, and Fu’s Fs and Ramos-Onsins and Rozas’ R2 confirmed the recent expansion of cork oak trees, validating its persistency in North Africa since the last glaciation during the Quaternary. The observed uni-modal mismatch distribution and the Harpending’s raggedness index confirmed the demographic history model for cork oak. A phylogenetic dendrogram showed that the distribution of Q. suber trees occurs independently of geographical origin, the relief of the population site, and the bioclimatic stages. The molecular history and cytoplasmic diversity suggest that in situ and ex situ conservation strategies can be recommended for preserving landscape value and facing predictable future climatic changes.
The genus Quercus comprises approximately 500 species of trees and shrub, which are distributed throughout much of the Northern hemisphere (Nixon, 1993). Oaks are conspicuous members of the temperate deciduous forests of Europe, North America, and Asia. In addition, Quercus is an important evergreen element of Mediterranean woodlands and subtropical forests. The largely temperate species of the subgenus Quercus are distinguished from the strictly South-East Asian members of the subgenus Cyclobalanopsis by several characters, notably the presence of expanded stigmatic surfaces on the pistillate flowers and small, inconspicuous bracts, which subtend single-staminate flowers (Nixon, 1993). Quercus suber L. is an emblematic Mediterranean evergreen sclerophyllous tree. Cork oak is a slow-growing, extremely long-lived evergreen tree (Zucca, 2012). It is a monoecious wind-pollinated species with a protandrous system to ensure cross-pollination (Zucca, 2012). The reproductive system (mostly sexual) and dissemination (gravity and zoochory) of cork oak are consistent with those of most oak species (Magri et al., 2007). These trees may reach about 20 m in height, with massive branches forming a round crown. The thickness of the soft bark is the source of cork, which is stripped every 10-12 years from the outer layer of the bark along the lower portion of the trunk. The modern distribution of Q. suber is rather discontinuous, and ranges from the Atlantic coasts of North Africa and the Iberian Peninsula to the Southern regions of Italy (especially Sardinia and Sicily), and includes the main West Mediterranean island as well as Morocco, coastal belts of Maghreb (Algeria and Tunisia), Provence (France), and Catalonia (Spain) (Zucca, 2012). Q. suber is widely cultivated within its natural range for cork production (Zucca, 2012). According to Carrion-Vazquez et al. (2000), without human activity, Q. suber would never develop pure stands in the Iberian Peninsula, and would form mixed forests with xerophyllous and deciduous oaks together with Pinus pinaster. In 2006, cork oak landscapes covered an area of nearly 2,277,700 ha, in Portugal, Spain, Italy, France, Morocco, Algeria, and Tunisia (ANF, 2006). Analysis of the world distribution of cork oak forests (ANF, 2006) reveals that Portugal has the largest area (736,700 ha), followed by Spain (506,000 ha), Algeria (414,000 ha), and Morocco (350,000 ha). France, Tunisia, and Italy have the smallest areas (92,000 ha). This reveals that Portugal contains nearly one-third of the world cork oak forest area; therefore, the environmental services and functions related to this species have special importance in this country. Cork oak (Q. suber) is an endemic feature of the western Mediterranean basin. It occupies large areas and impacts sylvatic populations and diverse vegetation structures in the southern (Morocco, Algeria, Tunisia) and the north-western Mediterranean basin [Italy (Sardinia, Sicily), southwestern France, Spain, and Portugal].
In Tunisia, the cork oak accounts for 25% of hardwood species and occupies the northern part of the country, including two natural regions of Kroumirie and Mogods. Q. suber currently covers an area of approximately 90.423 ha or 5.5% of the total area of forest species. The cork oak is currently located mainly in Jendouba, Beja, Bizerte, and Nabeul localities of Northern Tunisia. Renovation of the cork oak forests in the Mediterranean represents a comprehensive approach to the development of space, including fire protection and regeneration of the species. These priorities are supported by the development of forestry policy in the countries concerned. In Tunisia, more than one attempt at cork oak regeneration has been made, including regeneration by seedlings with the development of breeding techniques for oak seedlings in modern forest nurseries. The economic importance of the cork and ecological cork forests in the Mediterranean and globally is widely reported by ecologists. Current yields indicate cork production of approximately 307.500 tons, with Tunisia currently producing only 3% of total world production.
In the Quercus genus, phylogenetic relationships based on morphology occur due to pronounced variation in vegetation, which is sharply contrasted with a stabilized, seemingly constant set of floral characters (Tucker, 1974). Most studies on the genetic variation of Quercus forest trees have been carried out using nuclear markers. The chloroplast genome typically shows low intraspecific variation, although in the last decade, the development of new marker systems has allowed the detection of sufficient variation to assess phylogeographical patterns.
In recent years, cork oak has been the subject of intensive genetic studies, which have utilized different markers (Olalde et al., 2002). Strong differentiation among populations has been detected around the Mediterranean basin by investigating both chloroplast and mitochondrial DNA, which are maternally inherited in oaks, as demonstrated by Dumolin et al. (1995), as well as allozyme variation. Moreover, evidence of cytoplasmic introgression of Q. suber by Quercus ilex genes has been reported (Lumaret et al., 2005). Although nuclear microsatellites have been available for oaks for many years (Kampfer et al., 1998), only two studies have reported multiplexing at no more than five loci (Lepais et al., 2006). Thus, analyzing large oak populations using multiple markers remains expensive and time consuming.
Chloroplast DNA (cpDNA) sequence variation is widely used in systematics and to make phylogenetic inferences at different taxonomic levels. Introns and intergenic spacers show high rates of mutation (Baraket et al., 2010). The trnT-trnL and trnL-trnF intergenic spacers and trnL intron are useful for evolutionary studies at low taxonomic levels, and these regions have been used extensively in phylogenetic studies for the analysis of cytoplasmic polymorphism, and to study the demographic history of several species (Baraket et al., 2010, 2015Mustapha et al., 2015). Several hypotheses have been developed regarding the evolutionary history of cork oak as well as its center of origin, which has been proposed to lie in the eastern countries of the western Mediterranean Basin (Lumaret et al., 2005; Magri et al., 2007). Variation in cpDNA has been studied by PCR-RFLP and microsatellite markers to determine the route and pattern of postglacial recolonization of native oak throughout Europe and the Mediterranean Basin (Lumaret et al., 2005; Magri et al., 2007).
Here, we studied the genetic relationships and molecular evolution among six Tunisian populations of cork oak based on non-coding regions of cpDNA, i.e., the intergenic spacer between trnL (UAA) and trnF (GAA) and the combined region [trnL (UAA) intron and the intergenic spacer trnL-trnF]. Here, we explored cpDNA to analyze population genetic differentiation using non-coding regions and to establish genetic relationships among cork oak trees. This report also describes a molecular evolution scenario and demographic history in order to understand how this species responded to past global and severe climatic changes with particular attention given to recommendations for conservation strategies of this forestry feature.
MATERIAL AND METHODS
Plant material composed of young leaves was collected from Northern Tunisia (Kroumirie, Mogods), Cap Bon (Figure 1). We analyzed a set of 10 individuals from each population as follows: Population 1: Hammam Bourguiba (HB), Population 2: Béni Mtir (BM), Population 3: Bellif (B) (Nefza), Population 4: Jebel Khroufa (DK) (Tabarka), Population 5: Keff El Rand (KR) (El Haouaria), and Population 6: Hammam Jdidi (HJ) (Hammamet) (Table 1 and Figure 1).
Tunisian map showing the geographic origin of Quercus suber L. in North Tunisia.
Geographical, bioclimatic, and relief characteristics of different populations of Cork oak (Quercus suber L.) in Tunisia.
|Code||Site||Area||Geographical distribution||Relief||Altitude (m)|
|BM||Beni M'ttir||Ain Drahem||Western North (Kroumirie)||Mountain||800|
|HB||Hammam Bourguiba||Ain Drahem||Western North (Kroumirie)||Mountain||570|
|KR||Keff El Rand||Haouaria||North East (Cap Bon)||Mountain||642|
|HJ||Hammam Jdidi||Hammamet||North East (Cap Bon)||Mountain||174|
|DK||Djebel Khroufa||Tabarka||Western North (Mogods)||Tell Atlas||160|
|B||Bellif||Tabarka||Western North (Mogods)||Tell Atlas||88|
Total cellular DNA was extracted from frozen leaves according the procedure (Qiacube Qiagen) described by Bernatzky and Tanksley (1986). DNA quality was examined by electrophoresis on 0.8% agarose gel. DNA concentration was estimated using a Gene Quant minispectrophotometry (Qubit) at 260 nm and DNA was stored at -4°C until use in PCR.
Amplification and sequencing
The cpDNA spacer region between the trnL (UAA) - trnF (GAA) and the trnL (UAA) introns was amplified from total genomic DNA as a template using PCR. Amplification of the trnL intron and the adjacent trnL-trnF spacer was achieved using the forward and reverse primers designed by Taberlet et al. (1991). The primer pairs, denoted ‘c’ (5'-CGAAATCGGTAGACGCTACG-3'), ‘d’ (5'-GGGGATAGAGGGACTTGAAC-3') ‘e’ (5'-GGTTCAAGTCCCTCTATCCC-3'), and ‘f’ (5'-ATTTGAACTGGTGACACGAG-3'), were used to amplify the intron of trnL (UAA) 5'-exon, and the trnL (UAA) 3'-exon (i.e., the intron trnL) and the spacer between the trnL (UAA) 3'-exon and the trnF (GAA) exon (i.e., the trnL-trnF spacer), respectively. The two non-coding regions of the cpDNA were amplified by PCR in a DNA thermocycler (Gene Amp® PCR System 9700: PE Applied Biosystem). Conditions for PCR amplification were as follows: 25 mM MgCl2, 2 mM dNTP mix, 1 µM each primer, and 1 U DNA Taq polymerase (Fermentas) with 20 ng DNA in a final reaction volume of 25 µL. Cycling conditions were: 94°C for 4 min as an initial denaturation step before entering 35 cycles each composed of 1 min at 94°C, 1 min at 51°C, and 2 min at 72°C. A final extension step of 10 min at 72°C was usually programmed at the last cycle. Agarose gel electrophoresis (1.5%) and ethidium bromide staining were used to check the PCR products. The sequences obtained were published in GenBank with accession Nos. KR780486 to KR780545 for the intergenic spacer and KP693717 to KP693776 for the intron (
Localities, accession numbers, variations in length, GC and AT contents of the trnL-trnF spacer and combined region, trnL-trnF spacer and trnL intron in the studied Quercus trees.
|Relief||Code||Locality||Accession No.||trnL-trnF spacer||Combined: trnL-trnF spacer and trnL intron|
|%AT||%GC||Length (bp)||%AT||%GC||Length (bp)|
|Tell Atlas||B7-4||Ain Drahem||KR780492||67.9||32.1||448.0||67.5||32.5||968.0|
|Tell Atlas||BM1-4||Ain Drahem||KR780496||68.2||31.8||443.0||67.6||32.4||963.0|
Sequences were aligned using the ClustalW package and analyzed with MEGA program version 6 (Tamura et al., 2011). Aligned sequences were analyzed with the DnaSP software version 5.10.01 (Librado and Rozas, 2009) to estimate polymorphism indices, molecular evolution, and past demographic history. Indices of haplotype diversity (Hd) (Nei and Tajima, 1983) and pairwise estimates of nucleotide diversity (Pi) (Jukes and Cantor, 1969) were used to evaluate genetic diversity among Q. suber populations. The average pairwise nucleotide differences (K), Theta (per site), and the minimum number of recombination events (Rm) were also estimated. Selective neutrality was tested by both Tajima’s D (Tajima, 1989) and Fu and Li’s D* and F* methods (Fu and Li, 1993). Demographic parameters were assessed using the distribution of pairwise sequence differences (Mismatch distribution) of Rogers and Harpending (1992) and site-frequency spectra (distribution of the allelic frequency at a site) of Tajima (1989). Fu’s Fs (Fu, 1997) and Ramos-Onsins and Rozas’ R2 (Ramos-Onsins and Rozas, 2002) tests for mutation/drift equilibrium were also performed in DnaSP version 5.10.01 with 10,000 simulations (Librado and Rozas, 2009). The overall validity of the estimated demographic model was evaluated by tests of raggedness index r (Harpending, 1994). The significance of r was assessed by parametric bootstraps (10,000 replicates), and the significant value was taken as evidence for departure from the estimated demographic model of sudden population expansion. For each sequence, length, and contents, the proportion of AT and GC were estimated. The observed ratio of the mutational event was also calculated.
The alignment was manually checked and pairwise sequence divergence between trees in trnL-trnF intergenic spacer and the combined trnL-F region was calculated according to the maximum composite likelihood (MCL) method (Tamura et al., 2004). The evolutionary history was inferred using the maximum likelihood method based on the Tamura-Nei model (Tamura and Nei, 1993). Initial trees for the heuristic search were obtained automatically by applying neighbor-joining and BioNJ algorithms to a matrix of pairwise distances estimated using the MCL approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured by the number of substitutions per site. All positions containing gaps and missing data were eliminated using the MEGA Version 6 software (Tamura et al., 2011). The maximum parsimony (MP) tree was obtained using the Subtree-Pruning-Regrafting algorithm (Nei and Kumar, 2000) with search level 1, in which the initial trees were obtained by the random addition of sequences. To evaluate the relative robustness of the groups found in the relationship analysis, 1000 bootstrap replicates were calculated.
Sequence variation in the trnL-trnF spacer
The length of the trnL-trnF intergenic spacer ranged from 440 bp for ‘BM8-4’ to 452 bp for ‘KR9-4’, ‘DK6-4’, and ‘BM9-4’ cork oak with an average length of 448.7 bp. In addition, the AT content of the amplified sequences varied from 66.7 to 69.6% in the trnL-trnF spacer, and the GC content ranged from 30.4 to 33.3% (Table 2). The trnL-trnF spacer has undergone single mutation events such as indel (insertion/deletion) and substitution events (transversion and transition). The transition/transversion rate ratios were K1 = 1.431 (purines) and K2 = 1.883 (pyrimidines). An overall transition/transversion ratio R of 0.724 occurred in the cpDNA trnL-trnF spacer. This result shows that transversions are more frequent than transitions in the intergenic spacer (Figure 2).
Maximum composite likelihood divergence plotted against the number of transitions (s: crosses) and transversions (v: triangles) for pairwise comparisons of the trnL-trnF spacer (
In addition, the nucleotide frequencies were 38.84, 28.44, 15.72, and 16.99% for A, T, C, and G, respectively. The different substitutions detected are presented in Table 3, which shows that in Tunisian cork oak, A → G and T → C transitions are more frequent than G → A and C → T transitions. The cpDNA data set for the broad investigation of Q. suber individuals consisted of 412 aligned nucleotide positions with 309 conserved sites, 103 variable sites, 67 of which were parsimony informative and 36 of which were singletons.
Composite likelihood estimate of the pattern of nucleotide substitution in the trnL-trnF spacer and the combined sequences trnL-trnF spacer and trnL intron.
Sequence variation in the combined sequence of the trnL intron and trnL-trnF spacer
The length of the combined region ranged from 932 bp for ‘BM9’ to 972 bp for ‘B3’, ‘BM10’, ‘DK6’, ‘HJ5’, and ‘KR1’ trees, among the analyzed cork oaks. In addition, the AT content of the amplified sequences varied from 67.0 to 68.5% in the combined region and the GC content ranged from 31.5 to 33.0% (Table 2). The transition/transversion ratios were K1 = 1.008 (purines) and K2 = 0.194 (pyrimidines). The overall transition/transversion ratio, R, was of 0.258 in the combined region. The rate of the transitional substitutions (A/G: 12.89, T/C: 2.6, C/T: 1.18, and G/A: 6.26) were lower than the rates of transversions (A/T: 12.79, A/C: 12.79, T/A: 13.45, T/G: 13.45, C/A: 6.09, C/G: 6.09, G/T: 6.21, and G/C: 6.21) (Table 3), which confirms the high AT content. This may explain the high proportion of identified transversions (Figure 2). Of the 102 variable sites, 70 were parsimony-informative characters and 32 were singleton variable sites in Quercus sequences (Table 4). The nucleotide frequencies were 33.19, 34.90, 15.80, and 16.12% for A, T, C, and G, respectively.
Comparison of data from the trnL-trnF spacer and the combined region of the trnL-trnF spacer and the trnL intron.
|Total informative||No. of tree||Length||CI||RI||HI||BS ≥ 50%|
|trnL-trnF||412/67 = 6.14||137||171||0.888||0.875||0.112||11|
|Combined||856/70 = 12.22||110||251||0.548||0.751||0.452||14|
Total informative = total of characters (excluding gaps)/number of informative characters; CI = consistency index; RI = retention index; HI = homoplasy index; BS = number of branches that contained bootstrap support close to 50%.
Genetic diversity and differentiation
A high level of variation in cpDNA was detected in cork oak. Sequence alignment permitted the identification of 57 cpDNA haplotypes among 60 examined sequences for the intergenic spacer and 48 for the combined sequences. Haplotype diversity (Hd) values of 0.998 for the trnL-trnF spacer and 0.989 for the combined sequences were obtained in our data set (Table 5). Nucleotide diversity of cork oak was 0.03152 (3.152%) for the trnL-trnF spacer, and 0.01210 (1.210%) for the combined sequences (Table 5). Theta values (per site) and the average number of pairwise differences (K) were 0.05778 and 12.986, respectively, for the trnL-trnF spacer, and 0.02656 and 10.356, respectively, for the combined region (Table 5). The combined region trnL-trnF showed more polymorphic sites than the trnL-trnF spacer. Hence, it was more informative and proved to be very effective in differentiating the cork oak populations. Moreover, the trnL intron has been used successfully in Angiosperms (Nei and Kumar, 2000). The nucleotide diversity (Pi) and segregating sites (S), illustrated in Figure 3a and b, show the occurrence of substitutions along the analyzed sequences. A high level of nucleotide diversity was observed in the trnL-trnF spacer, and in the combined sequences. This illustrates the levels of polymorphism in Tunisian Q. suber and demonstrates the utility of the trnL-trnF spacer and the pooled sequences of the trnL intron and trnL-trnF spacer to reveal divergence at the intra-specific level. A 100-bp sliding window shows the distribution of nucleotide substitution diversity and segregating sites over the cpDNA region assayed.
Polymorphism parameters and Tajima’s neutrality tests for 60 sequences of the trnL-trnF spacer, and the combined region of the Tunisian cork oak accessions used in this study.
|trnL-trnF spacer||Combined: trnL-trnF spacer and trnL intron|
|Number of sequences||60||60|
|Alignment length (bp)||454||980|
|Parsimony informative characters||67||70|
|Singleton variable sites||36||32|
|Total number of mutation||111||106|
|Number of polymorphic sites (S)||103||102|
|Number of haplotypes (H)||57||48|
|Haplotype diversity (Hd) ± SD||0.998 ± 0.003||0.989 ± 0.006|
|Variance of haplotype diversity||0.00001||0.00004|
|Nucleotide diversity (Pi)||0.03152||0.01210|
|Theta (per site)||0.05778||0.02656|
|Average of pairwise differences (K)||12.986||10.356|
|Minimum number of recombination events (Rm)||14||13|
|Tajima’s D||-1.58503 NS (0.10 > P > 0.05)||-1.89622 *(P < 0.05)|
|Fu and Li’s D*||-1.44212 NS (P > 0.10)||-1.22759 NS (P > 0.10)|
|Fu and Li’s F*||-1.79379 NS (P > 0.10)||-1.78053 NS (P > 0.10)|
|Fu’s Fs statistic||-52,249 (P = 0.000)||-33.722 (P = 0.000)|
Base-pair sliding window of the trnL-trnF spacer (a1, b1) and the combined region: trnL-trnF spacer and trnL intron (a2, b2) of cpDNA of Quercus suber L.
MP using the trnL-trnF spacer
Relationships between cork oak accessions were examined taking into account variation in the intergenic trnL-trnF spacer. To clarify the genetic relationships between haplotypes, the dendrogram was constructed using MP. Haplotype parsimony analysis identified 240 parsimonious trees. Our analysis yielded one most parsimonious tree, 240 steps long, with a consistency index (CI) of 0.446, a retention index (RI) of 0.745, and a homoplasy index (HI) of 0.554 (Table 4 and Figure 4). Consequently, all data sets exhibited low homoplasy levels, demonstrating the utility of the trnL-trnF spacer region to elucidate genetic relationships among Q. suber populations (Figure 4).
Strict consensus tree of 140 most parsimonious trees (length = 240 steps, CI = 0. 446; RI = 0. 745) obtained from the trnL-trnF spacer data from Tunisian cork oak. Numbers below nodes represent bootstrap values.
MP using combined sequences
Parsimony analysis identified 167 parsimonious trees. Our analysis yielded one most parsimonious tree, 167 steps long, with CI of 0.548, RI of 0.751, and HI of 0.452 (Table 4 and Figure 5). Consequently, all data sets showed low homoplasy levels, demonstrating the utility of the combined sequences to elucidate genetic relationships among Q. suber populations (Table 4).
Strict consensus tree of 167 most parsimonious trees (length = 167 steps, CI = 0.548, RI = 0.751) obtained from the combined trnL-trnF region data from Tunisian cork oak. Numbers below nodes represent bootstrap values.
Genetic relationships using the trnL-trnF spacer
The pairwise sequence divergence ranged from 0.000 to 1.107, with an average of 0.025 indicating polymorphism among Q. suber populations. The low level of plastid DNA variation exhibited by Q. suber populations is probably related to its slow rate of evolution. The minimum distance (0.000) was observed between ‘Bellif’ (3-6), ‘Béni Mtir’ (2-6; 3-6; 6-6; 7-6; 9-6), ‘Djebel Khroufa’ (2-6; 4-6; 5-6; 7-6; 10-6), ‘Hammam Bourguiba’ (8-6; 9-6; 10-6), ‘Keff El Rand’ (1-6; 6-6), and ‘Hammam Jdidi’ (1-6) accessions, and the maximum distance (1.107) was observed between ‘Bellif’ (4-6) and ‘Keff El Rand’ (8-6; 10-6) population suggesting the important dissimilarity between sequences of the trnL-trnF spacer. To elucidate genetic relationships among Tunisian Q. suber populations, a dendrogram was constructed using the ML method. The dendrogram classified the 60 individuals studied into two main clusters (BS = 77) (Figure 6A). The first group designated I, included ‘DK3-6’, ‘B4-6’, and ‘B5-6’. All the remaining accessions are clustered in the second group (II). These accessions showed a high level of similarity between each other. In fact, it is important to note that Q. suber individuals are distributed independently to the geographical origin, the bioclimatic stage, and the relief of populations.
Maximum-likelihood tree obtained from the trnL-trnF spacer (
Of note, the present study has shown that a sequencing approach is more efficient at examining the phylogenetic relationships in this species, and may be extended to other populations of Tunisian Q. suber. Mediterranean evergreen Quercus species are a group with overlapping habitats, which often leads to their consideration as a homogeneous entity in botanical, biogeographical, or paleo-historical studies. In the Western Mediterranean Basin, Q. suber (cork oak), Q. ilex (holm oak), and Q. coccifera (kermes oak) are the dominant broad-leaved species. The three species are sympatric in many areas, but have some different evolutionary histories. Previous studies have shown differences in their genetic variation patterns at both nuclear and cytoplasmic levels (Lumaret et al., 2005).
Genetic relationships using the combined sequences
Pairwise sequence divergence ranged from 0.000 to 0.056 with an average of 0.012. The ML dendrogram showed two main groups (BS = 80) (Figure 6B). The first group consisted of ‘DK3’, ‘B4’, and ‘B5’ oaks, which differs from the other trees. All the remaining accessions clustered in the second group. However, the dendrogram topology of the trnL-trnF spacer is congruent with data provided by the combined sequences. Moreover, the combined sequences showed the same informative characters with the spacer. In our study, the combined data provided the same supported resolution with the intergenic spacer.
Tajima’s and Fu and Li’s tests
The selective neutrality of the detected mutations was tested using the methods of Tajima (1989) and Fu and Li (1993) to examine the null hypothesis. Selective neutrality tests show that both tests were negative and not significant in the total studied sample, with the exception that the Tajima test was negative and significant for the combined sequences (Table 5). These parameters reject neutrality in the regions studied. The deviation from selective neutrality, as evidenced by Tajima tests, is explained by an excess of rare mutations as identified in the studied sequences. The excess of low frequency polymorphism can be explained for Quercus populations in Tunisia.
Fu’s Fs and R2 statistics
To describe the cause of the deviation from neutrality, the R2 and Fu’s Fs statistic were estimated. For cork oak trees, Fu’s Fs were negative (Fu’s Fs = -52,249 for the trnL-trnF spacer, and Fu’s Fs = -33.722, for the collective sequences). Conversely, R2 showed positive values (R2 = 0.0597 for the trnL-trnF spacer, and R2 = 0.0477, for the pooled sequences). The results reject a model of constant population size, and the observed patterns of variation provide evidence that cork oak trees have been undergoing rapid expansion. Fu’s statistic is more powerful at detecting the deviation from neutrality and thereby attesting population expansion or a genetic hitchhiking in the data, which produces a similar pattern like that of recent population expansion. Our results demonstrate that the genetic effect of hitchhiking was ruled out by the Fu test, since the observed values were negative (Fu’s Fs = -52,249; P = 0.000 for the intergenic spacer, and Fu’s Fs = -33.722; P = 0.000 for the combined sequences, respectively) (Table 5 and Figure 7).
Changes in population size leave particular footprints that may eventually be detected in DNA sequence data (Tajima, 1989; Rogers and Harpending, 1992). The empirical distribution of the pairwise number of nucleotide differences and segregating sites among pairs of individuals shows a skew in the mismatch distribution that appears uni-modal for the spacer, and the aggregated sequences (Figure 7). The observed uni-modal mismatch distribution and the low Harpending’s raggedness index (r = 0.0019 for the trnL-trnF spacer and r = 0.0034 for the pooled sequences) (Figure 7) calculated for cork oak cpDNA haplotypes confirm the model of recent expansion. Usually, average values of Tajima’s D found in most plant species studied so far are likely to represent the results of demographic processes (such as expanding populations, introgression, extinction, and recolonization events) and/or weak selection, which allows deleterious variants to give rise to a low population frequency (Wachowiak et al., 2009).
To further examine whether the genetic hitchhiking acts in the trnL-trnF region, linkage disequilibrium was estimated among sites. The spacer has lower nucleotide diversity and lower widespread linkage disequilibrium compared to the trnL intron. Figure 8 shows the increase in linkage disequilibrium in the trnL intron. We noted a significant reduction in the level of linkage disequilibrium at the spacer-linked site (Figure 8), consistent with sweep selection and the lacking of a hitchhiking genetic effect.
Linkage disequilibrium estimated from the pooled data for the trnL intron and trnL-trnF intergenic spacer of Tunisian Quercus suber L.
In this study, the trnL-trnF spacer and the combined region (trnL intron + trnL-trnF spacer) of cpDNA were used to study the genetic diversity and molecular evolution of cork oak populations, the establishment of relationships, and to elaborate on a scenario for the origin of the Tunisian Q. suber forest. The sequence analysis showed variations in both the lengths and nucleotide compositions for all cork individuals. The average size of the trnL-trnF spacer of the Q. suber tree was 448.7 bp. Similar results have been reported in P. dactylefera (364-397 bp) (Sakka et al., 2013) and in other plants such as the Pinus genus (467-471 bp) (Chen et al., 2002). The nucleotide composition of the trnL-trnF spacer in the Tunisian cork oak (AT = 67.9%; GC = 32.1%) is similar to that observed in the total trnL-trnF region of Angiosperm species (Actaea, Digitalis, Drosera) (Bakker et al., 2000), in Ficus carica (Baraket et al., 2010), in P. dactylefera (Sakka et al., 2013), and in 19 other cpDNA non-coding regions in Poaceae (Chen et al., 2002) and, in Rubiaceae, for the atpB-rbcL (Manen and Natali, 1995). In fact, the GC contents varied from 28 to 32% for the psbA-trnH intergenic spacer of date palm, from 39 to 41% for the rpoB spacer as reported by Al-Qurainy et al. (2011) and from 33 to 34.4% for the trnL-trnF spacer as reported by Sakka et al. (2013). The plastome genome was AT rich (63%) in Q. suber.
The transition/transversion (ti/tv) ratio was estimated, and was similar to values previously reported for the trnL-trnF spacer of cpDNA among Fagus sylvatica, Q. macrolepis, Q. trojana, Q. ilex, Q. suber, Q. pubescens, and Q. robur, horny oak, and Q. ilex (ti/tv = 0.5) (Paffetti et al., 2001), in Ficus carica L. (ti/tv = 0.9) (Baraket et al., 2010), and in other Angiosperms for the trnL-F and the atpB-rbcL intergenic spacers (Manen and Natali, 1995; Bakker et al., 2000), and in cork oak (ti/tv = 7.60) (Cosimo et al., 2009). However, this differs from the values observed in P. dactylefera (ti/tv = 2.19) (Sakka et al., 2013).
Nucleotide frequencies were calculated and a similar composition has been reported in cork oak, in which the base frequencies were A = 18.33%, C = 33.66%, G = 30.71%, and T = 17.30% (Cosimo et al., 2009).
The numbers of conserved and variable sites determined in this study were found to be similar to values observed for the trnL-trnF intergenic region of six additional Quercus species: Q. suber L., Q. trojana Webb, Q. macrolepis Kotschy, Q. ilex L., Q. coccijera L., and Q. calliprinos. Different results have been reported in the Quercus genus, for which the data matrix listed 53 taxa (with four outgroups) and 595 characters (545 constant, 21 variable but parsimony-uninformative and 29 parsimony-informative). Six characters (four parsimony-informative) derived from indel coding were added to the MP analysis (heuristic search and bootstrap) (Cosimo et al., 2009). In the second part, the average size of the trnL-trnF region of the Tunisian Q. suber tree was estimated at 966.9 bp. Similar results have been reported for P. dactylefera (897-981 bp) (Sakka et al., 2013), Ficus carica (989-1022 bp) (Baraket et al., 2010), and Q. rubra (902 bp) (Gielly and Taberlet, 1994). The nucleotide composition in the pooled region of the Tunisian cork oak (AT = 67.6%; GC = 32.4%) was similar to that estimated for the combined sequences in Q. rubra by Gielly and Taberlet (1994) and in F. carica by Baraket et al. (2010). The ti/tv ratio was estimated (0.258) in the combined region. Conversely, in other studies, high levels of ti/tv have been found in P. dactylefera (ti/tv = 1.282) (Sakka et al., 2013), which show that transitions are slightly more common compared to transversions in the combined trnL-trnF spacer and trnL intron. We interpret these values as indicating different rates of transition and transversion. This result is consistent with findings in Angiosperm species, with a ti/tv ratio not exceeding 1 for any of the examined groups (Bakker et al., 2000). The haplotype diversity for the trnL-trnF spacer (Hd = 0.998) and for the combined sequences (Hd = 0.989) was estimated in our data set (Table 5). The trnL-trnF spacer and the trnL-trnF combined region of P. dactylefera showed the highest haplotype diversity of 0.848 and 0.955, respectively (Sakka et al., 2013). Nucleotide diversity of cork oak was 0.03152 (3.152%) for the trnL-trnF spacer, and 0.01210 (1.210%) for the combined sequences (Table 5). However, high nucleotide diversity has been found in other chloroplast non-coding regions of date palm such as psbA-trnH, rpoB genes (6.86 and 2.76%, respectively) (Manen and Natali, 1995), the trnL-trnF spacer (1.22%), and for the trnL-trnF region (1.19%) (Sakka et al., 2013). This is higher than the nucleotide diversity obtained for the accessions of Fagopyrum cymosum in the three non-coding regions of the rbcL-accD and spacer region (0.00260); trnC-rpoB (0.00311) and trnK-matK (0.00304), and similar to the nucleotide diversity obtained for the accessions of P. dactylefera in the trnL-trnF spacer and the trnL-trnF combined region (Pi = 0.01220 and Pi = 0.01190). The Theta (per site) and average number of pairwise differences (K) were 0.05778 and 12.986, respectively, for the trnL-trnF spacer, and 0.02656 and 10.356, respectively, for the combined region (Table 5). However, high values for average pairwise differences (K) have been found in other chloroplast non-coding regions of date palm such as psbA-trnH and rpoB (45.893 and 11.321, respectively) (Al-Qurainy et al., 2011), of P. dactylefera for the trnL-trnF spacer (0.01870 and 3.697, respectively), and for the trnL-trnF region (0.01940 and 8.803, respectively) (Sakka et al., 2013).
Genetic relationships of the trnL-trnF spacer and combined region show that the distribution of Q. suber individuals is independent of the geographical origin, the bioclimatic stage, and the relief of populations. The MP tree, with an RI index of 0.745 for the trnL-trnF spacer and of 0.751 for the pooled sequences, indicated minimal homoplasy within the data set. Furthermore, our results demonstrate that the trnL-trnF non-coding region is the location of numerous substitutions. Similar results were obtained by Cosimo et al. (2009) and show that the MP analysis produced 13,320 most parsimonious trees with 85 steps in length, CI of 0.682, and RI of 0.866.
Both Tajima’s D and Fu’s Fs tests revealed negative values, but only Fu’s Fs was statistically significant from intron and spacer sequences (Fs = -52.249, P = 0.000 for the intron, and Fs = -33.722, P = 0.000 for the spacer). However, only Tajima’s D tests revealed negative and significant values for the pooled sequences (D* = -1.89622; P < 0.05). This could be due to Fu’s Fs test having more power than Tajima’s D test to detect population expansion (Tamura and Nei, 1993). Neutrality tests reflect the degree of active population expansion or positive selection as evidenced by the Fu and Li test values for the trnL-trnF spacer and combined region. In fact, sites at 170 bp for the trnL-trnF spacer and to 220 bp for the trnL intron in the combined data set are indicated as the sites where selection occurs. This result is similar to those found in Q. crispula Blume by Quang et al. (2008), which shows that no significant Tajima’s D and Fu and Li’s D* and F* were observed, and similar to within-populations of Q. crispula (Quang et al., 2008). Furthermore, palynological studies have suggested that oak forests covered a broad area in southern Japan during the last glacial maximum and have just recently reached the Northern areas. In Southern Japan, the frequency of oak pollen rapidly increased 12,000 years ago and then began to decrease from 8500 years ago (Takahara, 1998). As suggested by Magri et al. (2007) and Lumaret et al. (2005), the distribution of cpDNA haplotypes of cork oak demonstrates its postglacial population expansion from potential glacial refuges in Italy, North Africa, and Iberia. Our results confirm this assumption, propose Tunisia as a refuge zone of cork oak, and validate its persistency in North Africa since the last glaciation during the Quaternary. Its long-term stability and persistence are demonstrated, and there appears to be preserved footprints of past expansion.
In summary, findings emerging from the present study reveal considerable intraspecific variation in the trnL intron and trnL-trnF spacer as non-coding regions of plastid DNA of Tunisian Q. suber populations and indicate a recent and rapid expansion of cork oaks. Results supported this scenario and rejected the neutral model of molecular evolution for the trnL-trnF spacer and combined region (trnL-trnF spacer and trnL intron) of Q. suber L. Cytoplasmic diversity seems to be unstructured with any biogeographic repartition, bioclimatic stage, or relief of populations. The results of the present study highlight that cytoplasmic DNA markers are reliable for use to elaborate molecular databases to implement breeding and rational management programs, in order to conserve and improve Tunisian cork oak forests. The results suggest that Q. suber populations possess considerable genetic diversity and should be given the highest priority for conservation. This is the first report concerning the detection of plastome mutational events for Tunisian Cork oaks and the markers identified are useful for studying cytoplasmic genetic diversity and differentiation of this species, and provide basic genetic information that should generate proposals for conservation and management of forest genetic resources. The present results provide managers with precise information that is essential for the evaluation of resources for gene conservation, and for prioritizing future actions. To successfully manage climatic changes and to address problems of deforestation, we recommend in situ and ex situ conservation strategies for all the populations studied herein in order to maintain their evolutionary potential, which has been present since quaternary glaciations, in Tunisia and the Mediterranean basin.