1. Identification and testing of reference genes for Sesame gene expression analysis by quantitative real-time PCR
Sesame (Sesamum indicum L.) is an ancient and important oilseed crop. However, few sesame reference
genes have been selected for quantitative real-time PCR until now. Screening and validating reference genes
is a requisite for gene expression normalization in sesame functional genomics research. In this study, ten
candidate reference genes, i.e., SiACT , SiUBQ6 , SiTUB , Si18S rRNA , SiEF1Î± , SiCYP , SiHistone ,
SiDNAJ , SiAPT and SiGAPDH, were chosen and examined systematically in 32 sesame samples. Three qRTPCR analysis methods, i.e., geNorm, NormFinder and BestKeeper, were evaluated systematically. Results
indicated that all ten candidate reference genes could be used as reference genes in sesame. SiUBQ6 and
SiAPT were the optimal reference genes for sesame plant development; SiTUB was suitable for sesame
vegetative tissue development, SiDNAJ for pathogen treatment, SiHistone for abiotic stress, SiUBQ6 for bud
development and SiACT for seed germination. As for hormone treatment and seed development, SiHistone ,
SiCYP , SiDNAJ or SiUBQ6 , as well as SiACT , SiDNAJ , SiTUB or SiAPT, could be used as reference gene
respectively. To illustrate the suitability of these reference genes, we analyzed the expression variation of
three functional sesame genes of SiSS , SiLEA and SiGH in different organs using the optimal qRT-PCR
system for the first time. The stability levels of optimal and worst reference genes screened for seed
development, anther sterility and plant development were validated in the qRT-PCR normalization. Our
results provided a reference gene application guideline for sesame gene expression characterization using qRT-PCR system.
2. Development and validation of genic-SSR markers in sesame by RNA-seq
To mining SSR markers for sesame molecular genetics researches, 75 bp and 100 bp paired-end RNA-seq was used to sequence 24 cDNA libraries, and 42,566 uni-transcripts were assembled from more than 260 million filtered reads. The total length of uni-transcript sequences was 47.99 Mb, and 7,324 SSRs (SSRs â‰¥15 bp) and 4,440 SSRs (SSRs â‰¥18 bp) were identified. On average, there was one genic-SSR per 6.55 kb (SSRs â‰¥15 bp) or 10.81 kb (SSRs â‰¥18 bp). Among perfect SSRs (â‰¥18 bp), di-nucleotide motifs (48.01%) were the most abundant, followed by tri- (20.96%), hexa- (25.37%), penta- (2.97%), tetra- (2.12%), and mono-nucleotides (0.57%). The top four motif repeats were (AG/CT)n [1,268 (34.51%)], (CA/TG)n [281 (7.65%)], (AT/AT)n [215 (5.85%)], and (GAA/TTC)n [131 (3.57%)]. A total of 2,164 SSR primer pairs were identified in the 4,440 SSR-containing sequences (â‰¥18 bp), and 300 SSR primer pairs were randomly chosen for validation. These SSR markers were amplified and validated in 25 sesame accessions (24 cultivated accessions, one wild species). 276 (92.0%) primer pairs yielded PCR amplification products in 24 cultivars. Thirty two primer pairs (11.59%) exhibited polymorphisms.These markers increase current SSR marker resources and will greatly benefit genetic diversity, qualitative and quantitative trait mapping and marker-assisted selection studies in sesame.
3. Development of SNP and InDel markers via de novo transcriptome assembly
In order to discover the single nucleotide polymorphisms (SNPs) and insertion/deletions (InDels) in RNA-Seq, we collected a total of 33.47 Gbp of data from three sesame transcriptome datasets. A reference transcriptome covering 267,508 unigenes was constructed. Among the 37,646 transcripts with complete open reading frames, a total of 7,450 SNPs and 362 InDels were found with frequencies of one SNP per 6.66 kb and one InDel per 137 kb, respectively. Most of the SNPs were transition-type with the nucleotide transitions Câ€“T or Aâ€“G. A total of 21 InDel types with lengths ranging from 1 to 38 bp were identified, and the short InDels (1â€“2 bp) were most abundant at a ratio of over 80 %. Furthermore, 4,959 (66.56 %) SNPs were detected in protein-coding regions: 2,899 (58.46 %) were synonymous and 2,060 (41.54 %) were nonsynonymous. All SNPs and InDels detected in this study were bi-allelic. Of the randomly selected 40 SNPs and 40 InDels, 92.5 % of the SNPs and 95.0 % of the InDels exhibited polymorphism according to the PCR-based and Sanger-sequenced results. Furthermore, the efficiencies of the newly developed polymorphic SNP and InDel markers were evaluated among 36 commercial sesame cultivars. More than 90.0 % of the markers displayed the expected polymorphic amplifications. The polymorphism information content values ranged from 0.05 to 0.58 with an average of 0.38. Moreover, all genotypes of the 36 commercial cultivars tested were definitively distinguished by 21 SNPs and 16 InDels. These newly identified molecular markers may provide a foundation for cultivar identification, genetic diversity analysis, qualitative and quantitative trait mapping and marker-assisted selection breeding in sesame.