NM_c+G>C; NM_c+G>T . ss, BGI|BGI_rs, fwd/, C/G, gtgcatggctggagcagaggccgggagcca . _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG · _DSCJPG. –, doi/jxb/ery Advance Access . (BGI, http://www. ). Processing of reads. Clean reads were.
|Country:||Papua New Guinea|
|Published (Last):||4 November 2010|
|PDF File Size:||13.92 Mb|
|ePub File Size:||9.93 Mb|
|Price:||Free* [*Free Regsitration Required]|
Soybean Glycine max was domesticated from its wild relative Glycine soja. However, the genetic variations underlying soybean domestication are not well known. Comparative transcriptomics revealed that a small portion of the orthologous genes might have been fast evolving. In contrast, three gene expression clusters were identified as divergent by their expression patterns, which occupied Moreover, the most divergent stage in gene expression between wild and cultivated soybeans occurred during seed development around the cotyledon stage 15 d after fertilization, G A module in which the co-expressed genes were significantly down-regulated at G15 of wild soybeans was identified.
The divergent clusters and modules included substantial differentially expressed genes DEGs between wild and cultivated soybeans related to cell division, storage compound accumulation, hormone response, and seed maturation processes. Chromosomal-linked DEGs, quantitative trait loci controlling seed weight and oil content, and selection sweeps revealed candidate DEGs at G15 in the fruit-related divergence of G.
Our work establishes a transcriptomic selection mechanism for altering gene expression during soybean domestication, thus shedding light on the molecular networks underlying soybean seed development and breeding strategy.
Soybean [ Glycine max L. Genetic tools and resources have been developed for soybeans, but the progress of forward and reverse genetics has generally been slow. The reasons for this limited progress are mainly the genomic complexity and the difficulty of genetic transformation. A few genes have been found to be involved in soybean domestication and improvement through various strategies, such as mapping of quantitative trait loci QTLs.
However, increasing advances in sequencing technology have provided a genomic platform Metzker, to predict the domesticated genes in various crops, such as rice Huang et al. This is having a significant impact on molecular breeding programs Poland and Rife, The soybean genome has 20 pairs of chromosomes. The first draft of the soybean genome cultivar Williams 82 was released in and predicted the genome size to be Mb Findley et al.
Using the sequence draft as a reference, re-sequencing analyses of soybean populations and comparative genomics have greatly enhanced our ability to identify the genetic characteristics that distinguish cultivated and wild soybeans, which helps us to understand the genetic basis of cultivar phenotypic specializations M.
The challenge now is to understand the functional consequences of the small fraction of these genetic changes and how they are involved in making cultivated soybeans different from wild soybeans. Much effort has focused on identifying genes that underwent large sequence changes or accelerated rates of nucleotide change in cultivated soybean lines Chung et al. This has led to the discovery of some genomic regions affected by artificial selection.
These genomic regions are associated with agricultural traits, for example plant height, pubescence form, twinning trait, maximum internode length, number of nodes, and seed-oil content Lam et al. Although the genomic regions identified in previous studies are valuable for marker-assisted breeding Lam et al. These results indicate that changes affecting protein sequences are a source of genetic variation in soybean domestication. Transcriptomics has made it possible to explore another dimension of domestication, namely changes in patterns of gene expression.
Evidence for changes at the transcriptional level during domestication has also been examined in various crops. For example, a study in maize has suggested the widespread alteration of transcriptional networks during domestication Swanson-Wagner et al. The same has been found in tomato domestication Koenig et al. A transcriptome analysis was performed during early and middle seed maturation stages between wild and cultivated soybean varieties Lu et al.
Another transcriptomic comparison of early maturation stages of developing seeds was performed between a cultivated soybean and a landrace with contrasting seed size phenotypes Du et al. A cytochrome P family gene GmCYP78A5 was found to be differentially expressed in the two soybeans, and transgenic soybean lines overexpressing this gene exhibit enlarged seed size and increased seed weight Du et al.
These transcriptome analyses provided the first sets of expression data on genes controlling the mid to mature stage of seed development, but they are not sufficient to enable a comprehensive understanding of the transcriptomic variation underlying seed developmental evolution, since domestic soybeans are believed to have multiple origins Xu et al. To gain further insight into the developmental evolution of soybean fruit and seed, in the present study, we chose cultivated and wild soybeans from a major soybean-producing center, northeast China, which is also a predicted soybean domestication center Fukuda et al.
S1 at JXB online. Analyses of the obtained 40 RNA sequencing RNA-seq data sets suggested that gene expression alteration was extensive between wild and cultivated soybeans, and analyses of gene expression clusters and DEGs suggested that most divergence in gene expression between wild and cultivated soybeans occurs around the cotyledon stage at the late pod development stage G15, defined as 15 d pods after fertilization.
Moreover, weighted gene network analysis identified one molecular module negatively associated with fruit and seed development at G15 in wild soybean, and the genes in this module, which are related to cell division, were expressed at low levels in wild soybean fruit relative to cultivated soybeans.
Genome-wide linkage of DEGs, QTLs seed weight and oiland selection sweeps on chromosomes suggested that DEGs at G15 between cultivated and wild soybeans were from domestication sweep regions, and and DEGs were associated with the identified QTLs controlling seed weight and oil content, respectively. Our work thus establishes that gene expression changes might have been preferentially targeted for fruit-related trait formation during soybean domestication.
Our results provide new insights into soybean domestication and offer candidate genes to be considered during soybean improvement. Seeds of wild soybean G. Four accessions from each soybean species wild and cultivatedas four repeats for each species, were chosen based on their rich diversity in color of flower and seed coat, seed size, and the content of protein and oil in seeds Supplementary Dataset S1.
They were grown in an experimental filed under natural conditions in May Institute of Botany, Beijing, China. Clean reads were obtained through the following three steps: Trimmed RNA-seq reads were mapped to the reference genome using Bowtie v2.
If the FPKM was zero, we treated these genes as non-checked genes. Clean reads were assembled using the de novo assembly software Trinity v2. First, clean reads with a certain length of overlap were combined to generate contigs. Then, the paired-end reads were realigned to contigs to obtain unigenes, which could identify different contigs in the same transcript and ensure the interval among these contigs.
The contigs in one transcript were assembled by Trinity and gained the sequences not extended on either end, which was defined as a unigene Wang et al.
The best alignments were used to decide sequence direction and to predict coding regions of the unigenes. In addition, KEGG was used to annotate the pathways of the unigenes. Protein-coding genes from four wild and four cultivated soybeans were used for gene family identification. Once a unique gene was clearly mapped in Williams 82, it was defined as a conserved single-copy gene family in soybean.
Four-fold degenerative sites from the coding sequence CDS alignments were extracted and concatenated for phylogenetic analysis, and the NJ method was incorporated in PhyML v3. Genes were considered to be FEGs if they had a higher value in the foreground branch than in the background branches.
K-means clustering was used to visualize genes exhibiting a similar expression pattern, and it was performed on log 2 -transformed FPKM values using MeV v4.
The modules were obtained using the automatic network construction function blockwiseModules with default settings, with minor modifications the power was 14, TOM-Type was adjacency, minModuleSize was 50, and mergeCutHeight was 0. The eigengene value was calculated for each module and used to test the association with each tissue type. The total connectivity, intramodular connectivity, and kME P -value were calculated.
Wedding & Engagement Gallery
Gene distributions on chromosomes were visualized using MapChart2 Voorrips, To observe general variation patterns at the transcriptomic level between cultivated and wild soybeans, four accessions of each species that showed tremendous phenotypic variations were chosen from the northeast region of China as representatives, and were also used as four independent repeats.
They displayed a rich diversity in flower color, seed coat color, seed size, oil content, protein content, and isoflavone content Supplementary Dataset S1suggesting full representation of wild and cultivated soybeans in this geographic region. For comprehensive evaluation, apical buds JJflower buds WHflowers Hand the developing pods at 5 d heart stage and 15 d cotyledon stage after fertilization respectively referred to as G5 and G15 were collected from each accession of the two soybean species and subjected to RNA-seq analysis.
Each sample was named with the abbreviated accession name plus the tissue name in certain cases, and 40 RNA-seq libraries were constructed and sequenced in total Supplementary Dataset S2. After adaptor trimming, an average of 59 million clean reads per library was acquired after RNA-seq Supplementary Dataset S2.
Therefore, the genome of Williams 82 a2. The clean RNA-seq reads were mapped onto the genome reference, generating an average of Referenced by the Williams 82 genome, and genes were detected, respectively, in all investigated tissues of cultivated and wild soybeans, and were simultaneously expressed in all included tissues Supplementary Dataset S3. Glycine soja had genes that were not expressed in at least in one, while G.
Altogether, we detected soybean genes in the involved soybean tissues. We first evaluated the sequence diversity of soybeans using the obtained transcriptomic data. By alignment to the Williams 82 genomic sequence, SNP sites in coding and non-coding regions including untranslated regions UTRs and introns of each gene in each tissue pair were analyzed between wild and cultivated accessions Supplementary Dataset S4.
More SNPs seemed to exist in each region of transcripts from wild soybeans, but the differences were not significant between most tissue comparisons compared with cultivated soybeans. S2AB ; Supplementary Dataset S4suggesting an overall higher level of transcript diversity in wild soybeans than in cultivated ones. Phylogenetic analysis based on the transcriptomic data revealed that the selected soybean cultivars formed a monophyly, while obvious separation was observed in the wild accessions Supplementary Fig.
For rigorous analysis, we identified the strictly orthologous unigenes single-copy genes between the wild and cultivated soybeans. Altogether, orthologous unigene pairs were characterized Supplementary Dataset S5.
Phylogenetic analysis using these unigenes revealed a quite similar topology Fig. S2Cindicating a solid phylogeny for these soybeans. Sequence diversity of wild and cultivated soybeans.
A Unrooted NJ tree of wild and cultivated soybeans using the identified orthologous genes. The scale bar represents the expected number of substitutions per site. Based on the above-constructed phylogenetic tree Fig.
bgi-architecture | – MULHOLLAND
Branch site analyses further revealed 8 and 14 FEGs, respectively, from the G. These results indicated that only a small portion 0. To inspect the overall gene expression pattern between wild and bbi soybeans, we used all transcriptomic data to perform K-means clustering analysis.
This analysis allows measurement of the dynamic expression of genes during a time series comprising different tissues, including apical buds, floral buds, flowers, G5, and G The detected soybean genes were clustered into 12 gene co-expressed clusters designated as C1—C12, and the gene number for these clusters ranged from C12 to C1 Fig. The genes in each cluster in principle show a similar expression pattern, while genes in different clusters feature distinct expression patterns.
However, considering the gene expression levels in each cluster, significant differences in certain tissues were detected between wild and cultivated soybeans. Moreover, the C4 cluster was 5098 in showing extremely low gene expression in soybeans, and the differences seemed to be more significant Fig. Gene expression pattern between wild and cultivated soybeans. A Gene clusters identified in 40 samples. Twelve gene clusters C1—C12 were identified using k-means clustering.
The blue curve represents the average expression of genes in each tissue of four accessions for both G. The inverted triangles indicate down-regulation, and the upright triangles indicate up-regulation. To analyze further the functional significance of these bgii, over-represented GO terms of the 12 clusters were identified, and each cluster apparently had distinctly enriched GO terms Fig.
A total of 32— significant terms were detected in the remaining 12 clusters Supplementary Dataset S7. We particularly focused on the three divergent clusters C1, C2, and C4.