* The following is derived from the White Paper

Introduction

The legume crop soybean (Glycine max [L.] Merr.) is the leading oilseed crop produced and consumed in the world today and accounts for 29% of the world’s agricultural output.

The domestication of G. max from its wild progenitor (Glycine soja Sieb. and Zucc.) occurred in China approximately 5,000 years ago (Carter et al., 2004) and expanded to other parts of Asia around 2000 years ago (Kihara, 1969); The crop was likely introduced into the Americas during the 18th century. 

Local adaptation throughout this global distribution resulted in a wide range of unique landraces. More than 170,000 soybean accessions are maintained by more than 160 institutions in nearly 70 countries (International Plant Genetic Resources Institute, 2001). However, only 80 accessions account for 99% of the collective parentage of North American soybean cultivars; off these, seventeen elite parents account for 86% of the collective (Li and Nelson, 2001; Carter et al., 2004). Researchers have presumed that these genetic bottlenecks have reduced the genetic diversity of modern soybean.

The size of the soy Glycine max genome is 1.1 Gb arranged in 20 chromosomes. The genome contains more than 46,000 protein-coding loci (Schmutz et al, 2010). Many of these gene loci have yet to be genetically and physically mapped.  Exploring the soy genome, mapping all gene loci, performing functional genetic research, and translating the information into molecular breeding can be facilitated by the use of molecular markers.

Molecular markers have evolved over the past 80 years through sampling and comparing genomes. Over time, technology for interrogating genetic variation has progressed, and many DNA molecular marker systems were developed. Consequently, so did the resolution of the genomic picture the markers can depict. Single Nucleotide Polymorphisms (SNPs) have emerged as the ultimate molecular marker. SNPs are single nucleotide changes that are heritable codominant and distributed with relatively high frequency throughout eukaryotic genomes. SNPs can be the causative mutation that directly affect a phenotype or can be associated to a causative mutation. 

For breeders, the use of molecular markers permits accurate and early selection of individuals of interest. The molecular markers shorten the number of selection cycles required, reduces time to market new lines, and lowers the overall cost of breeding.  

The desire to create high-density marker chip arrays that can interrogate a large number of SNPs per DNA sample has dominated the evolution of SNP genotyping. Research has led to many high-resolution SNP arrays. Several high-density soy SNP arrays were developed and commercialized: SoySNP50K assay (Song et al. 2013), BARCSoySNP6K, and BARCSoySNP3K (Song et al. 2020). Researchers have made less of an effort to develop informative, high-throughput, and cost-effective mid-density genotyping solutions for applied molecular breeding programs and seed production Quality Assurance (QA). The advent of Next Generation Sequencing technology and genotyping by targeted sequencing provides an attractive method for mid-density SNP genotyping.

The Soy 1k SNP Panel

The Community soy 1K SNP panel at AgriPlex Genomics is made of 1290 SNPs and consists of two parts: Genomic Screen and Trait Markers.  

The Genomic Screen consists of 1213 markers. Originally, the SNPs were part of the SoySNP50K array (Song et al. 2013). Initial reduction of the array resulted in the BARCSoySNP6K (Song et al. 2020), which were reduced further to the BARCSoySNP3K. The present selection out of the 3k SNP array was made based on the following criteria: 

  • Reduce the number of SNPs in the same large linkage blocks in the North American elite population.

  • SNP selection is based upon even spacing between SNPs in the segments of the genome outside of the major haplotype blocks.

  • Location of SNPs in euchromatic vs. heterochromatic regions of the genome 

  • Minor allele frequency (MAF). The average MAF of the SNPs among 562 elites was 0.36, and the minimum allowed was MAF > 0.10. The average MAF in the Southern and Northern elites was 0.29 and 0.33, respectively.

The number of SNPs per chromosome ranges from 32 to 92 and averages at 61 SNPs per chromosome. The average distance between adjacent SNPs is 788 KB (Table 1). 

1k soy chart.PNG

Table 1: Genomic Screen: SNPs numbers and average spacing (KB) along the soy chromosomes

The Trait Markers portion of the panel is made of 88 markers that are either embedded within gene sequences of traits of interest or associated with them. Currently, the panel includes markers for diverse phenotypic characteristics (Table 2). These characteristics include:

  • Growth habit and morphological features

  • Biochemical qualities such as seed composition

  • Environmental tolerances

  • Disease resistance to life cycle attributes

The soy research and breeding community have contributed to this collection of trait markers. The effort to build this collection should be ongoing, enriching the panel with additional trait markers as new discoveries are made.

1k soy chart 2 fix.PNG

Table 2: Selected traits, genes, and associated number of markers in the panel

The 1k Soy Community SNP Panel serves as a useful tool for conducting genomic selection, genomic prediction, and germplasm identification