*The following is derived from the White Paper

Introduction

Maize (Zea mays L.) serves as one of the key multi-purpose crops worldwide and is utilized for food, feed, fuel, and a variety of industrial products. In the United States, most of the grain harvested is not used directly as food. The USDA estimated that ∼3.5% of the grain harvested was used for cereals and other food products (USDA–National Agricultural Statistical Service, 2019). This figure reflects a threefold increase in demand and consumption of maize-based food products since the 1970s, primarily because of the popularity of both Hispanic and gluten-free foodstuffs. In Africa and Latin America, maize is a staple crop for food security and nutrition for both humans and animals. To meet increasing market demands in the United States and globally, the quality of the seed along the supply chain should be guaranteed; assessment of genetic purity of parental inbred lines and resulting hybrids is an essential measure in maize hybrid breeding and consequently, high-quality grain. Quality seed of hybrid maize is marketed when genetically pure breeder seed, basic seed, and certified seeds are produced (Poets et al. 2020).

Quality Assurance (QA) is the process used to measure and assure the quality of a product, and Quality Control (QC) is the process of ensuring that products meet consumer expectations. Seed QA and QC practices of corn were traditionally conducted using Grow Out Tests (GOT): a morphologically based approach that uses a set of morphometric descriptors, or biochemical markers analyses based on the protein profiles (isozymes) of different genotypes. In recent years, seed QA/QC has shifted to the use of molecular markers.


The molecular marker approach detects variation of the genotypes directly at the DNA level. Unlike GOT and biochemical marker methods, which have low polymorphism and high environmental influences, molecular markers are polymorphic, independent of the environment, reproducible, present at all developmental stages, probe known positions in the genome, and may be linked to traits of interest. Single Nucleotide Polymorphisms (SNPs), which are substitutions of a single nucleotide at a specific position in the genome, are currently considered to be the optimal marker type for most genotyping applications.


Large numbers of SNPs are present in all eukaryotic genomes. They are dispersed throughout the genome and are inherited as codominant Mendelian traits. The development and advancements in Next Generation DNA Sequencing (NGS) methodologies have contributed further to the usefulness of SNPs as molecular markers; SNPs lend themselves to a high level of multiplexing when they are probed using NGS. These features translate into relatively low cost and high speed of detection (Josia et al. 2021), which are two critical qualities when it comes to QA/QC applications.

The Panel

The maize QA/QC SNPs panel was developed at the International Maize and Wheat Improvement Center (CIMMYT). For a complete description of the development of the panel, see Chen et al. (2016). The 131 SNP panel presented here is an extension of the one in the above citation.


The objective of a QA/QC SNP panel is to enable the generation of a unique SNP profile for every corn line, thus, allowing to tell any two lines apart and detect off-types within a line. The evaluation of prospective marker sets was done against 561 CIMMYT Maize Inbred Lines (CMLs). Different marker groupings were tested for their usefulness in differentiating between the different CMLs and their ability to evaluate homogeneity within the elite lines.


For the selection of the most informative marker sets, the following marker performance parameters were used:

 

  • Data completeness: Markers with more than 20% missing data were filtered out.

  • Minor Allele Frequency (MAF): Higher MAF improves the efficiency of distinguishing CMLs from one another. MAF was kept at 0.25 or higher.

  • Marker distribution: Uniform marker selection across chromosomes demonstrated a better separation of CMLs than random selection. Thus, all chromosome arms are probed with the number of SNPs per chromosome ranging from 7 to 24, and the average interval between two adjacent markers is about 17.5 million base pairs (Table 1).

Maize QC chart.PNG

Table 1: Number of SNP markers on each chromosome and their spacing given as the average distance between two adjacent markers (BP, base pairs)


Within a finite and well-defined collection of lines, a limited number of SNPs may suffice to establish unique and diagnostic allele profiles that can distinguish each line from all others. The number of markers needed to generate such distinctive and unique DNA fingerprints will vary with the level of relatedness of the lines; the more related the lines are by common descent the higher will be the number of SNPs needed for the required resolution. The number of markers also changes during different stages of the breeding program or production stage (for example identifying inbreds vs hybrids). When the germplasm that the SNP panel is expected to provide resolution for is not defined, the total number of markers required is larger, and an even distribution of the markers throughout the genome becomes more important. The operational efficiency and cost-effectiveness of PlexSeq (see following), where there is no real cost or turnaround time penalty for increasing the number of SNPs (within the QA/QC values), allows for providing a generic global solution to all the QC/QA needs by increasing the total number of SNPs. Employing 131 SNP should enable the identification of any germplasm entry and measuring the heterogeneity within it.
 

The Number of Individuals in a QC Sample

The number of individuals sampled for routine QC genotyping is crucial as it affects the cost, time, and accuracy of detecting off-types within the entity. Chen et. al (2016) evaluated the influence of sample number on the probability of detecting off-types, and their findings are summarized in Fig 1. The graph captures the capacity to detect contamination in populations with defined percentages of off-types at different sample sizes. Considering the number of off-types detected in the sample from 0 to 3, detecting one off-type individual in a sample size of 192 individuals will correspond to 2% contamination at the source lot/line. If two samples were detected as off-types, the lot would have an upper limit of 2.5% contamination with 95% confidence. Therefore, if the target genetic purity of an inbred parental line is upward of 98%, the minimal size of a sample should be at least 192 individuals.

maize qc chart 2.PNG

Figure 1: The capacity to detect contamination in populations with defined percentages of off-types at different sample sizes.


Conclusions

Assessment of genetic purity of parental inbred lines and resulting hybrids during breeding operations and along the supply chain is an essential measure to assure high-quality grain.


The best methodology for seed QC/QA is the use of SNPs as molecular markers; Assessing genetic purity using PlexSeq methodology in conjunction with the maize quality SNPs panel provides a generic tool fit for quality testing across the genetic width of maize.


The flexibility of the platform also enables continual revision and upgrading of the marker system, ensuring it keeps pace with the germplasm evolution.


The Maize QC SNP panel is available as a service from AgriPlex Genomics and is also available as a kit to be used by in-house genotyping laboratories.