Quantification of DNA sequence tags from engineered constructs such as plasmids, transposons, or other transgenes underlies many functional genomics measurements. Typically, such measurements rely on PCR followed by next-generation sequencing. However, PCR amplification can introduce significant quantitative error. University of Minnesota researchers describe REcount, a novel PCR-free direct counting method. Comparing measurements of defined plasmid pools to droplet digital PCR data demonstrates that REcount is highly accurate and reproducible. They use REcount to provide new insights into clustering biases due to molecule length across different Illumina sequencers and illustrate the impacts on interpretation of next-generation sequencing data and the economics of data generation.
Illumina size standards allow measurement of sequencer-specific size biases
aDesign of REcount-based Illumina size standard constructs. Each standard construct contains a normalization barcode, as well as a barcode associated with a variable size standard that can be liberated by MlyI digestion and directly sequenced. b Raw abundance data for all 30 size standards and normalization barcodes from a MiSeq run. c Run-to-run variability of multiple MiSeq runs (n = 6 flow cells). d Size bias profiles of the iSeq (n = 1 flow cell), MiSeq (n = 6 flow cells), NextSeq (n = 4 flow cells), and NovaSeq (n = 4 flow cells, 4 lanes) sequencers. Note: Size bias data for other Illumina instruments is shown in Additional file 1: Figure S5. e Size bias profiles of the same library either clustered on the MiSeq immediately after denaturation or clustered after freezing and thawing the denatured library. Error bars are ± s.e.m