Low-pass whole-genome sequencing (WGS) has transformed genomic analysis by providing cost-effective insights into various biological processes. It is now a competitive alternative to genotyping arrays and offers cost-savings and higher throughput compared to Sanger.1 However, low-pass WGS is often associated with a high error rate that can offset its advantages.2
In this blog, we will outline common challenges encountered during low-pass WGS, including index hopping and lack of library complexity. Additionally, we will share some of the innovations that increase success rates and tips to monitor and maintain consistency in your workflow.
Common Challenges with WGS
Challenges that may result in error rates during low-pass WGS include index hopping, incorrect read length, and lack of library complexity.
Index hopping is a rare phenomenon in multiplexed sequencing, typically occurring in less than 1% of total reads.3,4 Though rare, when it does occur, index hopping can disrupt the correct assignment of DNA sequence to sample index, thus compromising the accuracy of data.
Read length (also known as sequencing depth) refers to the number of base pairs sequenced from a DNA fragment.5 Sequencing read lengths correspond directly to the reagents used on a given NGS instrument. For example, more chemistry cycles generate longer reads and, because long reads allow for more sequence overlap, they are useful for de novo assembly. Choosing the appropriate read length and sequencing depth is critical to achieving successful outcomes in low-pass applications.
Library complexity represents the diversity and number of unique DNA fragments in a sequencing library. The number of duplicate reads is a reliable measure of how complex the library is. For example, the smaller amount of duplicate reads present, the more complex the library will be.
In an ideal scenario, sequencing libraries are as complex as possible, meaning they reflect the true nature of the starting material.6 Complex libraries give researchers greater confidence that any variants that they detect are, in fact, real.
How to Overcome These Concerns
Unique dual index (UDI) barcoding is a method that mitigates the impact of index hopping. With UDIs, both the i5 and i7 index reads must match an expected combination, and any misassigned reads will be flagged as undetermined reads and can be excluded from analysis.7
The purePlex™ Library Prep Kit offers 384 UDIs (4 sets of 96) and permits pooling of samples immediately following the tagging steps. So purification and amplification of fragments occur after pooling, which reduces the QC and labor costs compared to other UDI workflows.
You can learn more about the benefits of UDIs and their effect on index hopping with our previous blog “When Unique Dual Indexes Matter”.8
Regarding sequence read lengths, the depth needs to cover each base multiple times in order to increase confidence in NGS applications.9 For WGS, longer reads are essential for capturing insertions and deletions or for sequencing regions with a lot of redundancy, such as those that contain transposons.10
Both purePlex and the plexWell™ LP 384 kit offer substantial normalization over a wide input range (3-30ng for plexWell and 5-50ng for purePlex). To learn more about read length, check out our previous blog “Short-Read Sequencing vs. Long-Read Sequencing: Which Technology is Right for Your Research?”.
Finally, to ensure the appropriate library complexity, researchers can implement titration techniques, such as dsDNA quantification by spectrofluoremetry and real-time PCRs. These methods help to optimize the number of molecules per reaction.11
Consistency and Collaboration
Implementing quality control (QC) metrics at each step of the workflow is essential to monitoring and maintaining consistent performance throughout WGS experiments.
Additionally, researchers can generate technical replicates during the library preparation and sequencing steps.12 By sequencing the same sample multiple times in this way, researchers can identify technical variability and ensure consistency throughout their workflow.
Consistent evaluation and optimization of experimental procedures also contributes to improved efficiency and success rates for low-pass sequencing applications. This is especially true in areas such as pharmacogenetic research, where low-pass is an increasingly competitive alternative to genotyping arrays.1
Data documentation and transparency is a crucial element to effective reproducibility for low-pass applications. This includes:
- Sharing sequencing data, protocols, and analysis pipelines
- Outlining experimental details like sample characteristics, library preparation protocols, and bioinformatics analysis workflows
Transparency within the scientific community promotes progress and enables others to duplicate studies and compare results.
Cross- and independent-validation also ensure the robustness of low-pass applications. Researchers can cross-check their findings through independent replication or alternative experimental methods, which strengthens the reliability of results. Also, collaboration with other research groups facilitates validation and fosters scientific advancement across the industry.
Enhancing success rates and efficiency in low-pass applications requires addressing technical challenges such as index hopping, optimizing library complexity, and selecting the appropriate read length. Collaboration among scientists is another crucial component for increasing success.
By implementing the tools discussed in this blog, such as purePlex and plexWell LP 384, researchers can improve the reliability and accuracy of their low-pass applications, leading to more insightful genomic discoveries across the scientific community.
- BMC Genomics | Comparing low-pass sequencing and genotyping
- Methods to improve the accuracy of next-generation sequencing
- 10x Genomics | Index hopping
- seqWell | Getting a Read on NGS Barcodes
- Illumina | Sequencing read length
- Aligent | Fundamentals of NGS Library Preparation
- Illumina | Unique dual indexes
- seqWell | When Unique Dual Indexes Matter
- iRepertoire | NGS considerations
- seqWell | Short-Read Sequencing vs. Long-Read Sequencing: Which Technology is Right for Your Research?
- Comparison of DNA Quantification Methods for NGS
- The Role of Replicates for Error Mitigation in NGS