Next-generation sequencing in laboratory medicine

Advancements in sequencing techniques and the emergence of next-generation sequencing (NGS) technology have transformed laboratory medicine and clinical diagnosis. DNA sequencing results help healthcare professionals in disease diagnosis, prognosis, therapeutic decision, and follow up of patients.1

Sanger sequencing

The first DNA sequencing was conducted in the laboratory of Frederick Sanger and Walter Gilbert in 1977 to sequence bacteriophage ϕX174. This sequencing method became known as Sanger Sequencing. It involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides (dNTPs) by DNA polymerase during in vitro DNA replication. It is able to read the nucleotide sequence for entire genes (1,000 to 30,000 bases long). After polymerase chain reaction (PCR) technique for DNA amplification emerged in 1983, it became used in DNA sequencing subsequently.2 Automated Sanger sequencing was used to sequence the human genome as part of the Human Genome Project (HGP) that concluded in 2003.3

Though Sanger sequencing is considered as the gold standard in clinical cytogenetics,2 it has the following pitfalls: 3,4

  • Can sequence only shorter lengths of DNA up to 800 to 1000 base pairs.
  • The quality of the sequence is often not very good in the first 15 to 40 bases because that is where the primer binds.
  • Cannot differentiate single base pair differences in longer segments >900 bases. 
  • Expensive.

From the Human Genome Project, it became known that human gene is made up of 3.1 billion base pairs5 and around 19000 to 20,000 protein-coded genes.6, 7 To utilize this knowledge in the field of diagnosis and treatment, a faster and less expensive sequencing technology was needed — and that led to the development of advanced sequencing technologies, collectively called “next-generation sequencing” (NGS).  Figure 1 below captures the major milestones in the sequencing technologies.2

Next-generation sequencing (NGS)

Next-generation sequencing technologies are high-throughput DNA sequencing technologies that are capable of sequencing large numbers of different DNA sequences in a single reaction (i.e., in parallel) and hence often called as massive parallel sequencing or massively parallel sequencing. All NGS technologies monitor the sequential addition of nucleotides to immobilized and spatially arrayed DNA templates but differ substantially in how these templates are generated and how they are interrogated to reveal their sequences.3

Based on the read lengths, NGS can be classified as short-length reads or long-length reads. Short-read sequencing is currently the most commonly used for NGS and has a wide range of diagnostic applications. In this type of sequencing, the genome is broken into small fragments – 50 to 300 bases, before being sequenced.8 The technologies carry out sequencing by hybridization or synthesis using DNA polymerase or by ligation using ligase enzyme and extend numerous DNA strands in parallel. Nucleotides can either be provided one at a time, or they can be modified with identifying tags. Short-read sequencing technologies can be further categorized as either single, molecule-based that involve the sequencing of a single molecule or ensemble-based, which is the sequencing of multiple, identical copies of a DNA molecule that have usually been amplified together on isolated beads. Furthermore, these methods could be real-time or synchronous controlled.

Long-read sequencers can read long strands of DNA or RNA (between 5,000 and 30,000 base pairs) in one go, without breaking them up into smaller fragments. They have played an important role in producing the most complete human reference genome sequence by enabling sequencing of regions that short-read sequencing struggled with. Even though they offer improved accuracy for the detection of specific types of genetic variants that are difficult to detect using current short-read sequencing methods, their use in clinical practice is restricted to a few specific situations, though it may expand in the future.9 Next-generation sequencing technologies have progressively evolved. Table 1 provides examples of a few NGS platforms that are currently available.

NGS approaches

Depending on the clinical applications, different approaches can be used for NGS.15

Whole genome sequencing (WGS): A comprehensive method to analyze the entire genome. This is useful in identifying inherited disorders, rare disorders, screening for prenatal aneuploidy, characterizing the mutations that drive cancer progression, epidemiological investigations of disease outbreaks, and determining antimicrobial resistance.16

Short-read WGS has the following major steps:16

1. Sample preparation: DNA from EDTA or citrate stabilized whole blood or surgically removed or biopsy tissue is isolated by conventional methods. To enable copy number variation (CNV), high-molecular DNA is preferred. Though a PCR amplification step was required in the past, the newer technologies no longer need it.

2. Library preparation: The library is generated by fragmenting the high molecular DNA followed by ligation of adapters that will bind to the linker DNA on the chip surface. Next, barcodes that enable pooling of samples from different patients on the same chip may be attached.

3. Cluster generation: The libraries are subsequently loaded onto a flow cell and placed on the sequencer, after which the individual DNA fragments are clonally amplified by a polymerase, generating small single-stranded clusters of the particular fragments.

4. Sequencing: The sequencing uses the principle of Sanger sequencing, where elongation is initiated by the addition of a sequence primer and polymerase and the nucleotide sequence is determined by the incorporation of complementary fluorescent-tagged nucleotide terminators. The fluorescent signal from the incorporated terminators is detected by scanning the chip and the individual clusters with a high-resolution confocal fluorescence laser detector after every round of nucleotide incorporation.

5. Data compilation using a high-performance computer: The next-generation system produces a raw sequence file. Data are compiled in a fastq file that is transferred to the high-performance computer (HPC). The raw sequence data is then aligned, creating Sequence Alignment/Map (SAM) format or its binary compressed version (BAM) file. From there, variant calling identifies changes to a particular genome as compared to the reference genome. That output is stored in the Variant Calling File (VCF).  

6. Data annotation and interpretation with bioinformatics: The VCF file is finally uploaded to the interpreters in the genomic laboratory for filtration, annotation, and interpretation.

The general workflow of WGS is outlined in Figure 3.

Whole exome sequencing (WES): Whole exome sequencing (WES), or exome sequencing, is a technique of sequencing all of the protein-coding regions of genes in a genome. Broadly speaking, it has two main steps as shown in Figure 4. They are 1) selecting only the subset of DNA that encodes proteins, known as exons and 2) sequencing the exonic DNA. Humans have about 180,000 exons, constituting about 1% of the human genome.17

WES is increasingly being utilized in the initial stages of diagnostic evaluation, especially for disorders that are genetically heterogeneous, such as complex neurologic diagnoses and multiple congenital anomalies. It has been used as a method of gene discovery in large series of patients with autism, epilepsy, brain malformations, congenital heart disease, and neurodevelopmental disabilities.18

The Helix® Genetic Health Risk App for Late-onset Alzheimer’s disease is an FDA-approved test that uses exome sequencing.19

Targeted next-generation sequencing (tNGS): Targeted next generation sequencing (tNGS) focuses on specific regions of interest in the genome — specific genes, coding regions, or even chromosomal segments at deeper coverage. Prior to the development and use of the target sequence (TS) panels, target enrichment for the genomic regions that are of interest are compared to the genomic background. This step is crucial and ensures that the NGS process is specifically designed to sequence the genomic targets efficiently and accurately. The common sequence enrichment processes are either generating amplicon by polymerase chain reaction (PCR) or using hybrid capture-based technique.20

tNGS is a great tool for diagnosis of microbial infections (more so for those organisms that are drug resistant) and diagnosis, prognosis, and monitoring therapy of cancer patients. The WHO recently suggested using tNGS for diagnosis of drug-resistant tuberculosis.21 The following are a few FDA-approved tests that use tNGS:

  • Tempus’s xT CDx, a 648-gene next-generation sequencing (NGS) assay for solid tumor profiling and companion diagnostic for patients with colorectal cancer (CRC)22
  • Thermo Fisher Scientific’s Oncomine Dx Target Test finds genetic variations in the genetic material of tumor tissues from patients with non-small cell lung cancer (NSCLC) and cholangiocarcinoma23  
  •  Memorial Sloan Kettering Cancer Center’s MSK-IMPACT targets 505 genes and profiles for mutations for both rare and common cancers24
  • Foundation Medicine’s FoundationOne®CDx is a tissue-based broad companion diagnostic (CDx) that is clinically and analytically validated for all solid tumors25

Metagenomic next-generation sequencing (mNGS): Metagenomic next-generation sequencing (mNGS) is a shotgun sequencing approach in which all of the nucleic acid (DNA and RNA) in a clinical sample is sequenced at a very high depth, 10-20 million sequences per sample. mNGS can be performed for any type of clinical sample, including cerebrospinal fluid, plasma, respiratory secretions, urine, stool, or tissue. A single mNGS test can detect sequence reads corresponding to all pathogens – viruses, bacteria, fungi, and parasites.  It can thus be used to identify the potential cause of a patient’s infection. To date, there is no currently FDA-cleared or approved mNGS test, although there are a few CLIA-certified laboratories that offer such tests on clinical samples.26, 27

Conclusion

Next-generation sequencing has transformed laboratory medicine as it has enhanced our understanding of disease mechanisms, enabled identification of specific biomarkers, and tailored medical interventions based on an individual’s genetics, i.e., precision medicine. NGS has also contributed to the development of non-invasive liquid biopsies for detection of cancer and monitor disease progression and treatment response. NGS is being used for RNA sequencing and methylation sequencing to study the methylation pattern of the DNA, sequencing microorganisms, sequencing cell-free DNA (cfDNA), cell-free RNA (cfRNA), etc. Clinical laboratories have adopted NGS as a gold standard for the diagnosis of hereditary disorders because of its analytic accuracy, high throughput, and potential for cost-effectiveness.28

In spite of these benefits, NGS has several challenges. It is complex and requires skilled professionals to perform the tests, analyze and interpret data, and standardize protocols. Though NGS is evolving at a rapid pace, it is still primarily used for research purposes or in CLIA-certified laboratories as laboratory developed tests. With advancements in NGS technologies and artificial intelligence, we may envisage NGS to become an easy-to-use point-of-care test as the standard of care in future.

REFERENCES

  1. Qin D. Next-generation sequencing and its clinical application. Cancer Biol Med. 2019;16(1):4-10. doi:10.20892/j.issn.2095-3941.2018.0055. 
  2. Morganti S, Tarantino P, Ferraro E, et al. Role of next-generation sequencing technologies in Personalized Medicine. In: P5 eHealth: An Agenda for the Health Technologies of the Future. Springer International Publishing; 2020:125-154.
  3. Rizzo JM, Buck MJ. Key principles and clinical applications of “next-generation” DNA sequencing. Cancer Prev Res (Phila). 2012;5(7):887-900. doi:10.1158/1940-6207.capr-11-0432.
  4. Crossley BM, Bai J, Glaser A, et al. Guidelines for Sanger sequencing and molecular assay monitoring. J Vet Diagn Invest. 2020;32(6):767-775. doi:10.1177/1040638720905833. 
  5. Homo_sapiens - Ensembl genome browser 111. Ensembl.org. Accessed March 26, 2024. https://useast.ensembl.org/Homo_sapiens/Info/Annotation.  
  6. Omenn GS, Lane L, Overall CM, et al. Research on the Human Proteome Reaches a Major Milestone: >90% of Predicted Human Proteins Now Credibly Detected, According to the HUPO Human Proteome Project. J Proteome Res. 2020;4;19(12):4735-4746. doi:10.1021/acs.jproteome.0c00485.  
  7. Amaral P, Carbonell-Sala S, De La Vega FM, et al. The status of the human gene catalogue. Nature. 2023;622(7981):41-47. doi:10.1038/s41586-023-06490-x. 
  8. Short-read sequencing. GeNotes. Accessed March 26, 2024. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/short-read-sequencing/
  9. Long-read sequencing. GeNotes. Accessed March 26, 2024. https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/long-read-sequencing/
  10. Sequencing platforms. Illumina.com. Accessed March 26, 2024. https://www.illumina.com/systems/sequencing-platforms.htm
  11. Ion GeneStudio system models - US. Accessed March 26, 2024. https://www.thermofisher.com/us/en/home/life-science/sequencing/next-generation-sequencing/instruments/ion-genestudio-system/models.html
  12. Sequencing systems. PacBio. Published January 24, 2022. Accessed March 26, 2024. https://www.pacb.com/sequencing-systems/.   
  13. Oxford Nanopore flow cells and sequencing devices. Oxford Nanopore Technologies. Accessed March 26, 2024. https://nanoporetech.com/products/sequence.   
  14. Petric RC, Pop LA, Jurj A, et al. Next generation sequencing applications for breast cancer research. Clujul Med. 2015;88(3):278-87. doi:10.15386/cjmed-486.
  15. Rodino KG, Simner PJ. Status check: next-generation sequencing for infectious-disease diagnostics. J Clin Invest. 2024;15;134(4):e178003. doi:10.1172/JCI178003. 
  16. Bagger FO, Borgwardt L, Jespersen AS, et al. Whole genome sequencing in clinical practice. BMC Med Genomics. 2024;29;17(1):39. doi:10.1186/s12920-024-01795-w. 
  17. Wikipedia contributors. Exome sequencing. Wikipedia, The Free Encyclopedia. Published March 18, 2024. https://en.wikipedia.org/w/index.php?title=Exome_sequencing&oldid=1214332692.  
  18. Retterer K, Juusola J, Cho MT, et al. Clinical application of whole-exome sequencing across clinical indications. Genet Med. 2016;18(7):696-704. doi:10.1038/gim.2015.148. 
  19. Helix FDA Authorization. Helix.com. Accessed March 26, 2024. https://www.helix.com/helix-fda-authorization
  20. Pei XM, Yeung MHY, Wong ANN, et al. Targeted Sequencing Approach and Its Clinical Applications for the Molecular Diagnosis of Human Diseases. Cells. 2023;2;12(3):493. doi:10.3390/cells12030493. 
  21. WHO issues rapid communication on use of targeted next-generation sequencing for diagnosis of drug-resistant tuberculosis. Who.int. Accessed March 26, 2024. https://www.who.int/news/item/25-07-2023-who-issues-rapid-communication-on-use-of-targeted-next-generation-sequencing-for-diagnosis-of-drug-resistant-tuberculosis.
  22. Troy J. Tempus receives U.S. FDA approval for xT CDx, a NGS-based in vitro diagnostic device. Tempus. Published May 1, 2023. Accessed March 26, 2024. https://www.tempus.com/news/tempus-receives-u-s-fda-approval-for-xt-cdx-a-ngs-based-in-vitro-diagnostic-device/
  23. Oncomine dx target test--US - US. Accessed March 26, 2024. https://www.thermofisher.com/us/en/home/clinical/diagnostic-testing/condition-disease-diagnostics/oncology-diagnostics/oncomine-dx-target-test/oncomine-dx-target-test-us-only.html
  24. MSK-IMPACT: A targeted test for mutations in both rare and common cancers. Memorial Sloan Kettering Cancer Center. Accessed March 26, 2024. https://www.mskcc.org/msk-impact.  
  25. FoundationOne®CDx. Foundation Medicine. Accessed March 26, 2024. https://www.foundationmedicine.com/test/foundationone-cdx.  
  26. Miller S, Naccache SN, Samayoa E, et al. Laboratory validation of a clinical metagenomic sequencing assay for pathogen detection in cerebrospinal fluid. Genome Res. 2019;29(5):831-842. doi:10.1101/gr.238170.118.
  27. Technology. UCSF Center for Next-Gen Precision Diagnostics. Published January 11, 2017. Accessed March 26, 2024. https://nextgendiagnostics.ucsf.edu/technology/.  
  28. Hartman P, Beckman K, Silverstein K, et al. Next generation sequencing for clinical diagnostics: Five year experience of an academic laboratory. Mol Genet Metab Rep. 2019;1;19:100464. doi:10.1016/j.ymgmr.2019.100464.