Accelerating cancer biomarker development using the latest mass spectrometry tools and techniques
Cancer is the second leading cause of mortality and is responsible for an estimated 9.6 million deaths worldwide each year.1 While the prevention of some types of cancer is viewed as a long-term goal for scientific research, an immediate priority for the research community is the improvement of patient outcomes by finding new, more effective treatment approaches and detecting the disease earlier. As such, translational proteomics has become an integral part of cancer research, playing a key role in the discovery and development of protein biomarkers that are set to revolutionize cancer diagnosis, monitoring, and treatment.
The advances in proteomics that have been made over the past two decades stem in large part from remarkable developments in mass spectrometry (MS). Ongoing improvements in the mass accuracy, resolution, and sensitivity of MS instruments are enabling the rapid and reliable detection, identification, and quantitation of proteins in complex mixtures. These techniques are opening the door to improved methods for discovering disease-specific biomarkers with the potential to support early disease detection and even individualized therapies.
Despite improvements in the technical performance of MS instrumentation, a lack of reproducible, scalable workflows has made the task of translating promising candidate biomarkers into real-world clinical diagnostic assays extremely challenging.2 However, in recent years, advanced MS-based proteomics workflows have been developed that are enabling exceptional reproducibility, multiplexing capacity, and quantitative accuracy. These workflows are now being used in large-scale proteomics research projects, including the Cancer Moonshot initiative, with the aim of driving the successful translation of biomarkers from discovery stage research to clinical applications.
Powerful MS techniques for translational proteomics
MS has proven to be an indispensable tool for unraveling the complexities of the proteome. Its ability to characterize and sequence individual proteins with precision means it can be used to determine protein interactions, document protein expression, and pinpoint sites of protein modification. Recent advances in instrumentation have improved the sensitivity, resolution, quantitative accuracy, and acquisition speed of MS techniques, enabling high-throughput, in-depth proteomic analysis.
Fundamental proteomics research has typically focused upon analytical sensitivity and depth of coverage. However, as biomarker development progresses toward clinical application, MS requirements have become more quantitative, and factors such as reproducibility, standardization, and scalability have become more important. Biomarker verification studies are used to screen potential biomarkers to ensure only the highest quality leads from the discovery phase are progressed. These verification workflows require high throughput methods that are capable of high analytical specificity and sensitivity, require minimal sample preparation, and can provide confident protein confirmation.
Due to innate biological variability, verified candidate biomarkers need to be validated across a large number of samples by developing targeted quantitation assays. Another area of focus for modern proteomics MS workflows is the efficient, targeted analysis of as many candidates as possible across hundreds, and potentially thousands, of samples.
Biomarker discovery using multiplex labeling techniques
Biomarker discovery is the first step in the translational proteomics pipeline. Discovery proteomics workflows are used to characterize biological samples at the protein level, in order to identify potential protein markers associated with disease in small patient cohorts. As biological samples contain very large numbers of proteins, discovery workflows must facilitate comprehensive profiling to ensure no promising candidate biomarkers are overlooked. To help researchers achieve multidimensional characterization of the proteome, multiplexing tools optimized for use with high resolution tandem MS platforms are enabling more accurate and higher throughput quantitation.
While incumbent approaches were limited by the need to analyze complex samples one-by-one, isobaric labeling strategies such as tandem mass tag (TMT) workflows allow for parallel multiplexing of large-scale high-throughput quantitative experiments. Using this technique, multiple peptide samples are chemically labeled with isobaric chemical tags that covalently bind to peptide N-terminal groups or lysine residues on peptides. During MS/MS analysis, each variant produces a unique reporter ion. The intensities of the peaks corresponding to each variant are recorded and compared to determine the relative
abundances of the peptide in each sample.
Multiplex approaches based on TMT workflows play an important role in cancer proteomics by detecting the subtle biological changes that contribute to the disease. For example, a TMT-based quantitative proteomics approach was recently used to identify potential diagnostic biomarkers for gastric cancer.3 The study compared differentially expressed proteins between cancer patients and healthy control subjects, finding several up-regulated and down-regulated proteins active in antigen binding, calcium ion binding, and protein homodimerization. Used in this way, TMT workflows are proving to be highly effective in identifying differentially expressed proteins associated with cancer pathways.
A scalable approach for discovery and verification
Although TMT workflows play a vital role in discovery-stage proteomics experiments, for larger targeted verification-stage studies, more flexible and scalable techniques are necessary. Label-free data dependent acquisition (DDA) methods have proven particularly useful at meeting this need and can be used to directly compare relative abundances of proteins across multiple liquid chromatography (LC)-MS/MS experiments without the use of isotopic tags. These methods are generally used for genome-wide protein identification studies and are very effective at extending proteome coverage while minimizing redundant peptide precursor selection. The primary advantage of this approach is that the number of sample comparisons is not limited, creating a comprehensive and scalable workflow.
Label-free protein quantitation is based on tandem MS analysis of the most abundant precursor ions. In contrast to stable isotope-labeling approaches, where differentially labeled proteins are combined and analyzed together, proteins studied using label-free approaches are measured individually. While this allows for comparison of multiple samples, any deviations arising from sample preparation or instrument use generate greater variability and reduce precision. Thus, DDA-based experiments require more repeat measurements to achieve statistical significance.4
The latest DDA workflow optimizations, including improvements in separation, acquisition, and data analysis, overcome the challenges around method standardization and experimental reproducibility. Improvements in the sensitivity of capillary flow high-performance LC (HPLC) technologies, for example, are enabling better separation of peptides, and ultimately, deliver more precise data. Data can be further enhanced using the latest LC columns that are designed to achieve more consistent chromatographic separation by reducing mobile phase dead volumes.
Modern high-resolution accurate mass (HRAM) technologies are also leading to improved reproducibility in biomarker verification. The latest generation of Orbitrap mass spectrometers, for example, provide increased acquisition speed and advanced peak determination to expand the number of peptides sampled, thus increasing peptide identification across varying data acquisition modes. Significant improvements in sampling depth, sequencing speed, and protein identifications provide better and more consistent data for enhanced run-to-run reproducibility and confident biomarker verification.
High-resolution verification workflows
While DDA workflows are very useful for scalable biomarker verification, achieving the required analytical sensitivity can sometimes be challenging. Data-independent acquisition (DIA) is an alternative label-free biomarker verification approach that overcomes this challenge. In a DIA analysis, a set of precursor acquisition windows are used to cover a broad mass-to-charge (m/z) range. All peptides within the defined m/z window are fragmented and a product ion spectrum for each detectable peptide is generated, providing multiplexed proteome-wide quantification of even low-level proteins.
While DIA workflows are well suited for biomarker candidate analysis in human samples, challenges with analytical selectivity and dynamic range have led to the search for method improvements. The co-isolates and co-fragments that are sampled by broad acquisition ranges can produce highly complex MS/MS spectra, making confident analysis more problematic. This can be particularly challenging when working with clinical samples such as plasma due to the natural abundance and diversity of peptides and plasma proteins.
High-resolution DIA (HR-DIA) workflows based on hybrid quadrupole-Orbitrap MS technologies are helping to obtain more confident measurements from large-scale proteomics studies. HR-DIA workflows address challenges with sample complexity by using much narrower acquisition windows, an optimization that is possible by the increased mass resolution of modern Orbitrap instruments. With a greater ability for deconvolution of complex spectra, driving improved precursor selectivity and unbiased analysis, HR-DIA workflows increase fidelity and identification range, leading to more reproducible and
comprehensive protein profiling.
Validating biomarkers with sensitive and specific protein quantification
Biomarker validation requires workflows with more directed analysis, capable of sensitive and specific protein quantification. While MS approaches for biomarker validation have traditionally relied on selected-reaction monitoring (SRM) techniques performed using triple quadrupole mass spectrometers, variation in the intensities of the product ions generated from the precursor ions can result in sensitivity issues. Parallel-reaction monitoring (PRM) is an alternative approach that uses hybrid triple quadrupole-Orbitrap technologies to achieve more sensitive protein quantitation by identifying the most intense product ions to analyze. PRM offers higher selectivity and high-throughput protein quantitation, ensuring
confident peptide quantification.
Once biomarker selection is narrowed to a small number of target peptides, PRM for targeted MS quantification allows the full MS/MS spectra to be acquired for each precursor. This enables higher analyte selectivity to be achieved than with SRM, facilitating better discrimination of target peptides from co-eluting interferences present in complex biological matrices. High-resolution MS can also support PRM analysis, enabling detection of low abundance peptides common in biological samples and outperforming alternative methods in terms of absolute quantification.
Despite these advantages, standard PRM methods can be limited by inconsistent retention times. Temperature fluctuations, inefficient mobile phase mixing, flow rate instability, or column contamination issues can all influence analyte retention times and ultimately affect method reliability. Direct retention time PRM (dRT-PRM) is an improvement to PRM workflows that can help to correct for these issues by monitoring and adjusting retention time windows in real-time using internal standards. In addition, dRT-PRM offers further benefits in terms of the quality and precision of peptide measurements to improve analytical reproducibility.
Accelerating biomarker development with the Cancer Moonshot
Many of the greatest challenges associated with translational proteomics workflows relate to lab-to-lab reproducibility, method standardization, and scalability. These issues are well-recognized, and large-scale collaborative programs such as the Cancer Moonshot initiative are leading the way when it comes to overcoming them using the most advanced MS methods.
An international multi-site study, conducted as part of the Cancer Moonshot initiative, recently applied many of the next-generation MS proteomics workflows highlighted earlier to determine whether they could accelerate biomarker development. Using TMT multiplexing in discovery workflows for the large-scale analysis of human, yeast, and E. coli proteomes, each research group involved in the study analyzed a set of known proteins and compared their results to determine inter-lab and day-to-day measurement reproducibility. Label-free DDA-plus and HR-DIA workflows with precursor-level quantitation were subsequently used for verification and validation studies, delivering comprehensive and accurate analysis that ensured remarkable reproducibility over large patient cohorts.
These MS-based workflows were used to enhance measurement reproducibility across sites, ultimately providing greater confidence in the experimental data generated and accelerating the translation of biomarkers for detecting, monitoring, and treating cancer. Eleven international labs, including six Cancer Moonshot initiative labs, tested the protocols across biomarker identification, verification, and validation stages. In total, over 80 percent of the individual proteins analyzed were identified and quantified in common across different days at the same site, while 80 percent of protein groups were quantified in common across different days and across different labs.
The study confirmed the new MS methods could be incorporated into standardized and high-throughput proteomics workflows to offer exceptional measurement robustness and enhanced scalability from discovery to validation. By implementing the latest MS-based approaches across the biomarker pipeline, the study demonstrated how these robust methods could help to accelerate the development of potential biomarkers into the clinic.
Conclusion
To realize the full potential of protein biomarkers, the existing challenges around reproducibility and scalability encountered with traditional proteomics workflows must be overcome. The latest MS-based proteomics workflows are addressing these bottlenecks in the translational pipeline, helping to drive the development of protein biomarkers for cancer diagnosis, monitoring, and precision treatment.
REFERENCES
- World Health Organization, Cancer Key Facts, https://www.who.int/news-room/fact-sheets/detail/cancer
- Anderson L. Six decades searching for meaning in the proteome. J. Proteomics. 2014; 107: 24–30.
- Huang A, Zhang M, Li T, Qin X. Serum Proteomic Analysis by Tandem Mass Tags (TMT) Based Quantitative Proteomics in Gastric Cancer Patients. Clin Lab. 2018; 64(5): 855–866.
- Britton D, Zen Y, Quaglia A et al. Quantification of Pancreatic Cancer Proteome and Phosphorylome: Indicates Molecular Events Likely Contributing to Cancer and Activity of Drug Targets. PLOS ONE 9(8) DOI: 10.1371/journal.pone.0090948.