Journal Information
Share
Share
Download PDF
More article options
Visits
92
Review Article
Full text access
Uncorrected Proof. Available online 25 July 2024
Protein Biomarkers in Lung Cancer Screening: Technical Considerations and Feasibility Assessment
Visits
92
Daniel Orivea,b,c, Mirari Echeparea,b,c,d, Franco Bernasconi-Bisioe,f, Miguel Fernández Sanmamedc,g,h, Antonio Pineda-Lucenad,e,f, Carlos de la Calle-Arroyoi, Frank Detterbeckj, Rayjean J. Hungk,l, Mattias Johanssonm, Hilary A. Robbinsm, Luis M. Seijon,o, Luis M. Montuengaa,b,c,d,
Corresponding author
lmontuenga@unav.es

Corresponding authors.
, Karmele Valenciaa,c,d,f,
Corresponding author
kvalencia@unav.es

Corresponding authors.
a Solid Tumors Program, CIMA-University of Navarra, Pamplona, Spain
b Department of Pathology, Anatomy and Physiology, School of Medicine, University of Navarra, Pamplona, Spain
c Consorcio de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
d Navarra Health Research Institute (IDISNA), Pamplona, Spain
e Molecular Therapeutics Program, CIMA-University of Navarra, Pamplona, Spain
f Department of Biochemistry and Genetics, School of Sciences, University of Navarra, Pamplona, Spain
g Program of Immunology and Immunotherapy, CIMA-University of Navarra, Pamplona, Spain
h Department of Oncology, Clínica Universidad de Navarra, Pamplona, Spain
i Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Universidad de Navarra, Pamplona, Spain
j Division of Thoracic Surgery, Department of Surgery, Yale School of Medicine, New Haven, CT, USA
k Prosserman Centre for Population Health Research, Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Canada
l Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
m International Agency for Research on Cancer, Lyon, France
n CIBER of Respiratory Diseases (CIBERES), Institute of Health Carlos III, Madrid, Spain
o Pulmonary Department, Clínica Universidad de Navarra, Madrid, Spain
Ver más
This item has received
Article information
Abstract
Full Text
Bibliography
Download PDF
Statistics
Figures (4)
Show moreShow less
Abstract

Lung cancer remains the leading cause of cancer-related deaths worldwide, mainly due to late diagnosis and the presence of metastases. Several countries around the world have adopted nation-wide LDCT-based lung cancer screening that will benefit patients, shifting the stage at diagnosis to earlier stages with more therapeutic options. Biomarkers can help to optimize the screening process, as well as refine the TNM stratification of lung cancer patients, providing information regarding prognostics and recommending management strategies. Moreover, novel adjuvant strategies will clearly benefit from previous knowledge of the potential aggressiveness and biological traits of a given early-stage surgically resected tumor. This review focuses on proteins as promising biomarkers in the context of lung cancer screening. Despite great efforts, there are still no successful examples of biomarkers in lung cancer that have reached the clinics to be used in early detection and early management. Thus, the field of biomarkers in early lung cancer remains an evident unmet need.

A more specific objective of this review is to present an up-to-date technical assessment of the potential use of protein biomarkers in early lung cancer detection and management. We provide an overview regarding the benefits, challenges, pitfalls and constraints in the development process of protein-based biomarkers. Additionally, we examine how a number of emerging protein analytical technologies may contribute to the optimization of novel robust biomarkers for screening and effective management of lung cancer.

Keywords:
Lung cancer
Biomarkers
Screening
Protein
Early stage
High-plex technologies
Discovery
Prognosis
Abbreviations:
AI
AIC
AUC
AUROC
CyTOF
DNA
DSP
ELISA
FFPE
ICP-TOF-MS
IHC
INTEGRAL
LDCT
miRNA
mRNA
MS
NGS
NSCLC
PANOPTIC
PEA
qPCR
REMARK
RNA
SdAbs
SELEX
ssDNA
TRIPOD
VHH
Full Text
BackgroundWhy biomarkers in the context of early lung cancer?

Lung cancer is the leading cause of cancer death worldwide1 and a major public health challenge, with approximately 66% of patients diagnosed in advanced stages when curative options are scarce.2

To reduce mortality rates, prevention through smoking cessation programs and early detection via low-dose computed tomography (LDCT)-based screening are essential strategies. A stage shift to earlier diagnosis with more therapeutic and curative options and social and economic benefits will be more evident once population-level lung cancer screening is implemented widely, as suggested by several systematic reviews, models and meta-analyses.3

One of the challenges of LDCT-based lung cancer screening is to decide and refine robust individual inclusion criteria. More than 40% of currently diagnosed lung cancers do not meet the generally adopted inclusion criteria for LDCT screening, based on age and smoking exposure. The quest for biomarkers to optimize the selection of high-risk individuals for lung cancer screening is a hot research area. Biomarkers may also help to characterize the risk of malignancy of nodules found in the course of LDCT imaging and to recommend further investigation in patients with nodules of higher risk. In addition, they also may avoid unnecessary tests or invasive interventions in those individuals with nodules who are very unlikely to develop neoplastic disease.4 In the context of prognosis or when searching for the most effective management of early lung tumors, biomarkers may significantly facilitate decision-making in relation to the best personalized preventive or therapeutic strategies and, thus, may aid in patient information and management.5

Protein-based biomarkers

General considerations for biomarker application were previously reviewed in Ref.6 Although currently there is considerable hype regarding whole genome epigenomics based on circulating cell-free DNA and DNA fragmentomics, the requirement for sensitivity in the screening or early tumor scenario is very high and still to be technically reached.7 While recognizing that the DNA-based liquid biopsies may provide in the future an excellent tool, we will focus this review only on proteins and will expose the new exciting developments regarding protein-based biomarkers in lung cancer early detection. Proteins are stable molecules and easy to measure by cost-effective technologies. They are also already extensively used markers in other settings in clinical oncology, both in tissue and in liquid biopsy. From the mechanistic point of view, proteins are more relevant than the circulating nucleic acids, as they are the end products of genetics and the physiologically active molecules. Finally, there are already very promising protein panels currently being validated in the context of lung cancer screening.8–11 Moreover, protein-based longitudinal monitoring over time can also be carried out in both early- and late-stage lung cancer patients, which is highly helpful for clinical decision making.12 In the specific case of lung cancer, proteins can be found not only in blood (serum or plasma) but also in bronchial lavage, fine needle aspirations, pleural fluid and even in exhaled breath condensate.13,14

In late stages, a number of well-established proteins common to a variety of tumors, such as CYFRA21-1, CEA or CA125, are used in clinical laboratories to support diagnosis or to provide an idea of tumor burden.15 In contrast, in early-stage cancer, despite great efforts made thus far there are no successful examples of proteins or any other type of markers that have yet reached the clinic. Recently, several studies proposed panels of targeted circulating proteins as a useful tool for the screening context. Interestingly, some of the late stage tumor protein markers are also included in these early stage candidate panels.9,16

The main objective of this review is to present an up-to-date technical assessment of the potential use of protein biomarkers in early lung cancer detection and management. We aim to provide an overview regarding the benefits, challenges, pitfalls and constraints associated to the development of protein-based biomarkers (summarized in Fig. 1). This review seeks to examine how several emerging protein analytical technologies may contribute in the near future to the discovery and validation of novel robust biomarkers for the early detection and effective management of lung cancer.

Fig. 1.

Different phases in biomarker development, from bench to bedside: from discovery to approval of a biomarker, specifying the main aims, technical objectives and the more common pitfalls associated to each phase/step.

(0.94MB).
Image created with BioRender.com.
Discovery of protein biomarkers

The identification of a selected list of candidate proteins by analyzing and performing an initial validation in clinical cohorts is the first, and very relevant step.

Experimental design in the discovery phase

Several guidelines on how to report new potential biomarkers, such as REMARK or TRIPOD, have emphasized the importance of carefully describing the experimental design carried out in the discovery phase. Nevertheless, many published studies do not report sufficient details on these crucial aspects. For example, for prognostic biomarkers, it is imperative to report on inclusion criteria, clinical traits of the cohort studied or whether patients received any treatment. All these aspects have a strong impact on the outcome (progression free survival or overall survival) of the patient.17,18

Although there is a general agreement on the importance of an adequate sample size for biomarker discovery when dealing with high-dimensional data, many studies in the discovery phase tend to be underpowered. In the context of early lung cancer, the study power is determined by the actual proportion of outcome events (e.g. screening positive findings, death or recurrence) also known as the effective sample size.

Another frequently neglected aspect is the standardization of the preanalytical characteristics of the studied specimen. To ensure accurate and reproducible marker assays, it is critical to inform how the original specimen was processed, the preanalytical aspects relevant to blood samples such as the number of freeze–thaw cycles, and the control of time from phlebotomy to blood processing and plasma or serum freezing.18

Protein biomarker discovery technologies

Novel technologies currently used for high-throughput protein biomarker discovery are distinguished by their technical complexity and their capacity to analyze multiple, up to thousands, different proteins simultaneously. These novel techniques, used at the discovery phase, are economically costly. This stands in stark contrast to the cost-effective techniques which will be applied in the next phases for the analysis of a smaller number of selected proteins in the clinical setting. In these discovery projects, the aim is to sift through a large comprehensive initial list of candidate proteins to the smallest possible subset of potential biomarkers. In recent years, the combination of proteomics and AI19 has allowed to identify the most robust candidate biomarkers from large numbers of proteins.

Several high-throughput technologies have been more frequently used for this “initial filter” discovery phase, either in tissue samples (multiplexed immunolocalization, such as digital spatial profiling), liquid biopsy (proximity extension assay, aptamer-based technology) or both (mass spectrometry and mass cytometry) (Fig. 2). They will be briefly summarized in the following paragraphs.

  • i)

    Digital spatial profiling (DSP)

Fig. 2.

Summary of the main characteristics of the protein-related discovery technologies that have been discussed in the previous paragraphs.

(0.94MB).
Image created with BioRender.com.

DSP technology is a nondestructive method suitable for use on formalin-fixed, paraffin-embedded (FFPE) tissue samples for performing high-plex spatial profiling of preselected proteins by counting small photocleavable oligonucleotide “barcodes” attached to primary antibodies assigned to each target of interest, allowing quantification of proteins in different tissue compartments, such as stroma and tumor.20

A recent study was published using DSP technology in which CD44 was identified as a promising novel biomarker. CD44 can indicate the sensitivity of non-small cell lung cancer (NSCLC) to treatments that block the programmed cell death protein 1 (PD-1) axis, which has significant implications for improving the efficacy of immunotherapy strategies for NSCLC patients.21

One major limitation of this technology is that DSP quantifies proteins based on a compartment but not on individual cells. As a consequence, it is not able to distinguish the cellular origin of a specific protein in each compartment. Therefore, mechanistic conclusions should not be drawn hastily when using this method, especially regarding the cell of origin, without more in-depth research and alternative validation techniques.

  • ii)

    Proximity extension assay (PEA)

The proximity extension assay (PEA) is a multiplex immunoassay for high-throughput detection of protein biomarkers in liquid biopsy. For each protein biomarker, a matched pair of antibodies linked to unique DNA-encoded tags binds to the respective protein target. Then, the hybridizing oligonucleotide is extended by adding a DNA polymerase, generating a DNA amplicon that can be detected and quantified by qPCR or NGS.22 Cross-reactivity due to unspecific binding of antibodies is avoided since only matched DNA reporter pairs can hybridize to produce an amplicon, allowing for highly multiplexed assays with coverage across a broad dynamic range (∼9log) with readout specificity and sensitivity comparable to or better than ELISAs (down to fg/mL) and using small sample volumes (1–5μL).23

The INTEGRAL study explored 1078 unique circulating protein markers in 1253 LDCT lung cancer screening participants using PEA technology and identified a panel of 36 potentially informative proteins for assessing the risk of malignancy for screen-detected pulmonary nodules, 10 of which are predictive of imminent (<1 year) diagnosis.24 Moreover, a recent study carried out by Davies et at. analyzed 2941 proteins in 496 plasma samples from the Liverpool Lung Project, which looks for biomarkers in early lung cancer diagnosis using a case–control design. Researchers found 240 differentially expressed proteins between healthy individuals and future lung cancer cases 1–3 years before diagnosis. For long-term prediction (1–5 years before diagnosis), 267 proteins were identified. Of these, 117 proteins overlapped with the 1–3 year analysis. Using machine learning algorithms (Elastic Net, Random Forest, Support Vector Machine, XGBoost), predictive models were developed. The Area Under the Curve (AUC) values ranged from 0.76 to 0.90 for the 1–3 year models and 0.73 to 0.83 for the 1–5 year models, indicating strong predictive accuracy. This research highlights the potential of plasma protein biomarkers for early lung cancer detection.25 Another project within INTEGRAL also identified 36 markers when measuring up to 1162 proteins among 731 lung cancer cases and 731 matched controls selected from 6 prospective epidemiological cohort studies.26 Together, these two initiatives selected 21 proteins to include on a single panel intended to optimize selection of individuals for lung cancer screening and management of screen-detected pulmonary nodules.11,26

A limitation of PEA technology, as is the case of most liquid biopsy-related techniques, is that despite its high sensitivity and specificity, it cannot be used to investigate the cellular (tumor or host) source of the protein. Second, although there are already PEA-based options to measure absolute quantification, the final quantification used in PEA-based multiplexed discovery assays provides relative protein quantification units. Recent exhaustive technical performance studies have been carried out on this technology.23

  • iii)

    Mass spectrometry (MS)

MS allows the identification and quantification of thousands of proteins in the same sample in a delocalized way (processed tissue or fluid). Recently, high-resolution mass spectrometers have been developed that are able to separate more than 1000 peptides at the same time.27 However, some of the main limitations of MS are the high dependency on operator expertise to avoid loss of detected proteins; the need of sophisticated equipment and costly laboratory set-up and maintenance; and the long and complex analytical process.28 Despite these limitations, MS techniques have advanced so much in the last years that single-cell proteome characterization is already possible.29

A lung cancer MS-based protein biomarker panel based on the detection of LG3BP and C163A plasma protein levels together with clinical and imaging factors was proposed by Silvestri et al. in the PulmonAry NOdule Plasma proTeomIc Classifier (PANOPTIC) trial, which consisted of 685 patients with 8–30-mm lung nodules. The trial results reported an increase in accuracy for distinguishing benign from malignant lung nodules. Using a 1-year follow-up to determine benignity, the sensitivity was 97%, with a specificity of 44% and a negative predictive value of 98%.30 The 2-year follow-up results of the PANOPTIC trial were published later with similar results.31

  • iv)

    Mass cytometry (CyTOF)

CyTOF is a targeted proteomic analytical technology that allows the antibody-based detection and quantification of up to 130 protein markers with single-cell resolution in heterogeneous biological samples.32,33 An adaptation of CyTOF to tissue sections is called imaging mass cytometry (IMC) and up to 40 simultaneous proteins can be analyzed.34 In both versions of CyTOF, cells are stained with antibodies following protocols similar to flow cytometry (FC) or multiplex immunolocation, except that metal nonradioactive isotopes are employed as reporter groups rather than fluorophores. As a readout, mass cytometry uses the atomic mass of these metal tags detected in an inductively coupled plasma time-of-flight mass spectrometry (ICP–TOF–MS) instrument which circumvents the growing challenge of fluorescence spectral overlap. CyTOF has advanced cellular subset analysis into a new high-dimensional era for defining, characterizing, and quantifying immune cell subsets in different solid tumor types, including lung cancer.35,36 However, CyTOF presents different technical challenges, including the need for a digestion process that may over- or underrepresent some cell subtypes or protein levels depending on the selected digestion protocol. Efforts to optimize this process have been developed by some groups.33 Moreover, the costs of CyTOF or IMC are high, as metal-tagged antibodies and antibody conjunction kits are expensive, the data acquisition rate is an order of magnitude slower than that of flow cytometry, and because heavy metals are common in laboratory reagents, avoiding contamination during sample preparation is critical.37

  • v)

    Aptamer-based proteomics

Aptamers are oligonucleotides, such as ribonucleic acid (RNA) and single-strand deoxyribonucleic acid (ssDNA), or peptide molecules designed to bind to predetermined targets with high affinity and specificity due to their specific three-dimensional structure.38 Aptamers with specific binding properties are selected via the so-called SELEX (Systematic Evolution of Ligands by Exponential enrichment) process, which is designed to optimize high affinity, slow off-rate, and high specificity to target proteins or other molecules. Aptamers are used for biomarker discovery in the context of highly multiplexed technology assays based on a large pool of different aptamers capable of simultaneously measuring thousands of human proteins broadly ranging from femto- to micromolar concentrations.39

Despite the promising opportunities that aptamers can offer in biomarker discovery, more work is certainly needed to fully assess the limitations of this technology: address background noise, limits of detection, specificity, potential cross-reactivity, and orthogonal reproducibility.

Building a biomarker modelThe challenge of large datasets in proteomics discovery

The main objective in the discovery phase is to develop algorithms that reach the desired accuracy level of prediction with the smallest number of predictor variables. In this way, the test will be more cost-effective and will allow faster results, which benefits both clinicians and patients. In addition, performing fewer assays reduces calibration errors, giving rise to an increase in consistency and robustness of the developed tools. A common pitfall of high-dimensional data analysis is model overfitting, which can be due to the combination of inadequate sample size, large number of variables and lack of statistical consideration for multiple testing.40 Even in well-designed discovery studies, the performance of biomarker-based models is always expected to decrease when moved to validation in independent cohorts. Further detailed discussion of statistical analysis in the context of biomarker model building is outside the scope of this review but is available elsewhere.41–43

Metrics for the quality of a prediction model

The evaluation of prediction models includes measuring discrimination and calibration metrics.

Discrimination is a crucial factor in determining the effectiveness of a predictive model, as it measures the model's ability to differentiate between patients who are at risk of a particular outcome and those who are not. The most commonly used metric to evaluate discrimination is the area under the receiver operating characteristic curve (AUROC or AUC) or the C-statistic when using time-to-event analysis.44,45

Calibration assesses the concordance between estimated and observed event probabilities. While discrimination is often prioritized, inadequate calibration is an ‘Achilles heel’ in predictive analytics, potentially yielding misleading predictions as overestimating risk may cause overtreatment, while underestimation can result in undertreatment or lack thereof. The TRIPOD guidelines recommend reporting on calibration performance, as the success of clinical implementation requires both good discrimination and calibration.46

Prediction model performance, including both discriminative ability and calibration, should always be in an external independent validation set (see below) or, at minimum, in a held-out testing set from the original data source.

Final model assessment

Inclusion of clinical and pathological variables with diagnostic, prognostic or predictive relevance can be important to increase the performance of the combined model. In recent years, several papers have been published proposing the use of artificial intelligence (AI) to select the most powerful combination of markers in different clinical situations, including in lung cancer.47 A final version of the model must be locked prior to clinical validation and clinical utility studies. Depending on the clinical question addressed, the robustness of the final model could be compared with well-established state of the art tools, such the TNM for prognosis, the Mayo or PanCan (Brock model) for classification of indeterminate pulmonary nodules or the PLCO2012 model for lung cancer risk assessment. For these comparisons, the new biomarker-based models can be assessed alone or in combination with the state-of-the-art models using different metrics, such as sensitivity, specificity, the Akaike Information Criterion (AIC), C-Index and/or AUC.

Designing a protein-based biomarker testDevelopment and standardization of a test to be used for quantifying biomarkers in the clinical setting

Once an algorithm/model including the top candidates and clinical factors has been robustly established, the optimal situation is to further develop analytical tests based on standard techniques that are already commonly used in the clinical routine. As an example, we identified and validated two immunohistochemistry-based protein prognostic signatures that can stratify patients with early-stage lung squamous cell carcinoma (SCC) and adenocarcinoma (AC) based on their risk of recurrence or mortality.48,49

Biomarker analysis tests

For tissue samples, immunohistochemistry (IHC) is critical for clinical diagnosis. The establishment of an IHC-based test requires robustness, clear cutoff values, unambiguous reading criteria and a highly standardized protocol.50

For liquid biopsy samples, optical sandwich enzyme-linked immunosorbent assay (ELISA) is the most common in the clinical practice. While optical ELISA provides highly reproducible, sensitive, and specific quantitative data, it has some limitations. The procedure is time-consuming, sometimes requires expensive antibodies, and may require a relatively high sample volume.

Both solid and liquid-based immunoassays can currently be found in multiplexed versions. Although several commercial multiplexed immunofluorescent methods have been developed for tissue samples, most clinical pathology laboratories are currently using chromogenic IHC for a maximum of 2 markers per FFPE slide. Among the different multiplexed liquid-based immunoassays, many commercially available kits are based on differentially labeled capture beads for each target in a multiplex ELISA-like assay.

One of the limitations of IHC and ELISA in the field of biomarker development is that it completely depends on having a good primary antibody which may be a costly process and require a long time. In this context, although it is still in its infancy, the flexible manufacturing and increasing use of single domain antibodies (SdAbs) may push toward a future revolution in the field. SdAbs, also known as VHHs or Nanobodies® (Ablynx), are small-sized (15kDa) single-domain antibodies derived from the variable domains of heavy chain-only antibodies present in camelids. They have several advantages over conventional antibodies, such as lower manufacturing costs, improvement in stability, sensitivity and specificity and advantages for in vivo imaging due to their tissue penetrability and fast renal clearance, making them attractive tools for cancer biomarker diagnosis.51,52

Real-life test: clinical validation, clinical utility validation and some regulatory considerationsClinical validation

External clinical validation assesses the performance of a prediction model using independent datasets distinct from those employed in its development and preferably from a different institution. This procedure is crucial for rigorously evaluating the validity and generalizability of a predictive model to diverse populations.53

Multi-institutional and multinational collaborations are being built (EDRN, INTEGRAL-ILCCO, SOLACE, I-ELCAP, etc.), which may be the ideal setting for external cross-validation of biomarkers in the context of lung cancer screening.

Finally, a model based on biomarkers always requires confirmation of cost-effectiveness before assessing clinical utility.54,55 As one example, an economic evaluation was recently performed to assess the cost-effectiveness of incorporating biomarker information to improve risk assessment for lung cancer screening. This study concluded that the optimization of the selection of ever smoker patients for lung cancer screening was cost effective depending on the scenario. The findings derived from this preliminary analysis do not provide final conclusive evidence to determine a specific price setting in the context of screening. Nevertheless, they do substantiate the need for further investigation on the eligibility, including cost, of biomarkers to optimize lung cancer screening.56

Clinical utility validation

After developing and validating a signature, the critical step to obtain final clearance from the regulatory authorities and to translate it to the real-life routine clinical setting is to perform clinical utility validation studies. Clinical utility refers to the ability of a test to generate results that guide treatment decisions which either increase the patient's lifespan or improve their quality of life by reducing the negative side effects of treatment. This stage of validation is different from previous phases because it usually requires a prospective clinical trial, randomized or specially designed, rather than retrospective analysis of stored specimens.

Mazzone et al. thoroughly explored the possible designs of prospective clinical trials to test the utility of the use of biomarkers in the context of LDCT-based lung cancer screening.57 At least four different potential designs were proposed. There are also new calls for potential non-randomized trial designs, or novel statistical tools, which may allow for shorter and more efficient ways to test a biomarker in the context of cancer screening.58,59

Ideally, in the clinical utility validation stage, the biomarker-based model will be tested in multiple cohorts that have a similar prevalence of the disease. This is important because biomarkers may perform differently across a wide range of disease prevalence cohorts, and their performance can also be affected by variables such as age or disease stage.60,61

A clear example of the importance of analyzing clinical utility was the recent evaluation by the UK Health Technologies Authority of a commercially available lung cancer early detection blood test, based on autoantibodies, for risk classification of solid pulmonary nodules. Despite its potential, the evaluation determined that there was insufficient evidence to ensure its diagnostic accuracy or clinical or economic value.62

As mentioned, there are some protein-based biomarkers for which clinical utility has been validated in late-stage lung cancer diagnostics (CEA, CYFRA, etc.). There are also protein biomarkers for predictive use in specific treatments, such as PD-L1 expression in the context of immunotherapy.63 Nevertheless, there are still no biomarkers for which clinical utility has been fully validated in the different clinical unmet needs for early lung cancer, neither in screening nor in the prognostic or early management context. The most likely explanation of this lack of already approved and clinically useful biomarkers for early lung cancer is the “funnel” structure of the process of discovery and development challenges of these tools, schematically described in Fig. 3.

Fig. 3.

Representation of an estimated proportion of candidate protein biomarkers validated in each phase from the discovery step to clinical approval.

(0.35MB).
Image created with BioRender.com.
Some regulatory considerations

To bring protein biomarkers to clinical practice in the context of lung cancer, multi-faceted efforts are required to optimize communication, ensure feasibility and cost-effectiveness, promote generalizability and equity etc. while balancing potential harms and benefit.64 The aim will always be to comply with the stringent requirements regarding robustness, standardization, reproducibility, etc. established by the consensus guidelines and the regulatory authorities.

From the regulatory point of view, in order to bring to real life approval of a diagnostic or prognostic test, the laboratory itself must be certified (for example under the CLIA program in the US) as able to “perform high-complexity testing.” In almost every global health system, the test requires an official accreditation as in vitro diagnostic test (IVD), or the equivalent according to each regional legislation.

For example, in the EU a new test needs to accomplish the requirements of the recent In Vitro Diagnostic Regulation (IVDR). The main aim of these regulations is to assure the compliance of a number of guidelines for minimum quality for the performance. In the European regulation, cancer marker tests have been included as “class C” devices, meaning that have potential high risk for the patient (if the test is not good enough) but low public health risk. Class C test require an official body to review and approve the product CE certification before it can be commercialized. Similar regulatory rules are found regarding the FDA or other agencies around the world.

It is evident that Health Authorities need to guarantee these minimal quality requirements, which in most cases include the demonstration that the performance of the test has been validated at different levels.

ConclusionsThe most likely near future

In this final paragraph, we discuss how we envision the evolution of the implementation of protein biomarkers in the context of early lung cancer diagnosis and management. In our view, there are three most likely traits that will accompany the development of novel protein-based tools in the near future, namely, multiplexed, integrative/multiplatform and AI-guided.

  • i)

    Multiplexed. It has been shown that multiple biomarker panels generally show better specificity in cancer detection than single markers. For example, thanks to the abovementioned current multiplexed technologies, it is already possible to detect a greater number of circulating proteins simultaneously.

  • ii)

    Integrative/multiplatform. Biomarkers are thought to be a complementary tool for other very informative variables. The multiplatform combination of clinical, molecular or radiomic data will improve the accuracy of clinical diagnosis and, ultimately, the outcome and life of patients.

  • iii)

    AI-guided. Bioinformatics and data science are at the forefront of modern biological research, offering a multidisciplinary approach that also enhances our understanding and potential applications of protein biomarkers. Proteomics, bioinformatics and data science intersect through advanced computational tools and methodologies such as artificial intelligence (AI) and machine learning (ML) to store, organize, analyze and interpret vast amounts of biological data that traditional techniques are unable to manage.

Data science, with its robust machine learning algorithms, uncovers hidden patterns and relationships within biological datasets, crucial for identifying novel biomarkers to diagnose or monitor medical conditions. These innovative data analytical technologies are leading to groundbreaking discoveries in disease diagnosis, prognosis and treatment. The growing importance of these synergic disciplines also highlights the current need for expertise in both biology and computational science for the study of biomarkers in any area of biomedicine.

AI is already being used or tested in the areas of diagnostic, staging and prognostic cancer assessment, especially in the imaging and data management fields. Nevertheless, there are a large number of novel aspects related to AI that will be necessary to understand before it is used in most clinical settings, including biomarkers. AI internal process intricacies and interpretability, such as the relative statistical strength of each variable, the cutoffs, or the way to obtain the final outcome, are potential “black boxes” that the biomarker developer and the clinician will need to discuss extensively with the AI expert in their team. It is very likely that the more complex the biomarker model is, and the more we use AI to define it, the more chances will be of being missled by aspects that only work in a specific setting and the harder it will be to ensure actual generalized applicability. Anticipating how to address these issues is also key for the successful development of a biomarker-AI-based algorithm.

It is important to acknowledge that in this review we were unable to cover every detail of the discovery and development of protein biomarkers, due to the inherent constraints of a narrative review. Despite our efforts, there are additional important aspects regarding biomarker studies that we could not fully address, such as ethnic diversity and sex differences. Moreover, throughout this review, we have described several biomarker studies. However, we were unable to delve deeply into some of their limitations, such as sample size constraints or potential methodological biases.

The issue of biomarker cost-effectiveness needs also to be solved, as we have mentioned above. A detailed analysis of EarlyCDT-lung's cost-benefit analysis in the context of lung cancer screening concluded that there are still not enough cost-effectiveness data available, considering all the published information.62

Another concern regarding the use of protein biomarkers in lung cancer screening involves the impact of false positive and false negative findings. Since a negative finding might lead an individual to eschew evidence-based screening with LDCT, for example, the legal implications of not screening may be significant if a cancer is missed. False positive findings also have important implications for insurance purposes, and may lead to unnecessary anxiety while waiting for years of sequential imaging to rule out cancer.

However, protein biomarkers can potentially improve patient outcomes by characterizing patient risk or pulmonary nodules in order to assuage anxiety regarding imaging findings, guide follow up or treatment, and potentially inform personal risk of lung cancer in the context of screening.

As an example, we show here a real-life adenocarcinoma case (pT1bN0M0) from Clinica Universidad de Navarra. This patient presented a pulmonary nodule in the left lower lobe, located in a challenging paraaortic region, accompanied by perilesional atelectasis (Fig. 4). EarlyCDT lung test was performed to assess nodule malignancy. Out of the seven lung cancer-associated autoantibodies tested, p53 and NY-ESO-1 showed elevated levels indicating a positive result for nodule malignancy (“Moderate Level [M]” according to the test scale).

Fig. 4.

EarlyCDT positive lung adenocarcinoma case from Clinica Universidad de Navarra.

(0.13MB).

In conclusion, we have reviewed why protein biomarkers are very promising molecular tools in the different fields around early lung cancer: screening, diagnosis, prognosis and early management. In the world of biomarkers, being proteins or other molecular analytes, it seems that the long and winding road that we were envisioning two decades ago has now turned into a potentially shorter and straighter path. We now clearly see the light at the end of the tunnel and the real feasibility of bringing to the clinics in a short and reasonable timeframe the first clinically useful biomarkers in the context of early lung cancer.

Ethics approval and consent to participate

Not applicable.

Funding

H.A. Robbins and M. Johansson were supported by the US National Cancer Institute (INTEGRAL project, U19CA203654, and LEAP project, R01CA262164). This work was supported by FIMA, CIBERONC (CB16/12/00443), Spanish Ministry of Science and Innovation and Fondo de Investigación Sanitaria Fondo Europeo de Desarrollo Regional (PI22/00451; to L.M. Montuenga) and Lung Ambition Alliance grant (to Luis M. Montuenga).

K. Valencia was supported by an Investigator grant from Asociación Española Contra el Cáncer (AECC). M. Echepare was supported by a Spanish Ministry of Health (ISCIII, Fondo de Investigación Sanitaria) PFIS predoctoral grant and D. Orive by a Spanish Ministry of Universities Ayuda para la Formación de Profesorado Universitario (FPU, FPU20/06292) predoctoral grant.

Authors’ contributions

K.V. and L.M.M. conceived, supervised, coordinated, reviewed and contributed to the writing of the manuscript. D.O., M.E., F.B.-B., M.F.S., A.P.-L., C.C.-A., F.D. R.J.H., M.J. H.A.R. contributed to the writing and reviewing of the manuscript. K.V. and D.O. prepared figures 1–3. All authors reviewed the final version of the manuscript.

Consent for publication

Not applicable.

Conflict of interests

KV, DO, ME, FB, MFS, AP, CC, FD, RJ: No potential conflicts to declare. LMS reports consulting for Median Technologies, Serum Detect, Sabartech, Roche and AstraZeneca, has received professional fees for speaking engagements from Roche, AstraZeneca and Menarini, and collaborated in scientific meetings or received institutional research grants from Serum Detect, Lung Ambition Alliance, Menarini and Esteve Pharmaceuticals. LMM: Astra-Zeneca: Speaker's Bureau and Research Grant. Pharmamar: Research Grant. Siemens-Healthineers: Research grant. Lung Ambition Alliance: Research Grant. SerumDetect: Research grant. HAR, MJ: none. Disclaimer: Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article, and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.

Availability of data and materials

Not applicable.

Acknowledgements

Not applicable.

References
[1]
H. Sung, J. Ferlay, R.L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, et al.
Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.
CA Cancer J Clin, 71 (2021), pp. 209-249
[2]
R.L. Siegel, K.D. Miller, H.E. Fuchs, A. Jemal.
Cancer statistics, 2022.
CA Cancer J Clin, 72 (2022), pp. 7-33
[3]
T. Hunger, E. Wanka-Pail, G. Brix, J. Griebel.
Lung cancer screening with low-dose CT in smokers: a systematic review and meta-analysis.
Diagnostics (Basel), 11 (2021),
[4]
D.R. Baldwin, M.E. Callister, P.A. Crosbie, E.L. O’Dowd, R.C. Rintoul, H.A. Robbins, et al.
Biomarkers in lung cancer screening: the importance of study design.
Eur Respir J, 57 (2021),
[5]
D.J. Kerr, L. Yang.
Personalising cancer medicine with prognostic markers.
EBioMedicine, 72 (2021),
[6]
L.M. Seijo, N. Peled, D. Ajona, M. Boeri, J.K. Field, G. Sozzi, et al.
Biomarkers in lung cancer screening: achievements, promises, and challenges.
J Thorac Oncol, 14 (2019), pp. 343-357
[7]
A.R. Thierry.
Circulating DNA fragmentomics and cancer screening. Vol. 3, Cell genomics.
Cell Press, (2023),
[8]
E. Irajizad, J.F. Fahrmann, T. Marsh, J. Vykoukal, J.B. Dennison, J.P. Long, et al.
J Clin Oncol, 41 (2023), pp. 4360-4368
[9]
J.F. Fahrmann, T. Marsh, E. Irajizad, N. Patel, E. Murage, J. Vykoukal, et al.
Blood-based biomarker panel for personalized lung cancer risk assessment.
J Clin Oncol, 40 (2022), pp. 876
[10]
E.J. Ostrin, L.E. Bantis, D.O. Wilson, N. Patel, R. Wang, D. Kundnani, et al.
Contribution of a blood-based protein biomarker panel to the classification of indeterminate pulmonary nodules.
J Thorac Oncol, 16 (2021), pp. 228-236
[11]
H.A. Robbins, K. Alcala, E.K. Moez, F. Guida, S. Thomas, H. Zahed, et al.
Design and methodological considerations for biomarker discovery and validation in the Integrative Analysis of Lung Cancer Etiology and Risk (INTEGRAL) Program.
Ann Epidemiol, 77 (2023), pp. 1-12
[12]
C. Alix-Panabières, K. Pantel.
Liquid biopsy: from discovery to clinical application.
Cancer Discov, 11 (2021), pp. 858-873
[13]
A. Campanella, S. De Summa, S. Tommasi.
Exhaled breath condensate biomarkers for lung cancer.
J Breath Res, 13 (2019), pp. 044002
[14]
N. Zakharova, A. Kozyr, A.M. Ryabokon, M. Indeykina, P. Strelnikova, A. Bugrova, et al.
Mass spectrometry based proteome profiling of the exhaled breath condensate for lung cancer biomarkers search.
Expert Rev Proteomics, 18 (2021), pp. 637-642
[15]
A. Yoshimura, J. Uchino, K. Hasegawa, T. Tsuji, S. Shiotsu, T. Yuba, et al.
Carcinoembryonic antigen and CYFRA 21-1 responses as prognostic factors in advanced non-small cell lung cancer.
Transl Lung Cancer Res, 8 (2019), pp. 227-234
[16]
X. Feng, W.Y.Y. Wu, J.U. Onwuka, Z. Haider, K. Alcala, K. Smith-Byrne, et al.
Lung cancer risk discrimination of prediagnostic proteomics measurements compared with existing prediction tools.
J Natl Cancer Inst, (2023),
[17]
K.G.M. Moons, D.G. Altman, J.B. Reitsma, J.P.A. Ioannidis, P. Macaskill, E.W. Steyerberg, et al.
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration.
Ann Intern Med, 162 (2015), pp. W1-W73
[18]
W. Sauerbrei, S.E. Taube, L.M. McShane, M.M. Cavenagh, D.G. Altman.
Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): an abridged explanation and elaboration.
J Natl Cancer Inst, 110 (2018), pp. 803-811
[19]
X. Zhang, I. Jonassen, A. Goksøyr.
Machine learning approaches for biomarker discovery using gene expression data.
Bioinformatics, 20 (2021), pp. 53-64
[20]
C.R. Merritt, G.T. Ong, S.E. Church, K. Barker, P. Danaher, G. Geiss, et al.
Multiplex digital spatial profiling of proteins and RNA in fixed tissue.
Nat Biotechnol, 38 (2020), pp. 586-599
[21]
M.K. Moutafi, M. Molero, S. Martinez Morilla, J. Baena, I.A. Vathiotis, N. Gavrielatou, et al.
Spatially resolved proteomic profiling identifies tumor cell CD44 as a biomarker associated with sensitivity to PD-1 axis blockade in advanced non-small-cell lung cancer.
J Immunother Cancer, 10 (2022),
[22]
L. Wik, N. Nordberg, J. Broberg, J. Björkesten, E. Assarsson, S. Henriksson, et al.
Proximity extension assay in combination with next-generation sequencing for high-throughput proteome-wide analysis.
Mol Cell Proteomics, 26 (2021), pp. 20
[23]
B.C. Carlyle, R.R. Kitchen, Z. Mattingly, A.M. Celia, B.A. Trombetta, S. Das, et al.
Technical performance evaluation of Olink proximity extension assay for blood-based biomarker discovery in longitudinal studies of Alzheimer's disease.
Front Neurol, 13 (2022), pp. 889647
[24]
E. Khodayari Moez, M.T. Warkentin, Y. Brhane, S. Lam, J.K. Field, G. Liu, et al.
Circulating proteome for pulmonary nodule malignancy.
J Natl Cancer Inst, 115 (2023), pp. 1060-1070
[25]
M.P.A. Davies, T. Sato, H. Ashoor, L. Hou, T. Liloglou, R. Yang, et al.
Plasma protein biomarkers for early prediction of lung cancer.
EBioMedicine, 93 (2023),
[26]
Lung Cancer Cohort Consortium (LC3). The blood proteome of imminent lung cancer diagnosis. Nat Commun. 2023;14(1):3042.
[27]
C.B. Messner, V. Demichev, Z. Wang, J. Hartl, G. Kustatscher, M. Mülleder, et al.
Mass spectrometry-based high-throughput proteomics and its role in biomedical studies and systems biology. Proteomics.
John Wiley and Sons Inc., (2022),
[28]
S.C. Wilschefski, M.R. Baxter.
Inductively coupled plasma mass spectrometry: introduction to analytical aspects.
Clin Biochem Rev, 40 (2019), pp. 115
[29]
H.M. Bennett, W. Stephenson, C.M. Rose, S. Darmanis.
Single-cell proteomics enabled by next-generation sequencing or mass spectrometry. Vol. 20, Nature methods.
Nature Research, (2023), pp. 363-374
[30]
G.A. Silvestri, N.T. Tanner, P. Kearney, A. Vachani, P.P. Massion, A. Porter, et al.
Assessment of plasma proteomics biomarker's ability to distinguish benign from malignant lung nodules: results of the PANOPTIC (Pulmonary Nodule Plasma Proteomic Classifier) trial.
Chest, 154 (2018), pp. 491
[31]
N.T. Tanner, S.C. Springmeyer, A. Porter, J.R. Jett, P. Mazzone, A. Vachani, et al.
Assessment of integrated classifier's ability to distinguish benign from malignant lung nodules: extended analyses and 2-year follow-up results of the PANOPTIC (pulmonary nodule plasma proteomic classifier) trial.
Chest, 159 (2021), pp. 1283-1287
[32]
M.H. Spitzer, G.P. Nolan.
Mass cytometry: single cells many features. Vol. 165, Cell.
Cell Press, (2016), pp. 780-791
[33]
T. Badri, I. Eguren-Santamaria, E. Fernandez-Pierola, M.F. Sanmamed.
Mass cytometry to characterize the immune lung cancer microenvironment.
Methods Cell Biol, 174 (2023), pp. 31-41
[34]
Y. Glasson, L.A. Chépeaux, A.S. Dumé, V. Lafont, J. Faget, N. Bonnefoy, et al.
Single-cell high-dimensional imaging mass cytometry: one step beyond in oncology.
Semin Immunopathol, 45 (2023), pp. 17-28
[35]
Y. Lavin, S. Kobayashi, A. Leader, E.a.D. Amir, N. Elefant, C. Bigenwald, et al.
Innate immune landscape in early lung adenocarcinoma by paired single-cell analyses.
[36]
M.F. Sanmamed, X. Nie, S.S. Desai, F. Villaroel-Espindola, T. Badri, D. Zhao, et al.
A burned-out cd8+ t-cell subset expands in the tumor microenvironment and curbs cancer immunotherapy.
Cancer Discov, 11 (2021), pp. 1700-1715
[37]
L.P. Arnett, R. Rana, W.W.Y. Chung, X. Li, M. Abtahi, D. Majonis, et al.
Reagents for mass cytometry. Vol. 123, Chemical reviews.
American Chemical Society, (2023), pp. 1166-1205
[38]
K.M. Song, S. Lee, C. Ban.
Aptamers and their biological applications.
Sensors, 12 (2012), pp. 612-631
[39]
J. Candia, G.N. Daya, T. Tanaka, L. Ferrucci, K.A. Walker.
Assessment of variability in the plasma 7k SomaScan proteomics assay.
Sci Rep, 12 (2022),
[40]
G.S. Collins, J.B. Reitsma, D.G. Altman, K.G.M. Moons.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.
[41]
M.A.E. Binuya, E.G. Engelhardt, W. Schats, M.K. Schmidt, E.W. Steyerberg.
Methodological guidance for the evaluation and updating of clinical prediction models: a systematic review.
BMC Med Res Methodol, 22 (2022),
[42]
A.A.H. de Hond, A.M. Leeuwenberg, L. Hooft, I.M.J. Kant, S.W.J. Nijman, H.J.A. van Os, et al.
Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review.
NPJ Digit Med, 5 (2022), pp. 2
[43]
S.C. Lu, C.L. Swisher, C. Chung, D. Jaffray, C. Sidey-Gibbons.
On the importance of interpretable machine learning predictions to inform clinical decision making in oncology.
Front Oncol, 13 (2023),
[44]
L.E. Cowley, D.M. Farewell, S. Maguire, A.M. Kemp.
Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature.
Diagn Progn Res, 3 (2019), pp. 1-23
[45]
Y. Huang, W. Li, F. Macheret, R.A. Gabriel, L. Ohno-Machado.
A tutorial on calibration measurements and calibration models for clinical prediction models.
J Am Med Inform Assoc, 27 (2020), pp. 621-633
[46]
D. Dhamnetiya, R.P. Jha, S. Shalini, K. Bhattacharyya.
How to analyze the diagnostic performance of a new test? Explained with illustrations.
J Lab Physicians, 14 (2022), pp. 90
[47]
C. Ladbury, A. Amini, A. Govindarajan, I. Mambetsariev, D.J. Raz, E. Massarelli, et al.
Integration of artificial intelligence in lung cancer: rise of the machine. Vol. 4, Cell reports medicine.
Cell Press, (2023),
[48]
E. Martínez-Terroba, C. Behrens, F.J. de Miguel, J. Agorreta, E. Monsó, L. Millares, et al.
A novel protein-based prognostic signature improves risk stratification to guide clinical management in early-stage lung adenocarcinoma patients.
J Pathol, 245 (2018), pp. 421-432
[49]
E. Martínez-Terroba, C. Behrens, J. Agorreta, E. Monsó, L. Millares, E. Felip, et al.
5 protein-based signature for resectable lung squamous cell carcinoma improves the prognostic performance of the TNM staging.
[50]
V. Yaghoobi, S. Martinez-Morilla, Y. Liu, L. Charette, D.L. Rimm, M. Harigopal.
Advances in quantitative immunohistochemistry and their contribution to breast cancer.
Expert Rev Mol Diagn, 20 (2020), pp. 509-522
[51]
M.M. Harmsen, H.J. De Haard.
Properties, production, and applications of camelid single-domain antibody fragments.
Appl Microbiol Biotechnol, 77 (2007), pp. 13-22
[52]
G. Gonzalez-Sapienza, M.A. Rossotti, S. Tabares-da Rosa.
Single-domain antibodies as versatile affinity reagents for analytical and diagnostic applications. Vol. 8, Frontiers in Immunology.
Frontiers Media S.A., (2017),
[53]
C.L. Ramspek, K.J. Jager, F.W. Dekker, C. Zoccali, M. Van DIepen.
External validation of prognostic models: what, why, how, when and where?.
Clin Kidney J, 14 (2021), pp. 49
[54]
R.J. Hung, E. Khodayari Moez, S.J. Kim, S. Budhathoki, J.D. Brooks.
Considerations of biomarker application for cancer continuum in the era of precision medicine.
Curr Epidemiol Rep, 9 (2022), pp. 200-211
[55]
R.J. Hung.
Biomarker-based lung cancer screening eligibility: implementation considerations.
Cancer Epidemiol Biomarkers Prev, 31 (2022), pp. 698-701
[56]
T.L. Larose, F. Meheus, P. Brennan, M. Johansson, H.A. Robbins.
Assessment of biomarker testing for lung cancer screening eligibility.
JAMA Netw Open, 3 (2020),
[57]
P.J. Mazzone, C.R. Sears, D.A. Arenberg, M. Gaga, M.K. Gould, P.P. Massion, et al.
Evaluating molecular biomarkers for the early detection of lung cancer: when is a biomarker ready for clinical use? An official American Thoracic Society Policy Statement.
Am J Respir Crit Care Med, 196 (2017), pp. e15-e29
[58]
R. Etzioni, R. Gulati, C. Patriotis, C. Rutter, Y. Zheng, S. Srivastava, et al.
Revisiting the standard blueprint for biomarker development to address emerging cancer early detection technologies.
J Natl Cancer Inst, (2023),
[59]
Z. Feng, M.S. Pepe.
Adding rigor to biomarker evaluations-EDRN experience.
Cancer Epidemiol Biomarkers Prev, 29 (2020), pp. 2575-2582
[60]
M.S. Pepe, R. Etzioni, Z. Feng, J.D. Potter, M.L. Thompson, M. Thornquist, et al.
Phases of biomarker development for early detection of cancer.
J Natl Cancer Inst, 93 (2001), pp. 1054-1061
[61]
R. Simon.
Clinical trial designs for evaluating the medical utility of prognostic and predictive biomarkers in oncology.
Per Med, 7 (2010), pp. 33
[62]
A. Duarte, M. Corbett, H. Melton, M. Harden, S. Palmer, M. Soares, et al.
EarlyCDT Lung blood test for risk classification of solid pulmonary nodules: systematic review and economic evaluation.
Health Technol Assess, 26 (2022),
[63]
H. Yu, T.A. Boyle, C. Zhou, D.L. Rimm, F.R. Hirsch.
PD-L1 expression in lung cancer. Vol. 11, Journal of Thoracic Oncology.
Lippincott Williams and Wilkins, (2016), pp. 964-975
[64]
M.N. Kammer, P.P. Massion.
Noninvasive biomarkers for lung cancer diagnosis, where do we stand?.
J Thorac Dis, 12 (2020), pp. 3317-3330
Copyright © 2024. The Authors
Archivos de Bronconeumología
Article options
Tools

Are you a health professional able to prescribe or dispense drugs?