Researchers leverage whole transcriptome RNA sequencing and machine learning to improve lung cancer risk stratification

“Bronchoscopy is commonly used to evaluate potentially cancerous lung nodules, but it often delivers inconclusive results. This frequently leads to additional diagnostic procedures, including invasive lung surgeries,” said Giulia C. Kennedy, Ph.D., chief scientific officer and chief medical officer for Veracyte. “We used advanced RNA sequencing and novel machine learning technology to overcome numerous, potential confounding factors and develop a robust genomic test that improves diagnostic accuracy in lung cancer. Providing physicians with this level of actionable genomic information can reduce the number of potentially dangerous lung surgeries and help guide next intervention steps for patients.”

The Percepta GSC is based on novel “field of injury” science, which identifies genomic changes that correlate with lung cancer risk in current or former smokers using a brushing to collect cells from the patient’s main lung airway during a standard bronchoscopy, without the need to sample the lesion directly. Veracyte scientists developed the test using RNA whole-transcriptome sequencing and machine learning on more than 1,600 patient samples from three different cohorts. The samples are from both current and former smokers who underwent bronchoscopy for suspected lung cancer. The Percepta GSC has been commercially available since June 2019.

The BMC Medical Genomics paper describes how Veracyte scientists developed the Percepta GSC, mitigating multiple technical and analytical factors that could impact the classifier’s performance, including demographic differences between patient cohorts, smoking status, inhaled medication use and the timing of sample collection. To account for these individual genomic and clinical features, Veracyte scientists integrated multiple classifiers, as well as a novel genomic index designed to capture the differences between current and former smokers and smoking history.

Genomic sequencing classifier structure

Fig. 2

a Overall structure of the Ensemble model. b Detailed structure of the hierarchical logistic regression component

Prospective clinical validation research shows that this ensemble classification approach stabilizes the Percepta GSC’s performance across patients with different clinical and genomic characteristics from multiple clinical cohorts, as well as samples from multiple RNA sequencing batches. The test classified a subset of patients with “intermediate” or “low” pre-test risk of cancer to “low” or “very low” with a high negative predictive value, potentially avoiding the need for additional invasive diagnostic procedures for these patients. The test also classified a subset of “intermediate” and “high” pre-test risk patients to “high” or “very high” risk with a high positive predictive value, which may accelerate the time to diagnosis and treatment decision.

“By utilizing cutting-edge genomic technologies and insisting on scientific rigor from development through validation, we’ve generated a diagnostic classifier that overcomes the many challenges of lung cancer diagnosis without the need for invasive surgery,” said Bonnie Anderson, Veracyte’s chairman and chief executive officer.

Lung cancer is the leading cause of cancer deaths, and is expected to kill approximately 136,000 Americans in 2020 – more than the next three leading cancers combined. Lung nodules are typically the first sign of lung cancer, however, determining which lung nodules are cancerous and which are benign is often challenging, leading to unnecessary invasive procedures or delayed treatment.

Source – BusinessWire

Choi Y, Qu J, Wu S, Hao Y, Zhang J, Ning J, Yang X, Lofaro L, Pankratz DG, Babiarz J, Walsh PS, Billatos E, Lenburg ME, Kennedy GC, McAuliffe J, Huang J. (2020) Improving lung cancer risk stratification leveraging whole transcriptome RNA sequencing and machine learning across multiple cohorts. BMC Med Genomics 3(Suppl 10):151. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *


Time limit is exhausted. Please reload CAPTCHA.