The low abundance of circulating tumour DNA (ctDNA) in plasma samples makes the analysis of ctDNA biomarkers for the detection or monitoring of early-stage cancers challenging. Here we show that deep methylation sequencing aided by a machine-learning classifier of methylation patterns enables the detection of tumour-derived signals at dilution factors as low as 1 in 10,000. For a total of 308 patients with surgery-resectable lung cancer and 261 age- and sex-matched non-cancer control individuals recruited from two hospitals, the assay detected 5281% of the patients at disease stages IA to III with a specificity of 96% (95% confidence interval (CI) 9398%). In a subgroup of 115 individuals, the assay identified, at 100% specificity (95% CI 91100%), nearly twice as many patients with cancer as those identified by ultradeep mutation sequencing analysis. The low amounts of ctDNA permitted by machine-learning-aided deep methylation sequencing could provide advantages in cancer screening and the assessment of treatment efficacy.
The main data supporting the results in this study are available within the paper and its Supplementary Information. The microarray data used for identification of differentially methylated sites can be downloaded from the TCGA database at https://gdac.broadinstitute.org/runs/analyses__2016_01_28/data and from the GEO database under the accession code GSE40279. Illumina EPIC TruSeq Methyl data are available at https://basespace.illumina.com/projects/31997005. The raw sequencing data (.fastq files) generated are available from the NCBI Sequence Read Archive (SRA) repository, under the accession code PRJNA534206. The analysed datasets generated during the study are too large to be publicly shared, but they are available for research purposes from the corresponding authors on reasonable request. Any data and materials that can be shared will be released subject to a data-transfer agreement.
Raine, A., Manlig, E., Wahlberg, P., Syvnen, A. C. & Nordlund, J. SPlinted Ligation Adapter Tagging (SPLAT), a novel library preparation method for whole genome bisulphite sequencing. Nucleic Acids Res. 45, e36 (2017).
Wu, J., Dai, W., Wu, L. & Wang, J. SALP, a new single-stranded DNA library preparation method especially useful for the high-throughput characterization of chromatin openness states. BMC Genom. 19, 143 (2018).
Genereux, D. P., Johnson, W. C., Burden, A. F., Stoger, R. & Laird, C. D. Errors in the bisulfite conversion of DNA: modulating inappropriate- and failed-conversion frequencies. Nucleic Acids Res. 36, e150 (2008).
Mazzone, P. J. et al. Evaluating molecular biomarkers for the early detection of lung cancer: when is a biomarker ready for clinical use? An official American Thoracic Society policy statement. Am. J. Respir. Crit. Care Med. 196, e15e29 (2017).
Swanton, C. et al. Prevalence of clonal hematopoiesis of indeterminate potential (CHIP) measured by an ultra-sensitive sequencing assay: exploratory analysis of the Circulating Cancer Genome Atlas (CCGA) study. J. Clin. Oncol. 36, 1200312003 (2018).
Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013).
Li, Y. S. et al. Unique genetic profiles from cerebrospinal fluid cell-free DNA in leptomeningeal metastases of EGFR-mutant non-small-cell lung cancer: a new medium of liquid biopsy. Ann. Oncol. 29, 945952 (2018).
We thank K. Kemphues (Cornell University) for critical review of the manuscript. We thank Z. Liang, H. Wu, Z. Jin, F. Tan, S. Chuai, W. Deng, X. Mao, Y. Ma, L. Yang, J. Ye and F. Duan for their assistance with this study. This work was supported, in part, by Beijing Natural Science Foundation (grant number 7182132), the Major Projects of the Beijing Municipal Science and Technology Commission (grant number Z 7013), the Capital Special Project for Featured Clinical Application (grant number Z 5157), the Peking Union Medical College Hospital Youth Fund (grant numbers PUMCH-2016-2.25, HI626500), and the Peking Union Medical College Special Youth Teacher Project (grant numbers 2014zlgc0717; 2014zlgc0135).
N.L. designed, supervised the clinical study, and provided funding. B.L. designed, supervised the technical study, and wrote the paper. Z.J., P.W., Y. Wang, Y. Wu, Z.C., L.C., Z.B., Hongsheng Liu, L.L., C.H., Y.Q. and Y.C. conducted the clinical study including participant recruitment, sample preparation, clinical information collection and interpretation. C. Wang, T.Z., F.Q., J.S., J. Xu, F.X., H.C., S.F., X.Y, H.H.-Z., J. Xiang and Hao Liu performed the technical development including experiment conduction, computational framework construction and data analysis. C. Wu and X.G. optimized the machine-learning algorithm. H.Z. and S.L designed the clinical study and provided funding. Z.Z. conceived the idea and oversaw the overall direction. All authors discussed the results and contributed to the final manuscript.
T.Z., B.L. and Z.Z. are inventors on a pending patent application held by Burning Rock Biotech related to target deep methylation sequencing (WO2019192489A1, filed in the United States, Canada, Europe, Japan, Sigapore, Australia and Brazil). C. Wang., B.L. and Z.Z. are on a patent application to be submitted by Burning Rock Biotech that covers other aspects of ELSA-seq described in this article. X.G. is a consultant for Burning Rock Biotech.
Liang, N., Li, B., Jia, Z. et al. Ultrasensitive detection of circulating tumour DNA via deep methylation sequencing aided by machine learning. Nat Biomed Eng 5, 586599 (2021). https://doi.org/10.1038/s41551-021-00746-5