Tumor Cell Detection in Single-Cell DNA Sequencing Data
School Name
Ridge View High School
Grade Level
12th Grade
Presentation Topic
Computer Science
Presentation Type
Mentored
Abstract
Single-cell DNA sequencing (scDNA-seq) helps researchers study the evolutionary process of cancer. It is a process used to examine individual cells, describe intra-tumor heterogeneity, and reconstruct the evolutionary history of a tumor. Coverage is the number of reads at a given position in the genome. The depth of high-coverage scDNA-seq allows for analysis of point mutations while it is difficult to make these inferences within ultra-low coverage scDNA-seq. However, due to the uniformity of coverage, ultra-low coverage scDNA-seq is ideal for copy number calling [6]. This study aims to develop a computational method, utilizing features computed from ultra-low coverage scDNA-seq, to detect tumor cells and assist in future efforts of identifying technical errors. Data was pre-processed using Principal Component Analysis (PCA). A machine learning algorithm was implemented to detect tumor cells in this latent, dimensionally reduced space for two patients (patients S0 and S1) with breast cancer sequenced using 10x genomics. The training set (patient S0) had an accuracy of 98% for tumor cell detection. The testing set (patient S1) had an accuracy of 99% for tumor cell detection. This demonstrates that these features are useful for accurately detecting tumor cells in ultra-low coverage scDNA-seq data. Spatial heterogeneity of tumor clones was observed, revealing correlations with cell type and sections. Doublet analysis revealed doublets concentrated between clusters, providing evidence that this feature may be useful for future detection of technical errors. Future studies will focus on improving the computational method for doublet detection and optimization of the tumor cell detection algorithm.
Recommended Citation
Ariyo, Toluwanimi, "Tumor Cell Detection in Single-Cell DNA Sequencing Data" (2022). South Carolina Junior Academy of Science. 159.
https://scholarexchange.furman.edu/scjas/2022/all/159
Location
HSS 209
Start Date
4-2-2022 10:00 AM
Presentation Format
Oral and Written
Group Project
No
Tumor Cell Detection in Single-Cell DNA Sequencing Data
HSS 209
Single-cell DNA sequencing (scDNA-seq) helps researchers study the evolutionary process of cancer. It is a process used to examine individual cells, describe intra-tumor heterogeneity, and reconstruct the evolutionary history of a tumor. Coverage is the number of reads at a given position in the genome. The depth of high-coverage scDNA-seq allows for analysis of point mutations while it is difficult to make these inferences within ultra-low coverage scDNA-seq. However, due to the uniformity of coverage, ultra-low coverage scDNA-seq is ideal for copy number calling [6]. This study aims to develop a computational method, utilizing features computed from ultra-low coverage scDNA-seq, to detect tumor cells and assist in future efforts of identifying technical errors. Data was pre-processed using Principal Component Analysis (PCA). A machine learning algorithm was implemented to detect tumor cells in this latent, dimensionally reduced space for two patients (patients S0 and S1) with breast cancer sequenced using 10x genomics. The training set (patient S0) had an accuracy of 98% for tumor cell detection. The testing set (patient S1) had an accuracy of 99% for tumor cell detection. This demonstrates that these features are useful for accurately detecting tumor cells in ultra-low coverage scDNA-seq data. Spatial heterogeneity of tumor clones was observed, revealing correlations with cell type and sections. Doublet analysis revealed doublets concentrated between clusters, providing evidence that this feature may be useful for future detection of technical errors. Future studies will focus on improving the computational method for doublet detection and optimization of the tumor cell detection algorithm.