删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

基于单细胞RNA测序的结直肠癌预后预测模型的建立和验证

本站小编 Free考研考试/2022-02-12

摘要/Abstract


摘要: 目的·基于单细胞RNA测序(single cell RNA sequence,scRNA-seq)技术构建结直肠癌预后预测模型。方法·利用GEO(Gene Expression Omnibus)数据库获取结直肠癌样本的scRNA-seq数据集,筛选与结直肠癌转移相关的差异基因作为预测模型的候选基因,运用套索回归算法(LASSO)、Logistic回归和Kaplan-Meier生存分析进一步在癌症基因组图谱(The Cancer Genome Atlas,TCGA)数据库中筛选及验证与结直肠癌预后相关的基因集,并建立结直肠癌预后预测模型。通过决策曲线分析和受试者工作特征(receiver operating characteristic,ROC)曲线评估预测模型在临床应用中的价值。结果·利用GEO数据库获取的scRNA-seq数据筛选出30个差异表达基因,进一步在TCGA数据库中利用LASSO回归得到9个关键基因,并以此对每例患者的关键基因表达进行评分。分别在训练集和验证集中对复发和未复发患者的评分进行比较,差异均有统计学意义(P<0.05)。采用 Logistic回归分析将肿瘤原发灶分级(T stage)和是否发生远处转移(M stage)2个独立的临床变量纳入评分-临床变量整合模型。对评分-临床变量整合模型的实际预测价值进行评估,ROC曲线在训练集和验证集的曲线下面积分别为0.775和0.705。结论·基于scRNA-seq结果,构建了较为稳定的结直肠癌预后预测模型,可供临床评估患者预后参考。
关键词: 单细胞RNA测序, 结直肠癌, 预后, 生物信息学
Abstract:
Objective·To establish a model for predicting the prognosis in patients with colorectal cancer (CRC) using single cell RNA sequencing (scRNA-seq).
Methods·scRNA-seq data of patients with CRC from Gene Expression Omnibus (GEO) database was used to filter out candidate genes, which were related to metastatic CRC. The least absolute shrinkage and selection operator (LASSO) regression, Logistic regression and Kaplan-Meier analysis were used to select and evaluate the significance of the hub gene filtered out in The Cancer Genome Atlas (TCGA) database, and to develop the prognostic prediction model of CRC. Decision curve analysis and receiver operating characteristic (ROC) curve were used to assess the clinical use of the prediction model.
Results·Thirty candidate genes were filtered out from the scRNA-seq data which was downloaded in GEO database, and then 9 hub genes were selected by LASSO regression in the TCGA database. The hub-gene expression was scored for each patient. The scores had significant difference between the groups with and without recurrence both in the training set and the validation set (P<0.05). In addition, Logistic regression analysis was carried out to incorporate the two independent clinical variables of primary tumor grade (T stage) and metastasis status (M stage) into the score-clinical variable integration model. Area under curve of the ROC curve in the training set and validation set were 0.775 and 0.705, respectively.
Conclusion·A relatively stable model for predicting prognosis in CRC was constructed based on the results of scRNA-seq, which has certain guiding significance for treatment decision and prognostic prediction.

Key words: single cell RNA sequencing (scRNA-seq), colorectal cancer, prognosis, bioinformatics


PDF全文下载地址:

点我下载PDF
相关话题/基因 临床 数据库 生物 数据