摘要: 目的·从转录组层面探究影响高度微卫星不稳定(microsatellite instability-high,MSI-H)结直肠癌转移的潜在关键基因及基因表达特征,并构建基因转移预测模型。方法·从癌症基因组图谱数据库中收集MSI-H结直肠癌患者转录组数据,根据转移信息分为转移组(21例)和无转移组(42例),分析2组间差异表达基因(differentially expressed gene,DEG),以基因本体数据库(Gene Ontology,GO)、基因集富集分析(Gene Set Enrichment Analysis,GSEA)对DEG进行注释、聚类及信号通路富集;使用STRING、Cytoscape软件筛选枢纽基因(hub基因);选取DEG绘制列线图,使用Bootstrap方法进行交叉验证;分析列线图中每个基因对MSI-H结直肠癌无进展生存期(progression-free survival,PFS)的影响。结果·转移组和无转移组间共得到245个DEGs,其中转移组较无转移组表达上调基因204个,下调基因41个。GO分析发现:DEG在生物过程、分子功能上主要富集于离子穿膜转运、氯离子穿膜转运及氯离子通道活性;在细胞组分中,富集于细胞外部分、细胞外空间等。GSEA结果显示:上调基因富集于神经活性物质配体-受体相互作用和代谢信号通路。通过Cytoscape筛选出上调基因蛋白质互作网络中的前10位的hub基因。根据DEG中调整后
P值最小且与肿瘤发生发展关联性高的前10个基因构建的转移预测模型有一定的预测效能,其中训练集曲线下面积(area under curve,AUC)=0.975,验证集AUC=0.920;模型中
AC078993.1、
IGLJ2(immunoglobulin lambda joining 2)的表达水平与MSI-H结直肠癌PFS呈明显负相关(
P=0.011,
P=0.005)。结论·在MSI-H结直肠癌中,离子通道变化及细胞外环境变化可能对肿瘤转移有重要影响,神经活性物质配体-受体相互作用、代谢信号通路可能是对转移较重要的信号通路;初步构建了MSI-H结直肠癌基因转移预测模型,可为后续相关临床研究提供参考。
关键词: 结直肠癌, 高度微卫星不稳定, 转移, 差异表达基因, 生物信息学, 列线图 Abstract: Objective·To explore the potential key genes and the gene expression characteristics of microsatellite instability-high (MSI-H) colorectal cancer (CRC) with metastasis at the transcriptome level, and establish a metastasis prediction gene model.
Methods·The transcriptome data of MSI-H CRC patients was obtained from The Cancer Genome Atlas database. The patients were divided into metastatic group (21 patients) and non-metastatic group (42 patients). The differentially expressed genes (DEGs) between the two groups were analyzed by Gene Ontology (GO) and Gene Set Enrichment Analysis (GSEA) to annotate, and cluster DEGs and enrich the signaling pathways. STRING and Cytoscape were used to select the hub genes. Nomogram was drawn based on the selected DEGs. The cross validation of the model was performed by Bootstrap method. Survival analysis was done to explore the influences of each gene in the nomogram on progression-free survival (PFS) of MSI-H CRC.
Results·A total of 245 DEGs were obtained from the metastatic group and non-metastatic group, among which 204 genes were up-regulated and 41 genes were down-regulated. GO analysis showed that DEGs were mainly clustered in ion transmembrane transport, chloride transmembrane transport and chloride channel activity in terms of biological process and molecular function. In terms of cellular component, DEGs were mainly clustered in extracellular region and extracellular space. GSEA showed that the neuroactive ligand-receptor interaction and metabolic pathways were enriched in the up-regulated genes. The top 10 hub genes in the protein-protein interaction network of the up-regulated genes were screened by Cytoscape. The metastasis prediction gene model, which was set up based on the top 10 DEGs with the lowest adjusted P value and high physiological relevance to tumor, had certain predictive efficiency [area under curve (AUC)=0.975 for training, AUC=0.920 for validation]. The expression levels of AC078993.1 and IGLJ2 (immunoglobulin lambda joining 2) were significantly negatively correlated with PFS of MSI-H CRC (P=0.011, P=0.005).
Conclusion·The changes in ion channels and extracellular environment may have important impacts on metastasis of MSI-H CRC. Neuroactive ligand-receptor interaction and metabolic pathways may be two important signaling pathways for metastasis of MSI-H CRC. A metastasis prediction gene model is established, which can provide reference for the follow-up related clinical researches.
Key words: colorectal cancer (CRC), microsatellite instability-high (MSI-H), metastasis, differentially expressed gene (DEG), bioinformatics, nomogram PDF全文下载地址:
点我下载PDF