生物学杂志

• 研究报告 • 上一篇    下一篇

癌症 TCGA 数据库中乳腺癌预后数据的挖掘

  

  1. 兰州大学 生命科学学院 生物物理所, 兰州 730000
  • 出版日期:2018-08-18 发布日期:2018-08-18
  • 通讯作者: 李硕磊,在读博士,研究方向为CRP 急性期基因调控,E-mail:daniel900102@163.com
  • 作者简介:Mian Khizar Hayat,在读博士,研究方向为CRP 急性期基因调控,E-mail:hayat13@lzu.edu.cn
  • 基金资助:
    兰州大学中央高校基本科研业务费专项(862601)

Prognosis-associated genes dig from TCGA of breast cancer

  1. Institute of Biophysics, School of Life Science, Lanzhou University, Lanzhou 730000, China
  • Online:2018-08-18 Published:2018-08-18

摘要: 近年来,乳腺癌发病率逐渐上升,并且呈现出年轻化趋势。使用TCGA数据库中已有的基因信息筛选鉴定出与乳腺癌预后相关的基因。为排除癌组织和正常组织取样时间不同造成的差异,我们选取了113对同时检测乳腺癌区和其相对应癌旁正常组织的样品,从TCGA数据库调取转录组数据,对这些数据通过DEseq进行差异表达分析,筛选出1428个差异表达基因。对差异表达基因进行基因本体GO,代谢通路KEGG,疾病本体DO和富集分析获得68个与乳腺癌相关的差异表达的关键基因;采用数据库中所用癌症的表达数据(共1097例)对这些乳腺癌相关基因进行总生存率分析,筛选出8个与乳腺癌预后相关的基因。结果显示在乳腺癌病人中PGLYRP2、SEMA3G、PROL1及SLC7A3的高表达伴随着乳腺癌病人的预后良好,而SKA1、BIRC5、RRM2和AURKA基因的高表达伴随着乳腺癌病人的预后不良。这8个基因有可能是乳腺癌预后相关的重要基因,这为乳腺癌病人的预后治疗提供了新的方向与思路,并可能通过调控基因水平来尽可能地控制预后。

关键词: 癌症基因组图谱数据库, 乳腺癌, 差异表达基因, 预后

Abstract: Recently, the incidence of breast cancer increased year by year, and there is a trend that patients with breast cancer are more and more younger. This paper used the Cancer Genomes Atlas database to identify genes that associated with breast cancer prognosis. In order to exclude the difference between sampling time of cancer tissues and normal tissue, we selected 113 pair breast cancer tissues and its adjacent normal tissues simultaneously. We got their transcriptome data from TCGA(RNAseqV2,raw count), and these data were analyzed by DEseq, finally 1428 differentially expressed genes were screened out(padj<0.05 and ABS (log2foldchange) >1). Next we analyzed these differential expression genes by Gene Ontology, Disease Ontology, KEGG and Enrichment Analysis, and screened out 68 differential expression genes related to breast cancer. Finally, we analyzed the overall survival rates by comparing the genes expression data of cancer from 1097 patients, and we found 8 genes that associated with prognosis. PGLYRP2, SEMA3G, PROL1, SLC7A3 were highly expressied in breast cancer patients live with a good prognosis, while SKA1, BIRC5, RRM2, AURKA were highly expressied in patients live with a poor prognosis. The 8 genes above may be a group of genes related to breast cancer prognosis, which may give a novel direction to breast cancer treatment, and make a good prognosis through regulation of gene possible.

Key words: TCGA database, breast cancer, differential expression genes, prognosis