生物学杂志

• 技术方法 • 上一篇    下一篇

基于混合深度神经网络的基因遗传变体致病性注释

  

  1. 江西理工大学 信息工程学院, 赣州 341000
  • 出版日期:2019-08-18 发布日期:2019-08-18
  • 作者简介:杨书新,博士,副教授,研究方向为图数据管理与生物信息学,E-mail:670774377@qq.com
  • 基金资助:
    国家自然科学基金地区项目(41362015);江西省教育厅科技项目(GJJ170518)

A hybrid deep neural network for annotating the pathogenicity of genetic variants

  1. College of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China
  • Online:2019-08-18 Published:2019-08-18

摘要: 绝大部分非编码区的基因功能尚不清楚,而许多的遗传变体就存在这些区域,如何识别与疾病相关的变体仍是一个挑战。已有基于支持向量机的算法CADD被提出,它可以注释编码和非编码区的变体,但是该方法未能捕获特征间的非线性关系。为了解决此问题,设计了一个混合卷积网络和全连接网路的模型,能很好地捕获特征之间的非线性关系。在测试集上,方法达到了最高的66.44%准确率。

关键词: 深度学习, 遗传变体, 致病性, 注释

Abstract: The genetic function of most non-coding regions is unclear, and many genetic variants have been found in these regions. How to identify associated disease variants is still a challenge. A Support Vector Machine based algorithm CADD has been proposed, which can annotate coding and non-coding region variants. However, CADD fails to capture non-linear relationship among features. To solve this problem, this paper designed a hybrid convolutional neural network and fully connected neural network model. This model can capture non-linear relationship well among features. Our method achieves the highest accuracy of 66.44% on the testing set.

Key words: deep learning, genetic variants, pathogenicity, annotation

中图分类号: