Accurate identification of compound–protein interactions (CPIs) in silico may deepen our understanding of the underlying mechanisms of drug action and thus remarkably facilitate drug discovery and development. Conventional similarity- or docking-based computational methods for predicting CPIs rarely exploit latent features from currently available large-scale unlabeled compound and protein data and often limit their usage to relatively small-scale datasets. In the present study, we propose DeepCPI, a novel general and scalable computational framework that combines effective feature embedding (a technique of representation learning) with powerful deep learning methods to accurately predict CPIs at a large scale. DeepCPI automatically learns the implicit yet expressive low-dimensional features of compounds and proteins from a massive amount of unlabeled data. Evaluations of the measured CPIs in large-scale databases, such as ChEMBL and BindingDB, as well as of the known drug–target interactions from DrugBank, demonstrated the superior predictive performance of DeepCPI. Furthermore, several interactions among small-molecule compounds and three G protein-coupled receptor targets (glucagon-like peptide-1 receptor, glucagon receptor, and vasoactive intestinal peptide receptor) predicted using DeepCPI were experimentally validated. The present study suggests that DeepCPI is a useful and powerful tool for drug discovery and repositioning. The source code of DeepCPI can be downloaded from https://github.com/FangpingWan/DeepCPI.
通过计算机准确识别化合物-蛋白质相互作用(CPI)能够加深我们对药物作用机制的理解,从而促进药物发现与研发。基于相似度或者对接的传统CPI预测方法通常利用小规模标注数据集的信息而很少利用大量无标签化合物和蛋白质数据中的潜在特征信息。在本研究中,我们提出了一种新颖的CPI预测模型DeepCPI。 DeepCPI通过特征嵌入技术(一种表示学习方法),自动从大规模未标记数据中学习化合物与蛋白质特征表示,并进一步通过结合深度学习技术,精确并大规模的预测CPI。在大规模CPI数据集(ChEMBL,BindingDB和DrugBank)上的计算实验与评估表明,DeepCPI拥有优越的预测性能。除此之外,我们通过湿实验进一步验证了DeepCPI方法预测的多个小分子与三种GPCR蛋白(GLP-1R, GCGR和VIPR)之间的互作关系。综合以上结果,DeepCPI是一个对于药物发现和重定位有效的工具。
PDF全文下载地址:
http://gpb.big.ac.cn/articles/download/731
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)
DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening
本站小编 Free考研考试/2022-01-03
相关话题/gen
I3: A Self-organising Learning Workflow for Intuitive Integrative Interpretation of Complex Genetic
Weproposeacomputationalworkflow(I3)forintuitiveintegrativeinterpretationofcomplexgeneticdatamainlybuildingontheself-organisingprinciple.Weillustrateth ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03shinyChromosome: An R/Shiny Application for Interactive Creation of Non-circular Plots of Whole Geno
Non-circularplotsofwholegenomesarenaturalrepresentationsofgenomicdataalignedalongallchromosomes.Currently,thereisnospecializedgraphicaluserinterface(G ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Gclust: A Parallel Clustering Tool for Microbial Genomic Data
Theacceleratinggrowthofthepublicmicrobialgenomicdataimposessubstantialburdenontheresearchcommunitythatusessuchresources.Buildingdatabasesfornon-redund ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03MakeHub: Fully Automated Generation of UCSC Genome Browser Assembly Hubs
Novelgenomesaretodayoftenannotatedbysmallconsortiaorindividualswhosebackgroundisnotfrombioinformatics.Thisaudiencerequirestoolsthatareeasytouse.Suchne ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Mapping Genome Variants Sheds Light on Genetic and Phenotypic Differentiation in Chinese
遗传变异和人类健康和精准医疗息息相关,因此绘制全人类基因组遗传变异图谱成为全球科学家共同奋斗的目标。近年来,国际千人基因组等多个研究小组纷纷致力于发现世界不同种族人群中基因组变异。我国是个多民族国家,拥有大约20%的世界人口和丰富的遗传多样性。但由于缺乏中国南北方人群特异的参考基因组以及深度测序数据 ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Whole Genome Analyses of Chinese Population and De Novo Assembly of A Northern Han Genome
Tounravelthegeneticmechanismsofdiseaseandphysiologicaltraits,itrequirescomprehensivesequencinganalysisoflargesamplesizeinChinesepopulations.Here,werep ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03H3K27me3 Signal in the Cis Regulatory Elements Reveals the Differentiation Potential of Progenitors
Drosophilaneuraldevelopmentundergoesextensivechromatinremodelingandpreciseepigeneticregulation.However,therolesofchromatinremodelinginestablishmentand ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03C3: Consensus Cancer Driver Gene Caller
Next-generationsequencinghasallowedidentificationofmillionsofsomaticmutationsinhumancancercells.Akeychallengeininterpretingcancergenomesistodistinguis ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03gFACs: Gene Filtering, Analysis, and Conversion to Unify Genome Annotations Across Alignment and Gen
Publishedgenomesfrequentlycontainerroneousgenemodelsthatrepresentissuesassociatedwithidentificationofopenreadingframes,startsites,splicesites,andrelat ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03m6A Regulates Neurogenesis and Neuronal Development by Modulating Histone Methyltransferase Ezh2
N6-methyladenosine(m6A),catalyzedbythemethyltransferasecomplexconsistingofMettl3andMettl14,isthemostabundantRNAmodificationinmRNAsandparticipatesindiv ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03