The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost.
在处理水稻相关品系重测序数据的过程中,研究人员发现不同品系水稻转录组的映射率和重构转录本的数量存在巨大差异,说明研究过程中不能用同一个参考基因组解决所有品系的问题。因此,本研究开发了一种基于重测序数据的基因组一致性序列构建、变异鉴定和注释转移的工具RGAAT(Reference Based Genome Assembly and Annotation Tool)。RGAAT可以通过处理基因组序列、注释文件(GTF, GFF, GFF3和BED)配合映射文件(SAM/BAM)或变异文件(VCF),获得更新的基因组序列和注释转移文件。与GATK和SAMtools/BCFtools不同,RGAAT考虑真实的等位基因频率构建一致性序列。RGAAT还可以鉴定基因组变异。在四组重测序测试数据中,RGAAT变异检测的准确性与特异性和GATK相当,高于Freebayes和SAMtools。RGAAT还可以基于品种之间基因组的比对信息鉴定变异。基于序列变异文件,RGAAT计算出输入序列与参考基因组之间的坐标转换文件和注释转移文件。测试数据结果表明RGAAT比现有注释转移工具RATT的转移性能更好。RGAAT已上传至Sourceforge (https://sourceforge.net/projects/rgaat/)和Github (https://github.com/wushyer/RGAAT_v2),研究人员可免费下载使用。
PDF全文下载地址:
http://gpb.big.ac.cn/articles/download/670
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)
RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Ge
本站小编 Free考研考试/2022-01-03
相关话题/gen
HeteroMeth: A Database of Cell-to-cell Heterogeneity in DNA Methylation
DNAmethylationisanimportantepigeneticmarkthatplaysavitalroleingeneexpressionandcelldifferentiation.TheaverageDNAmethylationlevelamongagroupofcellshasb ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03GAAD: A Gene and Autoimmiune Disease Association Database
Autoimmunediseases(ADs)arisefromanabnormalimmuneresponseofthebodyagainstsubstancesandtissuesnormallypresentinthebody.MorethanahundredofADshavebeendesc ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03CCGD-ESCC: A Comprehensive Database for Genetic Variants Associated with Esophageal Squamous Cell Ca
Esophagealsquamous-cellcarcinoma(ESCC)isoneofthemostlethalmalignanciesintheworldandoccursatparticularlyhigherfrequencyinChina.Whileseveralgenome-widea ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03PlaD: A Transcriptomics Database for Plant Defense Responses to Pathogens, Providing New Insights in
High-throughputtranscriptomicstechnologieshavebeenwidelyusedtostudyplanttranscriptionalreprogrammingduringtheprocessofplantdefenseresponses,andalargeq ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03TSNAdb: A Database for Tumor-specific Neoantigens from Immunogenomics Data Analysis
Tumor-specificneoantigenshaveattractedmuchattentionsincetheycanbeusedasbiomarkerstopredicttherapeuticeffectsofimmunecheckpointblockadetherapyandaspote ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Genome-wide MicroRNA Expression Profiles in COPD: Early Predictors for Cancer Development
Chronicobstructivepulmonarydisease(COPD)significantlyincreasestheriskofdevelopingcancer.Biomarkerstudiesfrequentlyfollowacase-controlset-upinwhichpati ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Tet2 Regulates Osteoclast Differentiation by Interacting with Runx1 and Maintaining Genomic 5-Hydrox
Asadioxygenase,Ten-ElevenTranslocation2(TET2)catalyzessubsequentstepsof5-methylcytosine(5mC)oxidation.TET2playsacriticalroleintheself-renewal,prolifer ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Corrigendum to “GoldCLIP: Gel-omitted Ligation-dependent CLIP” [Genomics Proteomics Bioinformatics 1
PDF全文下载地址:/articles/download/650 ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03The Role of Exportin-5 in MicroRNA Biogenesis and Cancer
MicroRNAs(miRNAs)areconservedsmallnon-codingRNAsthatplayanimportantroleintheregulationofgeneexpressionandparticipateinavarietyofbiologicalprocesses.Th ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Comparative Analysis of Human Genes Frequently and Occasionally Regulated by m6A Modification
Them6Amodificationhasbeenimplicatedasanimportantepitranscriptomicmarker,whichplaysextensiverolesintheregulationoftranscriptstability,splicing,translat ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03