Nature Communications
Abstract
Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented assemblies lose value as references genomes, often due to incomplete sequences and unanchored or mispositioned short contigs on chromosomes. Here we report a genome assembly method HERA, which includes a concept called a connection graph as well as algorithms for constructing the graph from an overlap graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and dramatically improves the quality of current genome assemblies. We test HERA with the genomes of rice, maize, human, and Tartary buckwheat. HERA can correctly assemble most of the previously unassembled regions including tandemly repetitive sequences and improve the contig N50 sizes of published maize and human assemblies from 1.3 Mb to 61.2 Mb and from 8.3 Mb to 54.4 Mb, respectively. The application of HERA will greatly improve the quality of new or existing assemblies of complex genomes.
论文编号: | DOI:10.1038/s41467-019-13355-3 |
论文题目: | Assembly of Chromosome-scale Contigs by Efficiently Resolving Repetitive Sequences with Long Reads |
英文论文题目: | Assembly of Chromosome-scale Contigs by Efficiently Resolving Repetitive Sequences with Long Reads |
第一作者: | Huilong Du & Chengzhi Liang |
英文第一作者: | Huilong Du & Chengzhi Liang |
联系作者: | |
英文联系作者: | |
外单位作者单位: | |
英文外单位作者单位: | |
发表年度: | 2019-11-26 |
卷: | |
期: | |
页码: | |
摘要: | Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented assemblies lose value as references genomes, often due to incomplete sequences and unanchored or mispositioned short contigs on chromosomes. Here we report a genome assembly method HERA, which includes a concept called a connection graph as well as algorithms for constructing the graph from an overlap graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and dramatically improves the quality of current genome assemblies. We test HERA with the genomes of rice, maize, human, and Tartary buckwheat. HERA can correctly assemble most of the previously unassembled regions including tandemly repetitive sequences and improve the contig N50 sizes of published maize and human assemblies from 1.3 Mb to 61.2 Mb and from 8.3 Mb to 54.4 Mb, respectively. The application of HERA will greatly improve the quality of new or existing assemblies of complex genomes. |
英文摘要: | Due to the large number of repetitive sequences in complex eukaryotic genomes, fragmented assemblies lose value as references genomes, often due to incomplete sequences and unanchored or mispositioned short contigs on chromosomes. Here we report a genome assembly method HERA, which includes a concept called a connection graph as well as algorithms for constructing the graph from an overlap graph. HERA resolves repeats at high efficiency with single-molecule sequencing data, and dramatically improves the quality of current genome assemblies. We test HERA with the genomes of rice, maize, human, and Tartary buckwheat. HERA can correctly assemble most of the previously unassembled regions including tandemly repetitive sequences and improve the contig N50 sizes of published maize and human assemblies from 1.3 Mb to 61.2 Mb and from 8.3 Mb to 54.4 Mb, respectively. The application of HERA will greatly improve the quality of new or existing assemblies of complex genomes. |
刊物名称: | Nature Communications |
英文刊物名称: | Nature Communications |
论文全文: | |
英文论文全文: | |
全文链接: | |
其它备注: | Huilong Du & Chengzhi Liang. Assembly of Chromosome-scale Contigs by Efficiently Resolving Repetitive Sequences with Long Reads. Nature Communications. DOI:10.1038/s41467-019-13355-3 |
英文其它备注: | |
学科: | |
英文学科: | |
影响因子: | |
第一作者所在部门: | |
英文第一作者所在部门: | |
论文出处: | |
英文论文出处: | |
论文类别: | |
英文论文类别: | |
参与作者: | |
英文参与作者: |