删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

TaxonKit: A practical and efficient NCBI taxonomy toolkit

本站小编 Free考研考试/2022-01-01

Wei Shen,
Hong Re
Institute for Viral Hepatitis, Department of Infectious Diseases, Key Laboratory of Molecular Biology for Infectious Diseases, Ministry of Education, The Second Affiliated Hospital of Chongqing Medical University, Chongqing 400010, China
Funds: We thank Yong-Xin Liu (State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China), Qi Zhao (Sun Yat-sen University Cancer Center, Guangzhou, China), Zhi-Luo Deng (Department of Computational Biology, Helmholtz Centre for Infection Research, Braunschweig, Germany), and Cai-Yun Zhu (Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China) for giving advice and comments on the manuscript. We are also grateful to TaxonKit users who have greatly helped to report bugs and suggest new features. We thank Daniel S. Standage (National Biodefense Analysis and Countermeasures Center, Fort Detrick, USA) for writing the Python bindings for TaxonKit. This work was supported by grants from the National Natural Science Foundation of China (32000474) to W.S. and the National Science and Technology Major Project of China (2017ZX10202203-007-001) to H.R.

Received Date: 2021-02-09
Accepted Date:2021-03-27
Rev Recd Date:2021-03-15
Publish Date:2021-04-15




Abstract
The National Center for Biotechnology Information (NCBI) Taxonomy is widely applied in biomedical and ecological studies. Typical demands include querying taxonomy identifier (TaxIds) by taxonomy names, querying complete taxonomic lineages by TaxIds, listing descendants of given TaxIds, and others. However, existed tools are either limited in functionalities or inefficient in terms of runtime. In this work, we present TaxonKit, a command-line toolkit for comprehensive and efficient manipulation of NCBI Taxonomy data. TaxonKit comprises seven core subcommands providing functions, including TaxIds querying, listing, filtering, lineage retrieving and reformatting, lowest common ancestor computation, and TaxIds change tracking. The practical functions, competitive processing performance, scalability with different scales of datasets and good accessibility can facilitate taxonomy data manipulations. TaxonKit provides free access under the permissive MIT license on GitHub, Brewsci, and Bioconda. The documents are also available at https://bioinf.shenwei.me/taxonkit/.
Keywords: NCBI Taxonomy,
TaxonKit,
TaxId,
Lineage,
TaxId changelog



PDF全文下载地址:

http://www.jgenetgenomics.org/article/exportPdf?id=93d8d894-08e4-4230-99f7-770d1c706bed&language=en
相关话题/TaxonKit practical efficient