Annotating cell types is a critical step in single-cell RNA sequencing (scRNA-seq) data analysis. Some supervised or semi-supervised classification methods have recently emerged to enable automated cell type identification. However, comprehensive evaluations of these methods are lacking. Moreover, it is not clear whether some classification methods originally designed for analyzing other bulk omics data are adaptable to scRNA-seq analysis. In this study, we evaluated ten cell type annotation methods publicly available as R packages. Eight of them are popular methods developed specifically for single-cell research, including Seurat, scmap, SingleR, CHETAH, SingleCellNet, scID, Garnett, and SCINA. The other two methods were repurposed from deconvoluting DNA methylation data, i.e., linear constrained projection (CP) and robust partial correlations (RPC). We conducted systematic comparisons on a wide variety of public scRNA-seq datasets as well as simulation data. We assessed the accuracy through intra-dataset and inter-dataset predictions; the robustness over practical challenges such as gene filtering, high similarity among cell types, and increased cell type classes; as well as the detection of rare and unknown cell types. Overall, methods such as Seurat, SingleR, CP, RPC, and SingleCellNet performed well, with Seurat being the best at annotating major cell types. Additionally, Seurat, SingleR, CP, and RPC were more robust against downsampling. However, Seurat did have a major drawback at predicting rare cell populations, and it was suboptimal at differentiating cell types highly similar to each other, compared to SingleR and RPC. All the code and data are available from https://github.com/qianhuiSenn/scRNA_cell_deconv_benchmark.
注释细胞类型是单细胞RNA测序(scRNA-seq)数据分析中的关键步骤。近年来,以实现自动细胞类型识别为目的出现了一些监督或半监督分类方法。但是,对这些方法使用的全面评估还存在欠缺。此外,某些最初设计用于分析其他组织或细胞群组学数据的分类方法在scRNA-seq分析的适用性尚不清楚。在这项研究中,我们评估了十种以R包形式公开提供的细胞类型注释方法。其中有八种流行方法是专门为单细胞研究开发的,包括Seurat,scmap,SingleR,CHETAH,SingleCellNet,scID,Garnett和SCINA。另外,我们从反卷积DNA甲基化数据的常用技术中重新利用了其他两种方法,即线性约束投影(CP)和鲁棒偏相关(RPC)。我们利用各种公共scRNA-seq数据集和模拟数据进行了系统比较。对于每个方法,我们评估了数据集内和数据集间的预测的准确性;应对诸如基因过滤,细胞类型之间的高度相似性以及增加的细胞类型类别等实际挑战的鲁棒性;以及对稀有和未知细胞类型的检测。总体而言,Seurat,SingleR,CP,RPC和SingleCellNet之类的方法效果良好,其中Seurat是注释主要细胞类型的最佳方法。此外,Seurat,SingleR,CP和RPC在应对基因过滤的鲁棒性方面更佳。然而,与SingleR和RPC相比,Seurat在预测稀有细胞种群方面有主要缺陷,并且在区分高度相似的细胞类型时表现欠佳。所有代码和数据都可以从https://github.com/qianhuiSenn/scRNA_cell_deconv_benchmark获得。
PDF全文下载地址:
http://gpb.big.ac.cn/articles/download/854
删除或更新信息,请邮件至freekaoyan#163.com(#换成@)
Evaluation of Cell Type Annotation R Packages on Single-cell RNA-seq Data
本站小编 Free考研考试/2022-01-03
相关话题/gen
scLM: Automatic Detection of Consensus Gene Clusters Across Multiple Single-cell Datasets
Ingeneexpressionprofilingstudies,includingsingle-cellRNAsequencing(scRNA-seq)analyses,theidentificationandcharacterizationofco-expressedgenesprovidesc ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Glycoproteogenomics: Setting the Course for Next-generation Cancer Neoantigen Discovery for Cancer V
Molecular-assistedprecisiononcologygainedtremendousgroundwithhigh-throughputnext-generationsequencing(NGS),supportedbyrobustbioinformatics.Thequestfor ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Genome-wide 5-Hydroxymethylcytosine Profiling Analysis Identifies MAP7D1 as A Novel Regulator of Lym
AlthoughDNA5-hydroxymethylcytosine(5hmC)isrecognizedasanimportantepigeneticmarkincancer,itspreciseroleinlymphnodemetastasisremainselusive.Inthisstudy, ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Gigantic Genomes Provide Empirical Tests of Transposable Element Dynamics Models
Transposableelements(TEs)areamajordeterminantofeukaryoticgenomesize.ThecollectivepropertiesofagenomicTEcommunityrevealthehistoryofTE/hostevolutionaryd ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03MACMIC Reveals A Dual Role of CTCF in Epigenetic Regulation of Cell Identity Genes
Numerousstudiesofrelationshipbetweenepigenomicfeatureshavefocusedontheirstrongcorrelationacrossthegenome,likelybecausesuchrelationshipcanbeeasilyident ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
AnovelRNAvirus,thesevereacuterespiratorysyndromecoronavirus2(SARS-CoV-2),isresponsiblefortheongoingoutbreakofcoronavirusdisease2019(COVID-19).Populati ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03MicroPhenoDB Associates Metagenomic Data with Pathogenic Microbes, Microbial Core Genes, and Human D
Microbesplayimportantrolesinhumanhealthanddisease.Theinteractionbetweenmicrobesandhostsisareciprocalrelationship,whichremainslargelyunder-explored.Cur ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Antidiabetic Effects of Gegen Qinlian Decoction via the Gut Microbiota Are Attributable to Its Key I
GegenQinlianDecoction(GQD),atraditionalChinesemedicine(TCM)formula,haslongbeenusedforthetreatmentofcommonmetabolicdiseases,includingtype2diabetesmelli ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03The Global Landscape of SARS-CoV-2 Genomes, Variants, and Haplotypes in 2019nCoVR
OnJanuary22,2020,ChinaNationalCenterforBioinformation(CNCB)releasedthe2019NovelCoronavirusResource(2019nCoVR),anopen-accessinformationresourceforthese ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03Different Gene Networks Are Disturbed by Zika Virus Infection in A Mouse Microcephaly Model
TheassociationofZikavirus(ZIKV)infectionwithmicrocephalyhasraisedalarmworldwide.TheircausallinkhasbeenconfirmedindifferentanimalmodelsinfectedbyZIKV.H ...中科院北京基因组研究所 本站小编 Free考研考试 2022-01-03