Question

Error in SingleR, no common genes between 'test' and 'ref'

0

Entering edit mode

Imtiaz ▴ 10

@cd8968cc

Last seen 4.5 years ago

Finland

I downloaded seurat data from vignette: "https://satijalab.org/seurat/articles/conversion_vignette.html"

pbmc.sce < - as.SingleCellExperiment(pbmc)

I am trying to annotate by calling SingleR() on this dataset and reference (HumanPrimaryCellAtlasData) dataset.

ref.data <- HumanPrimaryCellAtlasData(ensembl=TRUE)
predictions <- SingleR(test=pbmc.sce, assay.type.test=1,  ref=ref.data, labels=ref.data$label.main)

Error in SingleR(test = pbmc.sce, assay.type.test = 1, ref = ref.data, : no common genes between 'test' and 'ref'

I chacked row names are different. For pbmc.sce row names are gene symbol and for ref.data Ensembl gene ID. Does anyone have any suggestions?

 sessionInfo()

R version 4.1.0 (2021-05-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] org.Hs.eg.db_3.13.0 pbmc3k.SeuratData_3.1.4 SeuratData_0.2.1 SeuratDisk_0.0.0.9019
[5] loomR_0.2.1.9000 hdf5r_1.3.3 R6_2.5.0 cowplot_1.1.1
[9] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.17.0 AnnotationFilter_1.17.0 GenomicFeatures_1.45.0
[13] AnnotationDbi_1.55.0 DropletUtils_1.13.1 DropletTestFiles_1.3.0 SingleR_1.7.0
[17] scRNAseq_2.7.1 celldex_1.3.0 uwot_0.1.10 scran_1.20.1
[21] scater_1.21.0 scuttle_1.3.0 SingleCellExperiment_1.15.1 SummarizedExperiment_1.23.0 [25] Biobase_2.53.0 GenomicRanges_1.45.0 GenomeInfoDb_1.29.0 IRanges_2.27.0
[29] S4Vectors_0.31.0 BiocGenerics_0.39.0 MatrixGenerics_1.5.0 matrixStats_0.59.0
[33] patchwork_1.1.1 SeuratObject_4.0.2 Seurat_4.0.3 dplyr_1.0.7
[37] ggplot2_3.3.4 devtools_2.4.2 usethis_2.0.1 Matrix_1.3-3

SingleR • 5.9k views

ADD COMMENT • link 4.6 years ago • updated 4.5 years ago Imtiaz ▴ 10

0

Entering edit mode

Thank you for your comment. I tried to change gene symbol to Ensembl gene ID applying this command:

rownames(pbmc.sce) <- uniquifyFeatureNames(rowData(pbmc.sce)$ID, rowData(pbmc.sce)$Symbol)

location <- mapIds(EnsDb.Hsapiens.v86, keys=rowData(pbmc.sce)$ID,column="SEQNAME", keytype="GENEID")

I got the following error massage:

Error in `rownames<-`(`*tmp*`, value = c(NA_character_, NA_character_, : missing values not allowed in rownames

ADD REPLY • link 4.6 years ago Imtiaz ▴ 10

score 1 · Answer 1 · 2021-06-24

Hi Imtiaz,

This should give you the dataset with Ensembl gene IDs, not HGNC gene symbols. Then you can try again with SingleR.

library(dplyr)
library(Seurat)
library(patchwork)

pbmc.data <- Read10X(data.dir = 'filtered_gene_bc_matrices/hg19/')
pbmc <- CreateSeuratObject(
  counts = pbmc.data,
  project = 'pbmc3k',
  min.cells = 3,
  min.features = 200)
pbmc.sce <- as.SingleCellExperiment(pbmc)

require(EnsDb.Hsapiens.v86)
ens <- mapIds(EnsDb.Hsapiens.v86,
  keys = rownames(pbmc.sce),
  column = 'GENEID',
  keytype = 'SYMBOL')
all(rownames(pbmc.sce) == names(ens))

keep <- !is.na(ens)
ens <- ens[keep]
pbmc.sce <- pbmc.sce[keep,]
rownames(pbmc.sce) <- ens

pbmc.sce
class: SingleCellExperiment 
dim: 13132 2700 
metadata(0):
assays(2): counts logcounts
rownames(13132): ENSG00000228463 ENSG00000228327 ... ENSG00000273748
  ENSG00000278384
rowData names(0):
colnames(2700): AAACATACAACCAC-1 AAACATTGAGCTAC-1 ... TTTGCATGAGAGGC-1
  TTTGCATGCCTCAC-1
colData names(4): orig.ident nCount_RNA nFeature_RNA ident
reducedDimNames(0):
altExpNames(0):

With the last 4 lines of code, we ensure that any failed mappings from HGNC symbol --> Ensembl gene ID are removed.

Kevin