Error in SingleR, no common genes between 'test' and 'ref'
1
0
Entering edit mode
Imtiaz • 0
@cd8968cc
Last seen 5 weeks ago
Finland

I downloaded seurat data from vignette: "https://satijalab.org/seurat/articles/conversion_vignette.html"

pbmc.sce < - as.SingleCellExperiment(pbmc)

I am trying to annotate by calling SingleR() on this dataset and reference (HumanPrimaryCellAtlasData) dataset.

ref.data <- HumanPrimaryCellAtlasData(ensembl=TRUE)
predictions <- SingleR(test=pbmc.sce, assay.type.test=1,  ref=ref.data, labels=ref.data$label.main)

Error in SingleR(test = pbmc.sce, assay.type.test = 1, ref = ref.data, : no common genes between 'test' and 'ref'

I chacked row names are different. For pbmc.sce row names are gene symbol and for ref.data Ensembl gene ID. Does anyone have any suggestions?

 sessionInfo()

R version 4.1.0 (2021-05-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] org.Hs.eg.db_3.13.0 pbmc3k.SeuratData_3.1.4 SeuratData_0.2.1 SeuratDisk_0.0.0.9019
[5] loomR_0.2.1.9000 hdf5r_1.3.3 R6_2.5.0 cowplot_1.1.1
[9] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.17.0 AnnotationFilter_1.17.0 GenomicFeatures_1.45.0
[13] AnnotationDbi_1.55.0 DropletUtils_1.13.1 DropletTestFiles_1.3.0 SingleR_1.7.0
[17] scRNAseq_2.7.1 celldex_1.3.0 uwot_0.1.10 scran_1.20.1
[21] scater_1.21.0 scuttle_1.3.0 SingleCellExperiment_1.15.1 SummarizedExperiment_1.23.0 [25] Biobase_2.53.0 GenomicRanges_1.45.0 GenomeInfoDb_1.29.0 IRanges_2.27.0
[29] S4Vectors_0.31.0 BiocGenerics_0.39.0 MatrixGenerics_1.5.0 matrixStats_0.59.0
[33] patchwork_1.1.1 SeuratObject_4.0.2 Seurat_4.0.3 dplyr_1.0.7
[37] ggplot2_3.3.4 devtools_2.4.2 usethis_2.0.1 Matrix_1.3-3

SingleR • 125 views
ADD COMMENT
0
Entering edit mode

Thank you for your comment. I tried to change gene symbol to Ensembl gene ID applying this command:

rownames(pbmc.sce) <- uniquifyFeatureNames(rowData(pbmc.sce)$ID, rowData(pbmc.sce)$Symbol)

location <- mapIds(EnsDb.Hsapiens.v86, keys=rowData(pbmc.sce)$ID,column="SEQNAME", keytype="GENEID")  

I got the following error massage:

Error in `rownames<-`(`*tmp*`, value = c(NA_character_, NA_character_, : missing values not allowed in rownames 
ADD REPLY
0
Entering edit mode
@kevin
Last seen 3 minutes ago
Republic of Ireland

Hi Imtiaz,

This should give you the dataset with Ensembl gene IDs, not HGNC gene symbols. Then you can try again with SingleR.

library(dplyr)
library(Seurat)
library(patchwork)

pbmc.data <- Read10X(data.dir = 'filtered_gene_bc_matrices/hg19/')
pbmc <- CreateSeuratObject(
  counts = pbmc.data,
  project = 'pbmc3k',
  min.cells = 3,
  min.features = 200)
pbmc.sce <- as.SingleCellExperiment(pbmc)

require(EnsDb.Hsapiens.v86)
ens <- mapIds(EnsDb.Hsapiens.v86,
  keys = rownames(pbmc.sce),
  column = 'GENEID',
  keytype = 'SYMBOL')
all(rownames(pbmc.sce) == names(ens))

keep <- !is.na(ens)
ens <- ens[keep]
pbmc.sce <- pbmc.sce[keep,]
rownames(pbmc.sce) <- ens

pbmc.sce
class: SingleCellExperiment 
dim: 13132 2700 
metadata(0):
assays(2): counts logcounts
rownames(13132): ENSG00000228463 ENSG00000228327 ... ENSG00000273748
  ENSG00000278384
rowData names(0):
colnames(2700): AAACATACAACCAC-1 AAACATTGAGCTAC-1 ... TTTGCATGAGAGGC-1
  TTTGCATGCCTCAC-1
colData names(4): orig.ident nCount_RNA nFeature_RNA ident
reducedDimNames(0):
altExpNames(0):

With the last 4 lines of code, we ensure that any failed mappings from HGNC symbol --> Ensembl gene ID are removed.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6