Hi all,
I am new to R and SingleCellExperiment.
I am currently working through the following tutorial (Chapter 6):
https://osca.bioconductor.org/ but it is not working! Here is what I type in the RStudio console:
BiocManager::install("scRNAseq")
library(scRNAseq)
sce.416b <- LunSpikeInData(which="416b")
library(AnnotationHub)
ens.mm.v97 <- AnnotationHub()[["AH73905"]]
location <- mapIds(ens.mm.v97, keys=rownames(sce.416b),
keytype="GENEID", column="SEQNAME")
is.mito <- which(location=="MT")
but I get the following error:
> sce.416b <- LunSpikeInData(which="416b")
snapshotDate(): 2019-10-22
see ?scRNAseq and browseVignettes('scRNAseq') for documentation
loading from cache
see ?scRNAseq and browseVignettes('scRNAseq') for documentation
loading from cache
see ?scRNAseq and browseVignettes('scRNAseq') for documentation
loading from cache
>
> library(AnnotationHub)
> ens.mm.v97 <- AnnotationHub()[["AH73905"]]
snapshotDate(): 2019-10-29
loading from cache
require(“ensembldb”)
Error: failed to load resource
name: AH73905
title: Ensembl 97 EnsDb for Mus musculus
reason: require(“ensembldb”) failed: use BiocManager::install() to install package?
In addition: Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘ensembldb’
> location <- mapIds(ens.mm.v97, keys=rownames(sce.416b),
+ keytype="GENEID", column="SEQNAME")
Error in mapIds(ens.mm.v97, keys = rownames(sce.416b), keytype = "GENEID", :
could not find function "mapIds"
> is.mito <- which(location=="MT")
Error in which(location == "MT") : object 'location' not found
Can anyone tell me why I am unable to follow this tutorial?
Also, can someone tell me what the heck this code is actually doing:
library(AnnotationHub)
ens.mm.v97 <- AnnotationHub()[["AH73905"]]
location <- mapIds(ens.mm.v97, keys=rownames(sce.416b),
keytype="GENEID", column="SEQNAME")
is.mito <- which(location=="MT")
What is "AH73905
" what is "ens.mm.v97
" and how exactly is this finding mitochondrial genes in sce.416b
?
Any help with this would be really appreciated.
Kind regards, Linford
OK so I have followed the error messages and realised I needed to additional packages:
Now it appears to work, but I am getting an error message:
Which I assume is not such a big issue?
However, if someone could explain exactly how this code is identifying mitochondrial transcripts I would be very grateful.
I am glad you have solved the first problem.
I think you just have a warning message. I think it doesn't matter. This message is common. The reason is that there are 563 IDs that can't map to the reference annotation dataset ens.mm.v97.
The content of location is like that:
GRangesList object of length 46604: $ENSMUSG00000102693 GRanges object with 1 range and 0 metadata columns: seqnames ranges strand <Rle> <IRanges> <Rle>
[1] 1 3073253-3074322 +
seqinfo: 61 sequences from GRCm38 genome
$ENSMUSG00000064842 GRanges object with 1 range and 0 metadata columns: seqnames ranges strand <Rle> <IRanges> <Rle>
[1] 1 3102016-3102125 +
seqinfo: 61 sequences from GRCm38 genome
$ENSMUSG00000051951 GRanges object with 1 range and 0 metadata columns: seqnames ranges strand <Rle> <IRanges> <Rle>
[1] 1 3205901-3671498 -
seqinfo: 61 sequences from GRCm38 genome
...
<46601 more elements>The seqnames is the chr number.The genes are located on MT are marked by MT. That defines the MT genes.
library(scRNAseq) sce.416b <- LunSpikeInData(which="416b") sce.416b$block <- factor(sce.416b$block)
如果以上不能用,可以使用如下方法自行构建测试数据集
构建所需的数据:1.colData 2.edb 3.rowData 4.sceData 5.spikein
1.colData下载途径:http://s3.amazonaws.com/experimenthub/scRNAseq/lun-spikein/2.0.0/coldata-416b.rds
colData <- readRDS("d:/analysis-learn/mms_mt_analysis/coldata-416b.rds")
2.edb下载途径:
BiocManager::install("EnsDb.Mmusculus.v79")此处没有使用EnsDb.Mmusculus.v97数据
library(EnsDb.Mmusculus.v79)
edb <- EnsDb.Mmusculus.v79
3.rowData下载途径:http://s3.amazonaws.com/experimenthub/scRNAseq/lun-spikein/2.0.0/rowdata-416b.rds
rowData <- readRDS("d:/analysis-learn/mms_mt_analysis/rowdata-416b.rds")
4.sceData下载途径:http://s3.amazonaws.com/experimenthub/scRNAseq/lun-spikein/2.0.0/counts-416b.rds
sceData <- readRDS("d:/analysis-learn/mms_mt_analysis/counts-416b.rds")
5.spikein下载途径:
此处直接使用教程中的方法,并将数据转换为文档spikein数据
spikedata <- read.table("spikedata.txt")
构建测试所用数据集:
sce <- SingleCellExperiment(assays = list(counts = sceData),colData = colData, rowData = rowData)
spike.type <- rep("endogenous", nrow(sce))
spike.type[grep("ERCC", rownames(sce))] <- "ERCC"
spike.type[grep("SIRV", rownames(sce))] <- "SIRV"
sce <- splitAltExps(sce, spike.type, ref="endogenous")
spike.exp <- altExp(sce, "ERCC")
spikedata <- spikedata[rownames(spike.exp), ]
rowData(spike.exp) <- cbind(rowData(spike.exp), spikedata)
altExp(sce, "ERCC") <- spike.exp
ginfo <- genes(edb, columns=character(0))
mcols(ginfo) <- NULL
m <- match(rownames(sce), names(ginfo))
present <- !is.na(m)
if (!all(present)) {
replacement <- rowRanges(sce)
mcols(replacement) <- NULL
ginfo <- as(ginfo, "GRangesList")
replacement[present] <- ginfo[m[present]]
} else {
replacement <- ginfo[m]
}
mcols(replacement) <- rowData(sce)
rowRanges(sce) <- replacement
sce.416b <- sce
此时sce.416b数据集与示例中的数据相似度很高,可以进行后续的步骤
if you have any questions,please send e-mails to 1498554536@qq.com
Best wishes!