Question

How to resolve these trendVar() warnings and errors

0

Entering edit mode

Abhishek Singh ▴ 20

@abhishek-singh-4725

Last seen 22 months ago

France

Dear community,

I am trying to compute variance for my single cell experiment using scater and other related packages. I am getting warnnings and I am unable to resolve them.

what I am doing is:

> loadSCE <- function(path){
     sce <- scater::read10XResults(path)
     #sce <- normalize(sce) # Data normalization based on scran
     mitochondrialGenes <- as.character(rowData(sce)[startsWith(rowData(sce)$symbol, "MT-"),]$id)
     isSpike(sce, "MT") <- rownames(sce) %in% mitochondrialGenes
     sce <- calculateQCMetrics(sce, 
                               feature_controls = list(
                                   MT =  isSpike(sce, "MT")
                               ))
 }
> # Read-in expression sample scRNA matrix from directory
> paths <- list.dirs(path = "/ap/B/SampleData/TestData", recursive = FALSE)
> for (i in 1:length(paths))
     assign(paste0("sce_",i), loadSCE(paths[i]))
> sce=0
> for (i in 1:length(paths))
     sce[i]<-print(noquote(paste0("sce_",i)))
[1] sce_1
> t_list <- list()
> t_list <- mget(ls(pattern="sce_\\d+"))
> for(i in seq_along(t_list))
 {
     metadata(t_list[[i]])["name"] <- paste0("iMates-",i)
 }
> 
> sce_1
class: SingleCellExperiment 
dim: 33694 5586 
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
rowData names(11): id symbol ... total_counts log10_total_counts
colnames(5586): AAACCTGAGAAGGTTT-1 AAACCTGAGCGTTCCG-1 ... TTTGTCATCGTCTGCT-1 TTTGTCATCGTTGCCT-1
colData names(30): dataset barcode ... pct_counts_MT is_cell_control
reducedDimNames(0):
spikeNames(1): MT
> sce_1=computeSumFactors((sce_1))
Warning message:
In .computeSumFactors(assay(x, i = assay.type), subset.row = subset.row,  :
  encountered negative size factor estimates
> sce_1
class: SingleCellExperiment 
dim: 33694 5586 
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
rowData names(11): id symbol ... total_counts log10_total_counts
colnames(5586): AAACCTGAGAAGGTTT-1 AAACCTGAGCGTTCCG-1 ... TTTGTCATCGTCTGCT-1 TTTGTCATCGTTGCCT-1
colData names(30): dataset barcode ... pct_counts_MT is_cell_control
reducedDimNames(0):
spikeNames(1): MT
> sce_1 <- computeSpikeFactors(sce_1, general.use=FALSE)
Warning message:
In .local(x, ...) : zero spike-in counts during spike-in normalization
> sce_1
class: SingleCellExperiment 
dim: 33694 5586 
metadata(0):
assays(1): counts
rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
rowData names(11): id symbol ... total_counts log10_total_counts
colnames(5586): AAACCTGAGAAGGTTT-1 AAACCTGAGCGTTCCG-1 ... TTTGTCATCGTCTGCT-1 TTTGTCATCGTTGCCT-1
colData names(30): dataset barcode ... pct_counts_MT is_cell_control
reducedDimNames(0):
spikeNames(1): MT
> sce_1<-normalise(sce_1)
> sce_1
class: SingleCellExperiment 
dim: 33694 5586 
metadata(1): log.exprs.offset
assays(2): counts logcounts
rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
rowData names(11): id symbol ... total_counts log10_total_counts
colnames(5586): AAACCTGAGAAGGTTT-1 AAACCTGAGCGTTCCG-1 ... TTTGTCATCGTCTGCT-1 TTTGTCATCGTTGCCT-1
colData names(30): dataset barcode ... pct_counts_MT is_cell_control
reducedDimNames(0):
spikeNames(1): MT
> fit<-trendVar(sce_1)
Warning messages:
1: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  pseudoinverse used at 2.3884
2: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  neighborhood radius 0.063295
3: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  reciprocal condition number  0
4: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  There are other near singularities as well. 0.019243
5: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  pseudoinverse used at 0.38698
6: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  neighborhood radius 1.7339
7: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  reciprocal condition number  0
8: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
  There are other near singularities as well. 0.0038173
> fit$mean
ENSG00000198888 ENSG00000198763 ENSG00000198804 ENSG00000198712 ENSG00000228253 ENSG00000198899 ENSG00000198938 
     2.45015465      3.05138106      2.38837043      2.99489882      0.03735962      2.45166525      3.26575853 
ENSG00000198840 ENSG00000212907 ENSG00000198886 ENSG00000198786 ENSG00000198695 ENSG00000198727 
     2.12090336      0.40130252      2.86793385      1.10383346      0.04606978      2.25962306 
> fit$var
ENSG00000198888 ENSG00000198763 ENSG00000198804 ENSG00000198712 ENSG00000228253 ENSG00000198899 ENSG00000198938 
     1.20619175      1.02354288      2.12121326      1.20826964      0.06286058      1.13370346      1.16162907 
ENSG00000198840 ENSG00000212907 ENSG00000198886 ENSG00000198786 ENSG00000198695 ENSG00000198727 
     1.26598710      0.55259326      1.01916726      1.14862181      0.06191704      1.22625394 
> fit$trend
function (x) 
{
    output <- SUBFUN(x) * f.scale
    names(output) <- names(x)
    return(output)
}
<bytecode: 0x146de820>
<environment: 0x210dbb88>

To me all seems fine, may be I am missing something.

These errors turn into errors when I run these Rmarkdown.

The error that I get is :

Quitting from lines 70-78 (tp.Rmd)
Error in if (abs(mean(sf) - centre) > tol) { :
  missing value where TRUE/FALSE needed
Calls: <Anonymous> ... .local -> .check_centered_SF -> areSizeFactorsCentred
Execution halted

Any suggestions/solutions to make this code work will be of great help.

thank you

scater scran singlecellexperiment • 1.6k views

ADD COMMENT • link updated 7.0 years ago by Aaron Lun ★ 29k • written 7.0 years ago by Abhishek Singh ▴ 20

score 1 · Answer 1 · 2018-12-14

1

Entering edit mode

Aaron Lun ★ 29k

@alun

Last seen 8 hours ago

The city by the bay

The immediate problem is that you will have spike-in size factors of zero for some cells - see the warning produced by computeSpikeFactors. This results in NA normalized expression values that should immediately break trendVar, so I don't see how you were able to get any return value at all.
trendVar cannot use endogenous genes without being instructed to do so with use.spikes=FALSE. So, I don't see how you managed to get statistics for Ensembl genes from your trendVar call.
Mitochondrial genes are not spike-ins. If you don't have actual spike-ins, consider using makeTechTrend for UMI data.
scater hasn't had the read10XResults function for a long time, use DropletUtils::read10xCounts instead.
You are probably not using the latest version of scran, so the negative size factors from computeSumFactors will also lead to NA expression values (the latest version will automatically try to remove negative size factors for you).

I suggest updating to R 3.5.* and using the latest version of all packages; the reported behaviour of trendVar doesn't make sense.

ADD COMMENT • link 7.0 years ago Aaron Lun ★ 29k

0

Entering edit mode

Hi Aaron,

Thank you for your reply. May I ask you one additional question,How can I put gene names and not gene IDs while reading and processing the files?

presently, everything is based on gene IDs, however, when I want to see the top 50 genes or so it is better to have gene names/symbols than gene IDs.

Thank you

ADD REPLY • link 7.0 years ago Abhishek Singh ▴ 20

0

Entering edit mode

If you're talking about converting from Ensembl IDs to gene names, see an example here. If you're talking about renaming the rows of the SingleCellExperiment object, use uniquifyFeatureNames from the scater package, as described here.

ADD REPLY • link 7.0 years ago Aaron Lun ★ 29k