scran::trendVar throwing error
2
0
Entering edit mode
@zhalehsafikhani-14566
Last seen 6.4 years ago

Hello,

I want to use the trendVar function in screen packs to get the highly variant genes across my single cell samples.
But for some reason it is not working for me and throwing the following error:

Error in .trend_var(assay(x, i = assay.type), ..., subset.row = subset.row) :
 need at least 4 values for non-linear curve fitting

I think it throws the error at lout <- .Call(cxx_fit_linear_model, QR$qr, QR$qraux, x, subset.row - 1L, FALSE)

Apparently doesn’t keep enough genes for some reason.

This is how I use it though:
 

dim(cells.counts)

[1] 18574  3483

sce <- SingleCellExperiment(list(counts=cells.counts))
sce <- scran::computeSumFactors(sce)
summary(sizeFactors(sce))

Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.08161 0.61558 0.89761 1.00000 1.28977 3.65026

quantile(logcounts(sce))

       0%        25%        50%        75%       100%
0.0000000  0.0000000  0.0000000  0.7415879 13.7971704

fit <- trendVar(sce, parametric=TRUE)

Error in .trend_var(assay(x, i = assay.type), ..., subset.row = subset.row) :
 need at least 4 values for non-linear curve fitting


I put the dimension of the data and the range of the values and sum factors for you in case it might help figuring out the issue.
Can you please help figure out what is going wrong here?

scran • 1.5k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 13 minutes ago
The city by the bay

Firstly, I assume that you are using the latest versions of the relevant packages (see the posting guide); you did some quality control on the cells; and you ran scater::normalize at some point.

Secondly, a cursory examination of the scran:::.trend_var code will reveal the true origin of the error:

    if (parametric) {
        if (length(kept.vars) <= 3L) {
            stop("need at least 4 values for non-linear curve fitting")
        }

Finally, if you look at ?trendVar, you will see a min.mean argument that specifies the minimum mean log-count for genes to be used for trend fitting. I had set this to 0.1, which I thought was low enough to never cause anyone problems. I am surprised that fewer than four genes in your data set had a rowMeans(logcounts(sce)) above 0.1, these does not seem likely even for sparse droplet data. Nonetheless, you can set min.mean=0 to turn off mean filtering completely.

ADD COMMENT
0
Entering edit mode
@zhalehsafikhani-14566
Last seen 6.4 years ago

Hi Aaron,

Yes I use the latest versions:

[1] scran_1.6.5                scater_1.6.1               SingleCellExperiment_1.0.0

And yes I was doing sce <- scater::normalize(sce) right before calling quantile on logcounts(sce), apparently was dropped in my copy and paste.

I don't think the the threshold for gene expressions is an issue though as I should have enough by 0.1:

quantile(rowMeans(logcounts(sce)))
          0%          25%          50%          75%         100% 
0.0006372668 0.0148420739 0.1515419067 0.6003230574 9.4099977354 
> length(which(rowMeans(logcounts(sce)) > 0.1))
[1] 10298

Thanks for your help.

ADD COMMENT
0
Entering edit mode

Firstly, post replies to existing answers in "add comment", rather than making a new answer.

Secondly; well, I don't have any idea what's happening here. trendVar will automatically remove genes with zero variance and low mean, but it doesn't seem like the mean threshold is the problem. On the other hand, I can't imagine that your genes have zero variance across all cells, unless you're feeding trendVar an artificial matrix of some sort?

In any case, if you want to resolve this problem, you can either send me an anonymized copy of the data for me to look at, or you can try debugging it yourself. Run the following lines of code:

debug(scran:::.trend_var)
out <- scran:::.trend_var(sce, parametric=TRUE)

... step through each line of the function, and see what summary(vars > 1e-08) is after its creation. Also check out what subset.row is, it should be something like 1:10298 after its reassignment.

ADD REPLY
0
Entering edit mode

I shared the data with you on Google drive Aaron. I would appreciate if you take a look to see if you find anything odd with the data which makes it doesnt fit trendVar function criteria.

ADD REPLY
0
Entering edit mode

I realized what it was - you need to set use.spikes=FALSE, otherwise it will try to use the spike-in transcripts to fit the curve. You don't have any because you're working with 10X data, hence the error. This should have been obvious to me, my apologies; see also "Additional notes on gene selection" in ?trendVar.

More generally, note that you can store the count matrix as a sparse matrix, there's no need to coerce to a matrix any more. Also, I would suggest fiddling around with the trend fitting parameters, the following gives me a nice trend:

out <- trendVar(sce, parametric=TRUE, method="spline", df=10, use.spikes=FALSE)
plot(out$mean, out$var)
curve(out$trend(x), col="red", add=TRUE)

The next release will allow even more flexibility in controlling the trend fitting, so you can get even better fits.

ADD REPLY
0
Entering edit mode

Oh great it works now! It should have been obvious to me too! However, it was my first attempt to use scran. Many thanks for your help Aaron!

ADD REPLY

Login before adding your answer.

Traffic: 676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6