VST Failing Error in estimateDispersionsFit(object, quiet = TRUE, fitType)
1
0
Entering edit mode
Mia ▴ 10
@mia-24145
Last seen 11 months ago

Hi all!

First post here ever! Never thought I'd be confused enough and not find an answer on the internet for my problem, lol. Anyway!

I am using a public dataset of cancer brain samples and want to create a heatmap of sample distances as according to: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#heatmap-of-the-sample-to-sample-distances

My code is the following:


de = DESeqDataSetFromMatrix(countData = exprs(recountDataCombat),
colData = pData(recountDataCombat),
design = formula)

de <- estimateSizeFactors(de)
de <- estimateDispersionsGeneEst(de)
dispersions(de) <- mcols(de)$dispGeneEst glm_all_nb_combat <- nbinomWaldTest(de) res <- results(glm_all_nb_combat, name=resultsNames(glm_all_nb_combat)[2]) .myMAPlot(res, name=title_figure_2) # Sample Distances dds <- glm_all_nb_combat vsd <- vst(dds, blind=FALSE)  But I get the following error Error in estimateDispersionsFit(object, quiet = TRUE, fitType) : all gene-wise dispersion estimates are within 2 orders of magnitude from the minimum value, and so the standard curve fitting techniques will not work. One can instead use the gene-wise estimates as final estimates: dds <- estimateDispersionsGeneEst(dds) dispersions(dds) <- mcols(dds)$dispGeneEst
...then continue with testing using nbinomWaldTest or nbinomLRT


I am confused by this since I did follow this advice in my code when I did

de <- estimateSizeFactors(de)
de <- estimateDispersionsGeneEst(de)
dispersions(de) <- mcols(de)\$dispGeneEst
glm_all_nb_combat <- nbinomWaldTest(de)


I want to avoid rlog at all costs since I have around 600~ samples. But my questions are:

Does anyone know away around this?

If not does anyone know where I can find the source code?

Is VST not how I should be normalizing these samples since the dispersion estimates are so low?

Thank you in advance! :D Mia Altieri

vst varianceStabilizingTransformation DESeq2 • 312 views
2
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

The message above is saying that the data you are looking at is close to Poisson (it's not so easy to interpret, but basically all the dispersion estimates are close to 1e-8).

Not sure why you may have near Poisson data, but in that case, the shifted logarithm is a good approach:

ldat <- normTransform(dds)
plotPCA(ldat)
...

1
Entering edit mode

Yes! That worked! I see your posts all the time and I think the world of you, thank you so much for helping me!

And yeah, I am not sure why this is happening either, its odd because it doesn't happen until after I run Combat, so I want to compare with other batch correction methods.