Hi There,
I am trying to do a differential abundance analysis with microbiome sequencing data suing DESeq2 package, but I keep getting errors after trying different approaches within the package. The data is a 50 by 501 matrix with each row being a sample and the first column being the group indicator and the other 500 columns are sequencing reads for 500 taxa. No missing data in the data set. I can upload the data, but I don't know how to do that. Below is the R script for the analysis. Please help. Thanks a lot!
class(data) [1] "data.frame"
# make data in the format for DESeq2
countsData=t(data[,-1])+1
rownames(countsData)=colnames(data)[-1]
class(countsData)
[1] "matrix"
print(dim(countsData))
[1] 500 50
print(countsData[1:5,1:5]
[,1] [,2] [,3] [,4] [,5]
rawCount1 8 8 10 11 9
rawCount2 8 9 8 7 8
rawCount3 36 34 30 36 33
rawCount4 7 8 6 5 8
rawCount5 10 7 63 13 64
# get predictor data and annotate the predictor data
xData=as.data.frame(data[,1])
colnames(xData)=colnames(data)[1]
xData[,"x"]=as.factor(xData[,"x"])
class(xData)
[1] "data.frame"
print(dim(xData))
[1] 50 1
print(xData[1:5,])
[1] 0 0 1 0 1
Levels: 0 1
# prepare for analysis
dds <- DESeqDataSetFromMatrix(countData=countsData,colData=xData,design=~x)
# run analysis
suppressWarnings(runDeseq<- try(DESeq(dds, quiet = TRUE), silent = TRUE))
if (inherits(runDeseq, "try-error")) {
# If the parametric fit failed, try the local.
suppressWarnings(runDeseq<- try(DESeq(dds, fitType = "local", quiet = TRUE),
silent = TRUE))
if (inherits(runDeseq, "try-error")) {
# If local fails, try the mean
suppressWarnings(runDeseq<- try(DESeq(dds, fitType = "mean", quiet = TRUE),
silent = TRUE))
}
if (inherits(runDeseq, "try-error")) {
# If still bad, quit with error.
stop("DESeq1 fail")
}
}
When I dig into the error message, it says "simpleError in estimateDispersionsFit(object, fitType = fitType, quiet = quiet): all gene-wise dispersion estimates are within 2 orders of magnitude from the minimum value, and so the standard curve fitting techniques will not work. One can instead use the gene-wise estimates as final estimates: dds <- estimateDispersionsGeneEst(dds) dispersions(dds) <- mcols(dds)$dispGeneEst ...then continue with testing using nbinomWaldTest or nbinomLRT".
So I tried
dds <- DESeqDataSetFromMatrix(countData=countsData,colData=xData,design=~x)
dds <- estimateDispersionsGeneEst(dds)
but it gave me another error message: "Error in .local(object, ...) : first calculate size factors, add normalizationFactors, or set normalized=FALSE".
Here is some session info:
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)
packageVersion("DESeq2")
[1] ‘1.24.0’