DESeq2 estimateDispersionsGeneEst error
1
0
Entering edit mode
brijon • 0
@brijon-13253
Last seen 6.9 years ago

Hi 

Im having alot of trouble applying DESeq2 to my metagenome gene abundance data , any advice regarding where I may be going wrong would be much appreciated.Early into my career so sorry for my ignorance.

 

dds <- DESeqDataSetFromMatrix(countData=spc.matrix,colData = env,design= ~ Habitat)

I have tried running estimateSizeFactors like so:

dds<-estimateSizeFactors(dds,type="iterate")

and get the following error:

Error in estimateDispersionsFit(object, fitType = fitType, quiet = quiet) : 
  all gene-wise dispersion estimates are within 2 orders of magnitude
  from the minimum value, and so the standard curve fitting techniques will not work.
  One can instead use the gene-wise estimates as final estimates:
  dds <- estimateDispersionsGeneEst(dds)
  dispersions(dds) <- mcols(dds)$dispGeneEst
  ...then continue with testing using nbinomWaldTest or nbinomLRT

I have tried to  then go on to run the estimateDispersionsGeneEst function as advised:

dds <- estimateDispersionsGeneEst(dds)

and get this error...

Error in .local(object, ...) :

  first calculate size factors, add normalizationFactors, or set normalized=FALSE

I tried to go back and reset the count data with the normalized=FALSE parameter 

countdat<-counts(dds,normalized=FALSE)

counts(dds)<-countdat

but still get the same error when i reapply the estimateDispersionsGeneEst function.

Many Thanks,

Briony

 

 

 

 

 

 

deseq2 • 2.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 9 hours ago
United States
Try the poscounts size factor estimator type in version 1.16. This is recommended for metagenomics.
ADD COMMENT
0
Entering edit mode

Thanks very much for your advice Michael, would you consider deseq2 to be appropriate for environmental metagenomic samples, or would you suggest its more appropriate for clinical metagenome samples with less extreme differences in functional profiles?

Cheers again, 

Briony

ADD REPLY
0
Entering edit mode

I know this is an unsatisfying answer, but the mileage from the NB methods depends on various properties of the dataset, and I don't analyze metagenomic data myself, so I'm hard pressed to come up with rules for when the NB methods would be outperformed by other specific software. I'd look at, e.g. MA plot and the top genes using plotCounts to make sure that the inference makes sense and isn't driven by individual samples too much. Also, you should probably set minReplicatesForReplace=Inf, as the outlier replacement will probably not be appropriate for the count distribution. And I'm fairly sure the "poscounts" normalization will be better than the default, which might end up using very few rows for normalization (or fail if all rows have a 0).

ADD REPLY

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6