Question

DESeq estimateDispersion options for lower depth miRNA-seq

0

Entering edit mode

Aggarwal, Praful ▴ 50

@aggarwal-praful-5189

Last seen 8.9 years ago

United States

Hello, I am trying to use DESeq for miRNA sequencing data. We have 2 replicates (treated and untreated) i.e. a total of 4 samples. However, we have only around 200K-300K reads mapping to known miRNAs. I know this number is probably small, but we would still like to check for differential expression. The default options seem to be too conservative in our condition (may be due to the low number of reads), so I think may be using the "fit-only" option might be better. Since, we have lower reads I am thinking of using "genefilter's shorth" to estimate the size factors. I am trying these different options but am wondering what according to you could be the best options for usage in our case. I am aware of the potential noise in our data, but we still expect to see something significant which is being lost in all this noise. I hope my question makes sense. I would appreciate any help on this. Kindly let me know if you have any questions. Thanks, Praful [[alternative HTML version deleted]]

Sequencing miRNA DESeq Sequencing miRNA DESeq • 967 views

ADD COMMENT • link updated 12.2 years ago by Wolfgang Huber ★ 13k • written 12.2 years ago by Aggarwal, Praful ▴ 50

score 0 · Answer 1 · 2012-03-26

Dear Praful thanks for your message. 1. You can try with "fit-only", but then please visualise the data for the miRNAs that you identify that way and see whether these are plausible. E.g. do the 6 MA-plots for all pairs of samples and see where the hits are in there. The big drawback of the "fit-only" option is that "significant" calls might be made based on outlier measurements or other sources of high variability in the data. 2. Using genefilter::shorth as the argument for locfunc in estimateSizeFactors: in principle, this can be useful when the counts per-gene are low, but it requires that there are many genes (the shorth as an estimator is less efficient than, say, the median). Since you are working with miRNAs, you have few genes and low counts, and I am not sure that will improve much. I think it is fair to try both and see where you find more concordance between replicates and differences between conditions. You could call 'arrayQualityMetrics' on the variance-stabilised version of the data with both normalisation options, and see which of the cluster dendrograms and PCA plots you prefer. Best wishes Wolfgang Mar/26/12 5:49 PM, Aggarwal, Praful scripsit:: > Hello, > > > > I am trying to use DESeq for miRNA sequencing data. We have 2 > replicates (treated and untreated) i.e. a total of 4 samples. > However, we have only around 200K-300K reads mapping to known miRNAs. > I know this number is probably small, but we would still like to > check for differential expression. The default options seem to be too > conservative in our condition (may be due to the low number of > reads), so I think may be using the "fit-only" option might be > better. Since, we have lower reads I am thinking of using > "genefilter's shorth" to estimate the size factors. > > > > I am trying these different options but am wondering what according > to you could be the best options for usage in our case. I am aware of > the potential noise in our data, but we still expect to see something > significant which is being lost in all this noise. I hope my question > makes sense. I would appreciate any help on this. > > > > Kindly let me know if you have any questions. > > > > Thanks, Praful > -- Best wishes Wolfgang Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber