Unusual results with DESeq ?
1
0
Entering edit mode
Osvaldo ▴ 40
@osvaldo-6150
Last seen 4.6 years ago
Spain
hi there !! I am analyzing miRNA-seq data with DESeq, and I am getting no significant results, and I cannot see why is so. My experiment has two conditions and two replicates per condition: I execute it in R as follows: *countTable = read.table ( "/mnt/TB3/ograna/Ozge_Uluckan/miRNA- seq/Analysis/tables_for_DESeq/cOB1_vs_tOB100" , header=TRUE, row.names=1) experiment_design = data.frame( row.names = colnames(countTable), condition=c("OB1","OB1","OB100","OB100"), libType=c("single-end","single-end","single-end","single-end") ) library("DESeq") condition=factor(c("OB1","OB1","OB100","OB100")) cds = newCountDataSet( countTable, condition ) cds = estimateSizeFactors( cds ) normalizedReadCounts = counts( cds, normalized=TRUE ) write.csv( normalizedReadCounts, file="cOB1_vs_tOB100.DESeq_normalizedReadCounts.csv" ) cds = estimateDispersions( cds ) res = nbinomTest( cds, "OB1", "OB100" ) write.csv( res, file="cOB1_vs_tOB100.DESeq_diffExp.csv" )* I am getting the following size factors: > sizeFactors(cds) OB.1OU OB.2OU OB100.1OU OB100.2OU 1 1 1 1 the differential expression table sorted by 'padj' is as follows (I am showing just a small set): number id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj *340 mmu-miR-204-5p 31.5 12.5 50.5 4.04 2.014355293 6.466284630369E-005 0.122859408 69 mmu-miR-1247-5p 57.5 32.5 82.5 2.5384615385 1.3439544012 0.0010710548 1 170 mmu-miR-15b-3p 99.25 135.5 63 0.4649446494 -1.1048691179 0.0024355731 1 363 mmu-miR-214-3p 591.25 767 415.5 0.5417209909 -0.8843781007 0.0040302141 1 1122 mmu-miR-664-3p 55.5 76 35 0.4605263158 -1.1186444965 0.0065510324 1 630 mmu-miR-335-3p 9.5 15.5 3.5 0.2258064516 -2.1468413883 0.0093262341 1 997 mmu-miR-574-3p 282 364.5 199.5 0.5473251029 -0.8695300681 0.0101138488 1 176 mmu-miR-17-3p 29 41 17 0.4146341463 -1.2700891634 0.0109905451 1 1898 mmu-miR-99a-5p 1254.5 1568.5 940.5 0.5996174689 -0.7378856803 0.0140500521 1 846 mmu-miR-467a-5p 11 17 5 0.2941176471 -1.7655347464 0.019028208 1 1900 mmu-miR-99b-5p 6246 7702.5 4789.5 0.6218111003 -0.6854517236 0.0204136071 1* There is no even one significant miRNA. Is it that I am missing something or doing something wrong? What does it mean that all the 'padj' values are '1' with the exception of the first one '0.122' ? Is there a way to change the method used to correct the p-values? thanks very much in advance !!! regards. -- Osvaldo Graña Bionformatics Unit, Structural Biology and Biocomputing Programme Spanish National Cancer Research Centre (CNIO) Melchor Fernández Almagro, 3 - 28029 Madrid +34 91 732 8000 (ext 3062) http://bioinfo.cnio.es www.cnio.es) <ograna@cnio.es> [[alternative HTML version deleted]]
miRNA Cancer DESeq miRNA Cancer DESeq • 1.3k views
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.8k
@simon-anders-3855
Last seen 4.4 years ago
Zentrum für Molekularbiologie, Universi…
Hi Osvaldo Looking through the excerpt of your result list, it seems that your read counts are very low. Most miRNAs seem to have less than 50 reads, so it might well that you simply have not sequenced deeply enough. Have you enriched your library for short RNA molecules, or is this standard RNA-Seq? For how many miRNA genes do you have counts, and how many are above 50? For now, independent filtering might help: Exclude all the genes with less than, say, 20 or 40 counts (according to baseMean), and run the Benjamini-Hochberg adjustment only on the remaining ones. I think we have a chapter on the vignette about this. Also double-check teh normalization with MA plots. With so small counts, the default location measure (the median) might work subobtimal; maybe try the shorth. (See ?estimateSizeFactors). Also consider switching to DESeq2, which has better inferential power due to an improved dispersion estimation scheme. Simon On 13/09/13 10:15, Osvaldo Gra?a wrote: > hi there !! > I am analyzing miRNA-seq data with DESeq, and I am getting no significant > results, and I cannot see why is so. > > My experiment has two conditions and two replicates per condition: > > I execute it in R as follows: > *countTable = read.table ( > "/mnt/TB3/ograna/Ozge_Uluckan/miRNA- seq/Analysis/tables_for_DESeq/cOB1_vs_tOB100" > , header=TRUE, row.names=1) > experiment_design = data.frame( > row.names = colnames(countTable), > condition=c("OB1","OB1","OB100","OB100"), > libType=c("single-end","single-end","single-end","single-end") > ) > library("DESeq") > condition=factor(c("OB1","OB1","OB100","OB100")) > cds = newCountDataSet( countTable, condition ) > cds = estimateSizeFactors( cds ) > normalizedReadCounts = counts( cds, normalized=TRUE ) > write.csv( normalizedReadCounts, > file="cOB1_vs_tOB100.DESeq_normalizedReadCounts.csv" ) > cds = estimateDispersions( cds ) > res = nbinomTest( cds, "OB1", "OB100" ) > write.csv( res, file="cOB1_vs_tOB100.DESeq_diffExp.csv" )* > > > I am getting the following size factors: >> sizeFactors(cds) > OB.1OU OB.2OU OB100.1OU OB100.2OU > 1 1 1 1 > > > the differential expression table sorted by 'padj' is as follows (I am > showing just a small set): > > number id baseMean baseMeanA baseMeanB foldChange > log2FoldChange pval padj > *340 mmu-miR-204-5p 31.5 12.5 50.5 4.04 2.014355293 > 6.466284630369E-005 0.122859408 > 69 mmu-miR-1247-5p 57.5 32.5 82.5 2.5384615385 > 1.3439544012 0.0010710548 1 > 170 mmu-miR-15b-3p 99.25 135.5 63 0.4649446494 > -1.1048691179 0.0024355731 1 > 363 mmu-miR-214-3p 591.25 767 415.5 0.5417209909 > -0.8843781007 0.0040302141 1 > 1122 mmu-miR-664-3p 55.5 76 35 0.4605263158 > -1.1186444965 0.0065510324 1 > 630 mmu-miR-335-3p 9.5 15.5 3.5 0.2258064516 > -2.1468413883 0.0093262341 1 > 997 mmu-miR-574-3p 282 364.5 199.5 0.5473251029 > -0.8695300681 0.0101138488 1 > 176 mmu-miR-17-3p 29 41 17 0.4146341463 -1.2700891634 > 0.0109905451 1 > 1898 mmu-miR-99a-5p 1254.5 1568.5 940.5 0.5996174689 > -0.7378856803 0.0140500521 1 > 846 mmu-miR-467a-5p 11 17 5 0.2941176471 -1.7655347464 > 0.019028208 1 > 1900 mmu-miR-99b-5p 6246 7702.5 4789.5 0.6218111003 > -0.6854517236 0.0204136071 1* > > > There is no even one significant miRNA. Is it that I am missing something > or doing something wrong? What does it mean that all the 'padj' values are > '1' with the exception of the first one '0.122' ? > Is there a way to change the method used to correct the p-values? > > thanks very much in advance !!! > regards. > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Thanks very much Simon !!! Indeed, you are right, the experiment depth is very low in all the samples, they are going for a new sequencing round with higher depth. I hope results get better then. Anyway, your recommendations will help for future situations like this. best regards. osvaldo On Fri, Sep 13, 2013 at 11:32 AM, Simon Anders <anders@embl.de> wrote: > Hi Osvaldo > > Looking through the excerpt of your result list, it seems that your read > counts are very low. Most miRNAs seem to have less than 50 reads, so it > might well that you simply have not sequenced deeply enough. > > Have you enriched your library for short RNA molecules, or is this > standard RNA-Seq? For how many miRNA genes do you have counts, and how many > are above 50? > > For now, independent filtering might help: Exclude all the genes with less > than, say, 20 or 40 counts (according to baseMean), and run the > Benjamini-Hochberg adjustment only on the remaining ones. I think we have a > chapter on the vignette about this. > > Also double-check teh normalization with MA plots. With so small counts, > the default location measure (the median) might work subobtimal; maybe try > the shorth. (See ?estimateSizeFactors). > > Also consider switching to DESeq2, which has better inferential power due > to an improved dispersion estimation scheme. > > Simon > > > > On 13/09/13 10:15, Osvaldo Graña wrote: > >> hi there !! >> I am analyzing miRNA-seq data with DESeq, and I am getting no significant >> results, and I cannot see why is so. >> >> My experiment has two conditions and two replicates per condition: >> >> I execute it in R as follows: >> *countTable = read.table ( >> >> "/mnt/TB3/ograna/Ozge_Uluckan/**miRNA-seq/Analysis/tables_for_** >> DESeq/cOB1_vs_tOB100" >> , header=TRUE, row.names=1) >> experiment_design = data.frame( >> row.names = colnames(countTable), >> condition=c("OB1","OB1","**OB100","OB100"), >> libType=c("single-end","**single-end","single-end","**single-end") >> ) >> library("DESeq") >> condition=factor(c("OB1","OB1"**,"OB100","OB100")) >> cds = newCountDataSet( countTable, condition ) >> cds = estimateSizeFactors( cds ) >> normalizedReadCounts = counts( cds, normalized=TRUE ) >> write.csv( normalizedReadCounts, >> file="cOB1_vs_tOB100.DESeq_**normalizedReadCounts.csv" ) >> cds = estimateDispersions( cds ) >> res = nbinomTest( cds, "OB1", "OB100" ) >> write.csv( res, file="cOB1_vs_tOB100.DESeq_**diffExp.csv" )* >> >> >> >> I am getting the following size factors: >> >>> sizeFactors(cds) >>> >> OB.1OU OB.2OU OB100.1OU OB100.2OU >> 1 1 1 1 >> >> >> the differential expression table sorted by 'padj' is as follows (I am >> showing just a small set): >> >> number id baseMean baseMeanA baseMeanB foldChange >> log2FoldChange pval padj >> *340 mmu-miR-204-5p 31.5 12.5 50.5 4.04 2.014355293 >> >> 6.466284630369E-005 0.122859408 >> 69 mmu-miR-1247-5p 57.5 32.5 82.5 2.5384615385 >> 1.3439544012 0.0010710548 1 >> 170 mmu-miR-15b-3p 99.25 135.5 63 0.4649446494 >> -1.1048691179 0.0024355731 1 >> 363 mmu-miR-214-3p 591.25 767 415.5 0.5417209909 >> -0.8843781007 0.0040302141 1 >> 1122 mmu-miR-664-3p 55.5 76 35 0.4605263158 >> -1.1186444965 0.0065510324 1 >> 630 mmu-miR-335-3p 9.5 15.5 3.5 0.2258064516 >> -2.1468413883 0.0093262341 1 >> 997 mmu-miR-574-3p 282 364.5 199.5 0.5473251029 >> -0.8695300681 0.0101138488 1 >> 176 mmu-miR-17-3p 29 41 17 0.4146341463 -1.2700891634 >> 0.0109905451 1 >> 1898 mmu-miR-99a-5p 1254.5 1568.5 940.5 0.5996174689 >> -0.7378856803 0.0140500521 1 >> 846 mmu-miR-467a-5p 11 17 5 0.2941176471 -1.7655347464 >> 0.019028208 1 >> 1900 mmu-miR-99b-5p 6246 7702.5 4789.5 0.6218111003 >> -0.6854517236 0.0204136071 1* >> >> >> >> There is no even one significant miRNA. Is it that I am missing something >> or doing something wrong? What does it mean that all the 'padj' values are >> '1' with the exception of the first one '0.122' ? >> Is there a way to change the method used to correct the p-values? >> >> thanks very much in advance !!! >> regards. >> >> >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > ______________________________**_________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.et="" hz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: http://news.gmane.org/gmane.** > science.biology.informatics.**conductor<http: news.gmane.org="" gmane.="" science.biology.informatics.conductor=""> > -- Osvaldo Graña Bionformatics Unit, Structural Biology and Biocomputing Programme Spanish National Cancer Research Centre (CNIO) Melchor Fernández Almagro, 3 - 28029 Madrid +34 91 732 8000 (ext 3062) http://bioinfo.cnio.es www.cnio.es) <ograna@cnio.es> [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6