Question

get full variance per gene from DESeq

0

Entering edit mode

Wolfgang Huber ★ 13k

@wolfgang-huber-3550

Last seen 10 days ago

EMBL European Molecular Biology Laborat…

Hi Graham an alternative, and maybe easier and slightly more consistent approach might be to use the p-values (and the sign of the foldChange) to transform to z-scores, perhaps like this: with(BNevBTG, qnorm(pval/2) * sign(log2FoldChange) ) PS: "more consistent" because the residuals distribution (of the counts is skewed), while the p-values are uniform under the null. Best wishes Wolfgang On Jul/30/10 11:47 AM, Graham Thomas wrote: > Hi All, > > I am wondering whether there is an easy way of obtaining the full variance > for a given gene and condition from my DESeq analysis? > > When I take a look at my results I end up with a table like so): > > > head ( BNevBTG ) > id baseMean baseMeanA baseMeanB foldChange log2FoldChange > 1 ENSMUSG00000001627 162.62034 119.50785 205.73284 1.7215006 0.78366671 > 2 ENSMUSG00000001630 45.51063 41.94099 49.08027 1.1702220 0.22678230 > 3 ENSMUSG00000001632 1626.19532 1328.32256 1924.06807 1.4484946 0.53455431 > 4 ENSMUSG00000001642 54.09075 54.53378 53.64772 0.9837521 -0.02363326 > 5 ENSMUSG00000001655 0.00000 0.00000 0.00000 NaN NaN > 6 ENSMUSG00000001656 0.00000 0.00000 0.00000 NaN NaN > pval padj resVarA resVarB > 1 9.930922e-05 0.0005666467 0.4043083 2.7960349 > 2 4.628048e-01 0.6573258560 0.1815569 0.5295679 > 3 4.216269e-04 0.0021251136 0.3436054 0.3589968 > 4 9.559678e-01 1.0000000000 1.0424044 0.8391154 > 5 NA NA NA NA > 6 NA NA NA NA > > > I would like to transform my results into z-scores for GSEA purposes. As > far as I understand in order to do this I require my baseMean value > (which I have), my baseMeanA(and B) values, and the full variance (which > I want). > > It may be noted that my practical statistics knowledge is HEAVILY > limited so any helpp at all here is greatly appreiciated! > > > Regards, > Graham > > -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber

DESeq DESeq • 1.1k views

ADD COMMENT • link updated 13.7 years ago by Graham Thomas ▴ 20 • written 13.7 years ago by Wolfgang Huber ★ 13k

score 0 · Answer 1 · 2010-08-05

Hi Wolfgang, Thank you for your reply to my previous message. Due to the low number of biological conditions I am comparing, and the algorithm I would ideally like to use (that proposed in Jiang and Gentleman (2007)) I am inclined to think that I will obtain greater power if I can obtain a single z-score for each gene and each replicate (6 measurements) rather than per replicate (which is 2). As far as I'm aware 3 measurements per group will improve my power in permutation testing during GSEA, if this is not the case then I may test using the z-scores obtained as suggested with the following: with(BNevBTG, qnorm(pval/2) * sign(log2FoldChange) ) Is this correct? Otherwise, I require a method to obtain an appropriate variance estimate for each gene per condition. I can outline the approach I am considering (and the problem I have) with the following argument:- geneA geneB 1) 50 200 2) 55 150 3) 45 175 My thinking is that in order to obtain appropriate z-scores for each measurement geneA[1,] -> geneA[3,] (and, of course, as appropriate for geneB) I may use the following: vf <- rawVarFunc( cds, "geneA)" ) And then add the variance due to the Poisson process (this is what I am unclear about how to obtain). With this I can use baseMeanA to generate z-scores for each measurement in the group. As a sanity check at this point it may not be absurd to do some kind of non-specific filtering based on the residuals I get? Is this a reasonable approach to calculating my z-scores and, if so, how do I get the variance due to the counting process? If not, any comments or suggestions on how to proceed would be greatly appreciated! Apologies for the verbose explanation! Regards, Graham On 02/08/10 11:00, bioconductor-request at stat.math.ethz.ch wrote: > Message: 2 Date: Sun, 01 Aug 2010 22:50:57 +0200 From: Wolfgang Huber > <whuber at="" embl.de=""> To: bioconductor at stat.math.ethz.ch Subject: Re: > [BioC] get full variance per gene from DESeq Message-ID: > <4C55DE31.1010700 at embl.de> Content-Type: text/plain; > charset=ISO-8859-1; format=flowed Hi Graham an alternative, and maybe > easier and slightly more consistent approach might be to use the > p-values (and the sign of the foldChange) to transform to z-scores, > perhaps like this: with(BNevBTG, qnorm(pval/2) * sign(log2FoldChange) > ) PS: "more consistent" because the residuals distribution (of the > counts is skewed), while the p-values are uniform under the null. Best > wishes Wolfgang On Jul/30/10 11:47 AM, Graham Thomas wrote: >> > Hi All, >> > >> > I am wondering whether there is an easy way of obtaining the full variance >> > for a given gene and condition from my DESeq analysis? >> > >> > When I take a look at my results I end up with a table like so): >> > >> > > head ( BNevBTG ) >> > id baseMean baseMeanA baseMeanB foldChange log2FoldChange >> > 1 ENSMUSG00000001627 162.62034 119.50785 205.73284 1.7215006 0.78366671 >> > 2 ENSMUSG00000001630 45.51063 41.94099 49.08027 1.1702220 0.22678230 >> > 3 ENSMUSG00000001632 1626.19532 1328.32256 1924.06807 1.4484946 0.53455431 >> > 4 ENSMUSG00000001642 54.09075 54.53378 53.64772 0.9837521 -0.02363326 >> > 5 ENSMUSG00000001655 0.00000 0.00000 0.00000 NaN NaN >> > 6 ENSMUSG00000001656 0.00000 0.00000 0.00000 NaN NaN >> > pval padj resVarA resVarB >> > 1 9.930922e-05 0.0005666467 0.4043083 2.7960349 >> > 2 4.628048e-01 0.6573258560 0.1815569 0.5295679 >> > 3 4.216269e-04 0.0021251136 0.3436054 0.3589968 >> > 4 9.559678e-01 1.0000000000 1.0424044 0.8391154 >> > 5 NA NA NA NA >> > 6 NA NA NA NA >> > >> > >> > I would like to transform my results into z-scores for GSEA purposes. As >> > far as I understand in order to do this I require my baseMean value >> > (which I have), my baseMeanA(and B) values, and the full variance (which >> > I want). >> > >> > It may be noted that my practical statistics knowledge is HEAVILY >> > limited so any helpp at all here is greatly appreiciated! >> > >> > >> > Regards, >> > Graham >> > >> > >> > -- Wolfgang Huber EMBL > http://www.embl.de/research/units/genome_biology/huber > ------------------------------ -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.