EdgeR questions in analyzing 454 data-about prior.n, TMM, and p_value
2
0
Entering edit mode
Ying Ye ▴ 10
@ying-ye-4304
Last seen 9.7 years ago
Dear edgeR users and developers, I have few questions about edgeR when recently I use it for 454 pyrosequencing data: 1. prior.n According to users' manual, we may not use too low prior.n in moderated tagwise dispersion approach. But in my dataset, there are more than 15 samples in each comparison group and the freedom is larger than 30. prior.n <- estimateSmoothing(d) gives 0.0005329. So I am wondering if I could use 0.0005329 since I have rather big number of samples in each group. Or I should adjust prior.n into 10 according to the manual's suggestion. 2. TMM I am not sure if this is also applicable to 454 microbiota data. I suppose I should do TMM normalization as well since the normalization factors from my samples have a big variation (f is from 0.41 to 4.58). Is that right? 3. p_value According to your experience, is it reasonable and reliable to use p_value < 0.05 as significance criteria? or only <0.01 can be reliable. I am a new users in this package and hope you may give some suggestions. Many thanks! Ying Ye [[alternative HTML version deleted]]
Normalization edgeR Normalization edgeR • 887 views
ADD COMMENT
0
Entering edit mode
Mark Robinson ★ 1.1k
@mark-robinson-2171
Last seen 9.7 years ago
Hi Ying. Some comments below. On 2010-10-18, at 10:22 PM, Ying Ye wrote: > Dear edgeR users and developers? > > I have few questions about edgeR when recently I use it for 454 > pyrosequencing data: > > 1. prior.n > According to users' manual, we may not use too low prior.n in > moderated tagwise dispersion approach. But in my dataset, there are > more than 15 samples in each comparison group and the freedom is > larger than 30. prior.n <- estimateSmoothing(d) gives 0.0005329. So I > am wondering if I could use 0.0005329 since I have rather big number > of samples in each group. Or I should adjust prior.n into 10 according > to the manual's suggestion. Well, its hard to give a prescription for prior.n for all datasets. Since you have so many degrees of freedom, you shouldn't need prior.n as high as 10. You might try something lower, say 1-3. > 2. TMM > I am not sure if this is also applicable to 454 microbiota data. > I suppose I should do TMM normalization as well since the > normalization factors from my samples have a big variation (f is from > 0.41 to 4.58). Is that right? I must admit that I'm not intimately aware of all the nuances of microbiota data, but I will say that those factors you mention above are generally lower/higher than we see in RNA-seq data. I'd say its probably best to look at some "smear" plots -- through maPlot() for example -- to assess whether the TMM normalization is appropriately capturing shifts due to composition or the like. As always for exploratory analysis, it would be good to look multidimension scaling plots -- see plotMDS.dge(). There is no substitute for looking at your data. > 3. p_value > According to your experience, is it reasonable and reliable to > use p_value < 0.05 as significance criteria? or only <0.01 can be > reliable. First off, you'll probably want to do some multiple testing correction, which can be done through the topTags() function. As to where to set the threshold on significance, that is a matter of your false discovery tolerance ... the status quo is 5%, but you may want to be more or less stringent. Hope that helps. Mark > I am a new users in this package and hope you may give some > suggestions. Many thanks! > > Ying Ye > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > Bioc-sig-sequencing at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing ------------------------------ Mark Robinson, PhD (Melb) Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852 ------------------------------ ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 9 hours ago
WEHI, Melbourne, Australia
Dear Ying Ye, Just adding to one of Mark's comment, see below. > Date: Tue, 19 Oct 2010 09:43:10 +1100 > From: Mark Robinson <mrobinson at="" wehi.edu.au=""> > To: Ying Ye <mikecrux at="" gmail.com=""> > Cc: Bioc-sig-sequencing at r-project.org, bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] [Bioc-sig-seq] EdgeR questions in analyzing 454 > data-about prior.n, TMM, and p_value > > Hi Ying. > > Some comments below. > > On 2010-10-18, at 10:22 PM, Ying Ye wrote: > >> Dear edgeR users and developers? >> >> I have few questions about edgeR when recently I use it for 454 >> pyrosequencing data: >> >> 1. prior.n >> According to users' manual, we may not use too low prior.n in >> moderated tagwise dispersion approach. But in my dataset, there are >> more than 15 samples in each comparison group and the freedom is larger >> than 30. prior.n <- estimateSmoothing(d) gives 0.0005329. So I am >> wondering if I could use 0.0005329 since I have rather big number of >> samples in each group. Or I should adjust prior.n into 10 according to >> the manual's suggestion. > > Well, its hard to give a prescription for prior.n for all datasets. > Since you have so many degrees of freedom, you shouldn't need prior.n as > high as 10. You might try something lower, say 1-3. Just to refine this, how many degrees of freedom do you have per tag? Let's define df = number of libraries - number of groups. I would suggest you choose your prior.n so that prior.n * df is around 50, but don't go below prior.n=1. We are not recommending estimateSmoothing() at the moment because it gives variable results on next-generation sequencing data. The estimateSmoothing() value for your data is too small to be recommended. Best wishes Gordon ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 445 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6