limma small vs large number of samples
1
0
Entering edit mode
@giovanni-bucci-6524
Last seen 9.6 years ago
Thank you, Jim. One more question. The eBayes step adjusts the variance across genes. Since there is already a good estimate for the variance due to the large number of samples does the eBayes step shrink the variance even further? Thank you, Giovanni Hi Giovanni, On 4/28/2014 8:54 PM, Giovanni Bucci wrote: >* Hello everybody, *>>* I have 32 samples, 4 factors with 2 levels each. Each level has 2 *>* replicates. *>>>* str(gxexprs) *>* num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... *>>>* Group *>* [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ R95VE *>* [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE *>* [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE *>* 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE *>>>* design <- model.matrix(~0+Group) *>* fit <- lmFit(gxexprs, design) *>>* contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) *>* fit2 <- contrasts.fit(fit, contrast.matrix) *>* fit2 <- eBayes(fit2) *>>* TTable = topTable(fit2) *>>* global_p_val = TTable$P.Val *>>>* gxexprs = gxexprs[, 1:4] *>>>* ## same code as above but the expression matrix has only the first 4 *>* columns which represent the contrast tested above *>>* design <- model.matrix(~0+Group) *>* fit <- lmFit(gxexprs, design) *>>* contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) *>* fit2 <- contrasts.fit(fit, contrast.matrix) *>* fit2 <- eBayes(fit2) *>>* TTable = topTable(fit2) *>>* local_p_val = TTable$P.Val *>>* local_p_val has much greater values than global_p_val even though they *>* represent the same comparison. *>>* What is the explanation for this? * The denominator of your t-statistic is based on the mean square error of the model (which is based on the intra-group variance of all groups). When you have all the other groups in the model, the number of observations used to estimate variance is larger, so you get more degrees of freedom for you test (and the variance estimate is more accurate), so you get smaller p-values in general. Best, Jim >>* Can you point to some diagnostic functions that will show the difference? *>>* Thank you, *>>* Giovanni *>>* [[alternative HTML version deleted]]* [[alternative HTML version deleted]]
• 994 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 11 hours ago
United States
Hi Giovanni, The eBayes step doesn't adjust the variance across genes. Instead, it estimates a variance hyperparameter based on the variance across all genes, and then 'squeezes' the observed by-gene variances towards that hyperparameter. The degree of shrinkage is dependent on the df for the observed and prior variances, and since you have lots of observations, it will be less than would be expected for a smaller experiment. You can always call the squeezeVar function directly and compare the before and after variance estimates if you want. Best, Jim On 5/5/2014 12:26 PM, Giovanni Bucci wrote: > Thank you, Jim. > > One more question. The eBayes step adjusts the variance across genes. > Since there is already a good estimate for the variance due to the > large number of samples does the eBayes step shrink the variance even > further? > > Thank you, > > Giovanni > > > > Hi Giovanni, > > On 4/28/2014 8:54 PM, Giovanni Bucci wrote: >> * Hello everybody, > *>>* I have 32 samples, 4 factors with 2 levels each. Each level has 2 > *>* replicates. > *>>>* str(gxexprs) > *>* num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... > *>>>* Group > *>* [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ > R95VQ R95VE > *>* [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE > *>* [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE > *>* 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE > *>>>* design <- model.matrix(~0+Group) > *>* fit <- lmFit(gxexprs, design) > *>>* contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > *>* fit2 <- contrasts.fit(fit, contrast.matrix) > *>* fit2 <- eBayes(fit2) > *>>* TTable = topTable(fit2) > *>>* global_p_val = TTable$P.Val > *>>>* gxexprs = gxexprs[, 1:4] > *>>>* ## same code as above but the expression matrix has only the first 4 > *>* columns which represent the contrast tested above > *>>* design <- model.matrix(~0+Group) > *>* fit <- lmFit(gxexprs, design) > *>>* contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > *>* fit2 <- contrasts.fit(fit, contrast.matrix) > *>* fit2 <- eBayes(fit2) > *>>* TTable = topTable(fit2) > *>>* local_p_val = TTable$P.Val > *>>* local_p_val has much greater values than global_p_val even though they > *>* represent the same comparison. > *>>* What is the explanation for this? > * > The denominator of your t-statistic is based on the mean square error of > the model (which is based on the intra-group variance of all groups). > When you have all the other groups in the model, the number of > observations used to estimate variance is larger, so you get more > degrees of freedom for you test (and the variance estimate is more > accurate), so you get smaller p-values in general. > > Best, > > Jim > > >>> * Can you point to some diagnostic functions that will show the difference? > *>>* Thank you, > *>>* Giovanni > *>>* [[alternative HTML version deleted]]* > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT

Login before adding your answer.

Traffic: 1045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6