limma and nested factors
3
0
Entering edit mode
@arnemulleraventiscom-466
Last seen 10.3 years ago
Hello, I'm following the limma users' guide to analyze some affy data. The design in my analysis us a bit more complex than in the example of the manual. I don't know if I'm doing it right. I've two factors 'batch' and 'time' listed below: > batch [1] NEW NEW NEW OLD OLD OLD OLD PRG PRG PRG NEW NEW NEW OLD OLD OLD OLD PRG PRG [20] PRG Levels: NEW OLD PRG > > time [1] 04h 04h 04h 04h 04h 04h 04h 04h 04h 04h 24h 24h 24h 24h 24h 24h 24h 24h 24h [20] 24h Levels: 04h 24h You see that BATCH/TIME combinations have 3 replicates (except for OLD/04h or 24h which has 4). I'd like to know how many gene change in expression over time but within each batch. Therefore I'm constructing the following model matrix (time nested within batch, i.e. 04h and 24h at each level of batch): design.time <- model.matrix( ~ -1 + time %in% batch) colnames(design.time) <- c("time04h.batchNEW", "time24h.batchNEW", "time04h.batchOLD", "time24h.batchOLD", "time04h.batchPRG", "time24h.batchPRG") > design.time time04h.batchNEW time24h.batchNEW time04h.batchOLD time24h.batchOLD 1 1 0 0 0 2 1 0 0 0 3 1 0 0 0 4 0 0 1 0 5 0 0 1 0 6 0 0 1 0 7 0 0 1 0 8 0 0 0 0 9 0 0 0 0 10 0 0 0 0 11 0 1 0 0 12 0 1 0 0 13 0 1 0 0 14 0 0 0 1 15 0 0 0 1 16 0 0 0 1 17 0 0 0 1 18 0 0 0 0 19 0 0 0 0 20 0 0 0 0 time04h.batchPRG time24h.batchPRG 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 1 0 9 1 0 10 1 0 11 0 0 12 0 0 13 0 0 14 0 0 15 0 0 16 0 0 17 0 0 18 0 1 19 0 1 20 0 1 attr(,"assign") [1] 1 1 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$time [1] "contr.treatment" attr(,"contrasts")$batch [1] "contr.treatment" > fit.time <- lmFit(emat, design.time) > contr.mat.time <- makeContrasts(time04h.batchNEW-time24h.batchNEW, time04h.batchOLD-time24h.batchOLD, time04h.batchPRG-time24h.batchPRG, levels=design.time) > contr.mat.time time04h.batchNEW - time24h.batchNEW time04h.batchNEW 1 time24h.batchNEW -1 time04h.batchOLD 0 time24h.batchOLD 0 time04h.batchPRG 0 time24h.batchPRG 0 time04h.batchOLD - time24h.batchOLD time04h.batchNEW 0 time24h.batchNEW 0 time04h.batchOLD 1 time24h.batchOLD -1 time04h.batchPRG 0 time24h.batchPRG 0 time04h.batchPRG - time24h.batchPRG time04h.batchNEW 0 time24h.batchNEW 0 time04h.batchOLD 0 time24h.batchOLD 0 time04h.batchPRG 1 time24h.batchPRG -1 > fit2.time <- contrasts.fit(fit.time, contr.mat.time) fit2.time <- eBayes(fit2.time) > colnames(fit2.time$coefficients) [1] "time04h.batchNEW - time24h.batchNEW" "time04h.batchOLD - time24h.batchOLD" [3] "time04h.batchPRG - time24h.batchPRG > length(which(topTable(fit2.time, number=12488, coef=1, adjust='fdr', sort.by='P')[,3] <= 0.01)) [1] 184 > length(which(topTable(fit2.time, number=12488, coef=2, adjust='fdr', sort.by='P')[,3] <= 0.01)) [1] 349 > length(which(topTable(fit2.time, number=12488, coef=3, adjust='fdr', sort.by='P')[,3] <= 0.01)) [1] 27 What I see is that there are quite dramatic changes over time depending on the batch. Actually the batches represent different labs performing an expression analysis of *untreated* mouse cell cultures (MG-U74Av2 chip with 12488 probe sets) after 04h incubation and 24h incubation ... . The aim is to see if the three labs have actually treated the cells the same ;-) Does the above desing makes sense fot this kind of analysis? kind regards, Arne -- Arne Muller, Ph.D. Toxicogenomics, Aventis Pharma arne dot muller domain=aventis com
probe affy limma probe affy limma • 1.5k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 40 minutes ago
WEHI, Melbourne, Australia
At 01:53 AM 8/04/2004, Arne.Muller@aventis.com wrote: >Hello, > >I'm following the limma users' guide to analyze some affy data. The design in >my analysis us a bit more complex than in the example of the manual. I don't >know if I'm doing it right. There seems nothing wrong with your limma analysis. Focusing only on the most extreme differentially expressed genes as you do with a toptable will of course exaggerate the differences between the three labs. Also you're bound to get more significant genes for OLD because you have more arrays from that lab. Maybe take a more global look at the coefficients as well as the p-values. Gordon >I've two factors 'batch' and 'time' listed below: > > > batch > [1] NEW NEW NEW OLD OLD OLD OLD PRG PRG PRG NEW NEW NEW OLD OLD OLD OLD PRG >PRG >[20] PRG >Levels: NEW OLD PRG > > > > > time > [1] 04h 04h 04h 04h 04h 04h 04h 04h 04h 04h 24h 24h 24h 24h 24h 24h 24h 24h >24h >[20] 24h >Levels: 04h 24h > >You see that BATCH/TIME combinations have 3 replicates (except for OLD/04h or >24h which has 4). > >I'd like to know how many gene change in expression over time but within each >batch. Therefore I'm constructing the following model matrix (time nested >within batch, i.e. 04h and 24h at each level of batch): > >design.time <- model.matrix( ~ -1 + time %in% batch) >colnames(design.time) <- c("time04h.batchNEW", > "time24h.batchNEW", > "time04h.batchOLD", > "time24h.batchOLD", > "time04h.batchPRG", > "time24h.batchPRG") > > > design.time > time04h.batchNEW time24h.batchNEW time04h.batchOLD time24h.batchOLD >1 1 0 0 0 >2 1 0 0 0 >3 1 0 0 0 >4 0 0 1 0 >5 0 0 1 0 >6 0 0 1 0 >7 0 0 1 0 >8 0 0 0 0 >9 0 0 0 0 >10 0 0 0 0 >11 0 1 0 0 >12 0 1 0 0 >13 0 1 0 0 >14 0 0 0 1 >15 0 0 0 1 >16 0 0 0 1 >17 0 0 0 1 >18 0 0 0 0 >19 0 0 0 0 >20 0 0 0 0 > time04h.batchPRG time24h.batchPRG >1 0 0 >2 0 0 >3 0 0 >4 0 0 >5 0 0 >6 0 0 >7 0 0 >8 1 0 >9 1 0 >10 1 0 >11 0 0 >12 0 0 >13 0 0 >14 0 0 >15 0 0 >16 0 0 >17 0 0 >18 0 1 >19 0 1 >20 0 1 >attr(,"assign") >[1] 1 1 1 1 1 1 >attr(,"contrasts") >attr(,"contrasts")$time >[1] "contr.treatment" > >attr(,"contrasts")$batch >[1] "contr.treatment" > > > fit.time <- lmFit(emat, design.time) > > > contr.mat.time <- makeContrasts(time04h.batchNEW-time24h.batchNEW, > time04h.batchOLD-time24h.batchOLD, > time04h.batchPRG-time24h.batchPRG, > levels=design.time) > > contr.mat.time > time04h.batchNEW - time24h.batchNEW >time04h.batchNEW 1 >time24h.batchNEW -1 >time04h.batchOLD 0 >time24h.batchOLD 0 >time04h.batchPRG 0 >time24h.batchPRG 0 > time04h.batchOLD - time24h.batchOLD >time04h.batchNEW 0 >time24h.batchNEW 0 >time04h.batchOLD 1 >time24h.batchOLD -1 >time04h.batchPRG 0 >time24h.batchPRG 0 > time04h.batchPRG - time24h.batchPRG >time04h.batchNEW 0 >time24h.batchNEW 0 >time04h.batchOLD 0 >time24h.batchOLD 0 >time04h.batchPRG 1 >time24h.batchPRG -1 > > > >fit2.time <- contrasts.fit(fit.time, contr.mat.time) >fit2.time <- eBayes(fit2.time) > > > colnames(fit2.time$coefficients) >[1] "time04h.batchNEW - time24h.batchNEW" "time04h.batchOLD - >time24h.batchOLD" >[3] "time04h.batchPRG - time24h.batchPRG > > > length(which(topTable(fit2.time, number=12488, coef=1, adjust='fdr', >sort.by='P')[,3] <= 0.01)) >[1] 184 > > > length(which(topTable(fit2.time, number=12488, coef=2, adjust='fdr', >sort.by='P')[,3] <= 0.01)) >[1] 349 > > > length(which(topTable(fit2.time, number=12488, coef=3, adjust='fdr', >sort.by='P')[,3] <= 0.01)) >[1] 27 > >What I see is that there are quite dramatic changes over time depending on >the batch. Actually the batches represent different labs performing an >expression analysis of *untreated* mouse cell cultures (MG-U74Av2 chip with >12488 probe sets) after 04h incubation and 24h incubation ... . The aim is to >see if the three labs have actually treated the cells the same ;-) > >Does the above desing makes sense fot this kind of analysis? > > kind regards, > > Arne > >-- >Arne Muller, Ph.D. >Toxicogenomics, Aventis Pharma >arne dot muller domain=aventis com
ADD COMMENT
0
Entering edit mode
@stephen-henderson-71
Last seen 7.7 years ago
how about the rank spearman correlation of all the toptable p-values? >There seems nothing wrong with your limma analysis. Focusing only on the >most extreme differentially expressed genes as you do with a toptable will >of course exaggerate the differences between the three labs. ********************************************************************** This email and any files transmitted with it are confidentia...{{dropped}}
ADD COMMENT
0
Entering edit mode
@arnemulleraventiscom-466
Last seen 10.3 years ago
Hi, thanks for reply. I guess you mean is to use the rank as given by topTable (by p-value) for one contrast for a comparison with another contrast? Since the rank is already given (by topTable) would I've to use a pearson correlation? What I've just tried is a simple pearson (the data look not too far from normal) and spearman correlation separate for up and downregulated genes. > cor.time.spearman down up NEWvsOLD 0.4461673 0.371619276 NEWvsPRG -0.1675682 0.003799389 OLDvsPRG -0.1500851 0.057734972 > cor.time.pearson down up NEWvsOLD 0.723402369 0.49514365 NEWvsPRG -0.035844726 0.01057829 OLDvsPRG 0.005358755 0.04535618 Given are the "estimates". The correlation is based on the coefficients in the eBayes fit from limma. Only those coefficients were included for which either one or the other (e.g. NEW or OLD) provides a coefficient >0 (up) with p<=0.01, or <0 for down. For each correlation between 500 and 1000 genes are included. This is not too far from what I expect, since OLD and NEW have been generated using quite similar experimental protocols. It also tells that these batches don't have that much in common ... thanks for your suggestion, Arne -- Arne Muller, Ph.D. Toxicogenomics, Aventis Pharma arne dot muller domain=aventis com > -----Original Message----- > From: Stephen Henderson [mailto:s.henderson@ucl.ac.uk] > Sent: 08 April 2004 15:06 > To: 'Gordon Smyth '; Muller, Arne PH/FR > Cc: 'bioconductor@stat.math.ethz.ch ' > Subject: RE: [BioC] limma and nested factors > > > how about the rank spearman correlation of all the toptable p-values? > > >There seems nothing wrong with your limma analysis. Focusing > only on the > >most extreme differentially expressed genes as you do with a > toptable will > >of course exaggerate the differences between the three labs. > > > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager (it.support@wibr.ucl.ac.uk). All files are > scanned for viruses. > ********************************************************************** > >
ADD COMMENT

Login before adding your answer.

Traffic: 481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6