Question

Collation of differentially expressed genes in Limma

0

Entering edit mode

Donald Dunbar ▴ 50

@donald-dunbar-886

Last seen 9.6 years ago

Hello all, I'd be grateful some advice on the best way to collate my significantly differentially expressed gene data from Limma. I have three sets of five Affy chips arranged as one factor with three levels (two treatments and a control). I build the design matrix: mydesign <- model.matrix(~ -1+factor(c(1,1,1,2,2,2,3,3,3,1,1,2,2,3,3))) colnames(mydesign) <- c("Control", "T1", "T2") to give: Control T1 T2 1 1 0 0 2 1 0 0 3 1 0 0 4 0 1 0 5 0 1 0 6 0 1 0 7 0 0 1 8 0 0 1 9 0 0 1 10 1 0 0 11 1 0 0 12 0 1 0 13 0 1 0 14 0 0 1 15 0 0 1 then fit the model: fit <- lmFit(eset, mydesign) then I make the contrasts: contrast.matrix <- makeContrasts(T1-Control,T2-Control,T1-T2, levels=mydesign) and fit the contrast to the model: fit2 <- contrasts.fit(fit, contrast.matrix) apply empirical Bayes: fit2 <- eBayes(fit2) now I have three sets of p.values in fit2$p.value, but of course I need to adjust for multiple testing. I can do this in topTable: topTable_T1_Control <- topTable(fit2, coef=1, adjust='BH') topTable_T2_Control <- topTable(fit2, coef=2, adjust='BH') topTable_T1_T2 <- topTable(fit2, coef=3, adjust='BH') but that does it individually and returns genes for each contrast. I'd like the list of genes significant below a particular adjusted p-value threshold then a score for each contrast (-1, 0, 1). I can use mt.rawp2adjp from multtest: multadjust <- mt.rawp2adjp(fit2$p.value, proc=c("BH")) eBpvalues <- multadjust$adjp[order(multadjust$index),] but I think I'm applying this wrongly: I think I need to apply it to each contrast's p.value column separately, but I'm not sure how. The other option I thought to use was decideTests, but that doesn't seem to allow me to use BH as an adjustment option: groups <- decideTests(fit2, method="separate", adjust.method="fdr",p.value=0.05) vennDiagram(groups) So if anyone would give some advice on the best way to collate my significant genes, I'd like to output a matrix with a row for each gene that is significantly differentially expressed in any of the three contrasts, then columns for adjusted p.value, M (ratio), and score (-1, 0, 1) for each gene: p.value T1-C M T1-C scoreT1-C p.valueT2-C etc... gene1 gene2 etc... I'm using: platform powerpc-apple-darwin7.9.0 arch powerpc os darwin7.9.0 system powerpc, darwin7.9.0 status major 2 minor 1.1 year 2005 month 06 day 20 language R Hope that's clear. Thanks in advance. Donald -- Dr Donald Dunbar Head, Bioinformatics Team Centres for Cardiovascular Science and Inflammation Research, University of Edinburgh, Queen?s Medical Research Institute, 47 Little France Crescent Edinburgh EH16 4TJ United Kingdom T: +44 131 242 6700 E: donald.dunbar at ed.ac.uk W: www.bioinf.mvm.ed.ac.uk

affy limma affy limma • 1.3k views

ADD COMMENT • link updated 8.3 years ago by Jenny Drnevich ★ 2.0k • written 18.7 years ago by Donald Dunbar ▴ 50

score 0 · Answer 1 · 2016-01-19

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 14 minutes ago

WEHI, Melbourne, Australia

decideTests() does what you want, and it does provide BH as an adjustment option. Note that "fdr" and "BH" are the same thing.

You could also consider using an F-test for the three contrasts, i.e., topTable(fit2,, coef=1:3).

ADD COMMENT • link 8.3 years ago Gordon Smyth 50k

score 0 · Answer 2 · 2016-01-20

Another option is to use write.fit(). You pass it your MArrayLM object (fit2 above), and optionally a TestResults object that has the -1,0,1 designations from decideTests (could use groups above if you want a separate adjustment), and specify the adjustment type ("BH" or "fdr" are the same as Gordon said), and whether you want to apply the method separately for each contrast or put all 3 together in one global adjustment. Two possible downsides to this method: 1) It writes out for all genes, so you'd have to subset to what you want later and 2) it writes it out to a text file, which you'd have to read back into R if you want to do more manipulations.

If you're fine with the separate test adjustment, then you could use topTable this way:

topTable_T1_Control <- topTable(fit2, coef=1, adjust='BH', num = Inf, sort.by = "none") 
topTable_T2_Control <- topTable(fit2, coef=2, adjust='BH', num = Inf, sort.by = "none") 
topTable_T1_T2 <- topTable(fit2, coef=3, adjust='BH', num = Inf, sort.by = "none")

The objects will have the genes in all the same order, so you can easily cbind() them together into one object along with the output from decideTests and then filter down to genes that are significant in just one contrast. You'll need to do some column selection and renaming, and subsetting of the object using logical conditions (e.g., rowSums(groups !=0) > 0).

Jenny