Question: Same number of topTags reported for different contrasts
0
gravatar for MBWatson
4.3 years ago by
MBWatson0
United States
MBWatson0 wrote:

Good afternoon,

I am using edgeR to find differentially expressed genes among populations (SD and BR) and salinity treatments (15, 35, 60ppt). I have three biological replicates for each treatment. My experimental design matches very closely the example in section 3.2.4 of the (excellent) edgeR user guide and I have been using it as an example. The code appears to working correctly but when I write the topTags to a file, each pairwise contrast has the exact same number of tags- this seems unexpected. I am using 'topTags(x, n=Inf).'  If anyone has an feedback on this observation, I would appreciate hearing it. My code is below. Thank you!

source("http://bioconductor.org/biocLite.R")
biocLite("edgeR")
library("edgeR")

x <- read.delim("~path_to_file", header=TRUE, row.names="Symbols") 
z <-x[ ,c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18)]
#make new file called y composed of these columns
keep <- rowSums(cpm(z)) >=18 
z<- z[keep,] 
group <- factor(c("B","B","B","A","A","A","E","E","E","D","D","D","C","C","C","G","G","G")) 
#BR15=B, BR35=A, SD15=D, SD35=C, BR60=E, SD60=G
y <- DGEList(counts=z, group=group) 
y <-calcNormFactors(y) 
design <- model.matrix(~0+group, data=y$samples)
colnames(design) <- levels(y$samples$group)
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
y <- estimateGLMTrendedDisp(y, design, verbose=TRUE)
y <- estimateGLMTagwiseDisp(y, design)
fit <-glmFit(y, design)

#Contrast 1: Genes differentially expressed between BR15 and BR35

BvsA <- makeContrasts(B-A, levels=design)
lrt <-glmLRT(fit, contrast=BvsA)
DGE_BR15vsBR35 <- topTags(lrt, n=Inf)
write.csv(DGE_BR15vsBR35, file="DGE_BR15vsBR35.txt")

#Contrast 2: Genes differentially expressed between SD15 and SD35

DvsC <- makeContrasts(D-C, levels=design)
lrt <-glmLRT(fit, contrast=DvsC)
DGE_SD15vsSD35 <-topTags(lrt, n=Inf)
write.csv(DGE_SD15vsSD35, file="DGE_SD15vsSD35.txt")

#Contrast 3: Genes differentially expressed between BR35 and BR60

AvsE <- makeContrasts(A-E, levels=design) 
lrt <-glmLRT(fit, contrast=AvsE)
DGE_BR35vsBR60 <-topTags(lrt, n=Inf)
write.csv(DGE_BR35vsBR60, file="DGE_BR35vsBR60.txt")

#Contrast 4: Genes differentially expressed between SD35 and SD60

CvsG <- makeContrasts(C-G, levels=design)
lrt <-glmLRT(fit, contrast=CvsG)
DGE_SD35vsSD60 <-topTags(lrt, n=Inf)
write.csv(DGE_SD35vsSD60, file="DGE_SD35vsSD60.txt")
ADD COMMENTlink modified 4.3 years ago by Gordon Smyth37k • written 4.3 years ago by MBWatson0
Answer: Same number of topTags reported for different contrasts
1
gravatar for Gordon Smyth
4.3 years ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

You are using n=Inf, hence the topTags table lists every gene, hence all the topTags tables must be the same length. The length of your topTags tables will be the same as nrow(fit).

If you intended to write a topTags tables of statistically significant genes only, then you needed to set the 'p' argument of topTags. For example, topTags with n=Inf and p=0.05 will give genes significant at FDR of 5%.

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Gordon Smyth37k

Thank you for the response. Your explanation makes perfect sense. 

ADD REPLYlink written 4.3 years ago by MBWatson0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 235 users visited in the last hour