Same number of topTags reported for different contrasts
1
0
Entering edit mode
MBWatson • 0
@mbwatson-7475
Last seen 7.5 years ago
United States

Good afternoon,

I am using edgeR to find differentially expressed genes among populations (SD and BR) and salinity treatments (15, 35, 60ppt). I have three biological replicates for each treatment. My experimental design matches very closely the example in section 3.2.4 of the (excellent) edgeR user guide and I have been using it as an example. The code appears to working correctly but when I write the topTags to a file, each pairwise contrast has the exact same number of tags- this seems unexpected. I am using 'topTags(x, n=Inf).'  If anyone has an feedback on this observation, I would appreciate hearing it. My code is below. Thank you!

source("http://bioconductor.org/biocLite.R")
biocLite("edgeR")
library("edgeR")

x <- read.delim("~path_to_file", header=TRUE, row.names="Symbols") 
z <-x[ ,c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18)]
#make new file called y composed of these columns
keep <- rowSums(cpm(z)) >=18 
z<- z[keep,] 
group <- factor(c("B","B","B","A","A","A","E","E","E","D","D","D","C","C","C","G","G","G")) 
#BR15=B, BR35=A, SD15=D, SD35=C, BR60=E, SD60=G
y <- DGEList(counts=z, group=group) 
y <-calcNormFactors(y) 
design <- model.matrix(~0+group, data=y$samples)
colnames(design) <- levels(y$samples$group)
y <- estimateGLMCommonDisp(y, design, verbose=TRUE)
y <- estimateGLMTrendedDisp(y, design, verbose=TRUE)
y <- estimateGLMTagwiseDisp(y, design)
fit <-glmFit(y, design)

#Contrast 1: Genes differentially expressed between BR15 and BR35

BvsA <- makeContrasts(B-A, levels=design)
lrt <-glmLRT(fit, contrast=BvsA)
DGE_BR15vsBR35 <- topTags(lrt, n=Inf)
write.csv(DGE_BR15vsBR35, file="DGE_BR15vsBR35.txt")

#Contrast 2: Genes differentially expressed between SD15 and SD35

DvsC <- makeContrasts(D-C, levels=design)
lrt <-glmLRT(fit, contrast=DvsC)
DGE_SD15vsSD35 <-topTags(lrt, n=Inf)
write.csv(DGE_SD15vsSD35, file="DGE_SD15vsSD35.txt")

#Contrast 3: Genes differentially expressed between BR35 and BR60

AvsE <- makeContrasts(A-E, levels=design) 
lrt <-glmLRT(fit, contrast=AvsE)
DGE_BR35vsBR60 <-topTags(lrt, n=Inf)
write.csv(DGE_BR35vsBR60, file="DGE_BR35vsBR60.txt")

#Contrast 4: Genes differentially expressed between SD35 and SD60

CvsG <- makeContrasts(C-G, levels=design)
lrt <-glmLRT(fit, contrast=CvsG)
DGE_SD35vsSD60 <-topTags(lrt, n=Inf)
write.csv(DGE_SD35vsSD60, file="DGE_SD35vsSD60.txt")
edger topTags design and contrast matrix • 1.3k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

You are using n=Inf, hence the topTags table lists every gene, hence all the topTags tables must be the same length. The length of your topTags tables will be the same as nrow(fit).

If you intended to write a topTags tables of statistically significant genes only, then you needed to set the 'p' argument of topTags. For example, topTags with n=Inf and p=0.05 will give genes significant at FDR of 5%.

ADD COMMENT
0
Entering edit mode

Thank you for the response. Your explanation makes perfect sense. 

ADD REPLY

Login before adding your answer.

Traffic: 790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6