How to perform multiple comparisons with no biological replication using edgeR
1
0
Entering edit mode
hong1ang • 0
@hong1ang-22595
Last seen 4.8 years ago

Hi seniors,

I am using edgeR for Gene expression analysis, my data is time-course transcriptome but no biological replication. I also know that there is meaningless without biological replication, but I have no choice.

Now I want to analyze the comparison between any two groups,but I found glmLRT (fit, coef = 2) is not equal to the result of the glmLRT (fit, contrast = CONTRASTS [, 'A10hvsA0h']). The code is below:

rm(list=ls())
setwd('H:/qde-2.20191220/DEG/')
library("edgeR")

Counts data generated by featureCounts program

countData <- read.table('qde2_featurecounts',header = TRUE,sep = '\t',row.names = 1)
countData <- countData[,c(6,7,8,9,10,11)]
names(countData) <- c('WT_0h','WT_10h','WT_48h','KO_0h','KO_10h','KO_48h')
head(countData )
          WT_0h WT_10h WT_48h KO_0h KO_10h KO_48h
ao0303682  2029   1009    631   972    540    473
ao0309247  4036   1672   1029  2030    825    926
ao0306037     6      1      3     3      2      4
ao0300999    27     53     48    12     34     81
ao0306913     7      5     10     5      3     15
ao0305685     0      1      1     0      2      3

condition <- c(paste('A',c('0h','10h','48h'),sep='_'),paste('Q',c('0h','10h','48h'),sep='_'))
group <- factor(condition,levels=condition)

y <- DGEList(count = countData,group = group)

design <- model.matrix(~0+group)
colnames(design) <- levels(group)
design
  A_0h A_10h A_48h Q_0h Q_10h Q_48h
1    1     0     0    0     0     0
2    0     1     0    0     0     0
3    0     0     1    0     0     0
4    0     0     0    1     0     0
5    0     0     0    0     1     0
6    0     0     0    0     0     1
attr(,"assign")
[1] 1 1 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"

keep <- rowSums(cpm(y)>1) >=1
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y,method = 'TMM')

I have no biological replication, here set dispersion = 0.01

fit <- glmFit(y,dispersion = 0.01)
CONTRASTS <- makeContrasts( A_10hvsA_0h = A_10h-A_0h,
                            A_48hvsA_0h = A_48h-A_0h,
                            A_48hvsA_10h = A_48h-A_10h,
                            Q_0hvsA_0h = Q_0h-A_0h,
                            Q_10hvsA_10h = (Q_10h-Q_0h)-(A_10h-A_0h),
                            levels = design )

Compare the DEG of A10h vs. A0h using glmLRT(fit,coef = 2)

A_10hvsA_0h_lrt <- glmLRT(fit,coef = 2)
summary(decideTestsDGE(A_10hvsA_0h_lrt))
topTags(A_10hvsA_0h_lrt)

Coefficient:  y$samples$groupA_10h 
              logFC    logCPM       LR        PValue           FDR
ao0304393 10.566149  8.693791 939.1435 3.038757e-206 2.925715e-202
ao0301350  9.161787  8.974330 892.9482 3.348226e-196 1.611836e-192
ao0310604  8.813577  9.548651 881.2468 1.171078e-193 3.758378e-190
ao0307037  8.603780  9.590910 871.5692 1.487480e-191 3.580363e-188
ao0303760  8.288286 13.037790 849.7920 8.068355e-187 1.553642e-183
ao0302195  9.754058  7.210731 791.3131 4.176101e-174 6.701250e-171
ao0307297  8.017683  8.931971 780.9519 7.473511e-172 1.027928e-168
ao0308239  7.702823  8.346463 729.2818 1.283395e-160 1.544566e-157
ao0301855  8.243889  7.446494 728.1615 2.248900e-160 2.405823e-157
ao0303286  7.937759  7.459464 718.3401 3.073340e-158 2.959012e-155

Compare the DEG of A10h vs. A0h using glmLRT(fit,contrast = CONTRASTS[,'A10hvsA0h'])

lrt <- glmLRT(fit,contrast = CONTRASTS[,'A_10hvsA_0h'])
summary(decideTestsDGE(lrt))
topTags(lrt)
Coefficient:  -1*(Intercept) 1*y$samples$groupA_10h 
             logFC   logCPM       LR PValue FDR
ao0304393 30.90727 8.693791 1670.897      0   0
ao0302195 30.47853 7.210731 1604.222      0   0
ao0304617 29.46109 6.001523 1521.929      0   0
ao0301884 28.88430 6.408459 1530.358      0   0
ao0301350 27.47016 8.974330 1581.138      0   0
ao0301855 27.22039 7.446494 1524.158      0   0
ao0303842 26.39358 7.400163 1488.512      0   0
ao0303286 26.37514 7.459464 1498.223      0   0
ao0310604 26.29831 9.548651 1536.513      0   0
ao0307037 25.48104 9.590910 1499.271      0   0

why glmLRT(fit,coef = 2) and glmLRT(fit,contrast = CONTRASTS[,'A10hvsA0h']) produce two different results? the value of logFC is largely different. Actually, I want to use parameter contrast instead of coef.

What should I do? Thank you in advance

edgeR no replication glmLRT(fit coef=2) • 710 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

You want to compare A10h vs A0h and you've already done that successfully.

You could have got the same results by defining instead

design <- model.matrix(~group)

and then

lrt <- glmLRT(fit, coef=2)

coef=2 doesn't give the same result as the contrast unless you define the design matrix appropriately. There is a section on this in the limma User's Guide, which also applies to edgeR.

ADD COMMENT
0
Entering edit mode

Thank you for your prompt reply!I now know how to make the two results the same.

ADD REPLY

Login before adding your answer.

Traffic: 381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6