applying voom + limma to a block factor design in RNA-seq experiment
1
1
Entering edit mode
jfertaj ▴ 30
@jfertaj-8566
Last seen 19 months ago
United Kingdom

Hi,

I have a block-design for my study where the metadata looks like this:

   sample_ID location sex polyp_type age  status     files
1   INTP_103 proximal   1          1  31    ctrl INTP_1031
2   INTP_103   distal   1          1  31    ctrl INTP_1032
3   INTP_103   rectum   1          1  31    ctrl INTP_1033
4   INTP_105 proximal   2          3  66 disease INTP_1051
5   INTP_105   distal   2          3  66 disease INTP_1052
6   INTP_105   rectum   2          3  66 disease INTP_1053
7   INTP_106 proximal   1          2  64 disease INTP_1061
8   INTP_106   distal   1          2  64 disease INTP_1062
9   INTP_106   rectum   1          2  64 disease INTP_1063
10  INTP_111 proximal   2          1  49    ctrl INTP_1111

For each sample there are 3 different locations and also there are two different status (ctrl, and disease). I would like to test the difference between locations, between status and interactions 

Here is the code I am using

y <- DGEList(txi_longScaled$counts) # the counts come from a kallisto experiment
y <- calcNormFactors(y)

Treat <- factor(paste(metadata$status, metadata$location, sep="."))
design <- model.matrix(~0+Treat)
colnames(design) <- levels(Treat)
v <- voom(y, design)

#Then we estimate the correlation between measurements made on the same subject 
corfit <- duplicateCorrelation(v, design, block=metadata$sample_ID)
corfit$consensus

v <- voom(y, design, block = metadata$sample_ID, correlation = corfit$consensus)

fit <- lmFit(v, design, block = metadata$sample_ID, correlation = corfit$consensus)

cm <- makeContrasts( 
          CtrlvsCasesProximal = ctrl.proximal - disease.proximal,
          CtrlvsCasesDistal = ctrl.distal - disease.distal,
          CtrlvsCasesRectum = ctrl.rectum - disease.rectum,
          Ctrl.proximalvsCtrl.rectum = ctrl.proximal - ctrl.rectum,
          Ctrl.distalvsCtrl.proximal = ctrl.distal - ctrl.proximal,
          Ctrl.distalvsCtrl.rectum = ctrl.distal - ctrl.rectum,
          Dis.proximalvsDis.distal = disease.proximal - disease.distal,
          Dis.proximalvsDis.rectum = disease.proximal - disease.rectum,
          Dis.distalvsDis.rectum = disease.distal - disease.rectum,
levels=design)

fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2)

 CtrlvsCasesProximal <- topTable(fit2, coef="CtrlvsCasesProximal"), number = nrow(y$counts)  #will find those genes that are differentially expressed between the normal and diseased subjects in the location proximal

I would like to correct also for age and sex, my two questions are: a) is my approach correct (I am following this post using duplicateCorrelation with limma+voom for RNA-seq data)? and b) if I want to correct for sex and age is enough to do it when I first declare design? i.e 

design <- model.matrix(~0+Treat+age+as.factor(metadata$status)

Thanks

limma voom • 3.0k views
ADD COMMENT
3
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 20 hours ago
The city by the bay

Your code has syntax errors in the topTable call and the second model.matrix call. I don't see where covs$status comes from; does it represent disease/control status? If so, I don't understand why you're re-adding metadata$status in the second model.matrix call instead of adding metadata$sex. Your CtrlvsCasesDistal contrast is also incorrect, you should subtract disease.distal.

In any case, the correct version of your second design matrix would look more like:

age <- metadata$age
sex <- factor(metadata$sex)
design <- model.matrix(~0+Treat+age+sex)
colnames(design)[seq_len(nlevels(Treat))] <- levels(Treat)

Also see C: limma blocking and including covariates, which discusses why you need to block on constant patient characteristics in the design matrix even after you have blocked on the patient factor itself in duplicateCorrelation.

P.S. Split the limma and voom tags, otherwise the maintainers of either won't get notified.

ADD COMMENT
0
Entering edit mode

Thanks a lot @Aaron, I have corrected the code and also look to the post you mentioned. I was a bit worried because the comparisons Ctrl.distalvsCtrl.proximal,Ctrl.distalvsCtrl.rectum, and Ctrl.proximalvsCtrl.rectum give me a lot of DE genes with really low adjusted pvalue (1x10-e32) and strong B (100.34) so or the differences are true (same patient different locations) or I am doing something wrong.

ADD REPLY

Login before adding your answer.

Traffic: 497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6