Question: LIMMA : group specific effect of a treatment while controlling for individual effect
0
14 months ago by
SPbeginner0 wrote:

Dear Bioconductor community,

I have timecourse RNA-seq data.
Animals received 4 differents treatments (group = GpA, GpB, GpCD, GpE).
There are 6 animals in each group.
For each animal,blood were collacted before any treatment (D0) and at 3 time points after treatment: D3, D7 and D28.
So, the experiment could be described as follows :

Animal    Time    Group Rep
M657    0    GpA 1
M657    3    GpA 1
M657    7    GpA 1
M657    28    GpA 1
M658    0    GpA 2
M658    3    GpA 2
M658    7    GpA 2
M658    28    GpA 2
.
.
.
.
D627    0    GpB 1
D627    3    GpB 1
D627    7    GpB 1
D627    28    GpB 1
D628    0    GpB 2
D628    3    GpB 2
D628    7    GpB 2
D628    28    GpB 2
.
.
.
.

With rep being a factor value for Animal-Group which distinguishes the individual nested within a group.

I want to determine :
1/ Which genes respond at either the Day3, Day7 or day28 in the different group
for exemple :
GpA : Day3vsDay0 Day7vsDay0, Day28vsDay0
do I have to analyse each Group separatly ? Or I can analyse them all together ?
Because each subject yields a value for each time point, can I take "Animal" effect into my model ? For this, I use block option in duplicateCorrelation().

Here is what I did :

TS <- paste(Design_clean$Group, Design_clean$Time, sep=".")
Design_clean_simple <-data.frame(Group=TS,Replicate=factor(Design_clean$Rep),row.names=row.names(Design_clean)) Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple) dge <- DGEList(counts=Count) dge.norm <- calcNormFactors(dge) v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE) corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Replicate)
v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus) fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl$consensus) To identify DEG at Day3, Day7 and Day28 when compared to baseline (Day0) : contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0 levels=colnames(Blocked_Design_model)) fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked) fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit) 2/ Which genes respond differently over time in the different groups ie : Which genes respond differently over time in the GpA relative to the GpB? So, to answer this question, here is the contrast matrix, as an exemple Which genes respond differently at Day3 in the GpA relative to the GpB : contrast.matrix.blocked<-makeContrasts(GpAvsGpB.D3 = (GroupA.3-GroupA.0)-(GroupB.3-GroupB.0),levels=colnames(Blocked_Design_model)) it2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked) fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit) Am I doing right ? ADD COMMENTlink modified 14 months ago • written 14 months ago by SPbeginner0 None of your tags are limma; the maintainers don't get notified if the post is not tagged with the package name. ADD REPLYlink written 14 months ago by Aaron Lun23k Ok, thanks for your comment, I admit that's was not clear enought; I edit my post to make it more understandable. So with your recommandations : TS <- paste(Design_clean$Group, Design_clean$Time, sep=".") Design_clean_simple <-data.frame(Group=TS,Animal=factor(Design_clean$Animal),row.names=row.names(Design_clean))

head(Design_clean_simple)
Animal Group
M657 GpA.0
M657 GpA.3
M657 GpA.7
M657 GpA.28
M658 GpA.0
M658 GpA.3
M658 GpA.7
M658 GpA.28
.
.
.
.
D627 GpB.0
D627 GpB.3
D627 GpB.7
D627 GpB.28
D628 GpB.0
D628 GpB.3
D628 GpB.7
D628 GpB.28
Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple)
dge <- DGEList(counts=Count)
dge.norm <- calcNormFactors(dge)

v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE)
corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Animal) v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus)
fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl$consensus)

contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0
levels=colnames(Blocked_Design_model))

fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit,robust=TRUE)

I have another question :

contrast.matrix.blocked<-makeContrasts(GpAvsGpB.D3 = (GroupA.3-GroupA.0)-(GroupB.3-GroupB.0),levels=colnames(Blocked_Design_model))
it2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit)

how to interpret the logFC in this example ?

Am I right if i say : genes with logFC >0 are genes that are up-regulated in GroupA at day3 vs day0  but not affected (or down-regulated) in GroupeB at day3 vs day0 ?

No. Think of the possible options.

GroupA.3-GroupA.0 > 0 > GroupB.3-GroupB.0 => logFC > 0
GroupA.3-GroupA.0 > GroupB.3-GroupB.0 > 0 => logFC > 0
0 > GroupA.3-GroupA.0 > GroupB.3-GroupB.0 => logFC > 0

So it could be anything, as long as the 3 vs 0 log-fold change in group A is greater than that in group B.

Answer: LIMMA : group specific effect of a treatment while controlling for individual ef
1
14 months ago by
Aaron Lun23k
Cambridge, United Kingdom
Aaron Lun23k wrote:

This looks fine to me. A few suggestions:

• Make sure you've filtered out low-abundance before running calcNormFactors, etc.
• I usually like running eBayes with robust=TRUE to protect against outliers.
• I would put Animal in the design matrix (and resolve any linear dependencies) rather than blocking on it with duplicateCorrelation. The latter is slightly anticonservative, and you don't need it as you are not comparing directly between different animals. Your comparisons are either within animals (GpA.D3) or between log-fold changes (GpAvsGpB.D3).
1

Actually I don't think the code is right because OP has computed duplicateCorrelation for 'Rep' instead of for 'Animal'. If this was corrected, I think the use of duplicateCorrelation would be fine here.

'Rep' is the same as 'animal' : it's a factor value for Animal-Group which distinguishes the individual nested within a group.

Each Group will have animal numbered between 1-6. Is it better to let Animal ID in my model ?

Is my code correct like this :

TS <- paste(Design_clean$Group, Design_clean$Time, sep=".")
Design_clean_simple <-data.frame(Group=TS,Animal=factor(Design_clean$Animal),row.names=row.names(Design_clean)) Blocked_Design_model<-model.matrix(~0+Group,data=Design_clean_simple) dge <- DGEList(counts=Count) dge.norm <- calcNormFactors(dge) v.wts.bl<- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE) corfit.wts.bl <- duplicateCorrelation(v.wts.bl,Blocked_Design_model,block=Design_clean_simple$Animal)
v.wts.bl.corfit <- voomWithQualityWeights(dge.norm, Blocked_Design_model, plot=TRUE, correlation=corfit.wts.bl$consensus) fit.wts.bl.corfit <- lmFit(v.wts.bl.corfit, Blocked_Design_model,block=Design_clean_simple$Replicate,correlation=corfit.wts.bl\$consensus)

contrast.matrix.blocked<-makeContrasts(GpA.D3=GroupGpA.3-GroupGpA.0, GpA.D7=GroupGpA.7-GroupGpA.0, GpA.D28=GroupGpA.28-GroupGpA.0
levels=colnames(Blocked_Design_model))

fit2.wts.bl.corfit <- contrasts.fit(fit.wts.bl.corfit, contrast.matrix.blocked)
fit2.wts.bl.corfit <- eBayes(fit2.wts.bl.corfit,robust=TRUE)

'Rep' is not the same as 'Animal'. In the table that you show, 'Animal' takes on 12 values while 'Rep' takes on only 6 values. The vector that you input to limma as the block to duplicateCorrelation must take on 12 values.

It is hard to check your code with certainty, considering that all your code works from a data frame called 'Design_clean' that has different columns from the table that you actually show us. The table you show has a column called 'Treatment', but no 'Group'. The code works from 'Group but not 'Treatment'. Why show us one table but work from another?

And, to be pedantic, the table you show is not a "design matrix". In limma, it is called the "targets frame".

Answer: LIMMA : group specific effect of a treatment while controlling for individual ef
0
14 months ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

Your new code (posted 9 March 2018) looks correct, except that I would insert the line

keep <- filterByExpr(dge, Blocked_Design_model)
dge <- dge[keep,,keep.lib.sizes=FALSE]


before calcNormFactors. That was also Aaron's no. 1 suggestion.

The function filterByExpr was added to edgeR recently, on 26 Dec 2017, so you may need to re-install edgeR to get it.

With the approach taken, your experiment is just a one-way layout, so forming any contrasts between the groups or times is very straightforward, as I think you already appreciate.

Of course, as always, it is good practice to examine plots at each stage of your analysis, and check that the consensus correlation is positive, rather than just running all the code through as a block.

Could you specify what version of edgeR the function filterByExpr can be found in? I am working in version 3.20.9 and it isn't found. Reinstalling with biocLite doesn't seem to resolve the issue.

3.20.9 is ancient, we're on 3.34.9 now. Make sure you're running the latest R (3.4.*) and read:

https://www.bioconductor.org/install/#troubleshoot-bioconductor-packages