Question on performing ANOVA-like analysis using limma
1
0
Entering edit mode
grubble • 0
@97fab824
Last seen 7 days ago
Australia

Hello, I'm a bit confused as to how to set-up and run my differential expression RNAseq analysis. I have 20 different subjects, ten of which are male and ten of which are female, and I have seven time points for each subject (baseline, 0h, 4h, 8h, 12h, 16h, 24h). I want to know what genes change differently between men and women at each time point relative to baseline (e.g., are there any differences in differential expression between men and women for the comparison 8h vs baseline etc.). I also want to account for the variability between participants that is independent of sex, but I am confused as to whether this could effect the detection of sex-specific effects even if sex is used in my model.

Currently my code is as as follows:

sextime <- factor(paste(sampledata$time, sampledata$Sex, sep="."))

designsextime <- model.matrix(~0 + sextime, y$samples)
colnames(designsextime) <- gsub("sextime", "", colnames(designsextime))

contr.matrixsextime <- makeContrasts(
  "h0vsbaselinedif" = (h0.M-baseline.M) - (h0.F-baseline.F),
  "h4vsbaselinedif" = (h4.M-baseline.M) - (h4F-baseline.F),
  "h8vsbaselinedif" = (h8.M-baseline.M) - (h8F-baseline.F),
  "h12vsbaselinedif" = (h12.M-baseline.M) - (h12F-baseline.F),
  "h16vsbaselinedif" = (h16.M-baseline.M) - (h16F-baseline.F),
  "h24vsbaselinedif" = (h24.M-baseline.M) - (h24F-baseline.F),
  levels = colnames(designsextime))

v <- voom(y, designsextime, plot=TRUE)

fit <- lmFit(v, design=designsextime)

cor <- duplicateCorrelation(counts, designsextime, block=sampleid)

fit <- lmFit(v, design=designsextime, 
    block=sampleid, correlation=cor$consensus.correlation)

fit <- contrasts.fit(fit, contrasts=contr.matrixsextime)
efit <- eBayes(fit)

summary(decideTests(efit, adjust.method = "BH", adj.P.Val=0.05))

Would this be appropriate, or would it be more wise to remove the duplicateCorrelation and blocking by sampleid incase it affects the detection of sex-specific differences and instead have my code as follows (and therefore not account for interindividual variability)

sextime <- factor(paste(sampledata$time, sampledata$Sex, sep="."))

designsextime <- model.matrix(~0 + sextime, y$samples)
colnames(designsextime) <- gsub("sextime", "", colnames(designsextime))

contr.matrixsextime <- makeContrasts(
  "h0vsbaselinesex" = (h0.M-baseline.M) - (h0.F-baseline.F),
  "h4vsbaselinesex" = (h4.M-baseline.M) - (h4F-baseline.F),
  "h8vsbaselinesex" = (h8.M-baseline.M) - (h8F-baseline.F),
  "h12vsbaselinesex" = (h12.M-baseline.M) - (h12F-baseline.F),
  "h16vsbaselinesex" = (h16.M-baseline.M) - (h16F-baseline.F),
  "h24vsbaselinesex" = (h24.M-baseline.M) - (h24F-baseline.F),
  levels = colnames(designsextime))

v <- voom(y, designsextime, plot=TRUE)

fit <- lmFit(v, design=designsextime)

cor <- duplicateCorrelation(counts, designsextime)

fit <- lmFit(v, design=designsextime)

fit <- contrasts.fit(fit, contrasts=contr.matrixsextime)
efit <- eBayes(fit)

summary(decideTests(efit, adjust.method = "BH", adj.P.Val=0.05))

or alternatively, analyse the differential expression of men and women separately, where i can 'safely' use blocking by sampleid and then compare statistically the log2FCs generated for each gene in each contrast between men and women

limma • 314 views
ADD COMMENT
0
Entering edit mode

What does sampleid represent? If you use duplicateCorrelation(), then the blocking factor should be subject, but it isn't clear from your question whether sampleid is the subject identifier or something else. If sampleid is literally a sample identifier, and is different for every library, then the duplicateCorrelation analysis would not make sense and will give an error.

BTW, some of your code lines don't folllow limma syntax. The decideTests function doesn't have an adj.P.Val argument and you can't run duplicateCorrelation without a blocking factor.

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

In time course experiments such as this, it is essential to account for baseline variability between subjects, so the second analysis given in your question is wrong.

I don't understand what the your first analysis is doing because I don't know what sampleid represents. I can say though that duplicateCorrelation does not interfere with the detection of sex-specific effects. This analysis would be ok if you are blocking on subject, but is nevertheless not my recommended analysis.

My recommendation for a balanced time course experiment like this is to include subject in the design matrix and to include sex-specific time effects.

ADD COMMENT

Login before adding your answer.

Traffic: 425 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6