Limma-voom - Timepoint by timepoint comparison in timecourse experiment
2
0
Entering edit mode
rina ▴ 30
@rina-16738
Last seen 7 months ago
France

I am analyzing RNA-Seq data from a drug response dataset. There, we have a timecourse experiment with the following experimental design:

3 timepoints: 4h, 24h, 48h

3 drugs (A,B,C) + 1 untreated control

3 replicates per drug-timepoints

I would like to get differentially expressed genes for each timepoint compared to the untreated control. I have followed the limma-voom tutorial, and I get close to 300.000 differentially expressed genes (that go down to 10.000 if I apply an FDR and logFC threshold). I am expecting to find differences compared to the untreated samples, however, I am not sure whether the p-values are inflatted due to fewer changes in the earlier time points, as it has been suggested in another question in BioConductor regarding limma. Is there a difference on how limma performs differential analysis if fewer or more contrasts are specified?

My code can be found below:

# samples: df with sample metadata. Condition column combines both Drug + Timepoint (e.g. A_4h, B_4h, C_4h, ...., untreated_48h)
# gene_counts: df with gene counts

lev <- unique(samples$Condition)
fac <- factor(samples$Condition, levels=lev)
design <- model.matrix(~ 0 + fac)
colnames(design) <- lev

x <- DGEList(counts=count_df)
cpm <- cpm(x)
lcpm <- cpm(x, log=TRUE)

keep.exprs <- rowSums(cpm>1)>=3
x <- x[keep.exprs, keep.lib.sizes=FALSE]
dim(x)

contrast_list <- makeContrasts(
   Diff_A_4h = A_4h - untreated_4h,
   Diff_B_4h = B_4h - untreated_4h,
   Diff_C_4h = C_4h - untreated_4h,
   Diff_A_24h = A_24h - untreated_24h,
   Diff_A_24h = A_24h - untreated_24h,
   Diff_A_24h = A_24h - untreated_24h,
   Diff_A_48h = A_48h - untreated_48h,
   Diff_A_48h = A_48h - untreated_48h,
   Diff_A_48h = A_48h - untreated_48h,
   levels = design)

vfit <- lmFit(v, design)
vfit <- contrasts.fit(vfit, contrasts=contrasts_list)
efit <- eBayes(vfit)

# And to extract the DEGs per contrast

extract_DEGs <- function(x1){
  contr_name <- colnames(efit$coefficients)[[x1]]
  tt <- topTable(efit, x1, number = Inf) %>%
    rownames_to_column("Gene_name") %>%
    mutate(Contrast = contr_name) %>%
    as_tibble() 
  tt
}

Coeffs <- dimnames(efit$coefficients)[["Contrasts"]]
limma • 607 views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 38 minutes ago
WEHI, Melbourne, Australia

I am not sure whether the p-values are inflatted due to fewer changes in the earlier time points, as it has been suggested in another question in BioConductor regarding limma.

No, there is no such problem.

Is there a difference on how limma performs differential analysis if fewer or more contrasts are specified?

No, not unless you use non-default options to decideTests().

Regarding your earlier steps

x <- DGEList(counts=count_df)
keep.exprs <- rowSums(cpm>1)>=3
x <- x[keep.exprs, keep.lib.sizes=FALSE]

we recommend

keep.exprs <- filterByExpr(y, design)

and we recommend TMM normalization by

x <- calcNormFactors(x)
ADD COMMENT
0
Entering edit mode

Thank you so much for the quick reply and recommendations regarding data pre-processing!

ADD REPLY

Login before adding your answer.

Traffic: 829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6