Hi,
want to get this straight as my first time using limma-voom and duplicateCorrelation.
My metadata has multiple Tissue (either tumour or normal) across two Diets. Each Individual had multiple biopsies, from either normal, tumour or multiple of each.
conds Individual Tissue Diet S1 1 Normal NC S2 1 Tumour NC S3 1 Tumour NC S4 1 Tumour NC S5 2 Normal NC S6 3 Normal NC S7 4 Tumour NC S8 4 Tumour NC S9 5 Tumour NC S10 5 Tumour NC S11 5 Tumour NC S12 5 Normal NC S13 6 Tumour NC S14 6 Tumour NC S15 6 Tumour NC S16 6 Tumour NC S17 7 Tumour NC S18 7 Tumour NC S19 7 Tumour NC S20 7 Normal NC S21 8 Normal HFD S22 10 Normal HFD S23 11 Tumour HFD S24 11 Tumour HFD S25 11 Tumour HFD S26 12 Tumour HFD S27 12 Tumour HFD S28 12 Tumour HFD S29 12 Tumour HFD S30 13 Normal HFD S31 21 Tumour HFD S32 22 Tumour HFD S33 22 Tumour HFD S34 22 Tumour HFD S35 22 Normal HFD
I believe I can use duplicateCorrelation setting block=Individual to account for correlated gene expression from individuals. I make a combined DietTissue as we believe gene expression in Tissue will be affected by Diet.
DietTissue <- factor(paste(conds$Diet, conds$Tissue, sep=".")) DTdesign <- model.matrix(~0+DietTissue, conds) colnames(DTdesign)[seq_len(nlevels(DietTissue))] <- levels(DietTissue) DTkeep <- filterByExpr(dge, design) DTdge <- dge[keep, keep.lib.sizes=FALSE] DTdge <- calcNormFactors(DTdge) ##voom DTdcv <- voom(DTdge, design, plot=FALSE) ##DC: https://support.bioconductor.org/p/94280/#94290; https://support.bioconductor.org/p/59700/ DTcor <- duplicateCorrelation(DTdcv, block=conds$Individual) DTdcdcv <- voom(DTdge, DTdesign, correlation=DTcor$consensus.correlation, block=conds$Individual) DTdcfit <- lmFit(DTdcdcv, DTdesign, cor=DTcor$consensus.correlation, block=conds$Individual) DTdcmc <- makeContrasts(HFD.Tumour_Normal = HFD.Tumour - HFD.Normal, NC.Tumour_Normal = NC.Tumour - NC.Normal, levels=design) DTdcfitmc <- contrasts.fit(DTdcfit, DTdcmc) DTdcfitmc <- eBayes(DTdcfitmc)
Any comments with respect to this analysis would be greatly appreciated.
Bruce
Hi Aaron,
thanks for the reply, late response as I got no notification mail.
I find very small change after second round of
duplicateCorrelation:
I would like to perform MDS post
duplicateCorrelation
. Is there a way to apply a scaling factor accounting for the correlation to counts/CPM on a per-sample/individual level that you know of?Thanks for the tip on
robust=TRUE.
Bruce.
In general, no. The reason we have to use
duplicateCorrelation
in the first place is because we don't have enough information in the experimental design to estimate the effect of the blocking factor, which means we can't compute corrected expression values that are free of said effect. At least, not without also removing confounded effects of interest, which is the diet in your case. I don't think there's an obvious visual analogy in the MDS plot for the effect ofduplicateCorrelation
. The closest I can think of is somehow "squeezing" all samples together in a manner that preserves the differences within each level of the blocking factor.Long story short, don't worry about it and just make your MDS plot. If the diet effect is strong, it should show up fine regardless of the fact that it's confounded with the individual effect.
OK, good to know, and yes, with strong effect (tumour vs. normal) we find an effect as expected, but the diet effect is minimal. Not sure I won't worry about it though=D Thanks for your advice, and your continued support of these kinds of questions, it is incredibly helpful.