interpretation complex design limma
1
0
Entering edit mode
dfrtyu • 0
@grateshak-10586
Last seen 5 weeks ago
United Kingdom

Hi everyone, and prof Gordon Smyth

Pls help on how best to view two designs used for limma as below. The objective was to pool higher/secondary-level groups as well as first-level groups of samples within the design to get DGE.

So, with a design and the logCPM mean-variance output i.e. voom() function , Four people used the logic of normal designs and therefore added the 'higher/secondary-level' contrasts as below

ct<-makeContrasts(g2v1=(group2_dead+group2_alive) - (group1_dead+group1_alive),

b<-eBayes( contrasts.fit( lmFit(data, design),  contrasts=ct))
summary(decideTests(b))

sessionInfo( )


My question is : Does this approach have any form of interpretation from the resulting DE or it should be discarded completely in favour of division by numbers as below

ct <- makeContrasts(g2v1=(group2_dead+group2_alive)/2 - (group1_dead+group1_alive)/2  ,

b<-eBayes( contrasts.fit( lmFit(data, design),  contrasts=ct))
summary(decideTests(b))

sessionInfo( )
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods
[9] base

other attached packages:
[5] BiocParallel_1.24.1 genefilter_1.72.1   mgcv_1.8-31         nlme_3.1-148
[9] oligo_1.54.1        Biostrings_2.58.0   XVector_0.30.0      IRanges_2.24.1
[13] S4Vectors_0.28.1    oligoClasses_1.52.0 affy_1.68.0         forcats_0.5.1
[21] tidyr_1.1.3         tibble_3.1.1        ggplot2_3.3.5       tidyverse_1.3.1
[25] limma_3.46.0        GEOquery_2.58.0     Biobase_2.50.0      BiocGenerics_0.36.1

limma • 220 views
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

When you fit a linear model and make comparisons you are always computing the average for a group, and you make comparisons by calculating differences between those averages. In your first contrast you are computing sums, whereas the second you are computing averages. In other words, in

g2v1=(group2_dead+group2_alive) - (group1_dead+group1_alive)


That is the sum of group 2 minus the sum of group 1, which isn't something you would normally care to know.

g2v1=(group2_dead+group2_alive)/2 - (group1_dead+group1_alive)/2


is the average of group 2 minus the average of group 1, which is a readily interpretable quantity.

0
Entering edit mode

Very many thanks for the reply! @ James MacDonald

Indeed it is probably unnecessary to do g2v1=(group2_dead+group2_alive) hence the question about interpretation vis-a-vis the concept of DE. Part of why I asked about interpretability is because there was a 'non-expert' querying me about the input are all sum of log data

I guess you are indicating that such is not interpretable

0
Entering edit mode

The two different contrast matrices you give will yield identical lists of DE genes, p-values and FDRs. The only difference will be in the log-fold-changes, which will differ by a factor of 2 for the third contrast. As long as you know what the logFCs mean, both choices lead to the same conclusions, but I would always myself use the mean-mean contrast instead of sum-sum.