Limma for cell components effect on response
1
0
Entering edit mode
antgomo • 0
@antgomo-21296
Last seen 3.0 years ago
Spain

I am using limma to find the cell composition effect on response against treatment

my phenodata is as follows

...

Code  RESP Age SEX         CD8T      CD4T      NK          Bcell      Mono     Neu

30105    R    39   F    0.08900827 0.10813018 0.04065231 0.03384039 0.08240881 0.6639053
30106   NR   34   F    0.07089437 0.13440019 0.05485091 0.03168903 0.09493677 0.6452768
30119    R    73   F    0.10066220 0.23214008 0.07688516 0.04285787 0.07616766 0.4949032
30121    R    58   F    0.09589028 0.12685706 0.05535219 0.03765947 0.06127022 0.6421821
30122    R    47   F    0.04024961 0.07496977 0.02534626 0.02226978 0.05101141 0.8014216
30125   NR   66   F    0.02996649 0.05638210 0.02400648 0.02612157 0.02844631 0.8519205
30126    R    53   F    0.05369147 0.16694206 0.02350887 0.04463168 0.07090899 0.6591133
30128   NR   76   F    0.05227852 0.25069129 0.03144042 0.03237537 0.13886930 0.5275622
30134   NR   47   F    0.08675013 0.17954926 0.03897045 0.08915519 0.10016838 0.5315112
30135    R    55   F    0.06359675 0.15270431 0.03647699 0.04048208 0.07990695 0.6537159

...

So i was thinking i something like that, let's say that i want to test each effect separately to get the effect of this particular cell component in response to treatment

design <- model.matrix(~0+pd$RESP+pd$RESP:pd$Neu+ pd$SEX + pd$Age)

In this case, i have to test the5 and 6 coefficients, am i right?

pd$RESPNR pd$RESPR pd$SEXM pd$Age pd$RESPNR:pd$Neu pd$RESPR:pd$Neu
  1         0        1                      0     39        0.0000000       0.6639053
  2         1        0                      0     34        0.6452768       0.0000000
  3         0        1                      0     73        0.0000000       0.4949032
  4         0        1                      0     58        0.0000000       0.6421821
  5         0        1                      0     47        0.0000000       0.8014216
  6         0        1                      0     66        0.0000000       0.8519205

However, i don't know if this is the real design that i have to follow, maybe i have to consider splines and hence

X <- ns(df$Neu, df=3)

design <- model.matrix(~pd$RESP*X+ pd$Sex + pd$Age)

Any hint would be greatly appreciated

Many thanks in advance

450k EPIC minfi limma cell components • 1.4k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 53 minutes ago
The city by the bay

You should throw all the cell compositions into the model at once. This protects you against co-linearity in the compositions. For example, if CD8 and CD4 T cell abundances are strongly correlated, a single model that contains terms for all cell abundances will not be able to confidently associate a particular gene to either cell type. This is the correct outcome - using separate models will suggest that a gene is associated with both CD8 and CD4 T cell abundance, which could be misleading.

It also protects against variance inflation and loss of power from a mis-specified model. For example, let's say that the expression profiles of many genes are associated with B cell abundance. But if you don't include B cells in your (say) neuron-specific model, the apparent variance for those genes will be inflated. This reduces the power to detect any association with neuron abundance - but more importantly, due to empirical Bayes shrinkage, the inflated variances for these genes will also reduce power for other genes, which is not good.

So just throw in pd$RESP:pd$XXX terms for all cell types XXX. And hope that you have enough residual d.f. to fit the model and estimate the variances. I wouldn't worry about splines unless you really have loads and loads of samples to be able to afford them.

ADD COMMENT
0
Entering edit mode

Many thanks Aaron,

so i will use

design <- model.matrix(~pd$RESP*cell_comp+ pd$Sex + pd$Age)

without splines, in this case, i will have to look individually for each pd$RESP:pd$XXX individually, isn't it?

if i want to check the effect of cell_comp in DE genes between R and NR, i have to do contrasts

con <- makeContrasts( RESP_XXX - NRESP_XXX,
levels=design) # replacing ":" with "_" in colnames(design)
ADD REPLY
0
Entering edit mode

Yes and yes. You may also want to log-transform the components, I would expect log-expression to increase linearly with log-abundance rather than linearly with raw abundance.

ADD REPLY

Login before adding your answer.

Traffic: 359 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6