Question

How to adjust for different cell type mixtures in differential expression analysis?

0

Entering edit mode

aec ▴ 90

@aec-9409

Last seen 3.8 years ago

Dear all,

I computed enrichment scores for 64 cell types with xCELL from my bulk RNAseq samples. Now I would like to detect differential expression across 3 groups (control, case1, case2) but adjusting for the different cell type compositions (continuous variables). I was thinking of taking only the most variable cell types across samples (<10). I was wondering if it is really necessary to cut the continuous variables into smaller bins as DESeq2 FAQ says.

would this model be enough?

model.matrix(~group+cell_type1+cell_type2+cell_type_n)

Thanks,

cell types differential expression adjustment mixtures rnaseq • 2.2k views

ADD COMMENT • link updated 6.8 years ago by Ryan C. Thompson ★ 7.9k • written 6.8 years ago by aec ▴ 90

score 1 · Answer 1 · 2017-07-13

By my understanding, one problem with putting the cell type compositions directly into the model as numeric covariates is the incompatible scales: that cell fractions are expected to have a linear relationship to gene abundance, while model coefficients are fit on a log scale. You might be better off using SVA to estimate surrogate variables that can account for cell type composition as well as any other sources of systematic variation. Surrogate variables estimated by SVA will be on the correct scale to add directly to the design matrix. SVA also has the advantage of automatically choosing the right number of variables. This isn't too different from what you've already done, since your cell type compositions were also estimated from the data.