I am facing a tricky problem of experimental design that I would like to share with you.
We are working with several groups of samples, each of them containing sorted cells of the same cell type. In groups A and B, we have a certain percentage of the cells which have been modified in different ways (=continuous value). Group C is a control group without any modification.
The goal of this experiment is to assess the impact of the modification on the cells by comparing A vs C and B vs C while controlling for the number of modified cells in each group.
In addition, we don't know if this modification has a linear impact on the cells: i.e. a continuous value of 1% could have 10 times less impact than a continuous value of 10% which could itself have 1000 times less impact than a 20% of cells.
The design table:
sample group continuous_value(%)
sample_a A 35
sample_b A 10
sample_c B 1
sample_d B 4
sample_e C 0
sample_f C 0
I have already tried two different approaches with DESeq2 to work on this dataset:
a. Use the continuous value as a numeric value in the design (with or without log2 transformation). I obtained some differentially expressed genes but have no idea how to say that it was the right thing to do.
b. Transform the value into small bins. Unfortunately this did not work ("Error in DESeqDataSet(se, design = design, ignoreRank) : the model matrix is not full rank, so the model cannot be fit as specified.one or more variables or interaction terms in the design formula are linear combinations of the others and must be removed"). As you can see my controls are always at 0, my group B is quite low and the % in the group A is alway higher than the rest. Moreover, we currently don't have enough biological evidences to say: from this % to this one the cells can be fitted in the same box.
Since I don't understand enough all the mechanics behind linear model I would like to have some advices on this design. How can I compare my groups while controlling for this variable number of cells? (assuming or not that the % has a linear impact).
Up to know I have used DESeq2 but I could also use other packages if they are more suitable for this kind of messy design.
Sorry for this long post,
Thanks in advance for your answers!