Hello,
We have single-cell data from 12 breast cancer patients with 3 biopsies from each patient (Baseline, treatment one, treatment two); so in total 36 samples. Out of 12 patients, 4 are responders (R) and 8 are non-responders (NR). I have done cell-typing and sub-typing for all cells in my dataset. I want to perform a differential expression test between responders and non-responders for each cell type as well as sub-type at each time-point (Baseline, treatment one and treatment two). I also want to perform a differential expression test between Baseline vs treatment one; baseline vs treatment two and treatment one vs treatment two for each cell type and subtype and response category (i.e R and NR).
Based on https://www.nature.com/articles/s41467-021-25960-2, I am performing pseudo-bulk based DE analysis using DESeq2/edgeR and was wondering how robust would that be? In my understanding, there are two more ways to do this: 1) Do a single-cell based DESeq2/edgeR/MAST run instead of pseudo-bulk and 2) Perform a rank-sum test on a single-cell basis and estimate the error per sample. I wasn't able to find the thread but I remember reading a discussion about this from one of Michael Love's publications.
Thank you for your time and suggestions in advance.
Alright, thanks that answers my question
Only thing I would add here is that in my hands limma-trend is preferable for single-cell data (voom does not seem to properly correct for library size) as it can happen that you compare celltypes (clusters...) that intrinsically have different library sizes, for example because they express notably different numbers of genes (in my case that was neutrophils, which are transcriptionally not very active) versus progenitors which are still quite active). So testing the edgeR-calculated logcpms with limma-trend might in some cases be preferable. As usual, look at the MA-plots which are a great diagnostic plot.