Question

Robust way of dealing with low number of samples for Differential Gene Expression

0

Entering edit mode

Satoshi • 0

@762f5205

Last seen 14 months ago

United States

Hello,

We have single-cell data from 12 breast cancer patients with 3 biopsies from each patient (Baseline, treatment one, treatment two); so in total 36 samples. Out of 12 patients, 4 are responders (R) and 8 are non-responders (NR). I have done cell-typing and sub-typing for all cells in my dataset. I want to perform a differential expression test between responders and non-responders for each cell type as well as sub-type at each time-point (Baseline, treatment one and treatment two). I also want to perform a differential expression test between Baseline vs treatment one; baseline vs treatment two and treatment one vs treatment two for each cell type and subtype and response category (i.e R and NR).

Based on https://www.nature.com/articles/s41467-021-25960-2, I am performing pseudo-bulk based DE analysis using DESeq2/edgeR and was wondering how robust would that be? In my understanding, there are two more ways to do this: 1) Do a single-cell based DESeq2/edgeR/MAST run instead of pseudo-bulk and 2) Perform a rank-sum test on a single-cell basis and estimate the error per sample. I wasn't able to find the thread but I remember reading a discussion about this from one of Michael Love's publications.

Thank you for your time and suggestions in advance.

limma edgeR DESeq2 MAST • 1.6k views

ADD COMMENT • link written 14 months ago by Satoshi • 0

score 3 · Accepted Answer · 2024-10-27

3

Entering edit mode

Michael Love 43k

@mikelove

Last seen 16 hours ago

United States

I find pseudo-bulking is a robust way to approach DE, provided reliable cell type identification across samples, and when used with appropriate controlling for technical variation using methods like RUVSeq.

3 biopsies from each patient... out of 12 patients, 4 are responders (R) and 8 are non-responders (NR)

With such a design, it may be better approached with mixed effects models, using e.g. duplicateCorrelation with limma-voom.