Hi,
I would like to normalise my bulk data using DESeq2. Since the samples are normalised against an averaged reference, 50% of the genes must be constant across samples [1]. I was wondering why aren't the samples normalised against one of the samples. This would only require the weaker assumption that between any pair of samples (more specifically, the pairs including the reference), less than 50% of the genes are differential (see [1], section "Clustering to weaken the non-DE assumption"). This approach is followed for normalising single cells across clusters in the scran
package and for normalising single cells across batches in the multiBatchNorm
function of the batchelor
package.
References: [1] Lun, Aaron TL, Karsten Bach, and John C. Marioni. "Pooling across cells to normalize single-cell RNA sequencing data with many zero counts." Genome biology 17.1 (2016): 75.
Why exactly are you arguing with single-cell methods when dealing with bulk data?
Because the principle (median-based normalisation) is the same. The single-cell methods I'm referring to normalise averages of batches of single cells, which are effectively bulk samples.