I have a 10x single-cell dataset with 6 replicates each containing cells from the same 5 donors. For the sake of simplicity, let's assume I have only two clusters, perturbed and unperturbed. I'd like to run a pseudobulk differential expression testing, comparing the two clusters. But I want to pseudobulk each donor -- not each replicate. The complication is that each donor appears in all replicates.
One way to do this is to first aggregate the replicates using cellranger aggr, which takes care of normalization across replicates. Then I'd pseudobulk the donors and run DE testing as below:
y <- Seurat2PB(seurat_obj, sample="donor", cluster="perturbation_status")
y <- normLibSizes(y)
donor <- factor(y$samples$sample)
cluster <- as.factor(y$samples$cluster)
design <- model.matrix(~cluster+donor)
...
fit <- glmQLFit(y, design, robust = TRUE)
qlf <- glmQLFTest(fit, contrast = contrast_matrix)
My question is, what is the correct way to do this on an integrated Seurat object (ie, without aggregating the replicates)? It seems to me like pseudobulking the donors across replicates as above in an integrated Seurat object would be wrong due to different library sizes in each replicate.
Obviously, I can run the tests for each donor in each replicate separately. But that would reduce the power due to decreased cell counts in each test. Also, I'd rather run just one test for each donor than 6.
Thank you!
If the your replicate samples were from different cells but the same biological samples, then you should probably group cells from the same donor, the same replicate, and from the same cluster. In your case, you would have 5x6x2 = 60 pseudo-bulk samples.
Can you please clarify what the replicates represent? Are you simply resequencing the same libraries so that they are purely technical replicates? Or are the replicates different cells from the same biological samples? Or are the replicates separate tissue samples? It's not at all obvious what the situation is.
Hi Gordon, all cells come from lab-grown cell cultures. Same cell line from 5 different donors... Each replicate contains a different set of cells from the same 5-donor mixture.