findMarkers() {scran} doesn't really find markers
Entering edit mode
Last seen 2.3 years ago

I am trying to use the function findMarkers() from the package scran to find the marker genes of different cellt ypes.

My experiment / samples: I have 3 different cell types with 3 replicates each.

Data input: on the sequencing reads I have run DESeq2 and used varianceStabilizingTransformation() to get a matrix of normalized counts:

sampleinfo[,1:2] Sample Celltype AM1 AM1 Alveolarmacrophages AM2 AM2 Alveolarmacrophages AM3 AM3 Alveolarmacrophages IM1 IM1 Interstitialmacrophages IM2 IM2 Interstitialmacrophages IM3 IM3 Interstitialmacrophages T1 T1 Tcells T2 T2 Tcells T3 T3 T_cells

dds <- DESeqDataSetFromMatrix(countData = onlycounts, colData = sampleinfo, design= ~ Celltype) dds <- DESeq(dds) vsd <- varianceStabilizingTransformation(dds, blind=T) vsdmat <- assay(vsd)

head(vsd_mat) AM1 AM2 AM3 IM1 IM2 IM3 T1 T2 T3 let-7a-1-3p 10.3550933 9.8872116 10.5713000 8.4606248 8.8959434 8.641476 8.8700965 8.9457874 9.9124382 let-7a-2-3p -0.5877171 -0.5877171 -0.5877171 0.6988212 0.7829733 2.546719 -0.5877171 -0.5877171 -0.5877171 let-7a-5p 18.5178483 18.1865091 18.8431510 16.6119296 16.7121333 17.419055 18.6922011 18.8095962 18.8838528

Then I used the findMarkers() function

ncol(vsdmat)==length(sampleinfo$Celltype) #check if it´s T markergenes <- findMarkers(vsdmat, groups=sampleinfo$Cell_type, test.type="wilcox", pval.type="all")

No errors, but I was surprised that no transcripts had FDR < 0,05 for any of the cell types: I do not think this is right because in PCA plots and heatmaps the cell types are very nicely separated:

PCA plot

Hierarchical clustering with pairwise correlations

I have tried to use other systems for p-value:

markergenessome <- findMarkers(vsdmat, groups=sampleinfo$Celltype, test.type="wilcox", pval.type="some") markergenesany <- findMarkers(vsdmat, groups=sampleinfo$Celltype, test.type="wilcox", pval.type="any")

but this doesn´t improve the results either.

Do you know what could be going wrong? Or do you know any other package/functions to do this? I have googled but all similar functions I have found are for single-cell sequencing data. DESEq2 is also not useful because it doesn´t doesn´t do all the contrasts automatically, you need to type them one by one (I have now only 3 cell types, but later I will have more, so it´s not convenient).

findMarkers scran • 1.0k views
Entering edit mode
Aaron Lun ★ 28k
Last seen 14 hours ago
The city by the bay

It seems like you have a bulk experiment. With only 3 replicates per condition, you're not going to get any power for the Wilcoxon test. Let's have a look - what's the lowest p-value that you can get from 3 replicates?

wilcox.test(1:3, 100:102)$p.value
## [1] 0.1

And obviously the adjusted p-value will be even higher. You would probably get better mileage out of using test.type="t", which does a t-test that is more responsive to the effect size, but that won't take advantage of the various benefits offered by empirical Bayes shrinkage when dealing with bulk data.

The best approach is to not try to pretend it's single-cell data (which is what findMarkers is designed to work with) and just perform pairwise analyses with your DE framework of choice. See, for example, the code chunk here that describes how to use such DE methods with scran's consolidation machinery.


Login before adding your answer.

Traffic: 431 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6