marker genes in single cell experiment
2
0
Entering edit mode
@lirongrossmann-23954
Last seen 3.2 years ago

Hi,

I am trying to create a reference dataset from 40 different cell types (fpkm_matrix is a 26,000 x 40 log count matrix) and I am trting to find gene markers for each cell line.

I used the following code:

cell.matrix<- SingleCellExperiment(list(logcounts = as.matrix(fpkm_matrix)))
colLabels(cell.matrix) <- colnames(fpkm_matrix)
out <- pairwiseTTests(cell.matrix, cell.matrix$label , direction="up")

and got the following error

Error in .compute_mean_var(x, BPPARAM = BPPARAM, subset.row = subset.row,  : 
  no residual d.f. in any level of 'block' for variance estimation

Based on that, I am suspecting there may not be a lot of difference between the cells types but I know that there is.

Any input would be appreciated.

Thanks, Liron

single cell singlecellexperiment gene markers • 1.3k views
ADD COMMENT
0
Entering edit mode
@peter-langfelder-4469
Last seen 26 days ago
United States

My guess is that your cell.matrix contains no 'label' component since there's no 'label', only the expression, when you created it. In other words, cell.matrix$label is NULL. Maybe you have another variable (list) that contains the labels (component 'label') - you would want to use that instead of cell.matrix$label.

ADD COMMENT
0
Entering edit mode

Thanks, Peter. I forgot to add the code line with the labels in my question (I did it in my original code). I added it to the question, but I still get the above error....

ADD REPLY
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 5 hours ago
The city by the bay

I am going to guess that each column name is unique, in which case there are no replicates for any of the labels; in this case, computing a p-value for differential comparisons is not possible. This is reflected in the error message, where it's telling you that there are no residual degrees of freedom for the t-test.

Check if the labels are something like CD4_rep1, CD4_rep2, etc. in which case you can just sub() out the _repX to get consistent labels for the same cell type. However, if you actually only have one column per cell type, you're stuffed. There's no way to compute p-values here. Perhaps use SingleR::getClassicMarkers() instead to get the top markers with the largest log-fold changes.

For either function, one would typically use the log-transformed values.

ADD COMMENT
0
Entering edit mode

Thanks, Aaron. You are right! There are no replicates in the dataset. The singleR function worked! Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6