Question

SingleR using TPM in a bulk RNA seq dataset

0

Entering edit mode

mea2712 • 0

@3bf40994

Last seen 2.1 years ago

Sweden

Hey there, I would like to know what are your thoughts about using SingleR classic method in a bulk reference with TPMs.
I don't have access to the raw counts.
I am getting the opposite results as expected when using log(TPM+1) as a reference.
Thanks!


# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

sessionInfo( )

SingleR • 955 views

ADD COMMENT • link updated 2.1 years ago by ATpoint ★ 4.6k • written 2.1 years ago by mea2712 • 0

1

Entering edit mode

SingleR is correlation based, so you want to try to have a like-for-like comparison. That means, if your reference dataset is for example 3'-tagged and your data are full-length, then performance might be suboptimal. The same goes for data where for example your data are length-divided as in TPM, but the reference is not, then performance might suffer as the linear trend between length-corrected and uncorrected counts (given there was one between reference and your data in "reality") is probably not the same. Approaches that do not rely on correlation assumptions are for example UCell, relying on rank-based approach in terms of expression level.

ADD REPLY • link 2.1 years ago ATpoint ★ 4.6k