Raw counts for reference dataset in single cell rnaseq
1
0
Entering edit mode
@lirongrossmann-13938
Last seen 6 months ago

Hi,

I am using the HumanPrimaryCellAtlas from celldex package and was wondering if there is a way to obtain the raw for this dataset? I am trying to combine this dataset with my own reference dataset to obtain a larger reference dataset, but need to make sure they are both normalized together and using the same method- so am in need for the original raw counts.

Thanks, Liron

single cell celldex singleR • 164 views
1
Entering edit mode

HumanPrimaryCellAtlas is an array dataset, are you aware of that? What exactly do you mean with combine? Since these are not single-cell datasets you cannot plug them into an integration pipeline and pretend that no huge batch effects between datasets/platforms were present. Better check http://bioconductor.org/packages/release/bioc/vignettes/SingleR/inst/doc/SingleR.html#62usingmultiple_references which you can do with the data that are available via the package you mention.

0
Entering edit mode

I am aware that this is an array dataset. Unless I am misinterpreting it - it does contain a log expression matrix with columns referring to cell lines and rows as genes, no? I would definitely expect to get batch effects when combining it with another dataset, but was hoping that by getting raw counts I would at least be able to normalize them all together (still having batch effects though).

0
Entering edit mode
Aaron Lun ★ 26k
@alun
Last seen 1 hour ago
The city by the bay

What ATpoint said. I'll further add that, even for the RNA-seq datasets in celldex, I don't have the raw counts. When we first started the Bioconductor submission from Dvir's original SingleR package, we inherited only the log-expression matrices; we tried to reconstruct the raw counts but this was just too much work. So we just shrugged and made a mental note to keep better track of stuff in the future.