how can I annotate my scRNAseq clusters from a Rat experiment?
2
0
Entering edit mode
pcantalupo • 0
@pcantalupo-8617
Last seen 5 weeks ago
United States

Hello,

Can anybody recommend software or a general workflow of how to annotate clusters from a Rat experiment? I'm having troubling finding Rat reference databases for this and it seems that the Bioconductor packages such as celldex only contain Human and Mouse reference information.

Thank you,

scRNAseq Rattus_norvegicus_Data annotation • 139 views
4
Entering edit mode
Aaron Lun ★ 26k
@alun
Last seen 14 hours ago
The city by the bay

I would go with @atpoint's second point above. I'm guessing that well-curated Rat references are pretty hard to come by.

If you're working with Ensembl gene IDs in your rat dataset, you can just do something like this:

library(celldex)
ref <- MouseRNAseqData()

library(biomaRt)
mart <- useMart("ensembl", dataset="mmusculus_gene_ensembl")

# Not the cleanest code, but whatever.
ens.mapping <- getBM(c("mgi_symbol", "ensembl_gene_id"), filters="mgi_symbol", values=rownames(ref), mart=mart)
homo.mapping <- getBM(c("ensembl_gene_id", "rnorvegicus_homolog_ensembl_gene"), filters="ensembl_gene_id",
values=unique(ens.mapping$ensembl_gene_id), mart=mart) mapping <- merge(ens.mapping, homo.mapping) rat.genes <- mapping$rnorvegicus_homolog_ensembl_gene[match(rownames(ref), mapping\$mgi_symbol)]
keep <- !is.na(rat.genes) & rat.genes!=""
rat.ref <- ref[keep,]
rownames(rat.ref) <- rat.genes[keep]


Same principle applies for the human celldex datasets - just swap mgi_symbol for hgnc_symbol and replace mmusculus with hsapiens. As usual, YMMV on cross-species comparisons, though hopefully the defining aspects of the major cell types are still preserved.

2
Entering edit mode
ATpoint ▴ 650
@atpoint-13662
Last seen 9 hours ago
Germany

Others might have different suggestions but I see two strategies:

1) search the literature for previous work on Rat and then see whether there have been publications towards collections of marker genes for the relevant tissues you are working with.

2) convert the gene names from rat to e.g. the mouse orthologs and then use the extensive resources on mouse to get an idea what the clusters are. This is obviously to be interpreted carefully, but might serve well to get a first impression, e.g. to base downstream experiments on in order to experimentally validate the identities of the clusters.

Edit: See Aaron Lun's answer for a code suggestion towards 2).