Hello,
I have a long list of splice variants (ENSTxxx.y) from a DESeq2 experiment that I want to convert to the "clone_based_ensembl_transcript" and "clone_based_ensembl_gene" list.
I used biomaRt v2.34.2 to test 3 transcripts in the list to make sure it works before converting the entire list:
mart <- useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")
getBM(attributes=c('ensembl_transcript_id', 'ensembl_gene_id',
'clone_based_ensembl_transcript', 'clone_based_ensembl_gene'),
filters = 'ensembl_transcript_id',
values = c('ENST00000416008.1', 'ENST00000473913.1','ENST00000598996.2'),
mart = mart)
This returned no conversion:
[1] ensembl_transcript_id ensembl_gene_id
[3] clone_based_ensembl_transcript clone_based_ensembl_gene
<0 rows> (or 0-length row.names)
So, I checked the Ensembl Human GRCh38.p10 (which is used by biomaRt as the dataset = "hsapiens_gene_ensembl" in the 1st code line), and all 3 splice variants have "clone_based_ensembl_transcript" and "clone_based_ensembl_gene" :
splice variants cloned based ensembl transcript clone_based_ensembl_gene
ENST00000416008.1 AC068535.1-201 AC068535.1
ENST00000473913.1 AC009108.1-201 AC009108.1
ENST00000598996.2 FENDRR-205 FENDRR
Then, I removed the ".y" from the splice variant IDs and re-ran the code:
getBM(attributes=c('ensembl_transcript_id', 'ensembl_gene_id',
'clone_based_ensembl_transcript', 'clone_based_ensembl_gene'),
filters = 'ensembl_transcript_id',
values = c('ENST00000416008', 'ENST00000473913','ENST00000598996'),
mart = mart)
This time, biomaRt returned the ensembl transcripts and ensembl genes for only the first 2 splice variants:
ensembl_transcript_id ensembl_gene_id clone_based_ensembl_transcript
1 ENST00000416008 ENSG00000227157 AC068535.1-201
2 ENST00000473913 ENSG00000243697 AC009108.1-201
3 ENST00000598996 ENSG00000268388
clone_based_ensembl_gene
1 AC068535.1
2 AC009108.1
3
Is this because I do not use the correct filter ("ensembl_transcript_id") for the input splice variants? If so, which filter is correct? Why is the 3rd splice variant (ENST00000598996.2) not converted even in the correct ensembl_transcript_id format (ENST00000598996)
Many thanks for helping shed lights on these problems!
Please ask this as a new question. Thanks!