Why am I finding a mismatch between refseq_dna and ensembl_transcript_id ?
1
0
Entering edit mode
@mauedealiceit-3511
Last seen 9.6 years ago
I downloaded the following file from miRDB http://mirdb.org/miRDB/download/MirTarget2_v3.0_prediction_result.txt. gz I have checked that miRDB Gene_Bank_Accession_Number (for Human it is something like NM_xxxxx) correspond to BioMart "refseq_dna". I have a vector containing 253 Gene_Bank_Accession_Numbers length(tmp_miRNA_GB) [1] 253 > tmp_miRNA_GB[1:5] [1] "NM_203390" "NM_024639" "NM_001017989" "NM_203331" "NM_001879" I use such a vectos as input filter to getBM to obtain the respective ensembl_transcript_id. Surprisingly onlly 246 ensembl_transcript_ids are found: > gene.map <- getBM (attributes = c("hgnc_symbol","ensembl_gene_id","r efseq_dna","ensembl_transcript_id"), filters = "refseq_dna", values = tmp_miRNA_GB, mart=hmart) > dim(gene.map) [1] 246 4 I thought there would be a 1-1 correspondence between the two attributes: "refseq_dna" and "ensembl_transcript_id" Am I mistaken ? Thank you in advance for correcting my misunderstandings, Maura tutti i telefonini TIM! [[alternative HTML version deleted]]
biomaRt biomaRt • 832 views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 12 weeks ago
United States
On Wed, Jul 29, 2009 at 12:01 AM, <mauede@alice.it> wrote: > I downloaded the following file from miRDB > http://mirdb.org/miRDB/download/MirTarget2_v3.0_prediction_result.tx t.gz > > I have checked that miRDB Gene_Bank_Accession_Number (for Human it is > something like NM_xxxxx) correspond to BioMart "refseq_dna". > > I have a vector containing 253 Gene_Bank_Accession_Numbers > length(tmp_miRNA_GB) > [1] 253 > > tmp_miRNA_GB[1:5] > [1] "NM_203390" "NM_024639" "NM_001017989" "NM_203331" "NM_001879" > > I use such a vectos as input filter to getBM to obtain the respective > ensembl_transcript_id. > Surprisingly onlly 246 ensembl_transcript_ids are found: > > > gene.map <- getBM (attributes = > c("hgnc_symbol","ensembl_gene_id","refseq_dna","ensembl_transcript_i d"), > filters = "refseq_dna", values = > tmp_miRNA_GB, mart=hmart) > > > dim(gene.map) > [1] 246 4 > > I thought there would be a 1-1 correspondence between the two attributes: > "refseq_dna" and "ensembl_transcript_id" > Am I mistaken ? > Hi, Maura. Yes, unfortunately, there is not a 1-1 correspondence. Ensembl and NCBI (the curator of RefSeq) are independent organizations, each with different build policies and annotation processes for transcripts. So, in general in this field (genomics/bioinformatics), there is RARELY a 1-1 correspondence between any two entities. I would suggest that 246/253 is actually quite a good result--I might have expected a bit less a priori. Sean [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 886 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6