getBM biomaRt returns different results for the same attribute, depending on which attributes I request
3
0
Entering edit mode
efoss ▴ 10
@efoss-8908
Last seen 11 months ago
United States

 

Here are two lines of code that create data frames using biomaRt's "getBM" function: 

exons7p2 <- getBM(attributes = c("refseq_dna"), filters = "chromosome_name", values = "X", mart = mart.mm)
exons7p3 <- getBM(attributes = c("refseq_dna", "gene_exon_intron"), filters = "chromosome_name", values = "X", mart = mart.mm)

The only thing that differs is that I'm requesting one additional attribute in the second assignment. But if I look at what is in the "refseq_dna" columns of these two data frames, it's completely different. 

head(exons7p2$"refseq_dna") gives a bunch of gene names of the "NM_010498" type

head(exons7p3$"refseq_dna") gives a bunch of sequences

Clearly, there is something fundamental I'm misunderstanding about biomaRt. I would appreciate any guidance. 

Thanks. 

Eric

below is how I made mart.mm and also my sessionInfo()

mart.mm <- useDataset("mmusculus_gene_ensembl", mart = mart.mm)

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] org.Mm.eg.db_3.1.2                      RSQLite_1.0.0                          
 [3] DBI_0.3.1                               TxDb.Mmusculus.UCSC.mm9.knownGene_3.1.2
 [5] GenomicFeatures_1.20.5                  AnnotationDbi_1.30.1                   
 [7] Biobase_2.28.0                          GenomicRanges_1.20.8                   
 [9] GenomeInfoDb_1.4.3                      IRanges_2.2.7                          
[11] S4Vectors_0.6.6                         BiocGenerics_0.14.0                    
[13] biomaRt_2.24.1                         

loaded via a namespace (and not attached):
 [1] XVector_0.8.0           zlibbioc_1.14.0         GenomicAlignments_1.4.1 BiocParallel_1.2.21    
 [5] tools_3.2.2             lambda.r_1.1.7          futile.logger_1.4.1     rtracklayer_1.28.10    
 [9] futile.options_1.0.0    bitops_1.0-6            RCurl_1.95-4.7          Biostrings_2.36.4      
[13] Rsamtools_1.20.4        XML_3.98-1.3           

 

biomart • 1.3k views
ADD COMMENT
0
Entering edit mode
Thomas Maurel ▴ 790
@thomas-maurel-5295
Last seen 5 months ago
United Kingdom

Dear Eric,

This would be because the "gene_exon_intron" attribute returns the exons + introns of a gene and the "refseq_dna" attribute gives you the RefSeq dna external reference linked to Ensembl Gene IDs. If you have a look on the BioMart interface, you will see that when you ask for a sequence, all the other attributes are in the header (please see: http://www.ensembl.org/biomart/martview?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.sequences.ensembl_gene_id|hsapiens_gene_ensembl.default.sequences.ensembl_transcript_id|hsapiens_gene_ensembl.default.sequences.gene_exon&FILTERS=hsapiens_gene_ensembl.default.filters.chromosome_name."1"&VISIBLEPANEL=resultspanel). I believe this is what is happening here, all the NM_XXXX ids are located in the header of the result but because the sequences are quite big, it's quite hard to see the header.

You can find more information regarding the sequence attributes in the following BiomaRt documentation: http://bioconductor.org/packages/release/bioc/vignettes/biomaRt/inst/doc/biomaRt.pdf (section 4.7)

 

All the mart data is linked to Ensembl Gene IDs so it will be easier if you add Ensembl Gene/Transcript/Translation/Exons to your results.

Hope this helps,

Regards,

Thomas

 

ADD COMMENT
0
Entering edit mode

Hi Thomas, 

Perhaps I'm not understanding your answer, but I don't think that that is my problem. Below I have two data frames that I made with getBM, as described above. One of them has just "refseq_dna" as an attribute and the other has both "refseq_dna" and "gene_exon_intron":

> names(exons7p2)
[1] "refseq_dna"
> names(exons7p3)
[1] "refseq_dna"       "gene_exon_intron"

 

But now, if I ask for the first few "refseq_dna" items in the "exons7p2" data frame, I get gene names, whereas when I ask for the first few "refseq_dna" items in the "exons7p3" data frame (so the exact same type of attributes - not the "refseq_dna" and the "gene_exon_intron", but just the "refseq_dna") then I get DNA sequences. (I listed just the first few lines of sequences, but many more followed.): 


> head(exons7p2$refseq_dna)
[1] "XM_006527846" "NM_010498"    "XM_006527845" "NM_001290562" "NM_001290561" "NM_011123"   
> head(exons7p3$refseq_dna)

[1] "GTCAGTGCACAACTGCCAACTGGGATGCAGAACACTGCTCACGCCAACCATCCTGAAAGCCAACTATAAAAAGCAGAGAGATACTCTGCACCTTTTCAGTGAGGTCCAGATACCCACAGAGCAGAGACAGTCGCTCACACATGATGAGGGTCATCATCCTCCTGCTCACACTGCATGTGCTAGGCGTCTCCAGTGTGATGAGTCTCAAAAAGAAGGTAGCAGACCTGTGTGGAAGGGGGCTGTATGTGGTGGGCATGTTGGGCAGAGACAAACAGACAGAGAGAGGCTTGGGAGG

So items in one data frame that I retrieved by asking for "refseq_dna" give me one type of data (names) whereas asking for the same thing from another data frame gives me a different type of data

Could it be that there is a bug that mixes up attribute names? If I ask for "gene_exon_intron" items from the data frame that has them, it gives me gene names like those in the "refseq_dna" column from the other data frame (though not the same ones). 

Thanks. 

 

Eric

 

 

ADD REPLY
0
Entering edit mode
Thomas Maurel ▴ 790
@thomas-maurel-5295
Last seen 5 months ago
United Kingdom

Dear Eric,

I think it will be easier to explain if we look at a particular example, if we look at the Ensembl Gene ID ENSMUSG00000000003 (http://www.ensembl.org/Mus_musculus/Gene/Summary?db=core;g=ENSMUSG00000000003;r=X:77837901-77853623).

Below I am filtering on ENSMUSG00000000003 to only get data mapped to this gene:

> exons7p2 <- getBM(attributes = c("refseq_dna","ensembl_gene_id"), filters = "ensembl_gene_id", values="ENSMUSG00000000003", mart = mart.mm)
> exons7p3 <- getBM(attributes = c("refseq_dna", "gene_exon_intron","ensembl_gene_id"), filters = "ensembl_gene_id", values = "ENSMUSG00000000003", mart = mart.mm)

If we look at the first query, mart return NM_017471 as the value for the "refseq_dna" attribute and you can see that this is mapped to the Ensembl Gene ID  ENSMUSG00000000003.

> exons7p2
  refseq_dna    ensembl_gene_id
1  NM_017471 ENSMUSG00000000003

Now if you add the "gene_exon_intron" attribute in the query you will see that mart will run the above query but it will also fetch the exons + introns sequence for ENSMUSG00000000003.

> exons7p3
  refseq_dna
1 GTCAGTGCACAACTGCCAACTGGGATGCAGAACACTGCTCACGCCAACCATCCTGAAAGCCAACTATAAAAAGCAGAGAGATACTCTGCACCTTTTCAGTGAGGTCCAGATACCCACAGAGCAGAGACAGTCGCTCACACATGATGAGGGTCATCATCCTCCTGCTCACACTGCATGTGCTAGGCGTCTCCAGTGTGATGAGTCTCAAAAAGAAGGTAGCAGACCTGTGTGGAAGGGGGCTGTATGTGGTGGGCATGTTGGGCAGAGACAAACAGACAGAGAGAGGCTTGGGAGGGGGCTTTGTGCAGGTGGGTGGCGACAGCAGGAGACAGGGAGGCAGATGCAGAAAGACTTTAAAGGGGGAAACGAATGTTCTGGAACTTGGTATTTGCTAAGTCTGGTAATAGAGAATAACAAATGTGTTGGGGGGACATAGTTGAGTCTCATAAATGCTTTTCTGAGAGAAAAAGCAGAAGGACAAATACCAAGAGATAGAAACAGAGATAAATGAGAGAGTTCCTGGAGAATTTAACTAGGGATGGACAACTACCCTCTGGAAGTTGCTACTCTTTACATCAGGAATGGAGAGAGACAAAGAAAGAGAGAGTGATCTAAAGGGTGGTTTTCTGAGATAGAGTCACAAAGAAAGGCCAAGAGACAAAGACAGAGAGAATAATGTCTCTGAAGATAGGTTTTTGGTGTGTCTTTGTGTGACAGAGAGAAATGGGAATGGGGAAGAATGTTCTGGAAGTATTCTCTACATCTCAGCATGGAAAGCAGCAGGGAGAGAGAATAGTCTAAATGGTGGTTTTCTGATAGATAGAGACTGAGGACTGAGAGACCAGAGACCTGGAGAACCCGGAGAGAGACATAGAGAGAACATCCATCAGTCTGGAACCAAGGGTGTATCTGACTTTCCCAGAAAATCCAAACACCTACATCACTTCCTATGTTTTCACTCTAGTCCCTCACAAAGCCATGATGACGAATACAATGAAATAATTTTTTTCTCTTTTTTTTATTAGGTATTTTCCTCATTTACATTTCCAATGCTATCCCAAAAGACCCCCATACCCACCCACCCCCAATCCCCTACCCACACACTCCCCCTTTTTGGCCCTGGCCTTCCCCTGTACTGGGGCATATAAAGTTTGCAAGTCCAATGGGCCTCTCTTTCCAGTGATGGCTGACTAGGCCATCTTTTGATACATATGCAGCTAGAGTCAAGAGCTCCGGGGTATTGGTTAGTTCATATTGTTGTTCCACCTATAGGGTTGCAGTTCCCTTTAGCTCCTTGGATACTTTCTCTAGCTCCTCCATTAGGGGCTGTGTGACCCATCCAATAGCTGATTGTGAGCATCAACTTATGTGTTTGCTAGGCCCCAGCATAGTCTCACAAGAGACAGCTATATCTGGGTCCTTTCAGCAAAATCTTGCTAGTGTATGCAATGGTGTCAGCGTTTGGAAGCTGATTATGGGATGGATCCCTGGATACGGCAATCACAAGATGGCCCATCCTTTCGTCACAGCTCCAAATTTTGTCTCTGTAACTCCTTCCATGGGTGTTTTGTTCCCATTTCTAAGAAGGGGCAAAGTGTCCACACTTTGATCTTCCTTCTTCTTGAGTTTCATGCATTTAGCAAATTGTATCTTATATCTTGGGTATCCTAAGTTTCTGGGCTAATATCCACTTATCAGTGAGTACATATTGTGCGAGTTCCTTTGTGATTGTGTTACCTCACTCAGGATGATGCCCTCCAGGTCCATCCATTTGCTTAGGAATTTCATAAATTCATTCTTTTAATAGCTGAGTAGCACTTCATTGTGATAATGTACCACATTTTCTGTATCCATTCCTCTGTTGAGGGGCATCTGGGTTCTTTCCAGCTTCTGGCTATTATAAATAAGGCTGTTATGAACATAGTGGAGCATGAGTCCTTCTTACGGGTTGGGACATCTTCTGGATATATGTCCAGGAGAGGTATTGCTGGATCTTCCGGTAGTACTATGTCCAATTTTCTGAGGAACCGCCAGACTGATTTCCAGAGTGGTTGTACAAGCTTGCAATCCCACCAAAAATGGAGGAGTGTTCCTCTTTCTCCACATCCTCGCCAGCATCTGCTGTCACCTGAATTTTTGATCTTAGCCATTCTGACTGGTGTGAGGTGGAATCTCAGGGTTGTTTTGATTTGCATTTCCCTGATGATTAAGGATGTTGAACATTTTTCAGGTGCTTCTCTGCCATTCGGTATTCCTCAGGTGAGAATTCTTTGTTCAGCTCTGAGCCCCATTTTTTAATGGGCTATTTGATTTTCTGGAGTCCACCTTCTTGAGTTCTTTATATATATATTGGATATTAGTCCCCTATCCGATTTGGAATAGGTAAATATCCTTTCCCAATCTGTTGGTGGTCTTTTTTTATTTTTTTTATTAGGTATTTTCCTCGTTTACATTTTCAATGCTATCCCAAAAGTCCGCCATACCCACCCCCCCAATCCCCTACCCACCCACCCCCCTTTTTGGCCCTGGCGTTCCCCTGTACTGGGGCATATAAAGTTTGCAACTCCAATGGGCCTCTCTTTCCAGTGATAGCCGACTAGGCCATCTTTTAATACATATGCAGCTAGAGTCAAGAGCTCCGGGGTACTGGTTAGTTCATATTGTTGTTCCACCTATAGGGTTGCAGTTCCCTTTAGCTCCTTGGGTAATTTCACTAGCTCCTCCACTGGGGGCCGTGTGACCCATCCAATAGTTGACCGTGAGCATCCACTTATGTGTTTGCTAGGCCCCAGCATAGTCTCACAAGAGACAGCTATATCTGGGTCCTTTCAGCAAAATCTTGCTAGTGTATGCAATGGTGTCAGCGTTTGGAAGCTGATTATGGGATGGATCCCTGCATATGGCAATCACAGATAGTCCATCCTTTTGTCACAGCTCCACATTTTGTCTCTGTAACTCCTTCCATGGGTGTTTTGTTCCCATTTCTAAGAAGGGGCAAAGTGTCTACACTTTGGTCTTCATTCTTCTTGAATTTCATGCGTTTAGCAAATTGTATCTTATATCTTGGGTATCCTAAGTTTCTGGGCTAATATCCACTTATCAGTGAGTACATATTGTGCGAGTTCCTTTGTGATTGTGTTACCTCACTCAGGATGATGCCCTCCAGGTCCATCCATTTGACTAGGAATTTCATAAATTCATTTTTTAATAGCTGAATAGTATTCCATTGTGTAAATGTGCCACATTTTCAGTATCCATTCCTCTGTTGAGGGGCATCTGGTTTCTTTCCAGCTTCTGGCTATTATAAATAAGGCTGCTATGAACACAGTGGAGCATGTGTCCTTCTTGCCGGTTGGGACATCTTCTGGATATATGCCCAGGAGAGGTATTGCAGGATCCTCCAGTAGTACTATGTCCAATTTTCTGAGGAACCGCCAGACTGATTTCCAGAGTGGTTTACAAGCTTGCAATTCCACCAACAATGGAAGAGTGTTCCTCTTTCTCCACATCCTTGCCAGCATCTGCTGTCACCTGAATTTTTGATCTTAGCCATTCTGACTGGTGTGAGGTGGAATCTCAGGGTTGTTTTGATTTGCATTTCCCTGATGATTAAGGATGTTGAACATTTTTCAGGTGCTTCTCAGCCATTCGGTATTCCTCAGGTGAGAATTCTTTGTTCGCCTCTGAGCCCCATTTTTTAATGGGGTTATTTGATTTTCTGGAGTCCACCTTCTTGAGTTCTTTATATATGTTGGATATTAGTCCCCTATCCGATTTGGGAATGGTAAATATCCTTTCCCAATCTGTTGGTGGTCTTTTTGTCTTATTGACAGTGTCTTTTGCCTTGCAGAAGCTTTGCAATTTTATGAGGTCCCATTTATCGATTCTCGATCTTACAGCACAAGCCATTGCTGTTCTATTCAGGAATTTTTCCCCTGTACCCATATCTTCGAGGCTTTTCCCTGTTTTCTCCTCTATAAGTTTCCGTGTCTCTGGTTTTATTTGGAGTTCCTTAATCCACTTAGATTTGACCTTAGTACAAGGAGATAGGAATGGATCAATTTGCATTTTTCTACATGATAACTGCCAGCTGTGCCAGCACCATTTGTTGAAAATGCTGTCTTTTTTCCACTGGATGGTTTTAGCTCCCTTGTCAAAGATCAAGTGACCACAGGTGTGTGGGTTCATCTCTGGGTCTTCAATTCTGTTCCATTGGTCTACTTGTCTGTCACTATACCAGTACCATGCAGGTTTTATCACAATTGCTCTTTAGTACAGCTTTAGATCAGGCATGGTGATTCCATCAGAGGTTCTTTTATCCTTGAGAAGAGTTTTTGCTATCCTAGGTTTTTTGTTATTCCAGATGAATCTGCCAATTGCCCTTTCTAATTCGTTGAAGAATTGAGTTGGAATTTTGATGGGGATTGCATTGAATCTGTAGATTGCTTTTGGCAAGATAGCCATTTTTACTATATTGATCCTGCAAATCCATGAGCATGGGAGATCTTTCCATCTTCTGAGATCTTCTTTAATTTCTTTCTTCAGAGACTTGAAGTTCTTATCATACAGATCTTTCACTTTCTTAAGGGGTGAATTGCACTGACATTATGCTTACAGTGGCAGTTGTCTACTTTGGCCTGTTATCCAAAACATTAAAAGTAAGAACTAAAAAAGATCTGCAGAGAAAGCAAGATGACTTTATTCAAGGCAAAACTAGAGTACAGATTACAGACAGACACACTGCCAAAGATGGGCAGCTGCCCACTTGAGTGACTGGACTCAAGGTCTCCAAGCTATCATGTATGTCTCTATCAGGAGTCCCAAGGAGACCAATAAGCTCGTTACTAAGGTGCCAGATTTATTTTGATTAACAGATTAAAACATTTATTTCTCTAACTGTGTATGCCCCTTCTAGTAATACATCTGATTTTAATCATGCATAGGCCAAATTACACAAAGTCATAATGTGGCCATCTGAAGAACATGTTTTTGTCTTACTATGTAAACACCCCAAACCAGTTCAGGCAAAAATAGTGCTTCTGTTCTCCATACAAGGATACACTGAGAATATAGTTGAATGGAAACTGTCTGGACTTGATCATTATTAGGGTTGTGGGAAGTCAGAGACAGGTGTATGACTAGCTGGATACAGGGTTGGGGACTATGAAGGGTTCTTTTGTGTACTCTACAGCCTTGTTCACAGATGAGTGTGTGAGTATTCTAATACAGGTGGGGAGGAGGAAGAAGGGGTTGTTATATGGAGAACTCTATAGCATTGCTCAGAGAGGGGTGTGTGAGTAACCAGACACAGTTGGCTTGGGGAGGGGTCTAAAATCATGATTGGGTTCATTGAATAGAGTATGCATGAACATAAAAAAAGTTAAGCTGCCTAATGCCTTTTTATATAATCAAGTTGTGAATTATCCTTATCCAAGTGGTTTGGAATTCCCAGTTACTACACTATCCCCTACTTTTAAATGCATTTTTTCAATTGCTGCTTTTTTTCCAGATTGACGGGCCTTGGCAAACAATTTACTTAGCTGCCAGTACCATGGAAAAGATAAATGAAGGCTCACCATTGAGAACCTACTTCCGTCACATTTTGTGTGGGAGGAGATGCAACCAAGTCTACCTCTATTTTTTTATTAAGTAAGATATATATAGACAGAACAATCCATGTGATGGTGTGTGGGTAGAGAAATGAAATGTTTTCAGTCAACACTCATATCAGTGAACACACACAGCACTAAGTGAGGATCCTTGTATCACTTTGCTCTCTGTGGCCCTGAAAAGAACATATGATAGCTGTACTTTTAATTTAGAAAGAGTTAATTTATTTATTTTATAGCAAGCCAAAGTAAAATAAATAATATCAAGAGATGGGTGGTGATTATATGTAGATGATTTATACATAGATAATAGATATAAAATTAGATAGATGATACATATATGACTATGTAGGTGATAGATACATAATTAATAGATATAAAAGTAGATAAATGATACATATACACAAATGTAGGTGTTAGATAATAGACAATTAGCTAGACATGTGATAGATATATACATAGATATGTAGGTAGACAGATAATAGATGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGAAAATATATATAGATAAATAATAGATGGATACATGCATGTATATACATAAATATATACATAGGTAGATTTAGATAGTATATAGATATATGAAAGAGACAGGCCTGGGTGTAAAGTACAATTGGAAAACATATTCAAAATCAGCTCCTTGTTGAAGTCAAACATTACAAGACAATTTACCAAAATAACAATTACTTAATTTTCTATTTATCACTGCTCAAAAGCCCAGGTGGTAGAGAAGATATATCAAAATAAATTTTGGGATAGTTTCAGTGCTTTTAAACTATATTGATATGAGTGATATTATCCATCGAGGATATATTTCTATCTGGGAGGAACAAAGTTTTTGTTAATTATGTGCTGATATTTGAACCTTTTCATAAATTAGCTATTTATAAATTCAGGTTTCTGAGAGTCTCTGTACATTGTTCTTGCTTTGAACTCAATATGGAGTAAAAAATAAACTTTTTAATTTTTATTTATTATATACACATATAATAAATAAATTAATAAACATATTGTATTTTTTCTGGACAGTGTTTCTCTGTGTAGCCCTGGCTACCCTGAAACTCACACTGTAGACCAGGCTGGTCTCAAAATCAGAGCCTCTACCTTCTGGGATTAGAGGTGTGTGTCATCACTACCTGGCTAATTTATATATTTTTAAAGACAGGTTCTCCTTATATAACCCTGGCTGTCCTGGAATTCTCTATGTACACCAGACTGGACTTGAACTCACAGAGATCTACCTGCCTCTACCTCTCAGGTGCTGGGTTCAAAAGTGTGTACCACCACAAGCAGTCTTTTTATGTTTTTATTTATTTTTGTTTTTAATTAATTTTTTTTAGAATGATAGTGAACCTATCCTCCTGAATCAGTCTGTTATGGATTCCACCATGCTTGACTTGATATTTAACATACCTAAAATTATTTAAAAGAGTAGTAGTTCAAAATCAAGTAGGTGTATACACTAGAATGTTGAAGAGATAATTTTATTTTTAAATACGTTTTTAAACTAAAACATAATTACATCATTTCCCCCTTCCATTTTCTCCCTCCAATCCTTCCCATGTTCCCTCCTCCCTTGGTCTTTTTCAAATCCTAGACTCCTTTTTCTTTAATTGTTCTCGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCGCGCGCGCGTTCTTATTATACAACCTGTTCATTCCATATAATGTTACTTGTATATATATGATTTTTACAGTAACCATTTGATATTCAAAAAACTAACTGAGAGGGAGCACTTTTCTTCAGGGAAGACTACAAAATATCCTAAAATTCAATGGAGCCAGAAAAGATCCTGTGGCAGCTAAAACAATCCTGAGCAAATAGAACAGTGCTGGAAGGATTGCCATTCCAGATATGAGTGATTTTTTTTTACATGCAGAGAATGGACCCCCAAGGCCTTCAACAGGCTTTGTGTAACCACACCCTGCTATTAAACAGTTATTTCCATTTTTTGTTTGTTTTTGGTGGTGTGCTTTTAATGACTGAGTCATCTCTCCAGGCCAGTTATTCTCATGTTTTTAGAGCCATAAAGACAAATCTAAGCTGTGGCTGTCACTATTAGTCACATGGGTCTCTTCTGAGTGAATACAAATCTGAATGCATGACTATGCCTACAACTCTATGGATCTGTTGATGCACATCGCAACTTAAATCTTCTCAGAATACAATTGTAGGCTGAATTTAAATCCATTTAAATGCACAATTCAATTGCTGCTACAGAATCTATAAGATTGATGACATTTATGAGAGGGACAAAAAGTAAATACATGATCCCAATGTAAATATGGATGCTAGACTATAATGTATGTAGGGCACTGAATGTGCTCTGCACCAGGTGGCCTTTCACCCTTTAAACCCAATTCTACAGACATTTCTCATCTCAGATTGGGGTAGAATTTTTGCTCAATATTTTGGATTTTTTTCACTTATCAAAAGTTGTTATTGGGTGCTGGGGATACAGCTGTGTTGGGAGTGATTTAACAGCATGAACCACAGACCCCAGGGCTCCATACTCTGCACAAAGTTCAGTATTACATTCCCATCATCCCAGCACTTGGTGAGTAGAGGCAGGAAGATCAGGGGTTCAAGGACATTCATTGGTCAGAGGTCAGGATAGGCAACAGATCACACAGAGGAGCTAAATCTGCCCTAACAGAGACAACTGCAAACAACTAACAGGGTGATTGATGACTTTTAAGGATGTTATTAATTATATTTAAGGAAAATGCCCTACAACTACATGTGCTTATAAGAGATCTACAGAGGAAAATGTGGTTTGCAATAAACTATCCTGTCATTTATGTGGAATCAGGACCCTGGAGTAATATACAATACGACATTTGTTTTTCAGGAAAGGGACTAAGTGCCAACTGTATAAAGTCATAGGAAGGAAAAAACAAGAAGTTTACTATGCACAGTGTGAGTAAGCAATGCCGGAATGAATGCACGACCTTAATTGTTTCTTTTTTCAAAATTCACTGTCTTGGTTGAATTTTGCACACCTTGATATCTACATTCCAATGTAATTGTCTCTACAAATTACAGATGCAATCTTGAGGAAAGCCTCCTCATTTTCTTCCTAGTCATTTTGGAACTAATAAGAGATTTAATTTGTTTGTTTAGTCTTTAATGACTTTTAAAAACTTTTTTATTTATATGGTGTATACCTCTACCAAGCACACTTCCCTGGTTTAGCTTTACGGGTAAACTAAATCACAGACAAGAGTCTGAGAATGGTTAAAAGAATTAAAAGGTGATTTGAGCACTTTAAAAAATAACGATGCAAAATAGGAACTAGGATTTTATCAACTATGAAAATGACTCCCAGCAATTCTGAAACATGTGTCTTGTACAAGGAATGATTAATTGTCTAAATTTTATAATTTTATTGGGGAAAATGATTTACAATATATAAAATTAAGCATCATGAATAGTCTTAGCTCTTTAGTTCTGAAAGGCAAGGTATATGACATACTGAGATCCATTTATAAATAATTCAGGTATCAATTTAATTTGTCATGAATCAAGTGGTAAAGATACTTTCATTATTTTCTATTCTACCATAAAGAGTCTATTTAACGCAATACATTAAACTGATAAACTAAACATGTCATGGACTCTAGGTTGGTTTATAGATAAGTTTATAGAAGAGTTTATTGTACTTAAATAGAGGAGCCGAATCAACTCACAAGTATATATCCCTCCAAAAAGCCCCACACACACCTCCTAGTGTGTTATGTAATTTAGTTGGAGGCCAGCCTGATCTAAACAACAAGTTTCAGGCCAGCCAAGGATACATAGTAAGACCCGTCTCAGAAGGAAAAGGAGTTACCACTGACATAAAAAGGAACAGTCATCATATCAATGAGGAGGATCTTGAGTCAACCCCTTCCCCTCTTTAAGAAGTCTTGCTAGTGAAGCAGTGAGTGTATTGGAGTCACATACAAGAACATTTGGAAATATGGCTGCATTCCTGAAAAGCCCAACTCAGCTTGGATGACACTCCTGAAAGTTTTATCCTTGGAGCTTTAGCAAGGCTTACAGACAGCTAAGCTAGTCAGAGTCTCTTCTAGAGGGCTTGTTCACTGTTTCCCATAACGTTGTGGAGGGGCCTACTGAAAGCTGTTGGTCTCAAAATCATCCTGAGATTTGTGGGGTTTTTACTTCTTGAATCTCATGAGCTACTAAGACAGAATGTTCCAGTTTGGAGAAAATCACTATGTGGCACCCTCCCTTTTTTCTGACATGCCTCTGAGGAAAATTGGTAAAGTTAGATAATTGCACCACTAAATGTAGATCAGCTCCATGAAAATGTTGAGATATATATATATATATATATATATGTATATAAAATCAGAGAGTTTTCCTCAATATGTGTATATATGTGTATATAATTCCACCATTAAATGTATTAAATGTATTAGGATCCATGAAAATGCCTATATATGTGCACACACACTATATATATATATATATATATATATATATATATATATATAATCTTTCCTCAATATATGTACTATAAACAATTCCACCATTAAATTTAATAGGATCAATGAAATGTATGCACATATTGCATATACATATAATCATACAGCTATTCTCAGCAAGTAAGAGGACTGCAGTATGTCAGCTTCTATACATTCACACATCTTGGTATCAAGAACATAGCTATTCCAAACCAATATAGGTTCTACATATTTGTAGGTAACTGTGGGATGTTAGATATATTTTATATCTTACAACAGAATTATAAACTTGCTTTATAATTCCCTGTTTGTCTTGACTAGTTTCAACACATGGAGCAGAATATTGTAGGTTAGGGACTGTATACTCCATATATACATTCATAACTGTGCAGGTAGAATGAGTCTGAGCAAGCATTTTGTTGACTTTATATGTATTTATACAAATGTCTTACCTATCTCAGATGAAGGGAGCATAGCATTCATGTTAAAGATGGTAAATGAGAAGATATTGTTGTTTCATTATTTTAACAAGAACAGAAGAAATGATGTCACACGAGTGGCTGGAGTTTTGGGTAAGTGTCACACATAGAACTTGTAATGTGAGAGTGTGATTCAAGGATATGAATGTATGCATATCTCTGTACCCAAAGATCACAATGTGTTTGTGTATCTGATTCTGTATGTCAGGTTGATAAATCACTGAAGCTGTCATAGTTAGCTTTGAAGATCAACTTGGAACAACCCAGAATGCCCTGGGAAGGGTCTCAGTAGGGAGATTAAGCAACAGAAATATTTTATTAGTGTGCTGGGTACTAATGTTGTGACATATTCCCTATACACCGTGAAACGCAGCAAAAGGCAAACAACTGAATAAGGAGGAGATGACGGAGTTCATGAACTTAGTGGAAGAAATGGGCATTGAGGAAGAGAATGTACAGCGTATCATGGACACGGGTATAGTAGCAACCTACGTGTGTAACTTCTTACTTTGTATTTTTATAATAAATGTTATTTTTTTATTATTGAGGGTGTTGGGAAGTTGGCTCAGTGGGAAAAGTGCACACAAGCATGAGGGCCCAAGTTTGGATCCCAGCACCCATTTAGAAAGCCAAGAATAACAGAGCTGGAAAGGGGAGGGGACAAGAGGAGCTCTCTAGTAATAAATCTATCTGAAAGGTCATATTTCAAAACTAACAATGTGTAGAATGATTAAAGGAGGTGTTAAAAAGTCAGCCTCTTACCTATGCATGCGTGGGCATACACATGTCATTCACACGACTATGAACACACCAATAAACCATGCAGATAGAAACACACACACACAAAATCTACCTGAAATGAAACTCAGTTCCTAGGTCCCTCTGTCCAGATACTCTAGTCAAAACCTAGCTTTATATCTGCATTTTTATGAAAACTGTTCATATGTACTAATGTTATTACTAATATTATGTTGAAGTACTTTATTACTTTAAAATAATACTGTTGTTTTAAAATAATACTTTACTTTAAATGTTAGTATGTTAAAAATAATATATTTCCGTGTTTACTCTCCTTTAGACAACTGTCCAAGCAAGATCAAACCTTAGTGACTCAACAAGATCAGGATTAGGTGAGTCAAAGCACATTAATTTTATATCTTGAAGTTTTAATTTTAATTTTATTTTAATATTTAAGTTATTAATATTTAATATTTAAAGTAAATAATATTTTATTTGTTCTTTATCCTTTTCATATATTTATATAATGTATATTTATCTTTCATACTTCCACACACCTCATCCAAGCAACACATCTCCCTCTATACTTCATGGCTTTTAAATTTTATTATTATTTATTATTATTATTACCTCAATGAATCCAATTGGTGCTATTTCTGAATAGGGTATGTCAATTCCTGGAGGCATGAGTAATGTAATATACCAGGGCTATTATGTAATATAAAAAAATGGACTGCAGGGAGCTGGTGGCACATGCATTTAACCTAGCGCTCAAGAGGCAGAAGCAAGCATATGTCTGCGTTTTAGATAACCCTAGTCTACAGAGTGAGTTCCAGGACAGCAAGGGCTACTCAGAGAAACCCTGTCTCATAAAACCAAAAAGTATATTTTTTAAAGGCAGAAGAGGAAAGGAGGGAGGGAGGGAGGGAAAGAGGCAGACAGACAGACAGGCAGGCAGGCAGGCAAAAAGAAGGAAAAAGAAAAATATGGACCCTCCCTCTCCCAGAAACTATCAACTGGCAATAGCTCTTTGGTGGGGGGGGGGGGTCCTGAATCCCTCTCTGCTCAATTCTAGAATGTTAATTAGCTTGATCTGACCAGGGTCTTGTGCAGAACACCACAGTTGCTGTGAGTTCATGATTGTAACAGATCTGTCATGTTCAGAAAACAGAATTTTATGGCTCTACTCCCCATCCAGATGATAATTGATGGCTAAGCATCCACAGTCACTTGTCCTCATCCTTTGACCAGCTACAAATTTCTGCACAAACCACTCCCCACTGTAAAAAGTTGAGCAGAAGTCAAACATGGAGGCACACACCCTTAATCCCAACACTTGGGAGGCACGGGTAGATGAATCTCTGTGAGTATCAAGCCAGCCTGATCTACATACTGAGCTCCAGAAGATCCAGGTACATAGTCTCTATCTAAACAAACAAACACAAAGTTAAAAAGTTGATTTGACCAAAATTGAGAGAAGCATAAATCTATGAGTATTTTTAAGACCGTGGCTTGGAAACATGACAGTTCATCACCACTGGTCTCCTTCATAGGCTCCATGAGCTCCATTGTCACAGGCTTTTGACTCGAATTACAATAGAAACCCACCCTTACCCCTGTTCTTCCATAGATATGAAGTTCCTTCCATGGAGCTGGCATGAAATTTAATCAGAGAGTGCTTGGCTCCCCAATACAGCCTTTATTGCACCAGTGGACGCAAGGATGATTTTATAGTGTGTTGGGGAAGCATGGTGACATCACTGACATCTTATCCCACATACACAGCCTATTAAGTACATCTAAGCACTGTGAACGAGTTTCCTAGTCCATTTGACATCGATTTCTTGATGCCCTACAGCCACAGCATTTGGTGTCTTCAGCAATAGTGCCCTACCATTTAGATATGGTGTATAGTCAAGAGATATGGCAATAGCCAAGTTATTTTGGTTATTCCAAGGCTTCCTTCCATTAATAAATATCATGGTGGTACCCCCATGACTAAAAATTAGATTTTCACTGAATAAACCATGTCTTCTGGGAACAGCATTATACCATTGCAGGGATCCTCTGCTGAAACTTTTTTAATACTATATTTTTACTTAGCTTACAAACTAGTGGATTTCTGTAAGACTTCATTTACCTTCAGTTTCAGTTGACCCTCCCCTACCCTGTTCTTCCCTATGCCCAAACACATCCACACCTACTCCTCTAGCCCACAGCTCTCACTTTCTAATCTTCCCTGTCACCAGTGCCCAATTATATCGCCTGTATTATTATATTTTAATCACACAACCATAGGTTTCCATATGAGGTTTTAATAACCCTTCATTCTGGTTAAACCTTCCACCACACCCTGATTTCCCCATTCCACAACCAACTCCATGATTAAGCCTTCCTGCCCCAAGTATTCTTCTTTATACTTCATTTTAATGGCATTACATTTGATGGACCCACTTCCTTGATGGACCCAATTCAAACCAGTTTCTAATTACCTGGATTCCTTACATACTCCATATTATGCATACAAAATAAAAGATTCAAGTCTAATGTCCACATGTGAGATAGAATGTGCAGTTTGTCTTTCTGAGCCTGAGTGGCCTCATTAAGTATAATAATTTCCAGTTCCTTCTATTTACTTGGAAATTTCATATTTTCATTGTTCTCTATGGCTGAGTAATATTCCATCTTATACATTACATTTTCCTTATCCATTCATTAGTTGATGAACAGTTGGGTCAACTTCGTTTCTTAGCTATTATGAACTTAACCTCAGTGAGCATGGACATTCAAAGGTCTCTGTAACAGAATATAAACCCCTTTGTGTACATATCTAGAAATGGAGGAACTAGAGAAAGCACCCAAGGAACTAAAGGGAACTGCAACCCTATAGGTGGAACAACAATATGAACTAAGCAGTACCCCGGAGCTCTTGTCTCTAGCTGCATATCTATCAAAAGATGGCCTAGTCGGCCATCACTGCAAAGAGAGGCCCATTGGACTTGCAAACTTTATATGCCCCAGTACAGGGGAATGCCAGGGCCAAAAAGGGGGGATGGGTGGGTAGGGGAGTGGGGGGGGGTTGGGGACTTTTTGTATAGCATTGGAAATGTAAATGAGCTAAATACCTAATAAAAAATGGAAAAAAAAAGAAATGGTGTAACTGAGTCACATGGGAAGTCTTTTTCTAATCTTTGGATTTGTTTATTTGATGTTCATAGTTTTTGGGTTTTGGTTGGTTTTGTTCTAAGTTCTTTGTATATTGTAGACACTAATCCTCCATCACATGTGTAGTTGGCAAAGATCTCCATTCCCCGAGATACCTGTGCATTTAATTGACAGCTTCCTTTGCCGTAGTTTTTAATTCCATGATATCTGACAAGTGTTTGTCTTACTTTCTTGCTACAAGAATCCTATTCAAAGAATCCATACCTGTGTCTATGGTTAACACACACTCCATGCTTTCTTCTCTATCAGCTTCAGGTTACCATGTCTTATGACACGGTCTTTGAATCATTTGGAGTTGAGGTTTTTTTCAAGGTGACAGGGAAGAGCCCAGGTTCATTTCTCTGTATCCTGATGTCCACTTTTTCCTATTCGGTCTATTTATGGAATTATATATTTTTATGTTAGGTCATTTTTCAGTGGAGGCATTAACAATATCCAGAAGGGGACTATTTCTTACTAGTGTTGAGATGGTATTCTCCTATATGGGGCTGGTACTGAAGGCAGCAAGTTCTACCCCTAGTCTTCCTGTGAGATTCAACTACTATCTGGGACCTCGAGTGAGACTCTGTCTGTAGGATACATGGGGCTTGGTAAACTCCAATGTGAAACAAAAAATATATAATTTTAGTTTAGATTCATAGAAACTACATCCTCAAATAAACACATAAGTTCTAAAAAGTACCAATTTAGGTCTTGATATAAGATCATTTGTCATATAAAAATTTTCCATATAAGGAAAATTTCCATACAAAGTTCATGTATATTTCCAAATATACAAAATTCTGTAAAATGTTTTTGCATGATACATCTTGTCATTGTTTGCCTCTTTAATGGCTTGTATTTGTTTCATTTTCTACTCTCATCAAATATCATGTATTACTATCCTAAATATATGAAATAATTCTGTTCCAGCATTACAGATGACATCAGGAATTTTCCAGTATATTTTTCCTGGAACCTGAAACATCAATATGAAGATGAAGCAATCTTGTCTCTCAGATCATATTTTCCTATTTATTGCAAATTACAATTCCTGTCTCTGTACTTTCTCTTTCACTCATTGTTTCCCATGTTCTAATCGGTATTAGTGCATCTTTGAATGTTTAAATAAATTTATTTTACTTGC
  gene_exon_intron    ensembl_gene_id
1        NM_017471 ENSMUSG00000000003

 

Because the sequence is so big and BiomaRt display the header after the sequence then it can be a bit hard to spot. You are right, something is not right here as the "gene_exon_intron" and "refseq_dna" headers seems to be inverted in the second dataset. I will report this to the BiomaRt main developer.

Hope this helps,

Regards,

Thomas

 

ADD COMMENT
0
Entering edit mode

Dear Thomas, 

Yes, that helps a lot. Thanks very much. 

Best wishes, 

Eric

ADD REPLY
0
Entering edit mode
Thomas Maurel ▴ 790
@thomas-maurel-5295
Last seen 5 months ago
United Kingdom

Dear Eric,

This is indeed down to the fact that the BioMart website interface does not allow joins between sequences and external references and the BiomaRt package allow them. There is a setting that you can turn on in your query to get the BioMart website headers instead of the BiomaRt package headers called "bmHeader=TRUE", please see example below:

> exons7p3 <- getBM(attributes = c("refseq_dna", "gene_exon_intron","ensembl_gene_id"), filters = "ensembl_gene_id", values = "ENSMUSG00000000003", mart = mart.mm, bmHeader=TRUE)
> exons7p3
    Unspliced (Gene)
1 GTCAGTGCACAACTGCCAACTGGGATGCAGAACACTGCTCACGCCAACCATCCTGAAAGCCAACTATAAAAAGCAGAGAGATACTCTGCACCTTTTCAGTGAGGTCCAGATACCCACAGAGCAGAGACAGTCGCTCACACATGATGAGGGTCATCATCCTCCTGCTCACACTGCATGTGCTAGGCGTCTCCAGTGTGATGAGTCTCAAAAAGAAGGTAGCAGACCTGTGTGGAAGGGGGCTGTATGTGGTGGGCATGTTGGGCAGAGACAAACAGACAGAGAGAGGCTTGGGAGGGGGCTTTGTGCAGGTGGGTGGCGACAGCAGGAGACAGGGAGGCAGATGCAGAAAGACTTTAAAGGGGGAAACGAATGTTCTGGAACTTGGTATTTGCTAAGTCTGGTAATAGAGAATAACAAATGTGTTGGGGGGACATAGTTGAGTCTCATAAATGCTTTTCTGAGAGAAAAAGCAGAAGGACAAATACCAAGAGATAGAAACAGAGATAAATGAGAGAGTTCCTGGAGAATTTAACTAGGGATGGACAACTACCCTCTGGAAGTTGCTACTCTTTACATCAGGAATGGAGAGAGACAAAGAAAGAGAGAGTGATCTAAAGGGTGGTTTTCTGAGATAGAGTCACAAAGAAAGGCCAAGAGACAAAGACAGAGAGAATAATGTCTCTGAAGATAGGTTTTTGGTGTGTCTTTGTGTGACAGAGAGAAATGGGAATGGGGAAGAATGTTCTGGAAGTATTCTCTACATCTCAGCATGGAAAGCAGCAGGGAGAGAGAATAGTCTAAATGGTGGTTTTCTGATAGATAGAGACTGAGGACTGAGAGACCAGAGACCTGGAGAACCCGGAGAGAGACATAGAGAGAACATCCATCAGTCTGGAACCAAGGGTGTATCTGACTTTCCCAGAAAATCCAAACACCTACATCACTTCCTATGTTTTCACTCTAGTCCCTCACAAAGCCATGATGACGAATACAATGAAATAATTTTTTTCTCTTTTTTTTATTAGGTATTTTCCTCATTTACATTTCCAATGCTATCCCAAAAGACCCCCATACCCACCCACCCCCAATCCCCTACCCACACACTCCCCCTTTTTGGCCCTGGCCTTCCCCTGTACTGGGGCATATAAAGTTTGCAAGTCCAATGGGCCTCTCTTTCCAGTGATGGCTGACTAGGCCATCTTTTGATACATATGCAGCTAGAGTCAAGAGCTCCGGGGTATTGGTTAGTTCATATTGTTGTTCCACCTATAGGGTTGCAGTTCCCTTTAGCTCCTTGGATACTTTCTCTAGCTCCTCCATTAGGGGCTGTGTGACCCATCCAATAGCTGATTGTGAGCATCAACTTATGTGTTTGCTAGGCCCCAGCATAGTCTCACAAGAGACAGCTATATCTGGGTCCTTTCAGCAAAATCTTGCTAGTGTATGCAATGGTGTCAGCGTTTGGAAGCTGATTATGGGATGGATCCCTGGATACGGCAATCACAAGATGGCCCATCCTTTCGTCACAGCTCCAAATTTTGTCTCTGTAACTCCTTCCATGGGTGTTTTGTTCCCATTTCTAAGAAGGGGCAAAGTGTCCACACTTTGATCTTCCTTCTTCTTGAGTTTCATGCATTTAGCAAATTGTATCTTATATCTTGGGTATCCTAAGTTTCTGGGCTAATATCCACTTATCAGTGAGTACATATTGTGCGAGTTCCTTTGTGATTGTGTTACCTCACTCAGGATGATGCCCTCCAGGTCCATCCATTTGCTTAGGAATTTCATAAATTCATTCTTTTAATAGCTGAGTAGCACTTCATTGTGATAATGTACCACATTTTCTGTATCCATTCCTCTGTTGAGGGGCATCTGGGTTCTTTCCAGCTTCTGGCTATTATAAATAAGGCTGTTATGAACATAGTGGAGCATGAGTCCTTCTTACGGGTTGGGACATCTTCTGGATATATGTCCAGGAGAGGTATTGCTGGATCTTCCGGTAGTACTATGTCCAATTTTCTGAGGAACCGCCAGACTGATTTCCAGAGTGGTTGTACAAGCTTGCAATCCCACCAAAAATGGAGGAGTGTTCCTCTTTCTCCACATCCTCGCCAGCATCTGCTGTCACCTGAATTTTTGATCTTAGCCATTCTGACTGGTGTGAGGTGGAATCTCAGGGTTGTTTTGATTTGCATTTCCCTGATGATTAAGGATGTTGAACATTTTTCAGGTGCTTCTCTGCCATTCGGTATTCCTCAGGTGAGAATTCTTTGTTCAGCTCTGAGCCCCATTTTTTAATGGGCTATTTGATTTTCTGGAGTCCACCTTCTTGAGTTCTTTATATATATATTGGATATTAGTCCCCTATCCGATTTGGAATAGGTAAATATCCTTTCCCAATCTGTTGGTGGTCTTTTTTTATTTTTTTTATTAGGTATTTTCCTCGTTTACATTTTCAATGCTATCCCAAAAGTCCGCCATACCCACCCCCCCAATCCCCTACCCACCCACCCCCCTTTTTGGCCCTGGCGTTCCCCTGTACTGGGGCATATAAAGTTTGCAACTCCAATGGGCCTCTCTTTCCAGTGATAGCCGACTAGGCCATCTTTTAATACATATGCAGCTAGAGTCAAGAGCTCCGGGGTACTGGTTAGTTCATATTGTTGTTCCACCTATAGGGTTGCAGTTCCCTTTAGCTCCTTGGGTAATTTCACTAGCTCCTCCACTGGGGGCCGTGTGACCCATCCAATAGTTGACCGTGAGCATCCACTTATGTGTTTGCTAGGCCCCAGCATAGTCTCACAAGAGACAGCTATATCTGGGTCCTTTCAGCAAAATCTTGCTAGTGTATGCAATGGTGTCAGCGTTTGGAAGCTGATTATGGGATGGATCCCTGCATATGGCAATCACAGATAGTCCATCCTTTTGTCACAGCTCCACATTTTGTCTCTGTAACTCCTTCCATGGGTGTTTTGTTCCCATTTCTAAGAAGGGGCAAAGTGTCTACACTTTGGTCTTCATTCTTCTTGAATTTCATGCGTTTAGCAAATTGTATCTTATATCTTGGGTATCCTAAGTTTCTGGGCTAATATCCACTTATCAGTGAGTACATATTGTGCGAGTTCCTTTGTGATTGTGTTACCTCACTCAGGATGATGCCCTCCAGGTCCATCCATTTGACTAGGAATTTCATAAATTCATTTTTTAATAGCTGAATAGTATTCCATTGTGTAAATGTGCCACATTTTCAGTATCCATTCCTCTGTTGAGGGGCATCTGGTTTCTTTCCAGCTTCTGGCTATTATAAATAAGGCTGCTATGAACACAGTGGAGCATGTGTCCTTCTTGCCGGTTGGGACATCTTCTGGATATATGCCCAGGAGAGGTATTGCAGGATCCTCCAGTAGTACTATGTCCAATTTTCTGAGGAACCGCCAGACTGATTTCCAGAGTGGTTTACAAGCTTGCAATTCCACCAACAATGGAAGAGTGTTCCTCTTTCTCCACATCCTTGCCAGCATCTGCTGTCACCTGAATTTTTGATCTTAGCCATTCTGACTGGTGTGAGGTGGAATCTCAGGGTTGTTTTGATTTGCATTTCCCTGATGATTAAGGATGTTGAACATTTTTCAGGTGCTTCTCAGCCATTCGGTATTCCTCAGGTGAGAATTCTTTGTTCGCCTCTGAGCCCCATTTTTTAATGGGGTTATTTGATTTTCTGGAGTCCACCTTCTTGAGTTCTTTATATATGTTGGATATTAGTCCCCTATCCGATTTGGGAATGGTAAATATCCTTTCCCAATCTGTTGGTGGTCTTTTTGTCTTATTGACAGTGTCTTTTGCCTTGCAGAAGCTTTGCAATTTTATGAGGTCCCATTTATCGATTCTCGATCTTACAGCACAAGCCATTGCTGTTCTATTCAGGAATTTTTCCCCTGTACCCATATCTTCGAGGCTTTTCCCTGTTTTCTCCTCTATAAGTTTCCGTGTCTCTGGTTTTATTTGGAGTTCCTTAATCCACTTAGATTTGACCTTAGTACAAGGAGATAGGAATGGATCAATTTGCATTTTTCTACATGATAACTGCCAGCTGTGCCAGCACCATTTGTTGAAAATGCTGTCTTTTTTCCACTGGATGGTTTTAGCTCCCTTGTCAAAGATCAAGTGACCACAGGTGTGTGGGTTCATCTCTGGGTCTTCAATTCTGTTCCATTGGTCTACTTGTCTGTCACTATACCAGTACCATGCAGGTTTTATCACAATTGCTCTTTAGTACAGCTTTAGATCAGGCATGGTGATTCCATCAGAGGTTCTTTTATCCTTGAGAAGAGTTTTTGCTATCCTAGGTTTTTTGTTATTCCAGATGAATCTGCCAATTGCCCTTTCTAATTCGTTGAAGAATTGAGTTGGAATTTTGATGGGGATTGCATTGAATCTGTAGATTGCTTTTGGCAAGATAGCCATTTTTACTATATTGATCCTGCAAATCCATGAGCATGGGAGATCTTTCCATCTTCTGAGATCTTCTTTAATTTCTTTCTTCAGAGACTTGAAGTTCTTATCATACAGATCTTTCACTTTCTTAAGGGGTGAATTGCACTGACATTATGCTTACAGTGGCAGTTGTCTACTTTGGCCTGTTATCCAAAACATTAAAAGTAAGAACTAAAAAAGATCTGCAGAGAAAGCAAGATGACTTTATTCAAGGCAAAACTAGAGTACAGATTACAGACAGACACACTGCCAAAGATGGGCAGCTGCCCACTTGAGTGACTGGACTCAAGGTCTCCAAGCTATCATGTATGTCTCTATCAGGAGTCCCAAGGAGACCAATAAGCTCGTTACTAAGGTGCCAGATTTATTTTGATTAACAGATTAAAACATTTATTTCTCTAACTGTGTATGCCCCTTCTAGTAATACATCTGATTTTAATCATGCATAGGCCAAATTACACAAAGTCATAATGTGGCCATCTGAAGAACATGTTTTTGTCTTACTATGTAAACACCCCAAACCAGTTCAGGCAAAAATAGTGCTTCTGTTCTCCATACAAGGATACACTGAGAATATAGTTGAATGGAAACTGTCTGGACTTGATCATTATTAGGGTTGTGGGAAGTCAGAGACAGGTGTATGACTAGCTGGATACAGGGTTGGGGACTATGAAGGGTTCTTTTGTGTACTCTACAGCCTTGTTCACAGATGAGTGTGTGAGTATTCTAATACAGGTGGGGAGGAGGAAGAAGGGGTTGTTATATGGAGAACTCTATAGCATTGCTCAGAGAGGGGTGTGTGAGTAACCAGACACAGTTGGCTTGGGGAGGGGTCTAAAATCATGATTGGGTTCATTGAATAGAGTATGCATGAACATAAAAAAAGTTAAGCTGCCTAATGCCTTTTTATATAATCAAGTTGTGAATTATCCTTATCCAAGTGGTTTGGAATTCCCAGTTACTACACTATCCCCTACTTTTAAATGCATTTTTTCAATTGCTGCTTTTTTTCCAGATTGACGGGCCTTGGCAAACAATTTACTTAGCTGCCAGTACCATGGAAAAGATAAATGAAGGCTCACCATTGAGAACCTACTTCCGTCACATTTTGTGTGGGAGGAGATGCAACCAAGTCTACCTCTATTTTTTTATTAAGTAAGATATATATAGACAGAACAATCCATGTGATGGTGTGTGGGTAGAGAAATGAAATGTTTTCAGTCAACACTCATATCAGTGAACACACACAGCACTAAGTGAGGATCCTTGTATCACTTTGCTCTCTGTGGCCCTGAAAAGAACATATGATAGCTGTACTTTTAATTTAGAAAGAGTTAATTTATTTATTTTATAGCAAGCCAAAGTAAAATAAATAATATCAAGAGATGGGTGGTGATTATATGTAGATGATTTATACATAGATAATAGATATAAAATTAGATAGATGATACATATATGACTATGTAGGTGATAGATACATAATTAATAGATATAAAAGTAGATAAATGATACATATACACAAATGTAGGTGTTAGATAATAGACAATTAGCTAGACATGTGATAGATATATACATAGATATGTAGGTAGACAGATAATAGATGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGAAAATATATATAGATAAATAATAGATGGATACATGCATGTATATACATAAATATATACATAGGTAGATTTAGATAGTATATAGATATATGAAAGAGACAGGCCTGGGTGTAAAGTACAATTGGAAAACATATTCAAAATCAGCTCCTTGTTGAAGTCAAACATTACAAGACAATTTACCAAAATAACAATTACTTAATTTTCTATTTATCACTGCTCAAAAGCCCAGGTGGTAGAGAAGATATATCAAAATAAATTTTGGGATAGTTTCAGTGCTTTTAAACTATATTGATATGAGTGATATTATCCATCGAGGATATATTTCTATCTGGGAGGAACAAAGTTTTTGTTAATTATGTGCTGATATTTGAACCTTTTCATAAATTAGCTATTTATAAATTCAGGTTTCTGAGAGTCTCTGTACATTGTTCTTGCTTTGAACTCAATATGGAGTAAAAAATAAACTTTTTAATTTTTATTTATTATATACACATATAATAAATAAATTAATAAACATATTGTATTTTTTCTGGACAGTGTTTCTCTGTGTAGCCCTGGCTACCCTGAAACTCACACTGTAGACCAGGCTGGTCTCAAAATCAGAGCCTCTACCTTCTGGGATTAGAGGTGTGTGTCATCACTACCTGGCTAATTTATATATTTTTAAAGACAGGTTCTCCTTATATAACCCTGGCTGTCCTGGAATTCTCTATGTACACCAGACTGGACTTGAACTCACAGAGATCTACCTGCCTCTACCTCTCAGGTGCTGGGTTCAAAAGTGTGTACCACCACAAGCAGTCTTTTTATGTTTTTATTTATTTTTGTTTTTAATTAATTTTTTTTAGAATGATAGTGAACCTATCCTCCTGAATCAGTCTGTTATGGATTCCACCATGCTTGACTTGATATTTAACATACCTAAAATTATTTAAAAGAGTAGTAGTTCAAAATCAAGTAGGTGTATACACTAGAATGTTGAAGAGATAATTTTATTTTTAAATACGTTTTTAAACTAAAACATAATTACATCATTTCCCCCTTCCATTTTCTCCCTCCAATCCTTCCCATGTTCCCTCCTCCCTTGGTCTTTTTCAAATCCTAGACTCCTTTTTCTTTAATTGTTCTCGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCGCGCGCGCGTTCTTATTATACAACCTGTTCATTCCATATAATGTTACTTGTATATATATGATTTTTACAGTAACCATTTGATATTCAAAAAACTAACTGAGAGGGAGCACTTTTCTTCAGGGAAGACTACAAAATATCCTAAAATTCAATGGAGCCAGAAAAGATCCTGTGGCAGCTAAAACAATCCTGAGCAAATAGAACAGTGCTGGAAGGATTGCCATTCCAGATATGAGTGATTTTTTTTTACATGCAGAGAATGGACCCCCAAGGCCTTCAACAGGCTTTGTGTAACCACACCCTGCTATTAAACAGTTATTTCCATTTTTTGTTTGTTTTTGGTGGTGTGCTTTTAATGACTGAGTCATCTCTCCAGGCCAGTTATTCTCATGTTTTTAGAGCCATAAAGACAAATCTAAGCTGTGGCTGTCACTATTAGTCACATGGGTCTCTTCTGAGTGAATACAAATCTGAATGCATGACTATGCCTACAACTCTATGGATCTGTTGATGCACATCGCAACTTAAATCTTCTCAGAATACAATTGTAGGCTGAATTTAAATCCATTTAAATGCACAATTCAATTGCTGCTACAGAATCTATAAGATTGATGACATTTATGAGAGGGACAAAAAGTAAATACATGATCCCAATGTAAATATGGATGCTAGACTATAATGTATGTAGGGCACTGAATGTGCTCTGCACCAGGTGGCCTTTCACCCTTTAAACCCAATTCTACAGACATTTCTCATCTCAGATTGGGGTAGAATTTTTGCTCAATATTTTGGATTTTTTTCACTTATCAAAAGTTGTTATTGGGTGCTGGGGATACAGCTGTGTTGGGAGTGATTTAACAGCATGAACCACAGACCCCAGGGCTCCATACTCTGCACAAAGTTCAGTATTACATTCCCATCATCCCAGCACTTGGTGAGTAGAGGCAGGAAGATCAGGGGTTCAAGGACATTCATTGGTCAGAGGTCAGGATAGGCAACAGATCACACAGAGGAGCTAAATCTGCCCTAACAGAGACAACTGCAAACAACTAACAGGGTGATTGATGACTTTTAAGGATGTTATTAATTATATTTAAGGAAAATGCCCTACAACTACATGTGCTTATAAGAGATCTACAGAGGAAAATGTGGTTTGCAATAAACTATCCTGTCATTTATGTGGAATCAGGACCCTGGAGTAATATACAATACGACATTTGTTTTTCAGGAAAGGGACTAAGTGCCAACTGTATAAAGTCATAGGAAGGAAAAAACAAGAAGTTTACTATGCACAGTGTGAGTAAGCAATGCCGGAATGAATGCACGACCTTAATTGTTTCTTTTTTCAAAATTCACTGTCTTGGTTGAATTTTGCACACCTTGATATCTACATTCCAATGTAATTGTCTCTACAAATTACAGATGCAATCTTGAGGAAAGCCTCCTCATTTTCTTCCTAGTCATTTTGGAACTAATAAGAGATTTAATTTGTTTGTTTAGTCTTTAATGACTTTTAAAAACTTTTTTATTTATATGGTGTATACCTCTACCAAGCACACTTCCCTGGTTTAGCTTTACGGGTAAACTAAATCACAGACAAGAGTCTGAGAATGGTTAAAAGAATTAAAAGGTGATTTGAGCACTTTAAAAAATAACGATGCAAAATAGGAACTAGGATTTTATCAACTATGAAAATGACTCCCAGCAATTCTGAAACATGTGTCTTGTACAAGGAATGATTAATTGTCTAAATTTTATAATTTTATTGGGGAAAATGATTTACAATATATAAAATTAAGCATCATGAATAGTCTTAGCTCTTTAGTTCTGAAAGGCAAGGTATATGACATACTGAGATCCATTTATAAATAATTCAGGTATCAATTTAATTTGTCATGAATCAAGTGGTAAAGATACTTTCATTATTTTCTATTCTACCATAAAGAGTCTATTTAACGCAATACATTAAACTGATAAACTAAACATGTCATGGACTCTAGGTTGGTTTATAGATAAGTTTATAGAAGAGTTTATTGTACTTAAATAGAGGAGCCGAATCAACTCACAAGTATATATCCCTCCAAAAAGCCCCACACACACCTCCTAGTGTGTTATGTAATTTAGTTGGAGGCCAGCCTGATCTAAACAACAAGTTTCAGGCCAGCCAAGGATACATAGTAAGACCCGTCTCAGAAGGAAAAGGAGTTACCACTGACATAAAAAGGAACAGTCATCATATCAATGAGGAGGATCTTGAGTCAACCCCTTCCCCTCTTTAAGAAGTCTTGCTAGTGAAGCAGTGAGTGTATTGGAGTCACATACAAGAACATTTGGAAATATGGCTGCATTCCTGAAAAGCCCAACTCAGCTTGGATGACACTCCTGAAAGTTTTATCCTTGGAGCTTTAGCAAGGCTTACAGACAGCTAAGCTAGTCAGAGTCTCTTCTAGAGGGCTTGTTCACTGTTTCCCATAACGTTGTGGAGGGGCCTACTGAAAGCTGTTGGTCTCAAAATCATCCTGAGATTTGTGGGGTTTTTACTTCTTGAATCTCATGAGCTACTAAGACAGAATGTTCCAGTTTGGAGAAAATCACTATGTGGCACCCTCCCTTTTTTCTGACATGCCTCTGAGGAAAATTGGTAAAGTTAGATAATTGCACCACTAAATGTAGATCAGCTCCATGAAAATGTTGAGATATATATATATATATATATATATGTATATAAAATCAGAGAGTTTTCCTCAATATGTGTATATATGTGTATATAATTCCACCATTAAATGTATTAAATGTATTAGGATCCATGAAAATGCCTATATATGTGCACACACACTATATATATATATATATATATATATATATATATATATATAATCTTTCCTCAATATATGTACTATAAACAATTCCACCATTAAATTTAATAGGATCAATGAAATGTATGCACATATTGCATATACATATAATCATACAGCTATTCTCAGCAAGTAAGAGGACTGCAGTATGTCAGCTTCTATACATTCACACATCTTGGTATCAAGAACATAGCTATTCCAAACCAATATAGGTTCTACATATTTGTAGGTAACTGTGGGATGTTAGATATATTTTATATCTTACAACAGAATTATAAACTTGCTTTATAATTCCCTGTTTGTCTTGACTAGTTTCAACACATGGAGCAGAATATTGTAGGTTAGGGACTGTATACTCCATATATACATTCATAACTGTGCAGGTAGAATGAGTCTGAGCAAGCATTTTGTTGACTTTATATGTATTTATACAAATGTCTTACCTATCTCAGATGAAGGGAGCATAGCATTCATGTTAAAGATGGTAAATGAGAAGATATTGTTGTTTCATTATTTTAACAAGAACAGAAGAAATGATGTCACACGAGTGGCTGGAGTTTTGGGTAAGTGTCACACATAGAACTTGTAATGTGAGAGTGTGATTCAAGGATATGAATGTATGCATATCTCTGTACCCAAAGATCACAATGTGTTTGTGTATCTGATTCTGTATGTCAGGTTGATAAATCACTGAAGCTGTCATAGTTAGCTTTGAAGATCAACTTGGAACAACCCAGAATGCCCTGGGAAGGGTCTCAGTAGGGAGATTAAGCAACAGAAATATTTTATTAGTGTGCTGGGTACTAATGTTGTGACATATTCCCTATACACCGTGAAACGCAGCAAAAGGCAAACAACTGAATAAGGAGGAGATGACGGAGTTCATGAACTTAGTGGAAGAAATGGGCATTGAGGAAGAGAATGTACAGCGTATCATGGACACGGGTATAGTAGCAACCTACGTGTGTAACTTCTTACTTTGTATTTTTATAATAAATGTTATTTTTTTATTATTGAGGGTGTTGGGAAGTTGGCTCAGTGGGAAAAGTGCACACAAGCATGAGGGCCCAAGTTTGGATCCCAGCACCCATTTAGAAAGCCAAGAATAACAGAGCTGGAAAGGGGAGGGGACAAGAGGAGCTCTCTAGTAATAAATCTATCTGAAAGGTCATATTTCAAAACTAACAATGTGTAGAATGATTAAAGGAGGTGTTAAAAAGTCAGCCTCTTACCTATGCATGCGTGGGCATACACATGTCATTCACACGACTATGAACACACCAATAAACCATGCAGATAGAAACACACACACACAAAATCTACCTGAAATGAAACTCAGTTCCTAGGTCCCTCTGTCCAGATACTCTAGTCAAAACCTAGCTTTATATCTGCATTTTTATGAAAACTGTTCATATGTACTAATGTTATTACTAATATTATGTTGAAGTACTTTATTACTTTAAAATAATACTGTTGTTTTAAAATAATACTTTACTTTAAATGTTAGTATGTTAAAAATAATATATTTCCGTGTTTACTCTCCTTTAGACAACTGTCCAAGCAAGATCAAACCTTAGTGACTCAACAAGATCAGGATTAGGTGAGTCAAAGCACATTAATTTTATATCTTGAAGTTTTAATTTTAATTTTATTTTAATATTTAAGTTATTAATATTTAATATTTAAAGTAAATAATATTTTATTTGTTCTTTATCCTTTTCATATATTTATATAATGTATATTTATCTTTCATACTTCCACACACCTCATCCAAGCAACACATCTCCCTCTATACTTCATGGCTTTTAAATTTTATTATTATTTATTATTATTATTACCTCAATGAATCCAATTGGTGCTATTTCTGAATAGGGTATGTCAATTCCTGGAGGCATGAGTAATGTAATATACCAGGGCTATTATGTAATATAAAAAAATGGACTGCAGGGAGCTGGTGGCACATGCATTTAACCTAGCGCTCAAGAGGCAGAAGCAAGCATATGTCTGCGTTTTAGATAACCCTAGTCTACAGAGTGAGTTCCAGGACAGCAAGGGCTACTCAGAGAAACCCTGTCTCATAAAACCAAAAAGTATATTTTTTAAAGGCAGAAGAGGAAAGGAGGGAGGGAGGGAGGGAAAGAGGCAGACAGACAGACAGGCAGGCAGGCAGGCAAAAAGAAGGAAAAAGAAAAATATGGACCCTCCCTCTCCCAGAAACTATCAACTGGCAATAGCTCTTTGGTGGGGGGGGGGGGTCCTGAATCCCTCTCTGCTCAATTCTAGAATGTTAATTAGCTTGATCTGACCAGGGTCTTGTGCAGAACACCACAGTTGCTGTGAGTTCATGATTGTAACAGATCTGTCATGTTCAGAAAACAGAATTTTATGGCTCTACTCCCCATCCAGATGATAATTGATGGCTAAGCATCCACAGTCACTTGTCCTCATCCTTTGACCAGCTACAAATTTCTGCACAAACCACTCCCCACTGTAAAAAGTTGAGCAGAAGTCAAACATGGAGGCACACACCCTTAATCCCAACACTTGGGAGGCACGGGTAGATGAATCTCTGTGAGTATCAAGCCAGCCTGATCTACATACTGAGCTCCAGAAGATCCAGGTACATAGTCTCTATCTAAACAAACAAACACAAAGTTAAAAAGTTGATTTGACCAAAATTGAGAGAAGCATAAATCTATGAGTATTTTTAAGACCGTGGCTTGGAAACATGACAGTTCATCACCACTGGTCTCCTTCATAGGCTCCATGAGCTCCATTGTCACAGGCTTTTGACTCGAATTACAATAGAAACCCACCCTTACCCCTGTTCTTCCATAGATATGAAGTTCCTTCCATGGAGCTGGCATGAAATTTAATCAGAGAGTGCTTGGCTCCCCAATACAGCCTTTATTGCACCAGTGGACGCAAGGATGATTTTATAGTGTGTTGGGGAAGCATGGTGACATCACTGACATCTTATCCCACATACACAGCCTATTAAGTACATCTAAGCACTGTGAACGAGTTTCCTAGTCCATTTGACATCGATTTCTTGATGCCCTACAGCCACAGCATTTGGTGTCTTCAGCAATAGTGCCCTACCATTTAGATATGGTGTATAGTCAAGAGATATGGCAATAGCCAAGTTATTTTGGTTATTCCAAGGCTTCCTTCCATTAATAAATATCATGGTGGTACCCCCATGACTAAAAATTAGATTTTCACTGAATAAACCATGTCTTCTGGGAACAGCATTATACCATTGCAGGGATCCTCTGCTGAAACTTTTTTAATACTATATTTTTACTTAGCTTACAAACTAGTGGATTTCTGTAAGACTTCATTTACCTTCAGTTTCAGTTGACCCTCCCCTACCCTGTTCTTCCCTATGCCCAAACACATCCACACCTACTCCTCTAGCCCACAGCTCTCACTTTCTAATCTTCCCTGTCACCAGTGCCCAATTATATCGCCTGTATTATTATATTTTAATCACACAACCATAGGTTTCCATATGAGGTTTTAATAACCCTTCATTCTGGTTAAACCTTCCACCACACCCTGATTTCCCCATTCCACAACCAACTCCATGATTAAGCCTTCCTGCCCCAAGTATTCTTCTTTATACTTCATTTTAATGGCATTACATTTGATGGACCCACTTCCTTGATGGACCCAATTCAAACCAGTTTCTAATTACCTGGATTCCTTACATACTCCATATTATGCATACAAAATAAAAGATTCAAGTCTAATGTCCACATGTGAGATAGAATGTGCAGTTTGTCTTTCTGAGCCTGAGTGGCCTCATTAAGTATAATAATTTCCAGTTCCTTCTATTTACTTGGAAATTTCATATTTTCATTGTTCTCTATGGCTGAGTAATATTCCATCTTATACATTACATTTTCCTTATCCATTCATTAGTTGATGAACAGTTGGGTCAACTTCGTTTCTTAGCTATTATGAACTTAACCTCAGTGAGCATGGACATTCAAAGGTCTCTGTAACAGAATATAAACCCCTTTGTGTACATATCTAGAAATGGAGGAACTAGAGAAAGCACCCAAGGAACTAAAGGGAACTGCAACCCTATAGGTGGAACAACAATATGAACTAAGCAGTACCCCGGAGCTCTTGTCTCTAGCTGCATATCTATCAAAAGATGGCCTAGTCGGCCATCACTGCAAAGAGAGGCCCATTGGACTTGCAAACTTTATATGCCCCAGTACAGGGGAATGCCAGGGCCAAAAAGGGGGGATGGGTGGGTAGGGGAGTGGGGGGGGGTTGGGGACTTTTTGTATAGCATTGGAAATGTAAATGAGCTAAATACCTAATAAAAAATGGAAAAAAAAAGAAATGGTGTAACTGAGTCACATGGGAAGTCTTTTTCTAATCTTTGGATTTGTTTATTTGATGTTCATAGTTTTTGGGTTTTGGTTGGTTTTGTTCTAAGTTCTTTGTATATTGTAGACACTAATCCTCCATCACATGTGTAGTTGGCAAAGATCTCCATTCCCCGAGATACCTGTGCATTTAATTGACAGCTTCCTTTGCCGTAGTTTTTAATTCCATGATATCTGACAAGTGTTTGTCTTACTTTCTTGCTACAAGAATCCTATTCAAAGAATCCATACCTGTGTCTATGGTTAACACACACTCCATGCTTTCTTCTCTATCAGCTTCAGGTTACCATGTCTTATGACACGGTCTTTGAATCATTTGGAGTTGAGGTTTTTTTCAAGGTGACAGGGAAGAGCCCAGGTTCATTTCTCTGTATCCTGATGTCCACTTTTTCCTATTCGGTCTATTTATGGAATTATATATTTTTATGTTAGGTCATTTTTCAGTGGAGGCATTAACAATATCCAGAAGGGGACTATTTCTTACTAGTGTTGAGATGGTATTCTCCTATATGGGGCTGGTACTGAAGGCAGCAAGTTCTACCCCTAGTCTTCCTGTGAGATTCAACTACTATCTGGGACCTCGAGTGAGACTCTGTCTGTAGGATACATGGGGCTTGGTAAACTCCAATGTGAAACAAAAAATATATAATTTTAGTTTAGATTCATAGAAACTACATCCTCAAATAAACACATAAGTTCTAAAAAGTACCAATTTAGGTCTTGATATAAGATCATTTGTCATATAAAAATTTTCCATATAAGGAAAATTTCCATACAAAGTTCATGTATATTTCCAAATATACAAAATTCTGTAAAATGTTTTTGCATGATACATCTTGTCATTGTTTGCCTCTTTAATGGCTTGTATTTGTTTCATTTTCTACTCTCATCAAATATCATGTATTACTATCCTAAATATATGAAATAATTCTGTTCCAGCATTACAGATGACATCAGGAATTTTCCAGTATATTTTTCCTGGAACCTGAAACATCAATATGAAGATGAAGCAATCTTGTCTCTCAGATCATATTTTCCTATTTATTGCAAATTACAATTCCTGTCTCTGTACTTTCTCTTTCACTCATTGTTTCCCATGTTCTAATCGGTATTAGTGCATCTTTGAATGTTTAAATAAATTTATTTTACTTGC
  RefSeq DNA ID [e.g. NM_203373]    Ensembl Gene ID
1                      NM_017471 ENSMUSG00000000003

 

You will get back the BioMart website attributes name instead of the biomaRt package attributes name. 

 

Hope this helps,

Regards,

Thomas

ADD COMMENT

Login before adding your answer.

Traffic: 263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6