intron sequences from biomaRt
2
0
Entering edit mode
Dario Greco ▴ 310
@dario-greco-1536
Last seen 9.6 years ago
dear list, i need to get the intron sequences for a group of entrez gene ids. is there any way to do it using biomaRt? apparently there is no option in the getSequence() function. > sessionInfo() R version 2.5.0 (2007-04-23) i686-redhat-linux-gnu biomaRt RCurl XML "1.10.0" "0.8-1" "1.7-3" any suggestions? thanks for your help. yours d -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780
biomaRt biomaRt • 2.2k views
ADD COMMENT
0
Entering edit mode
@steffen-durinck-1780
Last seen 9.6 years ago
Dear Dario, No there is currently no possibility to select intronic sequences directly. You could request this at helpdesk at ensembl.org and see if they want to add this feature in future versions of Ensembl. For now the only way to do it with biomaRt would be to retrieve the transcript or gene sequences and the exon sequences and then find the intronic sequences by splitting the transcript sequences on the different exons. Hope this helps, Steffen -----Original Message----- From: Dario Greco [mailto:dario.greco@helsinki.fi] Sent: Fri 6/15/2007 9:13 AM To: bioconductor at stat.math.ethz.ch Subject: [BioC] intron sequences from biomaRt dear list, i need to get the intron sequences for a group of entrez gene ids. is there any way to do it using biomaRt? apparently there is no option in the getSequence() function. > sessionInfo() R version 2.5.0 (2007-04-23) i686-redhat-linux-gnu biomaRt RCurl XML "1.10.0" "0.8-1" "1.7-3" any suggestions? thanks for your help. yours d -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780 _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
hi Steffen, thank you very much for your email. i asked to the ensembl helpdesk and this is what they replied: "...The only way to retrieve intronic sequences in batch mode is by using the Ensembl Core Perl API: http://www.ensembl.org/info/software/core/index.html I hope this answers your question. Please let us know if you have any other questions or problems..." is there any idea of implementing/using this from R? shall i go by scratch with bioperl? thanks again, yours d Durinck, Steffen (NIH/NCI) [F] wrote: > Dear Dario, > > No there is currently no possibility to select intronic sequences directly. You could request this at helpdesk at ensembl.org and see if they want to add this feature in future versions of Ensembl. For now the only way to do it with biomaRt would be to retrieve the transcript or gene sequences and the exon sequences and then find the intronic sequences by splitting the transcript sequences on the different exons. > > Hope this helps, > Steffen > > > -----Original Message----- > From: Dario Greco [mailto:dario.greco at helsinki.fi] > Sent: Fri 6/15/2007 9:13 AM > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] intron sequences from biomaRt > > dear list, > > i need to get the intron sequences for a group of entrez gene ids. > is there any way to do it using biomaRt? apparently there is no option in the > getSequence() function. > > >> sessionInfo() >> > R version 2.5.0 (2007-04-23) > i686-redhat-linux-gnu > biomaRt RCurl XML > "1.10.0" "0.8-1" "1.7-3" > > any suggestions? > thanks for your help. > > yours > d > > -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780 Lab WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/ Personal WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/dario.htm
ADD REPLY
0
Entering edit mode
@steffen-durinck-1780
Last seen 9.6 years ago
Hi Dario, If the BioMart web server would implement this, it would become readily available in R, but the are currently no plans for them to do this and neither do we currently intend to implement this ourselves in biomaRt (unless you want to contribute this function;) ). Would my previous suggestion of subtracting the exonic sequences from the unspliced transcript sequences not work? If not you'll indeed have to use the Ensembl Core Perl API and create a script to retrieve the intronic sequences. Best regards, Steffen -----Original Message----- From: Dario Greco [mailto:dario.greco@helsinki.fi] Sent: Mon 6/18/2007 7:14 AM To: Durinck, Steffen (NIH/NCI) [F] Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] intron sequences from biomaRt hi Steffen, thank you very much for your email. i asked to the ensembl helpdesk and this is what they replied: "...The only way to retrieve intronic sequences in batch mode is by using the Ensembl Core Perl API: http://www.ensembl.org/info/software/core/index.html I hope this answers your question. Please let us know if you have any other questions or problems..." is there any idea of implementing/using this from R? shall i go by scratch with bioperl? thanks again, yours d Durinck, Steffen (NIH/NCI) [F] wrote: > Dear Dario, > > No there is currently no possibility to select intronic sequences directly. You could request this at helpdesk at ensembl.org and see if they want to add this feature in future versions of Ensembl. For now the only way to do it with biomaRt would be to retrieve the transcript or gene sequences and the exon sequences and then find the intronic sequences by splitting the transcript sequences on the different exons. > > Hope this helps, > Steffen > > > -----Original Message----- > From: Dario Greco [mailto:dario.greco at helsinki.fi] > Sent: Fri 6/15/2007 9:13 AM > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] intron sequences from biomaRt > > dear list, > > i need to get the intron sequences for a group of entrez gene ids. > is there any way to do it using biomaRt? apparently there is no option in the > getSequence() function. > > >> sessionInfo() >> > R version 2.5.0 (2007-04-23) > i686-redhat-linux-gnu > biomaRt RCurl XML > "1.10.0" "0.8-1" "1.7-3" > > any suggestions? > thanks for your help. > > yours > d > > -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780 Lab WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/ Personal WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/dario.htm
ADD COMMENT
0
Entering edit mode
hi Steffen, thank you once again for your answer. actually, your advice of subtracting the exons from the unspliced transcripts worked out. i have actually done it within R using the biomaRt and Biotrings facilities. at the moment, my code works fine but it is quite ugly and needs optimization. however, if there is any interest, i will be happy to share (after some needed cosmetics ;-) ) thanks again, d Durinck, Steffen (NIH/NCI) [F] wrote: > Hi Dario, > > If the BioMart web server would implement this, it would become readily available in R, but the are currently no plans for them to do this and neither do we currently intend to implement this ourselves in biomaRt (unless you want to contribute this function;) ). Would my previous suggestion of subtracting the exonic sequences from the unspliced transcript sequences not work? If not you'll indeed have to use the Ensembl Core Perl API and create a script to retrieve the intronic sequences. > > Best regards, > Steffen > > -----Original Message----- > From: Dario Greco [mailto:dario.greco at helsinki.fi] > Sent: Mon 6/18/2007 7:14 AM > To: Durinck, Steffen (NIH/NCI) [F] > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] intron sequences from biomaRt > > hi Steffen, > > thank you very much for your email. i asked to the ensembl helpdesk and > this is what they replied: > > "...The only way to retrieve intronic sequences in batch mode is by using > the Ensembl Core Perl API: > http://www.ensembl.org/info/software/core/index.html > I hope this answers your question. Please let us know if you have any > other questions or problems..." > > is there any idea of implementing/using this from R? shall i go by scratch with bioperl? > > thanks again, > yours > d > > > > > Durinck, Steffen (NIH/NCI) [F] wrote: > >> Dear Dario, >> >> No there is currently no possibility to select intronic sequences directly. You could request this at helpdesk at ensembl.org and see if they want to add this feature in future versions of Ensembl. For now the only way to do it with biomaRt would be to retrieve the transcript or gene sequences and the exon sequences and then find the intronic sequences by splitting the transcript sequences on the different exons. >> >> Hope this helps, >> Steffen >> >> >> -----Original Message----- >> From: Dario Greco [mailto:dario.greco at helsinki.fi] >> Sent: Fri 6/15/2007 9:13 AM >> To: bioconductor at stat.math.ethz.ch >> Subject: [BioC] intron sequences from biomaRt >> >> dear list, >> >> i need to get the intron sequences for a group of entrez gene ids. >> is there any way to do it using biomaRt? apparently there is no option in the >> getSequence() function. >> >> >> >>> sessionInfo() >>> >>> >> R version 2.5.0 (2007-04-23) >> i686-redhat-linux-gnu >> biomaRt RCurl XML >> "1.10.0" "0.8-1" "1.7-3" >> >> any suggestions? >> thanks for your help. >> >> yours >> d >> >> >> > > -- Dario Greco Institute of Biotechnology - University of Helsinki Building Cultivator II P.O.Box 56 Viikinkaari 4 FIN-00014 Finland Office: +358 9 191 58951 Fax: +358 9 191 58952 Mobile: +358 44 023 5780 Lab WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/ Personal WebPage: http://www.biocenter.helsinki.fi/bi/dna-microarray/dario.htm
ADD REPLY
0
Entering edit mode
Dario Greco wrote: > hi Steffen, > > thank you once again for your answer. > actually, your advice of subtracting the exons from the unspliced > transcripts worked out. > i have actually done it within R using the biomaRt and Biotrings facilities. Hi Dario, There is a new function 'mask' in Biotrings 2.5.10 that can perhaps be useful for your problem. For example, let's say that you've already managed to get the starts/ends for the 30 exons belonging to gene "FBgn0025803" (chromosome 3R) of the fly: exons_start, exons_end: integer vectors of length 30 Note that the start and end for gene "FBgn0025803" are 'min(exons_start)' and 'max(exons_end)'. library(BSgenome.Dmelanogaster.FlyBase.r51) exons <- views(Dmelanogaster[["3R"]], exons_start, exons_end) exons introns <- mask(exons) introns <- introns[-c(1, length(introns))] introns Note that the introns obtained by this method are the portions of the gene that don't belong to any of the exons. This is different from what you would get if you were extracting the introns by looking at each individual splicing. Also note that, in the example above, we get 24 introns only: this is because there are overlaps among the 30 exons. See ?mask for other examples. Cheers, H. > > at the moment, my code works fine but it is quite ugly and needs > optimization. however, if there is any interest, i will be happy to > share (after some needed cosmetics ;-) ) > > thanks again, > d > > > > Durinck, Steffen (NIH/NCI) [F] wrote: >> Hi Dario, >> >> If the BioMart web server would implement this, it would become readily available in R, but the are currently no plans for them to do this and neither do we currently intend to implement this ourselves in biomaRt (unless you want to contribute this function;) ). Would my previous suggestion of subtracting the exonic sequences from the unspliced transcript sequences not work? If not you'll indeed have to use the Ensembl Core Perl API and create a script to retrieve the intronic sequences. >> >> Best regards, >> Steffen >> >> -----Original Message----- >> From: Dario Greco [mailto:dario.greco at helsinki.fi] >> Sent: Mon 6/18/2007 7:14 AM >> To: Durinck, Steffen (NIH/NCI) [F] >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] intron sequences from biomaRt >> >> hi Steffen, >> >> thank you very much for your email. i asked to the ensembl helpdesk and >> this is what they replied: >> >> "...The only way to retrieve intronic sequences in batch mode is by using >> the Ensembl Core Perl API: >> http://www.ensembl.org/info/software/core/index.html >> I hope this answers your question. Please let us know if you have any >> other questions or problems..." >> >> is there any idea of implementing/using this from R? shall i go by scratch with bioperl? >> >> thanks again, >> yours >> d >> >> >> >> >> Durinck, Steffen (NIH/NCI) [F] wrote: >> >>> Dear Dario, >>> >>> No there is currently no possibility to select intronic sequences directly. You could request this at helpdesk at ensembl.org and see if they want to add this feature in future versions of Ensembl. For now the only way to do it with biomaRt would be to retrieve the transcript or gene sequences and the exon sequences and then find the intronic sequences by splitting the transcript sequences on the different exons. >>> >>> Hope this helps, >>> Steffen >>> >>> >>> -----Original Message----- >>> From: Dario Greco [mailto:dario.greco at helsinki.fi] >>> Sent: Fri 6/15/2007 9:13 AM >>> To: bioconductor at stat.math.ethz.ch >>> Subject: [BioC] intron sequences from biomaRt >>> >>> dear list, >>> >>> i need to get the intron sequences for a group of entrez gene ids. >>> is there any way to do it using biomaRt? apparently there is no option in the >>> getSequence() function. >>> >>> >>> >>>> sessionInfo() >>>> >>>> >>> R version 2.5.0 (2007-04-23) >>> i686-redhat-linux-gnu >>> biomaRt RCurl XML >>> "1.10.0" "0.8-1" "1.7-3" >>> >>> any suggestions? >>> thanks for your help. >>> >>> yours >>> d >>> >>> >>> >> >
ADD REPLY
0
Entering edit mode

Hi Steffen,

I'm wondering if you could share this code? Unless 11 years is not enough time for cosmetics ;-)

Thank you!

Nico

ADD REPLY

Login before adding your answer.

Traffic: 473 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6