exon exon junction library for human
1
0
Entering edit mode
shirley zhang ★ 1.0k
@shirley-zhang-2038
Last seen 9.6 years ago
Dear List, I am working on a next generation sequencing data set from Human. I am wondering whether there is a public exon-exon junction library available. Thanks Shirley [[alternative HTML version deleted]]
Sequencing Sequencing • 1.2k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 13 months ago
United States
Hi Shirley, On Aug 20, 2009, at 11:21 AM, shirley zhang wrote: > Dear List, > I am working on a next generation sequencing data set from Human. I am > wondering whether there is a public exon-exon junction library > available. I'm not sure if I get what you mean, but perhaps this might help. I recently got ERANGE[1] working for some RNA-seq analysis. Part of the preliminary steps is to creata a fasta file that has sequences spanning known exon/exon junctions. If you want to make such a file, read the instructions in the README.build-rds.txt file of the ERANGE package. That having been said, I've built such a file for 32bp reads against the human hg19/NCBI37 genome which I can give you, if you like (with the disclaimer of not being held responsible if that file is actually wrong). -steve ERANGE: http://woldlab.caltech.edu/rnaseq/ -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Hi Steve, Thank you for your quick response. If you can share your file with me, that would be fantastics. Thanks again, Shirley On Thu, Aug 20, 2009 at 11:41 AM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi Shirley, > > > On Aug 20, 2009, at 11:21 AM, shirley zhang wrote: > > Dear List, >> I am working on a next generation sequencing data set from Human. I am >> wondering whether there is a public exon-exon junction library available. >> > > I'm not sure if I get what you mean, but perhaps this might help. > > I recently got ERANGE[1] working for some RNA-seq analysis. Part of the > preliminary steps is to creata a fasta file that has sequences spanning > known exon/exon junctions. > > If you want to make such a file, read the instructions in the > README.build-rds.txt file of the ERANGE package. > > That having been said, I've built such a file for 32bp reads against the > human hg19/NCBI37 genome which I can give you, if you like (with the > disclaimer of not being held responsible if that file is actually wrong). > > -steve > > ERANGE: http://woldlab.caltech.edu/rnaseq/ > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > -- Xiaoling (Shirley) Zhang Ph.D. Candidate in Bioinformatics Boston University, Boston, MA Tel: (857) 233-9862 Email: zhangxl@bu.edu [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, On Aug 20, 2009, at 11:46 AM, shirley zhang wrote: > Hi Steve, > > Thank you for your quick response. > > If you can share your file with me, that would be fantastics. You can get the file here: http://cbio.mskcc.org/~lianos/files/hg19_splices.fa.zip That was created with the hg19/masked fasta files. The leading and trailing NN-dinucleotides from each sequence were inserted by ERANGE. I'm actually not sure that this file will help you directly unless you're planning to use ERANGE's pipeline, though. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Thanks. I am planing to use ERANGE and TopHat. Your file will definitely help, especially it is against the human hg19/NCBI37 genome. Thanks again, Shirley On Thu, Aug 20, 2009 at 11:52 AM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi, > > On Aug 20, 2009, at 11:46 AM, shirley zhang wrote: > > Hi Steve, >> >> Thank you for your quick response. >> >> If you can share your file with me, that would be fantastics. >> > > You can get the file here: > > http://cbio.mskcc.org/~lianos/files/hg19_splices.fa.zip > > That was created with the hg19/masked fasta files. The leading and trailing > NN-dinucleotides from each sequence were inserted by ERANGE. > > I'm actually not sure that this file will help you directly unless you're > planning to use ERANGE's pipeline, though. > > -steve > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > -- Xiaoling (Shirley) Zhang Ph.D. Candidate in Bioinformatics Boston University, Boston, MA Tel: (857) 233-9862 Email: zhangxl@bu.edu [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hey, Sorry, that splices file was actually made against the unmasked hg19 fasta file. I'm not sure, but you might have to make your own hg19 database for cistematic at some point down the road, so you might want to do that know and build your own splices file anyway. But that's certainly your choice ... -steve On Aug 20, 2009, at 11:58 AM, shirley zhang wrote: > Thanks. I am planing to use ERANGE and TopHat. Your file will > definitely help, especially it is against the human hg19/NCBI37 > genome. > > Thanks again, > Shirley > > On Thu, Aug 20, 2009 at 11:52 AM, Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> > wrote: > Hi, > > > On Aug 20, 2009, at 11:46 AM, shirley zhang wrote: > > Hi Steve, > > Thank you for your quick response. > > If you can share your file with me, that would be fantastics. > > You can get the file here: > > http://cbio.mskcc.org/~lianos/files/hg19_splices.fa.zip > > That was created with the hg19/masked fasta files. The leading and > trailing NN-dinucleotides from each sequence were inserted by ERANGE. > > I'm actually not sure that this file will help you directly unless > you're planning to use ERANGE's pipeline, though. > > -steve > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > > > > -- > Xiaoling (Shirley) Zhang > > Ph.D. Candidate in Bioinformatics > Boston University, Boston, MA > Tel: (857) 233-9862 > Email: zhangxl at bu.edu -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
I see. Thanks again for your help, Shirley On Thu, Aug 20, 2009 at 12:06 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hey, > > Sorry, that splices file was actually made against the unmasked hg19 fasta > file. > > I'm not sure, but you might have to make your own hg19 database for > cistematic at some point down the road, so you might want to do that know > and build your own splices file anyway. But that's certainly your choice ... > > -steve > > > On Aug 20, 2009, at 11:58 AM, shirley zhang wrote: > > Thanks. I am planing to use ERANGE and TopHat. Your file will definitely >> help, especially it is against the human hg19/NCBI37 genome. >> >> Thanks again, >> Shirley >> >> On Thu, Aug 20, 2009 at 11:52 AM, Steve Lianoglou < >> mailinglist.honeypot@gmail.com> wrote: >> Hi, >> >> >> On Aug 20, 2009, at 11:46 AM, shirley zhang wrote: >> >> Hi Steve, >> >> Thank you for your quick response. >> >> If you can share your file with me, that would be fantastics. >> >> You can get the file here: >> >> http://cbio.mskcc.org/~lianos/files/hg19_splices.fa.zip >> >> That was created with the hg19/masked fasta files. The leading and >> trailing NN-dinucleotides from each sequence were inserted by ERANGE. >> >> I'm actually not sure that this file will help you directly unless you're >> planning to use ERANGE's pipeline, though. >> >> -steve >> >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact >> >> >> >> >> -- >> Xiaoling (Shirley) Zhang >> >> Ph.D. Candidate in Bioinformatics >> Boston University, Boston, MA >> Tel: (857) 233-9862 >> Email: zhangxl@bu.edu >> > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 881 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6