Question: Need a BSgenome.Btaurus.UCSC.bostau9 package
0
gravatar for t.nguyen3
4 months ago by
t.nguyen30
t.nguyen30 wrote:

Hi there,

I need to create an annotation package to run Rnbeads using ARS-UCD1.2/bosTau9 assembly. Although I can find it at ftp://hgdownload.soe.ucsc.edu/goldenPath/bosTau9/database/, I can't find it on Bioconductor. Does anyone have it or know how to forge it? I checked the document: http://bioconductor.org/packages/release/bioc/vignettes/BSgenome/inst/doc/BSgenomeForge.pdf, but it seems very hard to me since I am very new to R.

Many thanks Loan

annotation • 178 views
ADD COMMENTlink modified 4 months ago by James W. MacDonald51k • written 4 months ago by t.nguyen30
Answer: Need a BSgenome.Btaurus.UCSC.bostau9 package
0
gravatar for James W. MacDonald
4 months ago by
United States
James W. MacDonald51k wrote:

I'm not sure you need to make a BSgenome package these days, but instead can simply download and import the 2bit file.

> library(rtracklayer)
> download.file("ftp://hgdownload.soe.ucsc.edu/goldenPath/bosTau9/bigZips/bosTau9.2bit", "bosTau9.2bit")
trying URL 'ftp://hgdownload.soe.ucsc.edu/goldenPath/bosTau9/bigZips/bosTau9.2bit'
downloaded 679.5 MB
> z <- TwoBitFile("bosTau9.2bit")
> z
TwoBitFile object
resource: bosTau9.2bit 
> zz <- import(z)
> zz
  A DNAStringSet instance of length 2211
           width seq                                        names               
   [1] 158534110 GTACACTGATCACGTGGCTG...AGATAAATCCATTAAATGA chr1
   [2] 103308737 TGGAAATTAAAGGGAAGAAT...TAGCGTTAGGGTTCGCGTA chr10
   [3] 106982474 TCATGCACTGATCACGTGGC...GTGTTGGCCAGGGAAGTAT chr11
   [4]  87216183 TTTCATGATCAAAAGCCACG...AGGAGTGGAATTGGTGAGC chr12
   [5]  83472345 ATGCACACATCAGGTGGCTT...AGGGGGGTTGGGTTAGGGT chr13
   ...       ... ...
[2207]   1124660 TAGTGTGGAGGATCACTATT...ATAACTACTGAAGCCTGTG chrUn_NW_020192291v1
[2208]   9309904 CCATACAGCACCAATGATAA...ATATTCAGTACCTATTTAT chrUn_NW_020192292v1
[2209]     27572 TCTCCTGTGTTGTAGGAAGA...TGATGCCTGTTCAGTGATT chrUn_NW_020192293v1
[2210]    986051 TGCCCATGTGTATATACCTG...ACAAAAAATCGAGTACTCT chrUn_NW_020192294v1
[2211] 139009144 CCTAACCCTAACCCTAACCC...TGTATTTCTCTTTCTTTTT chrX

You can also import just portions

> import(z, which = GRanges("chr1:1-100000"))
  A DNAStringSet instance of length 1
     width seq
[1] 100000 GTACACTGATCACGTGGCTGATCATGCACAAAT...CTTACCTTTTCATGCACTGATTACCTGGCTATC

ADD COMMENTlink written 4 months ago by James W. MacDonald51k

Hi James,

Thank you very much for your suggestion. I tried to follow your steps however I got the error message as below:

zz <- import(z) Error in .seqlengthsTwoBitFile(con) : UCSC library operation failed In addition: Warning message: In .seqlengthsTwoBitFile(con) : End of file reading 4 bytes

Do you have any idea why it does not work?

Thanks, Loan

ADD REPLYlink written 4 months ago by t.nguyen30

Try downloading it directly, using a browser.

ADD REPLYlink written 4 months ago by James W. MacDonald51k

Hi James, I think the problem caused by using window. I used Mac OS to run and it worked.

Thanks again for all your help and your time

Loan

ADD REPLYlink written 4 months ago by t.nguyen30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 175 users visited in the last hour