question about forgeBSgenomeDataPkg function
0
0
Entering edit mode
@herve-pages-1542
Last seen 18 hours ago
Seattle, WA, United States
Hi Brian, I'm putting this on the mailing list since this might actually affect other users. Brian Herb wrote: > Herve- > > You perviously helped me with building the BSgenome package for the Rat, > and now i am helping my lab mate create a BSgenome package for the > rhesus monkey. We are running into an error when he reads in the gap files: > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, > na.strings, : > scan() expected 'an integer', got 'fragment' > > we wonder if the issue is that the gap files are in a slightly different > format than what I am used to with the rat: > > Example Rat gap file: > > 585 chr1 1360 2576 2 N 1216 fragment yes > 585 chr1 5378 5428 4 N 50 fragment yes > 585 chr1 13845 13895 6 N 50 fragment yes > 585 chr1 23435 23485 8 N 50 fragment yes > 585 chr1 25955 26005 10 N 50 fragment yes > 585 chr1 33306 33356 12 N 50 fragment yes > 585 chr1 35384 40627 14 N 5243 fragment yes > 585 chr1 45904 46169 16 N 265 fragment yes > > Example rhesus monkey gap file: > > > chr1 17248 17350 2 N 102 fragment yes > chr1 26206 26619 4 N 413 fragment yes > chr1 27937 28130 6 N 193 fragment yes > chr1 47170 48593 8 N 1423 fragment yes > chr1 83907 85189 10 N 1282 fragment yes > chr1 95455 96505 12 N 1050 fragment yes > chr1 100303 100323 14 N 20 fragment yes > chr1 132263 132283 16 N 20 fragment yes > chr1 151325 152178 18 N 853 fragment yes Good catch! It seems that we can't indeed assume that UCSC is using a consistent schema for their 'gap' table. For Rat and any other organisms I've seen to far, the columns are the following: http://genome.ucsc.edu/cgi-bin/hgTables?db=rn4&hgta_group=map&hgta_tra ck=gap&hgta_table=gap&hgta_doSchema=describe+table+schema but for Rhesus, the 'bin' column is missing: http://genome.ucsc.edu/cgi-bin/hgTables?db=rheMac2&hgta_group=map&hgta _track=gap&hgta_table=gap&hgta_doSchema=describe+table+schema I've tried to accommodate this in read.gapMask(). This change will be available in IRanges >= 1.5.56 (devel) and IRanges >= 1.4.13 (release). Both packages should become available thru biocLite() in the next 24 hours. Please let me know if you encounter any further issue. Thanks for the report! H. > > we wonder if the missing column in the monkey gap file is throwing off > the forgeMasksFiles function, and if there is something that we can > stipulate in this function to change which column it is looking for. > > > > sessionInfo() > R version 2.10.0 (2009-10-26) > x86_64-unknown-linux-gnu > locale: > [1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C > [3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915 > [5] LC_MONETARY=C LC_MESSAGES=en_US.iso885915 > [7] LC_PAPER=en_US.iso885915 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] BSgenome_1.14.2 Biostrings_2.14.10 IRanges_1.4.9 > loaded via a namespace (and not attached): > [1] Biobase_2.6.0 > > Kind Regards, > Brian > > > -- > Brian Herb > Graduate Program in Biochemistry, Cellular and Molecular Biology > Johns Hopkins School of Medicine > Dr. Andrew Feinberg Laboratory > Rangos 580 > 855 N. Wolfe St. > Baltimore, MD 21205 > Phone:410-614-3479 > Fax: 410-614-9819 -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
Cancer BSgenome BSgenome IRanges Cancer BSgenome BSgenome IRanges • 541 views
ADD COMMENT

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6