Uploading BED file: rtracklayer
1
0
Entering edit mode
swaraj basu ▴ 50
@swaraj-basu-4629
Last seen 10.3 years ago
Hello Everybody, I was using the rtracklayer package to upload CAGE tag bed files as genomic ranges object. I used the commands library(rtracklayer) tmp<-import("filename.bed", asRangedData=FALSE) And they were working fine. My BED file contains 7 fields in the bed format *chr1 3783 3784 ctss1 0.128643307100390 + 1 chr1 3788 3789 ctss2 0.128643307100390 + 1 chr1 3798 3799 ctss3 0.128643307100390 + 1 chr1 3822 3823 ctss4 0.128643307100390 + 1 chr1 3830 3831 ctss5 0.128643307100390 + 1 chr1 3843 3844 ctss6 0.128643307100390 + 1 chr1 3862 3863 ctss7 0.128643307100390 + 1 chr1 6254 6255 ctss8 0.128643307100390 - 1 chr1 7219 7220 ctss9 0.128643307100390 + 1 chr1 14028 14029 ctss10 0.128643307100390 + 1* Recently I did an update of all the installed bioconductor packages using the commands source("http://bioconductor.org/biocLite.R") biocLite(character(), ask=FALSE) And now when I try to upload the same file I get error tmp <- import("filename.bed", asRangedData=FALSE) Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : solving row 1: range cannot be determined from the supplied arguments (too many NAs) While if I upload another bed file with all 12 fields I am not getting any error message eg *chr1 25630757 25636172 ul45 0 + 0 0 0,0,0 5 1142,2065,91,323,482, 0,1399,4271,4518,4933, chr1 25631826 25634542 ul46 0 - 0 0 0,0,0 1 2716, 0, chr1 25631826 25635826 ul47 0 - 0 0 0,0,0 1 4000, 0, chr1 28599188 28604017 ul48 0 - 0 0 0,0,0 4 1086,116,336,221, 0,1199,1425,4608, chr1 40911858 40912963 ul49 0 + 0 0 0,0,0 1 1105, 0,* Is this a change made in the update, or the new version of *rtracklayer*can only accommodate complete bed files. Can someone please help. -Regards -- Swaraj Basu PhD Student (Bioinformatics - Functional Genomics) Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Naples [[alternative HTML version deleted]]
rtracklayer rtracklayer • 4.5k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States
On Wed, Jun 6, 2012 at 10:37 AM, swaraj basu <projectbasu@gmail.com> wrote: > Hello Everybody, > > I was using the rtracklayer package to upload CAGE tag bed files as genomic > ranges object. I used the commands > > library(rtracklayer) > tmp<-import("filename.bed", asRangedData=FALSE) > And they were working fine. > My BED file contains 7 fields in the bed format > > *chr1 3783 3784 ctss1 0.128643307100390 + 1 > chr1 3788 3789 ctss2 0.128643307100390 + 1 > chr1 3798 3799 ctss3 0.128643307100390 + 1 > chr1 3822 3823 ctss4 0.128643307100390 + 1 > chr1 3830 3831 ctss5 0.128643307100390 + 1 > chr1 3843 3844 ctss6 0.128643307100390 + 1 > chr1 3862 3863 ctss7 0.128643307100390 + 1 > chr1 6254 6255 ctss8 0.128643307100390 - 1 > chr1 7219 7220 ctss9 0.128643307100390 + 1 > chr1 14028 14029 ctss10 0.128643307100390 + 1* > > Recently I did an update of all the installed bioconductor packages using > the commands > source("http://bioconductor.org/biocLite.R") > biocLite(character(), ask=FALSE) > > And now when I try to upload the same file I get error > tmp <- import("filename.bed", asRangedData=FALSE) > Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") > : > solving row 1: range cannot be determined from the supplied arguments > (too many NAs) > > While if I upload another bed file with all 12 fields I am not getting any > error message > eg > *chr1 25630757 25636172 ul45 0 + 0 0 0,0,0 5 > 1142,2065,91,323,482, 0,1399,4271,4518,4933, > chr1 25631826 25634542 ul46 0 - 0 0 0,0,0 1 > 2716, 0, > chr1 25631826 25635826 ul47 0 - 0 0 0,0,0 1 > 4000, 0, > chr1 28599188 28604017 ul48 0 - 0 0 0,0,0 4 > 1086,116,336,221, 0,1199,1425,4608, > chr1 40911858 40912963 ul49 0 + 0 0 0,0,0 1 > 1105, 0,* > > Is this a change made in the update, or the new version of > *rtracklayer*can only accommodate complete bed files. Can someone > please help. > > Hi Swaraj, You're splitting the BED file at a place that does not make a lot of sense to me, i.e., between the thickStart and thickEnd columns. When rtracklayer goes to load the thick regions, it can find only the start, not the end, and it fails. Anyway, this is a bug in rtracklayer and I will go ahead and make it assume the missing thickEnd is equal to the feature end. Michael > -Regards > > -- > Swaraj Basu > PhD Student (Bioinformatics - Functional Genomics) > Animal Physiology and Evolution > Stazione Zoologica Anton Dohrn > Naples > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Should be fixed in 1.17.9. If you're using release, might be easiest to just get rid of that last column. Michael On Wed, Jun 6, 2012 at 10:46 AM, Michael Lawrence <michafla@gene.com> wrote: > > > On Wed, Jun 6, 2012 at 10:37 AM, swaraj basu <projectbasu@gmail.com>wrote: > >> Hello Everybody, >> >> I was using the rtracklayer package to upload CAGE tag bed files as >> genomic >> ranges object. I used the commands >> >> library(rtracklayer) >> tmp<-import("filename.bed", asRangedData=FALSE) >> And they were working fine. >> My BED file contains 7 fields in the bed format >> >> *chr1 3783 3784 ctss1 0.128643307100390 + 1 >> chr1 3788 3789 ctss2 0.128643307100390 + 1 >> chr1 3798 3799 ctss3 0.128643307100390 + 1 >> chr1 3822 3823 ctss4 0.128643307100390 + 1 >> chr1 3830 3831 ctss5 0.128643307100390 + 1 >> chr1 3843 3844 ctss6 0.128643307100390 + 1 >> chr1 3862 3863 ctss7 0.128643307100390 + 1 >> chr1 6254 6255 ctss8 0.128643307100390 - 1 >> chr1 7219 7220 ctss9 0.128643307100390 + 1 >> chr1 14028 14029 ctss10 0.128643307100390 + 1* >> >> Recently I did an update of all the installed bioconductor packages using >> the commands >> source("http://bioconductor.org/biocLite.R") >> biocLite(character(), ask=FALSE) >> >> And now when I try to upload the same file I get error >> tmp <- import("filename.bed", asRangedData=FALSE) >> Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") >> : >> solving row 1: range cannot be determined from the supplied arguments >> (too many NAs) >> >> While if I upload another bed file with all 12 fields I am not getting any >> error message >> eg >> *chr1 25630757 25636172 ul45 0 + 0 0 0,0,0 5 >> 1142,2065,91,323,482, 0,1399,4271,4518,4933, >> chr1 25631826 25634542 ul46 0 - 0 0 0,0,0 1 >> 2716, 0, >> chr1 25631826 25635826 ul47 0 - 0 0 0,0,0 1 >> 4000, 0, >> chr1 28599188 28604017 ul48 0 - 0 0 0,0,0 4 >> 1086,116,336,221, 0,1199,1425,4608, >> chr1 40911858 40912963 ul49 0 + 0 0 0,0,0 1 >> 1105, 0,* >> >> Is this a change made in the update, or the new version of >> *rtracklayer*can only accommodate complete bed files. Can someone >> please help. >> >> > Hi Swaraj, > > You're splitting the BED file at a place that does not make a lot of sense > to me, i.e., between the thickStart and thickEnd columns. When rtracklayer > goes to load the thick regions, it can find only the start, not the end, > and it fails. Anyway, this is a bug in rtracklayer and I will go ahead and > make it assume the missing thickEnd is equal to the feature end. > > Michael > > >> -Regards >> >> -- >> Swaraj Basu >> PhD Student (Bioinformatics - Functional Genomics) >> Animal Physiology and Evolution >> Stazione Zoologica Anton Dohrn >> Naples >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Thanks Michael for your help. The thickStart column for my data actually holds the number of tags mapped to each position. Since before rtracklayer was able to upload data along with this column, I did not bother for a workaround. I will keep in mind your suggestion for future usage of the package. On Wed, Jun 6, 2012 at 7:50 PM, Michael Lawrence <lawrence.michael@gene.com>wrote: > Should be fixed in 1.17.9. If you're using release, might be easiest to > just get rid of that last column. > > Michael > > > On Wed, Jun 6, 2012 at 10:46 AM, Michael Lawrence <michafla@gene.com>wrote: > >> >> >> On Wed, Jun 6, 2012 at 10:37 AM, swaraj basu <projectbasu@gmail.com>wrote: >> >>> Hello Everybody, >>> >>> I was using the rtracklayer package to upload CAGE tag bed files as >>> genomic >>> ranges object. I used the commands >>> >>> library(rtracklayer) >>> tmp<-import("filename.bed", asRangedData=FALSE) >>> And they were working fine. >>> My BED file contains 7 fields in the bed format >>> >>> *chr1 3783 3784 ctss1 0.128643307100390 + 1 >>> chr1 3788 3789 ctss2 0.128643307100390 + 1 >>> chr1 3798 3799 ctss3 0.128643307100390 + 1 >>> chr1 3822 3823 ctss4 0.128643307100390 + 1 >>> chr1 3830 3831 ctss5 0.128643307100390 + 1 >>> chr1 3843 3844 ctss6 0.128643307100390 + 1 >>> chr1 3862 3863 ctss7 0.128643307100390 + 1 >>> chr1 6254 6255 ctss8 0.128643307100390 - 1 >>> chr1 7219 7220 ctss9 0.128643307100390 + 1 >>> chr1 14028 14029 ctss10 0.128643307100390 + 1* >>> >>> Recently I did an update of all the installed bioconductor packages using >>> the commands >>> source("http://bioconductor.org/biocLite.R") >>> biocLite(character(), ask=FALSE) >>> >>> And now when I try to upload the same file I get error >>> tmp <- import("filename.bed", asRangedData=FALSE) >>> Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = >>> "IRanges") >>> : >>> solving row 1: range cannot be determined from the supplied arguments >>> (too many NAs) >>> >>> While if I upload another bed file with all 12 fields I am not getting >>> any >>> error message >>> eg >>> *chr1 25630757 25636172 ul45 0 + 0 0 0,0,0 5 >>> 1142,2065,91,323,482, 0,1399,4271,4518,4933, >>> chr1 25631826 25634542 ul46 0 - 0 0 0,0,0 1 >>> 2716, 0, >>> chr1 25631826 25635826 ul47 0 - 0 0 0,0,0 1 >>> 4000, 0, >>> chr1 28599188 28604017 ul48 0 - 0 0 0,0,0 4 >>> 1086,116,336,221, 0,1199,1425,4608, >>> chr1 40911858 40912963 ul49 0 + 0 0 0,0,0 1 >>> 1105, 0,* >>> >>> Is this a change made in the update, or the new version of >>> *rtracklayer*can only accommodate complete bed files. Can someone >>> please help. >>> >>> >> Hi Swaraj, >> >> You're splitting the BED file at a place that does not make a lot of >> sense to me, i.e., between the thickStart and thickEnd columns. When >> rtracklayer goes to load the thick regions, it can find only the start, not >> the end, and it fails. Anyway, this is a bug in rtracklayer and I will go >> ahead and make it assume the missing thickEnd is equal to the feature end. >> >> Michael >> >> >>> -Regards >>> >>> -- >>> Swaraj Basu >>> PhD Student (Bioinformatics - Functional Genomics) >>> Animal Physiology and Evolution >>> Stazione Zoologica Anton Dohrn >>> Naples >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > -- Swaraj Basu PhD Student (Bioinformatics - Functional Genomics) Animal Physiology and Evolution Stazione Zoologica Anton Dohrn Naples [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Ok, I get it now. You have a BED6+1 file. This is supported in the devel version of rtracklayer, by specifying the extraCols = "tagCount" argument. The columns named in extraCols are assumed to be at the end of the file. It simply loads them and names them in the order of that vector. Michael On Thu, Jun 7, 2012 at 1:07 AM, swaraj basu <projectbasu@gmail.com> wrote: > Thanks Michael for your help. The thickStart column for my data actually > holds the number of tags mapped to each position. Since before rtracklayer > was able to upload data along with this column, I did not bother for a > workaround. I will keep in mind your suggestion for future usage of the > package. > > > On Wed, Jun 6, 2012 at 7:50 PM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> Should be fixed in 1.17.9. If you're using release, might be easiest to >> just get rid of that last column. >> >> Michael >> >> >> On Wed, Jun 6, 2012 at 10:46 AM, Michael Lawrence <michafla@gene.com>wrote: >> >>> >>> >>> On Wed, Jun 6, 2012 at 10:37 AM, swaraj basu <projectbasu@gmail.com>wrote: >>> >>>> Hello Everybody, >>>> >>>> I was using the rtracklayer package to upload CAGE tag bed files as >>>> genomic >>>> ranges object. I used the commands >>>> >>>> library(rtracklayer) >>>> tmp<-import("filename.bed", asRangedData=FALSE) >>>> And they were working fine. >>>> My BED file contains 7 fields in the bed format >>>> >>>> *chr1 3783 3784 ctss1 0.128643307100390 + 1 >>>> chr1 3788 3789 ctss2 0.128643307100390 + 1 >>>> chr1 3798 3799 ctss3 0.128643307100390 + 1 >>>> chr1 3822 3823 ctss4 0.128643307100390 + 1 >>>> chr1 3830 3831 ctss5 0.128643307100390 + 1 >>>> chr1 3843 3844 ctss6 0.128643307100390 + 1 >>>> chr1 3862 3863 ctss7 0.128643307100390 + 1 >>>> chr1 6254 6255 ctss8 0.128643307100390 - 1 >>>> chr1 7219 7220 ctss9 0.128643307100390 + 1 >>>> chr1 14028 14029 ctss10 0.128643307100390 + 1* >>>> >>>> Recently I did an update of all the installed bioconductor packages >>>> using >>>> the commands >>>> source("http://bioconductor.org/biocLite.R") >>>> biocLite(character(), ask=FALSE) >>>> >>>> And now when I try to upload the same file I get error >>>> tmp <- import("filename.bed", asRangedData=FALSE) >>>> Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = >>>> "IRanges") >>>> : >>>> solving row 1: range cannot be determined from the supplied arguments >>>> (too many NAs) >>>> >>>> While if I upload another bed file with all 12 fields I am not getting >>>> any >>>> error message >>>> eg >>>> *chr1 25630757 25636172 ul45 0 + 0 0 0,0,0 5 >>>> 1142,2065,91,323,482, 0,1399,4271,4518,4933, >>>> chr1 25631826 25634542 ul46 0 - 0 0 0,0,0 1 >>>> 2716, 0, >>>> chr1 25631826 25635826 ul47 0 - 0 0 0,0,0 1 >>>> 4000, 0, >>>> chr1 28599188 28604017 ul48 0 - 0 0 0,0,0 4 >>>> 1086,116,336,221, 0,1199,1425,4608, >>>> chr1 40911858 40912963 ul49 0 + 0 0 0,0,0 1 >>>> 1105, 0,* >>>> >>>> Is this a change made in the update, or the new version of >>>> *rtracklayer*can only accommodate complete bed files. Can someone >>>> please help. >>>> >>>> >>> Hi Swaraj, >>> >>> You're splitting the BED file at a place that does not make a lot of >>> sense to me, i.e., between the thickStart and thickEnd columns. When >>> rtracklayer goes to load the thick regions, it can find only the start, not >>> the end, and it fails. Anyway, this is a bug in rtracklayer and I will go >>> ahead and make it assume the missing thickEnd is equal to the feature end. >>> >>> Michael >>> >>> >>>> -Regards >>>> >>>> -- >>>> Swaraj Basu >>>> PhD Student (Bioinformatics - Functional Genomics) >>>> Animal Physiology and Evolution >>>> Stazione Zoologica Anton Dohrn >>>> Naples >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> > > > -- > Swaraj Basu > PhD Student (Bioinformatics - Functional Genomics) > Animal Physiology and Evolution > Stazione Zoologica Anton Dohrn > Naples > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 855 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6