makeTranscriptDbFromBiomart error
3
0
Entering edit mode
Stefanie ▴ 360
@stefanie-5192
Last seen 10.3 years ago
Hi, here is my code: library(GenomicFeatures) humanDB = makeTranscriptDbFromBiomart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl") This is the error I get thrown: Download and preprocess the 'transcripts' data frame ... OK Download and preprocess the 'chrominfo' data frame ... OK Download and preprocess the 'splicings' data frame ... Fehler in .extractCdsRangesFromBiomartTable(bm_table) : BioMart data anomaly: some 5' UTR have a start > end Seems to be a problem of biomart not of R? Anybody any idea? Best, Stefanie sessionInfo() R version 2.14.1 (2011-12-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] de_AT.UTF-8/de_AT.UTF-8/de_AT.UTF-8/C/de_AT.UTF-8/de_AT.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicFeatures_1.6.5 AnnotationDbi_1.16.10 Biobase_2.14.0 GenomicRanges_1.6.4 IRanges_1.12.5 biomaRt_2.10.0 loaded via a namespace (and not attached): [1] Biostrings_2.22.0 BSgenome_1.22.0 DBI_0.2-5 RCurl_1.8-0 RSQLite_0.11.1 rtracklayer_1.14.4 tools_2.14.1 XML_3.6-2 [9] zlibbioc_1.0.0
biomaRt biomaRt • 1.9k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States
Hi Stefanie, On Wed, Jun 6, 2012 at 10:03 AM, Stefanie <stefanie.tauber at="" univie.ac.at=""> wrote: > Hi, > > here is my code: > library(GenomicFeatures) > humanDB = makeTranscriptDbFromBiomart(biomart = "ensembl", dataset = > "hsapiens_gene_ensembl") > > This is the error I get thrown: > Download and preprocess the 'transcripts' data frame ... OK > Download and preprocess the 'chrominfo' data frame ... OK > Download and preprocess the 'splicings' data frame ... Fehler in > .extractCdsRangesFromBiomartTable(bm_table) : > ?BioMart data anomaly: some 5' UTR have a start > end > > Seems to be a problem of biomart not of R? > Anybody any idea? I don't think this is the problem, but I'd first start by updating R to 2.15.x and reinstall your bioconductor packages (via `biocLite`) so that your playing w/ the latest and greatest. Second, you might try building the DB you need via UCSC using their ensGene table ... I guess it should be pretty much what you need, no? For example: R> library(GenomicFeatures) R> txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Hi I just tried it with R 2.15, I get the same error. If I follow your suggestion: > txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") I get: Download the ensGene table ... OK Extract the 'transcripts' data frame ... OK Extract the 'splicings' data frame ... OK Download and preprocess the 'chrominfo' data frame ... Error in download.file(url, destfile, quiet = TRUE) : cannot open URL 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/data base/chromInfo.txt.gz' In addition: There were 50 or more warnings (use warnings() to see the first 50) > warnings() Warning messages: 1: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000513161: the cds cumulative length is not a multiple of 3 2: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000417833: the cds cumulative length is not a multiple of 3 3: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000450884: the cds cumulative length is not a multiple of 3 Hmm, what else could I try? In any case, thanks a lot for any suggestion!! Am 06.06.2012 um 21:59 schrieb Steve Lianoglou: > Hi Stefanie, > > On Wed, Jun 6, 2012 at 10:03 AM, Stefanie <stefanie.tauber@univie.ac.at> wrote: >> Hi, >> >> here is my code: >> library(GenomicFeatures) >> humanDB = makeTranscriptDbFromBiomart(biomart = "ensembl", dataset = >> "hsapiens_gene_ensembl") >> >> This is the error I get thrown: >> Download and preprocess the 'transcripts' data frame ... OK >> Download and preprocess the 'chrominfo' data frame ... OK >> Download and preprocess the 'splicings' data frame ... Fehler in >> .extractCdsRangesFromBiomartTable(bm_table) : >> BioMart data anomaly: some 5' UTR have a start > end >> >> Seems to be a problem of biomart not of R? >> Anybody any idea? > > I don't think this is the problem, but I'd first start by updating R > to 2.15.x and reinstall your bioconductor packages (via `biocLite`) so > that your playing w/ the latest and greatest. > > Second, you might try building the DB you need via UCSC using their > ensGene table ... I guess it should be pretty much what you need, no? > For example: > > R> library(GenomicFeatures) > R> txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") > > HTH, > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact DI Stefanie Tauber Center for Integrative Bioinformatics Vienna (CIBIV) (CIBIV is a joint institute of Vienna University, Medical University, and University of Veterinary Medicine, Vienna, Austria) Max F. Perutz Laboratories (MFPL) Campus Vienna Biocenter 5 (VBC5), Ebene 1, Room 1812.2 Dr. Bohr Gasse 9 A-1030 Wien, Austria Phone: ++43 +1 / 42772-4030 Fax: ++43 +1 / 42772-4098 email: stefanie.tauber@univie.ac.at www.cibiv.at [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Stefanie, On Thu, Jun 7, 2012 at 5:16 AM, Stefanie Tauber <stefanie.tauber at="" univie.ac.at=""> wrote: > Hi > > I just tried it with R 2.15, I get the same error. > > If I follow your suggestion: > > txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") > > > I get: > > Download the ensGene table ... OK > Extract the 'transcripts' data frame ... OK > Extract the 'splicings' data frame ... OK > Download and preprocess the 'chrominfo' data frame ... Error in > download.file(url, destfile, quiet = TRUE) : > ? cannot open URL > 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/chromInfo.t xt.gz' > In addition: There were 50 or more warnings (use warnings() to see the first > 50) [snip] Strange ... I also get the same warnings you get (the "cds cumulative length is not a multiple of 3") for some transcripts, but I think this is something beyond our control. I don't get any error(s) when downloading and building the TxDB, so it completes fine for me. I'm actually running the *-devel versions of the bioc packages w/ R-2.15.x so it's not very easy for me to check the current released GenomicFeatures package, but I'd be a bit surprised if the error is there. Could you paste the output of `sessionInfo()` after you call `library(GenomicFeatures)` when running your new R-2.15.x install? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
Hi, here is my sessionInfo: > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicFeatures_1.8.0 AnnotationDbi_1.18.0 Biobase_2.16.0 [4] GenomicRanges_1.8.1 IRanges_1.14.2 BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] biomaRt_2.12.0 Biostrings_2.24.0 bitops_1.0-4.1 BSgenome_1.24.0 [5] DBI_0.2-5 RCurl_1.91-1 Rsamtools_1.8.0 RSQLite_0.11.1 [9] rtracklayer_1.16.0 stats4_2.15.0 tools_2.15.0 XML_3.9-4 [13] zlibbioc_1.2.0 I updated GenomicFeatures to 1.8.1, but unfortunately did not help. BUT: makeTranscriptDbFromUCSC did work :) > txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") Download the ensGene table ... OK Extract the 'transcripts' data frame ... OK Extract the 'splicings' data frame ... OK Download and preprocess the 'chrominfo' data frame ... OK Prepare the 'metadata' data frame ... metadata: OK Make the TranscriptDb object ... OK There were 50 or more warnings (use warnings() to see the first 50) > txdb TranscriptDb object: | Db type: TranscriptDb | Supporting package: GenomicFeatures | Data source: UCSC | Genome: hg19 | Genus and Species: Homo sapiens | UCSC Table: ensGene | Resource URL: http://genome.ucsc.edu/ | Type of Gene ID: Ensembl gene ID | Full dataset: yes | miRBase build ID: NA | transcript_nrow: 181648 | exon_nrow: 541825 | cds_nrow: 278798 | Db created by: GenomicFeatures package from Bioconductor | Creation time: 2012-06-07 17:48:45 +0200 (Thu, 07 Jun 2012) | GenomicFeatures version at creation time: 1.8.1 | RSQLite version at creation time: 0.11.1 | DBSCHEMAVERSION: 1.0 > warnings() Warning messages: 1: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000513161: the cds cumulative length is not a multiple of 3 2: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000417833: the cds cumulative length is not a multiple of 3 3: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : UCSC data anomaly in transcript ENST00000450884: the cds cumulative length is not a multiple of 3 Best, Stefanie Am 07.06.2012 um 16:25 schrieb Steve Lianoglou: > Hi Stefanie, > > On Thu, Jun 7, 2012 at 5:16 AM, Stefanie Tauber > <stefanie.tauber@univie.ac.at> wrote: >> Hi >> >> I just tried it with R 2.15, I get the same error. >> >> If I follow your suggestion: >> >> txdb <- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") >> >> >> I get: >> >> Download the ensGene table ... OK >> Extract the 'transcripts' data frame ... OK >> Extract the 'splicings' data frame ... OK >> Download and preprocess the 'chrominfo' data frame ... Error in >> download.file(url, destfile, quiet = TRUE) : >> cannot open URL >> 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/chromInfo. txt.gz' >> In addition: There were 50 or more warnings (use warnings() to see the first >> 50) > [snip] > > Strange ... I also get the same warnings you get (the "cds cumulative > length is not a multiple of 3") for some transcripts, but I think this > is something beyond our control. I don't get any error(s) when > downloading and building the TxDB, so it completes fine for me. > > I'm actually running the *-devel versions of the bioc packages w/ > R-2.15.x so it's not very easy for me to check the current released > GenomicFeatures package, but I'd be a bit surprised if the error is > there. > > Could you paste the output of `sessionInfo()` after you call > `library(GenomicFeatures)` when running your new R-2.15.x install? > > -steve > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact DI Stefanie Tauber Center for Integrative Bioinformatics Vienna (CIBIV) (CIBIV is a joint institute of Vienna University, Medical University, and University of Veterinary Medicine, Vienna, Austria) Max F. Perutz Laboratories (MFPL) Campus Vienna Biocenter 5 (VBC5), Ebene 1, Room 1812.2 Dr. Bohr Gasse 9 A-1030 Wien, Austria Phone: ++43 +1 / 42772-4030 Fax: ++43 +1 / 42772-4098 email: stefanie.tauber@univie.ac.at www.cibiv.at [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Stefanie, This is related to a bug with the 5' and 3' starts/ends that was in the latest version of biomaRt. We reported it to them a couple weeks ago because it immediately started to break some of our quality control tests for GenomicFeatures. At that time, they told us that it has been fixed, but it will still take a couple of weeks for their correction to propagate out. In the meantime, using either makeTranscriptDbFromUCSC() or the stock annotation packages for human, might be a good work- around for you. The warning that you saw for makeTranscriptDbFromUCSC() was another quality control check. We expect that when an annotation resource tells us the range for a CDS that this range should be divisible by three. When this doesn't happen, we issue the warning you were seeing for makeTranscriptDbFromUCSC(). Hope that this clarifies things, Marc On 06/07/2012 08:50 AM, Stefanie Tauber wrote: > Hi, > > here is my sessionInfo: > >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicFeatures_1.8.0 AnnotationDbi_1.18.0 Biobase_2.16.0 > [4] GenomicRanges_1.8.1 IRanges_1.14.2 BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] biomaRt_2.12.0 Biostrings_2.24.0 bitops_1.0-4.1 BSgenome_1.24.0 > [5] DBI_0.2-5 RCurl_1.91-1 Rsamtools_1.8.0 RSQLite_0.11.1 > [9] rtracklayer_1.16.0 stats4_2.15.0 tools_2.15.0 XML_3.9-4 > [13] zlibbioc_1.2.0 > > I updated GenomicFeatures to 1.8.1, but unfortunately did not help. > > > BUT: makeTranscriptDbFromUCSC did work :) > >> txdb<- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") > Download the ensGene table ... OK > Extract the 'transcripts' data frame ... OK > Extract the 'splicings' data frame ... OK > Download and preprocess the 'chrominfo' data frame ... OK > Prepare the 'metadata' data frame ... metadata: OK > Make the TranscriptDb object ... OK > There were 50 or more warnings (use warnings() to see the first 50) > >> txdb > TranscriptDb object: > | Db type: TranscriptDb > | Supporting package: GenomicFeatures > | Data source: UCSC > | Genome: hg19 > | Genus and Species: Homo sapiens > | UCSC Table: ensGene > | Resource URL: http://genome.ucsc.edu/ > | Type of Gene ID: Ensembl gene ID > | Full dataset: yes > | miRBase build ID: NA > | transcript_nrow: 181648 > | exon_nrow: 541825 > | cds_nrow: 278798 > | Db created by: GenomicFeatures package from Bioconductor > | Creation time: 2012-06-07 17:48:45 +0200 (Thu, 07 Jun 2012) > | GenomicFeatures version at creation time: 1.8.1 > | RSQLite version at creation time: 0.11.1 > | DBSCHEMAVERSION: 1.0 > >> warnings() > Warning messages: > 1: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : > UCSC data anomaly in transcript ENST00000513161: the cds cumulative length is not a multiple of 3 > 2: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : > UCSC data anomaly in transcript ENST00000417833: the cds cumulative length is not a multiple of 3 > 3: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], exon_locs$start[[i]], ... : > UCSC data anomaly in transcript ENST00000450884: the cds cumulative length is not a multiple of 3 > > > Best, > Stefanie > > Am 07.06.2012 um 16:25 schrieb Steve Lianoglou: > >> Hi Stefanie, >> >> On Thu, Jun 7, 2012 at 5:16 AM, Stefanie Tauber >> <stefanie.tauber at="" univie.ac.at=""> wrote: >>> Hi >>> >>> I just tried it with R 2.15, I get the same error. >>> >>> If I follow your suggestion: >>> >>> txdb<- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") >>> >>> >>> I get: >>> >>> Download the ensGene table ... OK >>> Extract the 'transcripts' data frame ... OK >>> Extract the 'splicings' data frame ... OK >>> Download and preprocess the 'chrominfo' data frame ... Error in >>> download.file(url, destfile, quiet = TRUE) : >>> cannot open URL >>> 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/chromInfo .txt.gz' >>> In addition: There were 50 or more warnings (use warnings() to see the first >>> 50) >> [snip] >> >> Strange ... I also get the same warnings you get (the "cds cumulative >> length is not a multiple of 3") for some transcripts, but I think this >> is something beyond our control. I don't get any error(s) when >> downloading and building the TxDB, so it completes fine for me. >> >> I'm actually running the *-devel versions of the bioc packages w/ >> R-2.15.x so it's not very easy for me to check the current released >> GenomicFeatures package, but I'd be a bit surprised if the error is >> there. >> >> Could you paste the output of `sessionInfo()` after you call >> `library(GenomicFeatures)` when running your new R-2.15.x install? >> >> -steve >> >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact > DI Stefanie Tauber > > Center for Integrative Bioinformatics Vienna (CIBIV) > (CIBIV is a joint institute of Vienna University, Medical University, and University of Veterinary Medicine, Vienna, Austria) > Max F. Perutz Laboratories (MFPL) > Campus Vienna Biocenter 5 (VBC5), Ebene 1, Room 1812.2 > Dr. Bohr Gasse 9 > A-1030 Wien, Austria > Phone: ++43 +1 / 42772-4030 > Fax: ++43 +1 / 42772-4098 > email: stefanie.tauber at univie.ac.at > www.cibiv.at > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States
One more thing: The uswest ensmbl biomart mirror has apparently been updated with the fix (for reasons that are not known to me, the default has still not been updated). So if you look at the manual page for ?makeTranscriptDbFromBiomart You can see an example of how to use the uswest.ensembl.org host by specifying the bomart and host arguments. Marc On 06/07/2012 10:40 AM, Marc Carlson wrote: > Hi Stefanie, > > This is related to a bug with the 5' and 3' starts/ends that was in > the latest version of biomaRt. We reported it to them a couple weeks > ago because it immediately started to break some of our quality > control tests for GenomicFeatures. At that time, they told us that it > has been fixed, but it will still take a couple of weeks for their > correction to propagate out. In the meantime, using either > makeTranscriptDbFromUCSC() or the stock annotation packages for human, > might be a good work-around for you. > > The warning that you saw for makeTranscriptDbFromUCSC() was another > quality control check. We expect that when an annotation resource > tells us the range for a CDS that this range should be divisible by > three. When this doesn't happen, we issue the warning you were seeing > for makeTranscriptDbFromUCSC(). > > Hope that this clarifies things, > > > Marc > > > > On 06/07/2012 08:50 AM, Stefanie Tauber wrote: >> Hi, >> >> here is my sessionInfo: >> >>> sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] GenomicFeatures_1.8.0 AnnotationDbi_1.18.0 Biobase_2.16.0 >> [4] GenomicRanges_1.8.1 IRanges_1.14.2 BiocGenerics_0.2.0 >> >> loaded via a namespace (and not attached): >> [1] biomaRt_2.12.0 Biostrings_2.24.0 bitops_1.0-4.1 >> BSgenome_1.24.0 >> [5] DBI_0.2-5 RCurl_1.91-1 Rsamtools_1.8.0 >> RSQLite_0.11.1 >> [9] rtracklayer_1.16.0 stats4_2.15.0 tools_2.15.0 XML_3.9-4 >> [13] zlibbioc_1.2.0 >> >> I updated GenomicFeatures to 1.8.1, but unfortunately did not help. >> >> >> BUT: makeTranscriptDbFromUCSC did work :) >> >>> txdb<- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") >> Download the ensGene table ... OK >> Extract the 'transcripts' data frame ... OK >> Extract the 'splicings' data frame ... OK >> Download and preprocess the 'chrominfo' data frame ... OK >> Prepare the 'metadata' data frame ... metadata: OK >> Make the TranscriptDb object ... OK >> There were 50 or more warnings (use warnings() to see the first 50) >> >>> txdb >> TranscriptDb object: >> | Db type: TranscriptDb >> | Supporting package: GenomicFeatures >> | Data source: UCSC >> | Genome: hg19 >> | Genus and Species: Homo sapiens >> | UCSC Table: ensGene >> | Resource URL: http://genome.ucsc.edu/ >> | Type of Gene ID: Ensembl gene ID >> | Full dataset: yes >> | miRBase build ID: NA >> | transcript_nrow: 181648 >> | exon_nrow: 541825 >> | cds_nrow: 278798 >> | Db created by: GenomicFeatures package from Bioconductor >> | Creation time: 2012-06-07 17:48:45 +0200 (Thu, 07 Jun 2012) >> | GenomicFeatures version at creation time: 1.8.1 >> | RSQLite version at creation time: 0.11.1 >> | DBSCHEMAVERSION: 1.0 >> >>> warnings() >> Warning messages: >> 1: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], >> exon_locs$start[[i]], ... : >> UCSC data anomaly in transcript ENST00000513161: the cds >> cumulative length is not a multiple of 3 >> 2: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], >> exon_locs$start[[i]], ... : >> UCSC data anomaly in transcript ENST00000417833: the cds >> cumulative length is not a multiple of 3 >> 3: In .extractUCSCCdsStartEnd(cdsStart[i], cdsEnd[i], >> exon_locs$start[[i]], ... : >> UCSC data anomaly in transcript ENST00000450884: the cds >> cumulative length is not a multiple of 3 >> >> >> Best, >> Stefanie >> >> Am 07.06.2012 um 16:25 schrieb Steve Lianoglou: >> >>> Hi Stefanie, >>> >>> On Thu, Jun 7, 2012 at 5:16 AM, Stefanie Tauber >>> <stefanie.tauber at="" univie.ac.at=""> wrote: >>>> Hi >>>> >>>> I just tried it with R 2.15, I get the same error. >>>> >>>> If I follow your suggestion: >>>> >>>> txdb<- makeTranscriptDbFromUCSC(genome="hg19", tablename="ensGene") >>>> >>>> >>>> I get: >>>> >>>> Download the ensGene table ... OK >>>> Extract the 'transcripts' data frame ... OK >>>> Extract the 'splicings' data frame ... OK >>>> Download and preprocess the 'chrominfo' data frame ... Error in >>>> download.file(url, destfile, quiet = TRUE) : >>>> cannot open URL >>>> 'http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/chromInf o.txt.gz' >>>> >>>> In addition: There were 50 or more warnings (use warnings() to see >>>> the first >>>> 50) >>> [snip] >>> >>> Strange ... I also get the same warnings you get (the "cds cumulative >>> length is not a multiple of 3") for some transcripts, but I think this >>> is something beyond our control. I don't get any error(s) when >>> downloading and building the TxDB, so it completes fine for me. >>> >>> I'm actually running the *-devel versions of the bioc packages w/ >>> R-2.15.x so it's not very easy for me to check the current released >>> GenomicFeatures package, but I'd be a bit surprised if the error is >>> there. >>> >>> Could you paste the output of `sessionInfo()` after you call >>> `library(GenomicFeatures)` when running your new R-2.15.x install? >>> >>> -steve >>> >>> >>> -- >>> Steve Lianoglou >>> Graduate Student: Computational Systems Biology >>> | Memorial Sloan-Kettering Cancer Center >>> | Weill Medical College of Cornell University >>> Contact Info: http://cbio.mskcc.org/~lianos/contact >> DI Stefanie Tauber >> >> Center for Integrative Bioinformatics Vienna (CIBIV) >> (CIBIV is a joint institute of Vienna University, Medical University, >> and University of Veterinary Medicine, Vienna, Austria) >> Max F. Perutz Laboratories (MFPL) >> Campus Vienna Biocenter 5 (VBC5), Ebene 1, Room 1812.2 >> Dr. Bohr Gasse 9 >> A-1030 Wien, Austria >> Phone: ++43 +1 / 42772-4030 >> Fax: ++43 +1 / 42772-4098 >> email: stefanie.tauber at univie.ac.at >> www.cibiv.at >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Stefanie ▴ 360
@stefanie-5192
Last seen 10.3 years ago
Hi Marc, thanks for the background info, always nice to know the source of some errors or warnings... In any case using makeTranscriptDbFromUCSC() is fine for me, Thanks! Stefanie
ADD COMMENT

Login before adding your answer.

Traffic: 392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6