Building a TranscriptDb object from a lncRNA gft file
1
0
Entering edit mode
@fong-chun-chan-5706
Last seen 10.3 years ago
Hi, I am trying to use the makeTranscriptDbFromGFF() function from the GenomicFeatures R package to build a transcriptDB from a lncRNA gff file available from http://www.lncipedia.org/download (version 2.1). It gives this error when I run it: $> lncRNADb <- makeTranscriptDbFromGFF( '~/share/references/lncipedia_2_1.gtf', format = 'gtf', dataSource = ' http://www.lncipedia.org/', species = 'all' ) extracting transcript information Estimating transcript ranges. Extracting gene IDs Processing splicing information for gtf file. Deducing exon rank from relative coordinates provided Prepare the 'metadata' data frame ... metadata: OK Now generating chrominfo from available sequence names. No chromosome length information is available. Error in .normargSplicings(splicings, transcripts_tx_id) : 'splicings$cds_start' must be an integer vector In addition: Warning messages: 1: In .deduceExonRankings(exs, format = "gtf") : Infering Exon Rankings. If this is not what you expected, then please be sure that you have provided a valid attribute for exonRankAttributeName 2: In matchCircularity(chroms, circ_seqs) : None of the strings in your circ_seqs argument match your seqnames. Has anyone encountered this error before? Any help would be greatly appreciated. Below is my sessionInfo(). Thanks, Fong --- > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicFeatures_1.10.1 AnnotationDbi_1.20.3 Biobase_2.18.0 [4] GenomicRanges_1.10.6 IRanges_1.16.4 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] biomaRt_2.14.0 Biostrings_2.26.3 bitops_1.0-5 BSgenome_1.26.1 [5] DBI_0.2-5 parallel_2.15.2 RCurl_1.95-3 Rsamtools_1.10.2 [9] RSQLite_0.11.2 rtracklayer_1.18.2 stats4_2.15.2 tools_2.15.2 [13] XML_3.95-0.1 zlibbioc_1.4.0 [[alternative HTML version deleted]]
TranscriptDb TranscriptDb • 1.5k views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 8.4 years ago
United States
Hi Fong, I downloaded this file. And after removing the very 1st line of the file (which was only "##gtf" ), this ran fine for me. Did you look at the file before you ran it and see the line of cruft at the top? Also what was your sessionInfo()? Marc On 02/14/2013 12:54 PM, Fong Chun Chan wrote: > Hi, > > I am trying to use the makeTranscriptDbFromGFF() function from the > GenomicFeatures R package to build a transcriptDB from a lncRNA gff file > available from http://www.lncipedia.org/download (version 2.1). > > It gives this error when I run it: > > $> lncRNADb<- makeTranscriptDbFromGFF( > '~/share/references/lncipedia_2_1.gtf', format = 'gtf', dataSource =' > http://www.lncipedia.org/', species = 'all' ) > > extracting transcript information > Estimating transcript ranges. > Extracting gene IDs > Processing splicing information for gtf file. > Deducing exon rank from relative coordinates provided > Prepare the 'metadata' data frame ... metadata: OK > Now generating chrominfo from available sequence names. No chromosome > length information is available. > Error in .normargSplicings(splicings, transcripts_tx_id) : > 'splicings$cds_start' must be an integer vector > In addition: Warning messages: > 1: In .deduceExonRankings(exs, format = "gtf") : > Infering Exon Rankings. If this is not what you expected, then please be > sure that you have provided a valid attribute for exonRankAttributeName > 2: In matchCircularity(chroms, circ_seqs) : > None of the strings in your circ_seqs argument match your seqnames. > > Has anyone encountered this error before? Any help would be greatly > appreciated. Below is my sessionInfo(). Thanks, > > Fong > > --- > >> sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicFeatures_1.10.1 AnnotationDbi_1.20.3 Biobase_2.18.0 > [4] GenomicRanges_1.10.6 IRanges_1.16.4 BiocGenerics_0.4.0 > > loaded via a namespace (and not attached): > [1] biomaRt_2.14.0 Biostrings_2.26.3 bitops_1.0-5 > BSgenome_1.26.1 > [5] DBI_0.2-5 parallel_2.15.2 RCurl_1.95-3 > Rsamtools_1.10.2 > [9] RSQLite_0.11.2 rtracklayer_1.18.2 stats4_2.15.2 tools_2.15.2 > > [13] XML_3.95-0.1 zlibbioc_1.4.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 334 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6