Hi,
I am trying to make a txdb for Arabidopsis lyrata. the annotation file could be downloaded here:
I am using the following command to create txdb:
txdb <- makeTxDbFromGFF(file="/PATH/rawdata/annotations/Arabidopsis_lyrata.v.1.0.30.chrb.gff3",
format=c("auto", "gff3", "gtf"),
dataSource="gtf file for Arabidopsis lyrata",
organism="Arabidopsis lyrata")
Above command creates the txdb as below:
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
> txdb
TxDb object:
# Db type: TxDb
# Supporting package: GenomicFeatures
# Data source: gtf file for Arabidopsis lyrata
# Organism: Arabidopsis lyrata
# Taxonomy ID: 59689
# miRBase build ID: NA
# Genome: NA
# transcript_nrow: 31478
# exon_nrow: 170022
# cds_nrow: 154686
# Db created by: GenomicFeatures package from Bioconductor
# Creation time: 2017-12-14 14:59:40 +0200 (Thu, 14 Dec 2017)
# GenomicFeatures version at creation time: 1.28.4
# RSQLite version at creation time: 2.0
# DBSCHEMAVERSION: 1.1
However when I use seqinfo(txdb) it shows empty:
> seqinfo(txdb)
Seqinfo object with 8 sequences from an unspecified genome; no seqlengths:
seqnames seqlengths isCircular genome
chr1 NA NA <NA>
chr2 NA NA <NA>
chr3 NA NA <NA>
chr4 NA NA <NA>
chr5 NA NA <NA>
chr6 NA NA <NA>
chr7 NA NA <NA>
chr8 NA NA <NA>
While it should be similar to:
> library("BSgenome.Alyrata.JGI.v1")
> seqinfo(Alyrata)
Seqinfo object with 8 sequences from Assembly V1.0 genome:
seqnames seqlengths isCircular genome
chr1 33132539 FALSE Assembly V1.0
chr2 19320864 FALSE Assembly V1.0
chr3 24464547 FALSE Assembly V1.0
chr4 23328337 FALSE Assembly V1.0
chr5 21221946 FALSE Assembly V1.0
chr6 25113588 FALSE Assembly V1.0
chr7 24649197 FALSE Assembly V1.0
chr8 22951293 FALSE Assembly V1.0
I really appreciate it if you pin point the problem or if there is a better way to make the txdb?
Kind regards,
Nader
sessionInfo()
please! Your version of Bioconductor seems outdated.Note that
makeTxDbFromGFF()
usesrtracklayer::import.gff3()
internally as a first step of importing the GFF3 file as a GRanges object. And even though the sequence lengths are present in the file, for some reasonsrtracklayer::import.gff3()
fails to import them:You could either ask a new question on this site with tag rtracklayer and focus on the
rtracklayer::import.gff3()
issue, or open an issue on GitHub: https://github.com/lawremi/rtracklayer/issuesThanks,
H.