I'm trying to use the function MakeTxDbFromGFF and am getting errors. There are three .gff3 files I'm uisng at https://download.xenbase.org/pub/Genomics/JGI/Xenla9.2/
XENLA_9.2_Xenbase.gff3.gz XENLA_9.2_GCA.gff3.gz XENLA_9.2_GCF.ff3.gz
I don't know what the differences are so I'm trying all three to see which gives me the best result. My R commands are:
TxDb.xlaevis_xenbase <- makeTxDbFromGFF("XENLA_9.2_Xenbase.gff3")
TxDb.xlaevis_GCF <- makeTxDbFromGFF("XENLA_9.2_GCF.gff3")
TxDb.xlaevis_GCA <- makeTxDbFromGFF("XENLA_9.2_GCA.gff3")
The first ccommand, using XENLA_9.2_Xenbase.gff3, gives the following error:
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... Error in as.vector(x, mode) :
coercing an AtomicList object to an atomic vector is supported only for
objects with top-level elements of length <= 1
Does anyone know why this is or how to fix it?
The command calling the GCF file works but gives me warnings. The command calling the GCA file works perfectly.
I tried running XENLA_9.2_Xenbase.gff3 through http://genometools.org/cgi-bin/gff3validator.cgi and it tells me that the .gff3 is too large.
Top 10 lines of the XENLA_9.2_Xenbase.gff3:
#gff-version 3
#data-version 2017-08-28
#species Xenopus laevis
#genome build 9.2
#genome assembler NCBI
#genome accession GCF_001663975.1
#genome FASTA file ftp://ftp.xenbase.org/pub/Genomics/JGI/Xenla9.2/XL9_2.fa.gz
#RefSeq-Accn converted to Sequence-Name via ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/663/975/GCF_001663975.1_Xenopus_laevis_v2/GCF_001663975.1_Xenopus_laevis_v2_assembly_report.txt
MT Xenbase gene 2136 2204 . + . ID=gene42065;Alias=XL-9_2-gene42065;Name=mt-trna-phe.L;Dbxref=Xenbase:XB-GENE-22251956;Note=NAR: 1461;Original=RefSeq_rna62470;anticodon=(pos:2166..2168);gbkey=tRNA;product=tRNA-Phe;curie=Xenbase:XB-GENE-22251956;gene_id=Xenbase:XB-GENE-22251956;Ontology_term=SO:0001272
MT Xenbase tRNA 2136 2204 . + . ID=rna100000;Alias=XL-9_2-rna100000;Name=rna100000;Parent=gene42065;curie=modelID:XL-9_2-rna100000;transcript_id=modelID:XL-9_2-rna100000;Ontology_term=SO:0000253
MT Xenbase exon 2136 2204 . + . ID=id856313;Alias=XL-9_2-id856313;Parent=rna100000;gbkey=tRNA
MT Xenbase gene 2205 3023 . + . ID=gene34778;Alias=XL-9_2-gene34778;Name=mt-rnr1.L;Dbxref=Xenbase:XB-GENE-22251886;Original=RefSeq_rna62492;gbkey=rRNA;product=12S ribosomal RNA;curie=Xenbase:XB-GENE-22251886;gene_id=Xenbase:XB-GENE-22251886;Ontology_term=SO:0001637
MT Xenbase rRNA 2205 3023 . + . ID=rna100001;Alias=XL-9_2-rna100001;Name=rna100001;Parent=gene34778;curie=modelID:XL-9_2-rna100001;transcript_id=modelID:XL-9_2-rna100001;Ontology_term=SO:0000252
MT Xenbase exon 2205 3023 . + . ID=id733233;Alias=XL-9_2-id733233;Parent=rna100001;gbkey=rRNA
MT Xenbase gene 3024 3092 . + . ID=gene48202;Alias=XL-9_2-gene48202;Name=mt-trna-val.L;Dbxref=Xenbase:XB-GENE-22251991;Original=RefSeq_rna62471;anticodon=(pos:3054..3056);gbkey=tRNA;product=tRNA-Val;curie=Xenbase:XB-GENE-22251991;gene_id=Xenbase:XB-GENE-22251991;Ontology_term=SO:0001272
MT Xenbase tRNA 3024 3092 . + . ID=rna100002;Alias=XL-9_2-rna100002;Name=rna100002;Parent=gene48202;curie=modelID:XL-9_2-rna100002;transcript_id=modelID:XL-9_2-rna100002;Ontology_term=SO:0000253
MT Xenbase exon 3024 3092 . + . ID=id956642;Alias=XL-9_2-id956642;Parent=rna100002;gbkey=tRNA
MT Xenbase gene 3093 4723 . + . ID=gene44770;Alias=XL-9_2-gene44770;Name=mt-rnr2.L;Dbxref=Xenbase:XB-GENE-22251891;Original=RefSeq_rna62493;gbkey=rRNA;product=16S ribosomal RNA;curie=Xenbase:XB-GENE-22251891;gene_id=Xenbase:XB-GENE-22251891;Ontology_term=SO:0001637
MT Xenbase rRNA 3093 4723 . + . ID=rna100003;Alias=XL-9_2-rna100003;Name=rna100003;Parent=gene44770;curie=modelID:XL-9_2-rna100003;transcript_id=modelID:XL-9_2-rna100003;Ontology_term=SO:0000252
MT Xenbase exon 3093 4723 . + . ID=id903228;Alias=XL-9_2-id903228;Parent=rna100003;gbkey=rRNA
MT Xenbase gene 4724 4798 . + . ID=gene43253;Alias=XL-9_2-gene43253;Name=mt-trna-leu1.L;Dbxref=Xenbase:XB-GENE-22251946;Original=RefSeq_rna62472;anticodon=(pos:4759..4761);gbkey=tRNA;product=tRNA-Leu;curie=Xenbase:XB-GENE-22251946;gene_id=Xenbase:XB-GENE-22251946;Ontology_term=SO:0001272
MT Xenbase tRNA 4724 4798 . + . ID=rna100004;Alias=XL-9_2-rna100004;Name=rna100004;Parent=gene43253;curie=modelID:XL-9_2-rna100004;transcript_id=modelID:XL-9_2-rna100004;Ontology_term=SO:0000253
MT Xenbase exon 4724 4798 . + . ID=id878652;Alias=XL-9_2-id878652;Parent=rna100004;gbkey=tRNA
MT Xenbase gene 4799 5770 . + . ID=gene41609;Alias=XL-9_2-gene41609;Name=nd1.L;Dbxref=GeneID:2642086,Xenbase:XB-GENE-6251959;gbkey=Gene;gene=nd1.L;gene_biotype=protein_coding;curie=Xenbase:XB-GENE-6251959;gene_id=Xenbase:XB-GENE-6251959;Ontology_term=SO:0001217
MT Xenbase mRNA 4799 5770 . + . ID=rna100005;Alias=XL-9_2-rna100005;Name=rna100005;Parent=gene41609;curie=modelID:XL-9_2-rna100005;transcript_id=modelID:XL-9_2-rna100005;Ontology_term=SO:0000234
MT Xenbase CDS 4799 5770 . + 0 ID=cds781946;Alias=XL-9_2-cds781946;Parent=rna100005;gbkey=CDS;protein_id=modelID:XL-9_2-cds781946
sessionInfo( )
``` R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicAlignments_1.32.1 Rsamtools_2.12.0
[3] Biostrings_2.64.1 XVector_0.36.0
[5] openxlsx_4.2.5.1 xlaevis.db_3.2.3
[7] org.Xl.eg.db_3.15.0 ChIPseeker_1.32.1
[9] rtracklayer_1.56.1 TxDb.Hsapiens.UCSC.hg38.knownGene_3.15.0
[11] GenomicFeatures_1.48.4 AnnotationDbi_1.58.0
[13] ChIPQC_1.32.2 BiocParallel_1.30.4
[15] DiffBind_3.6.5 SummarizedExperiment_1.26.1
[17] Biobase_2.56.0 MatrixGenerics_1.8.1
[19] matrixStats_0.63.0 GenomicRanges_1.48.0
[21] GenomeInfoDb_1.32.4 IRanges_2.30.1
[23] S4Vectors_0.34.0 BiocGenerics_0.42.0
[25] ggplot2_3.4.0 BiocManager_1.30.19
loaded via a namespace (and not attached):
[1] utf8_1.2.2 tidyselect_1.2.0
[3] RSQLite_2.2.18 htmlwidgets_1.6.1
[5] grid_4.2.2 scatterpie_0.1.8
[7] munsell_0.5.0 codetools_0.2-18
[9] interp_1.1-3 systemPipeR_2.2.2
[11] withr_2.5.0 colorspace_2.0-3
[13] GOSemSim_2.22.0 filelock_1.0.2
[15] rstudioapi_0.14 rJava_1.0-6
[17] DOSE_3.22.1 bbmle_1.0.25
[19] GenomeInfoDbData_1.2.8 mixsqp_0.3-48
[21] hwriter_1.3.2.1 polyclip_1.10-4
[23] bit64_4.0.5 farver_2.1.1
[25] coda_0.19-4 vctrs_0.5.0
[27] treeio_1.20.2 TxDb.Rnorvegicus.UCSC.rn4.ensGene_3.2.2
[29] generics_0.1.3 BiocFileCache_2.4.0
[31] R6_2.5.1 apeglm_1.18.0
[33] graphlayouts_0.8.4 invgamma_1.1
[35] RVenn_1.1.0 locfit_1.5-9.7
[37] bitops_1.0-7 cachem_1.0.6
[39] fgsea_1.22.0 gridGraphics_0.5-1
[41] DelayedArray_0.22.0 assertthat_0.2.1
[43] BiocIO_1.6.0 scales_1.2.1
[45] ggraph_2.1.0 enrichplot_1.16.2
[47] gtable_0.3.1 tidygraph_1.2.2
[49] xlsx_0.6.5 rlang_1.0.6
[51] splines_4.2.2 lazyeval_0.2.2
[53] yaml_2.3.6 reshape2_1.4.4
[55] TxDb.Dmelanogaster.UCSC.dm3.ensGene_3.2.2 qvalue_2.28.0
[57] tools_4.2.2 ggplotify_0.1.0
[59] ellipsis_0.3.2 gplots_3.1.3
[61] RColorBrewer_1.1-3 Rcpp_1.0.9
[63] plyr_1.8.8 progress_1.2.2
[65] zlibbioc_1.42.0 purrr_1.0.1
[67] RCurl_1.98-1.9 prettyunits_1.1.1
[69] deldir_1.0-6 viridis_0.6.2
[71] ashr_2.2-54 chipseq_1.46.0
[73] ggrepel_0.9.2 magrittr_2.0.3
[75] data.table_1.14.6 TxDb.Hsapiens.UCSC.hg18.knownGene_3.2.2
[77] DO.db_2.9 truncnorm_1.0-8
[79] mvtnorm_1.1-3 SQUAREM_2021.1
[81] amap_0.8-19 TxDb.Mmusculus.UCSC.mm9.knownGene_3.2.2
[83] hms_1.1.2 xlsxjars_0.6.1
[85] patchwork_1.1.2 XML_3.99-0.13
[87] emdbook_1.3.12 jpeg_0.1-10
[89] gridExtra_2.3 compiler_4.2.2
[91] biomaRt_2.52.0 bdsmatrix_1.3-6
[93] tibble_3.1.8 shadowtext_0.1.2
[95] KernSmooth_2.23-20 crayon_1.5.2
[97] htmltools_0.5.4 ggfun_0.0.9
[99] ggVennDiagram_1.2.2 tidyr_1.2.1
[101] aplot_0.1.9 DBI_1.1.3
[103] tweenr_2.0.2 dbplyr_2.3.0
[105] MASS_7.3-58.1 rappdirs_0.3.3
[107] boot_1.3-28 ShortRead_1.54.0
[109] Matrix_1.5-3 cli_3.4.1
[111] parallel_4.2.2 igraph_1.3.5
[113] pkgconfig_2.0.3 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[115] numDeriv_2016.8-1.1 TxDb.Celegans.UCSC.ce6.ensGene_3.2.2
[117] xml2_1.3.3 ggtree_3.4.4
[119] yulab.utils_0.0.6 stringr_1.5.0
[121] digest_0.6.31 fastmatch_1.1-3
[123] tidytree_0.4.2 restfulr_0.0.15
[125] GreyListChIP_1.28.1 curl_5.0.0
[127] gtools_3.9.4 rjson_0.2.21
[129] jsonlite_1.8.4 lifecycle_1.0.3
[131] nlme_3.1-160 viridisLite_0.4.1
[133] limma_3.52.4 BSgenome_1.64.0
[135] fansi_1.0.3 pillar_1.8.1
[137] lattice_0.20-45 Nozzle.R1_1.1-1.1
[139] plotrix_3.8-2 KEGGREST_1.36.3
[141] fastmap_1.1.0 httr_1.4.4
[143] GO.db_3.15.0 glue_1.6.2
[145] zip_2.2.2 png_0.1-7
[147] bit_4.0.4 ggforce_0.4.1
[149] stringi_1.7.8 blob_1.2.3
[151] TxDb.Mmusculus.UCSC.mm10.knownGene_3.10.0 latticeExtra_0.6-30
[153] caTools_1.18.2 memoise_2.0.1
[155] dplyr_1.0.10 irlba_2.3.5.1
thanks so much...really appreciate it.