How to fix warning in makeTxDbFromGFF() when using Gencode M30 (mouse)?
1
0
Entering edit mode
Pratik Mehta ▴ 10
@0512b16f
Last seen 9 weeks ago
United States

Hello Bioconductor community,

The error I am having is with makeTxDbFromGFF(). My goal is to use tximeta through se <- tximeta(coldata) to import my salmon quant.sf files from a bulk RNA-seq experiment.

The warning I get (below) happens during se <- tximeta(coldata).

I think the pathway is sort-of like this... AnnotationHub does not have the Gencode M30 mouse on the AH-server so then tximeta attempts to use makeTxDbFromGFF() to make it from the .gtf file from Gencode (https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M30/gencode.vM30.annotation.gtf.gz).

This works, but also gives a warning:

Warning message:
In .get_cds_IDX(mcols0$type, mcols0$phase) :
  The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored.

Below is a reproducible example, with the issue isolated to GenomicFeatures. I obtained the .gtf from Gencode here and the direct link on to the .gtf I am using from that page is here:

https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M30/gencode.vM30.annotation.gtf.gz

txdb <- makeTxDbFromGFF("/data/references/gencode.vM30.annotation.gtf")


Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .get_cds_IDX(mcols0$type, mcols0$phase) :
  The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored.

I realize this may not be a top priority right now. I am going to dig through the code on the GitHub, if anyone has any hints on where to look in the code to resolve this, I'd be really grateful. Thank you in advance.

SessionInfo:

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GenomicFeatures_1.49.6      AnnotationDbi_1.59.1        BiocParallel_1.31.12        GenomicState_0.99.15        AnnotationHub_3.5.1        
 [6] BiocFileCache_2.5.0         dbplyr_2.2.1                forcats_0.5.2               stringr_1.4.1               dplyr_1.0.10               
[11] purrr_0.3.4                 readr_2.1.2                 tidyr_1.2.1                 tibble_3.1.8                tidyverse_1.3.2            
[16] viridis_0.6.2               viridisLite_0.4.1           EnhancedVolcano_1.15.0      ggrepel_0.9.1               plotly_4.10.0              
[21] ggplot2_3.3.6               tximeta_1.15.2              biomaRt_2.53.2              DESeq2_1.37.6               SummarizedExperiment_1.27.3
[26] Biobase_2.57.1              MatrixGenerics_1.9.1        matrixStats_0.62.0          GenomicRanges_1.49.1        GenomeInfoDb_1.33.7        
[31] IRanges_2.31.2              S4Vectors_0.35.3            BiocGenerics_0.43.4        

loaded via a namespace (and not attached):
  [1] readxl_1.4.1                  backports_1.4.1               plyr_1.8.7                    lazyeval_0.2.2               
  [5] splines_4.2.1                 digest_0.6.29                 ensembldb_2.21.4              htmltools_0.5.3              
  [9] fansi_1.0.3                   magrittr_2.0.3                memoise_2.0.1                 googlesheets4_1.0.1          
 [13] tzdb_0.3.0                    Biostrings_2.65.6             annotate_1.75.0               modelr_0.1.9                 
 [17] prettyunits_1.1.1             colorspace_2.0-3              blob_1.2.3                    rvest_1.0.3                  
 [21] rappdirs_0.3.3                haven_2.5.1                   xfun_0.33                     crayon_1.5.1                 
 [25] RCurl_1.98-1.8                jsonlite_1.8.0                tximport_1.25.1               genefilter_1.79.0            
 [29] survival_3.4-0                glue_1.6.2                    gtable_0.3.1                  gargle_1.2.1                 
 [33] zlibbioc_1.43.0               XVector_0.37.1                DelayedArray_0.23.2           SingleCellExperiment_1.19.0  
 [37] scales_1.2.1                  pheatmap_1.0.12               DBI_1.1.3                     Rcpp_1.0.9                   
 [41] xtable_1.8-4                  progress_1.2.2                bit_4.0.4                     DT_0.25                      
 [45] dittoSeq_1.9.3                htmlwidgets_1.5.4             httr_1.4.4                    RColorBrewer_1.1-3           
 [49] ellipsis_0.3.2                pkgconfig_2.0.3               XML_3.99-0.10                 locfit_1.5-9.6               
 [53] utf8_1.2.2                    tidyselect_1.1.2              rlang_1.0.5                   later_1.3.0                  
 [57] munsell_0.5.0                 BiocVersion_3.16.0            cellranger_1.1.0              tools_4.2.1                  
 [61] cachem_1.0.6                  cli_3.4.0                     generics_0.1.3                RSQLite_2.2.17               
 [65] ggridges_0.5.3                broom_1.0.1                   evaluate_0.16                 fastmap_1.1.0                
 [69] yaml_2.3.5                    knitr_1.40                    bit64_4.0.5                   fs_1.5.2                     
 [73] KEGGREST_1.37.3               AnnotationFilter_1.21.0       mime_0.12                     xml2_1.3.3                   
 [77] compiler_4.2.1                rstudioapi_0.14               filelock_1.0.2                curl_4.3.2                   
 [81] png_0.1-7                     interactiveDisplayBase_1.35.0 reprex_2.0.2                  geneplotter_1.75.0           
 [85] stringi_1.7.8                 lattice_0.20-45               ProtGenerics_1.29.0           Matrix_1.5-1                 
 [89] vctrs_0.4.1                   pillar_1.8.1                  lifecycle_1.0.2               BiocManager_1.30.18          
 [93] cowplot_1.1.1                 data.table_1.14.2             bitops_1.0-7                  httpuv_1.6.6                 
 [97] rtracklayer_1.57.0            R6_2.5.1                      BiocIO_1.7.1                  promises_1.2.0.1             
[101] gridExtra_2.3                 codetools_0.2-18              assertthat_0.2.1              rjson_0.2.21                 
[105] withr_2.5.0                   GenomicAlignments_1.33.1      Rsamtools_2.13.4              GenomeInfoDbData_1.2.8       
[109] parallel_4.2.1                hms_1.1.2                     grid_4.2.1                    rmarkdown_2.16               
[113] googledrive_2.0.0             shiny_1.7.2                   lubridate_1.8.0               restfulr_0.0.15   
GenomicFeatures AnnotationHubData tximeta • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 5 hours ago
United States

There isn't an error! You are getting a warning that tells you that there were some issues with the incoming GFF file that were ignored. Is there something about the warning that is unclear?

ADD COMMENT
0
Entering edit mode

Yup. No error here, this happens for all Gencode, and you can just ignore it.

ADD REPLY

Login before adding your answer.

Traffic: 432 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6