Help with Pathview: No annotation package for a species, gene symbols not mapped!
0
0
Entering edit mode
bertranbio • 0
@87e58000
Last seen 7 months ago
Netherlands

Hello pathview users and developers in the BioConductor community,

This is my first post asking for help with a package in R. From the get go, I am sorry if there are any basic elements missing in my question.

My issue is the following: I am trying to run pathview with expression data from potato RNAseq. My RNAseq was originally mapped on PGSC identifiers which I am able to convert into NCBI GeneID (Entrez) identifiers using gprofiler (https://biit.cs.ut.ee/gprofiler/gost).

An example data file can be found in my GitHub repository (https://github.com/andrebertran/andrebertran/blob/main/pathviewdata.csv). To prepare my file for pathview I did the following in R:

#Read table in correct format. First column is read as character, not numbers.
library(readr)
pathviewdata <- read_csv("pathviewdata.csv", 
    col_types = cols(...1 = col_character()))
View(pathviewdata)

#assign row names to the entrez identifiers
library(tidyverse)
pathviewdata <- column_to_rownames(pathviewdata, var = "...1") 

#First trial using entrez as gene.idtype
pathview(gene.data = pathviewdata, pathway.id = "04075", gene.idtype = "entrez", species = "sot", limit = list(gene = 7, cpd = 7), out.suffix = "test27")

#Second trial using kegg as gene.idtype
pathview(gene.data = pathviewdata, pathway.id = "04075", gene.idtype = "kegg", species = "sot", limit = list(gene = 7, cpd = 7), out.suffix = "test28")

And I get, in both instances, the same warning message from the package:


#Warning: No annotation package for the species sot, gene symbols not mapped!

I have checked data(korg) to make sure that my species of interest is supported by KEGG and all seems to check out just fine. I realize that my question is similar to those seen in these previous posts (Pathview with minor species and http://seqanswers.com/forums/showthread.php?t=35472#6) but I sincerely couldn't follow the provided answers and would really appreciate a comprehensive help with this issue!

I have a vague idea that I need to download a sort of table from NCBI FTP website where the entrez gene IDs of potato are correlated to specific gene names (the gene abbreviations used by KEGG in their pathway maps) and then I need to assign this as a reference in pathview but I just don't know how to do this. It was not clear enough for me from the threads I mentioned above.

Any help is appreciated! Perhaps you can indicate me other software like pathview which can take in PGSC identifiers directly?

Cheers,

André

sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.8     purrr_0.3.4     tidyr_1.2.0     tibble_3.1.6   
 [7] ggplot2_3.3.5   tidyverse_1.3.1 readr_2.1.2     pathview_1.34.0 writexl_1.4.0  

loaded via a namespace (and not attached):
 [1] Biobase_2.54.0         httr_1.4.2             bit64_4.0.5            vroom_1.5.7           
 [5] jsonlite_1.7.3         modelr_0.1.8           assertthat_0.2.1       stats4_4.1.2          
 [9] blob_1.2.2             GenomeInfoDbData_1.2.7 cellranger_1.1.0       pillar_1.7.0          
[13] RSQLite_2.2.9          backports_1.4.1        glue_1.6.1             XVector_0.34.0        
[17] rvest_1.0.2            colorspace_2.0-2       XML_3.99-0.8           pkgconfig_2.0.3       
[21] broom_0.7.12           haven_2.4.3            zlibbioc_1.40.0        scales_1.1.1          
[25] tzdb_0.2.0             KEGGREST_1.34.0        generics_0.1.2         IRanges_2.28.0        
[29] ellipsis_0.3.2         cachem_1.0.6           withr_2.4.3            BiocGenerics_0.40.0   
[33] cli_3.2.0              magrittr_2.0.2         crayon_1.5.0           readxl_1.3.1          
[37] memoise_2.0.1          KEGGgraph_1.54.0       fs_1.5.2               fansi_1.0.2           
[41] xml2_1.3.3             graph_1.72.0           tools_4.1.2            hms_1.1.1             
[45] org.Hs.eg.db_3.14.0    lifecycle_1.0.1        S4Vectors_0.32.3       munsell_0.5.0         
[49] reprex_2.0.1           AnnotationDbi_1.56.2   Biostrings_2.62.0      compiler_4.1.2        
[53] GenomeInfoDb_1.30.1    rlang_1.0.1            grid_4.1.2             RCurl_1.98-1.6        
[57] rstudioapi_0.13        bitops_1.0-7           gtable_0.3.0           DBI_1.1.2             
[61] R6_2.5.1               lubridate_1.8.0        fastmap_1.1.0          bit_4.0.4             
[65] utf8_1.2.2             Rgraphviz_2.38.0       stringi_1.7.6          parallel_4.1.2        
[69] Rcpp_1.0.8             vctrs_0.3.8            png_0.1-7              dbplyr_2.1.1          
[73] tidyselect_1.1.1
annotation_package not_available_in_BioConductor pathview • 226 views
ADD COMMENT

Login before adding your answer.

Traffic: 432 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6