Hello everybody,
I'm trying to run a GO enrichment analysis through the Bioconductor package topGO. I'm sticking to its vignette. I went straight to paragraph 4.3 Custom Annotation.
Annotations need to be provided either as gene-to-GOs or as GO-to-genes mappings. An example of such mapping can be found in the "topGO/examples" directory. The file "geneid2go.map" contains gene-to-GOs mappings. For each gene identifier are listed the GO terms to which this gene is specifically annotated. We use the readMappings function to parse this file.
However, while trying to run this on my dataset:
geneID2GO <- readMappings(file = system.file("examples/geneid2go.map", package = "topGO"))
I get:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
In addition: Warning message:
In file(file, "rt") :
file("") only supports open = "w+" and open = "w+b": using the former
Apparently, I need to provide an annotation file similar to the mentioned geneid2go.map. The problem is that in the tutorial I can't find any link to visualize this file, nor any 'example' folder, so I have no idea what it should look like. This is probably what is causing my error. Can anybody help here? Am I missing something obvious?
This is what my file looks like:
SMEL4.1_01g039380.1 GO:0000009, GO:0004376, GO:0006506
SMEL4.1_09g002640.1 GO:0000009, GO:0004376, GO:0006506, GO:0046983
SMEL4.1_03g017790.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g019100.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g028120.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g028130.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_06g027920.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_09g001290.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
Thanks in advance for your help!
Here is my session info:
```R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252
[4] LC_NUMERIC=C LC_TIME=Italian_Italy.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] topGO_2.38.1 SparseM_1.81 GO.db_3.10.0 AnnotationDbi_1.48.0 IRanges_2.20.2
[6] S4Vectors_0.24.4 Biobase_2.46.0 graph_1.64.0 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.8.3 bit_4.0.4 lattice_0.20-45 rlang_1.0.2 fastmap_1.1.0 blob_1.2.3
[7] tools_3.6.3 grid_3.6.3 cli_3.3.0 DBI_1.1.2 matrixStats_0.62.0 bit64_4.0.5
[13] vctrs_0.4.1 memoise_2.0.1 cachem_1.0.6 RSQLite_2.2.14 compiler_3.6.3 pkgconfig_2.0.3
Thanks for your answer!
Oddly enough, I solved the problem by simply taking the system.file function and package = "topGO" parameter out.
That's how it looks like:
The
system.file
function is used in that context because the author wants to provide an example and needs some data to be provided in the package to do so. In other words, the author has added some data to thetopGO
package that can then be used to show examples, andsystem.file
is a convenient way to figure out where the package is installed, which can differ depending on the OS and the end user's preferences.You will almost never need to use
system.file
yourself, and certainly not for a file that is in your working directory. You should never be putting files in your R library directory, so the only reason you should ever usesystem.file
is if you are running an example and want to see what the input data look like.