Trouble importing my own Gene Ontology Annotation into topGO
1
0
Entering edit mode
Raito92 ▴ 60
@raito92-20399
Last seen 22 months ago
Italy

Hello everybody,

I'm trying to run a GO enrichment analysis through the Bioconductor package topGO. I'm sticking to its vignette. I went straight to paragraph 4.3 Custom Annotation.

Annotations need to be provided either as gene-to-GOs or as GO-to-genes mappings. An example of such mapping can be found in the "topGO/examples" directory. The file "geneid2go.map" contains gene-to-GOs mappings. For each gene identifier are listed the GO terms to which this gene is specifically annotated. We use the readMappings function to parse this file.

However, while trying to run this on my dataset:

geneID2GO <- readMappings(file = system.file("examples/geneid2go.map", package = "topGO"))

I get:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  no lines available in input
In addition: Warning message:
In file(file, "rt") :
  file("") only supports open = "w+" and open = "w+b": using the former

Apparently, I need to provide an annotation file similar to the mentioned geneid2go.map. The problem is that in the tutorial I can't find any link to visualize this file, nor any 'example' folder, so I have no idea what it should look like. This is probably what is causing my error. Can anybody help here? Am I missing something obvious?

This is what my file looks like:

SMEL4.1_01g039380.1 GO:0000009, GO:0004376, GO:0006506
SMEL4.1_09g002640.1 GO:0000009, GO:0004376, GO:0006506, GO:0046983
SMEL4.1_03g017790.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g019100.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g028120.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_03g028130.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_06g027920.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096
SMEL4.1_09g001290.1 GO:0000015, GO:0000287, GO:0004634, GO:0006096

Thanks in advance for your help!

Here is my session info:

```R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252    LC_MONETARY=Italian_Italy.1252
[4] LC_NUMERIC=C                   LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] topGO_2.38.1         SparseM_1.81         GO.db_3.10.0         AnnotationDbi_1.48.0 IRanges_2.20.2      
[6] S4Vectors_0.24.4     Biobase_2.46.0       graph_1.64.0         BiocGenerics_0.32.0 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3       bit_4.0.4          lattice_0.20-45    rlang_1.0.2        fastmap_1.1.0      blob_1.2.3        
 [7] tools_3.6.3        grid_3.6.3         cli_3.3.0          DBI_1.1.2          matrixStats_0.62.0 bit64_4.0.5       
[13] vctrs_0.4.1        memoise_2.0.1      cachem_1.0.6       RSQLite_2.2.14     compiler_3.6.3     pkgconfig_2.0.3  
Annotation GeneOntology topGO GO • 1.6k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

You might have a broken install. But regardless, the file format is the same as what you have.

> z <- read.table(system.file("examples/geneid2go.map", package = "topGO"), sep = "\t")
> head(z)
      V1
1  68724
2 119608
3  49239
4  67829
5 106331
6 214717
                                                                                                                                                                                              V2
1                                                                                                                                     GO:0005488, GO:0003774, GO:0001539, GO:0006935, GO:0009288
2                                                                                                                         GO:0005634, GO:0030528, GO:0006355, GO:0045449, GO:0003677, GO:0007275
3                                     GO:0016787, GO:0017057, GO:0005975, GO:0005783, GO:0005792, GO:0004345, GO:0005788, GO:0047936, GO:0006098, GO:0005488, GO:0006006, GO:0055114, GO:0016491
4 GO:0045926, GO:0016616, GO:0000287, GO:0030145, GO:0005739, GO:0000166, GO:0005575, GO:0006099, GO:0005524, GO:0008152, GO:0006102, GO:0005759, GO:0005975, GO:0004449, GO:0055114, GO:0016491
5                                                                         GO:0043565, GO:0000122, GO:0003700, GO:0005634, GO:0045597, GO:0006355, GO:0045595, GO:0045449, GO:0003677, GO:0007275
6                                                                                                             GO:0004803, GO:0005634, GO:0008270, GO:0003677, GO:0000228, GO:0046872, GO:0046983

So you can just use readMappings on your file.

ADD COMMENT
0
Entering edit mode

Thanks for your answer!

Oddly enough, I solved the problem by simply taking the system.file function and package = "topGO" parameter out.

geneID2GO <- readMappings(file = "topGO_Annotations.txt") 

That's how it looks like:

> str(head(geneID2GO))

List of 6
 $ SMEL4.1_01g039380.1: chr [1:3] "GO:0000009" "GO:0004376" "GO:0006506"
 $ SMEL4.1_09g002640.1: chr [1:4] "GO:0000009" "GO:0004376" "GO:0006506" "GO:0046983"
 $ SMEL4.1_03g017790.1: chr [1:4] "GO:0000015" "GO:0000287" "GO:0004634" "GO:0006096"
 $ SMEL4.1_03g019100.1: chr [1:4] "GO:0000015" "GO:0000287" "GO:0004634" "GO:0006096"
 $ SMEL4.1_03g028120.1: chr [1:4] "GO:0000015" "GO:0000287" "GO:0004634" "GO:0006096"
 $ SMEL4.1_03g028130.1: chr [1:4] "GO:0000015" "GO:0000287" "GO:0004634" "GO:0006096"
ADD REPLY
0
Entering edit mode

The system.file function is used in that context because the author wants to provide an example and needs some data to be provided in the package to do so. In other words, the author has added some data to the topGO package that can then be used to show examples, and system.file is a convenient way to figure out where the package is installed, which can differ depending on the OS and the end user's preferences.

You will almost never need to use system.file yourself, and certainly not for a file that is in your working directory. You should never be putting files in your R library directory, so the only reason you should ever use system.file is if you are running an example and want to see what the input data look like.

ADD REPLY

Login before adding your answer.

Traffic: 569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6