I am currently performing an analysis for highlighting cpg markers that are differentially methylated (Illumina 450K) by using the lme4 package.I would now like to perform a Gene Ontology Enrichment by using the TopGO package, but I am having some difficulties.
I am trying to apply the following procedure, therefore I have the following questions:
- I want to focus on cpg markers that are located within genes (Body, TSS, 5' or 3' UTR). Therefore, I discard all markers that are intergenic:
- For each cpg marker, I have the following Informations (one small part):
ID Chromosome Start End Strand GeneSymbol GeneRegion CpG_content
cg05672930 chrY 21729555 21729556 + CYorf15A 5'UTR Island
cg15422579 chrY 22737424 22737425 - EIF1AY TSS200 Shore
cg02884332 chrY 22737946 22737947 + EIF1AY Body Island
Question 1: Should I only perform the gene ontology enrichment on genes only (CYorf15A, EIF1AY), or may I also perform it on the markers (i.e. cg05672930)?
Question 2: Further, when trying to implement the GO object as found in the topGO tutorial (http://bioconductor.org/packages/release/bioc/html/topGO.html), I am applying the following code:
sampleGOdata<-new("topGOdata", description="Simple session",ontology="BP", allGenes=geneList,geneSel=topDiffGenes, nodeSize=10, annot=???,affyLib=???)
Q2.1-The geneList parameter should contain a vector whose names are all the cpg markers on the chip (or genes) with p-values calculated from lme4, correct?
Q2.2-The geneSel parameter should contain a boolean vector of the cpg markers (or genes) whose p-value is below the alpha threshold, correct?
Q2.3-My other question is related to what should "annot" and "affyLib" parameter contain?
Q2.4-My guess is that once I am able to implement a correct GO object () , I will able to perform the fisher test and then interpret the ontologies (by repeating SampleGOdata for Molecular Function, and Cellular component), right ?
I am sorry for these naive questions, I am very new to topgo, and I couldn't find any tutorials focusing specifically on gene methylation.
Thank you very much for your kind help and responses
P.S.Also, is there any straightforward way (or package) for associating gene symbols (i.e. SAMD11, KLHL17, etc...) or cpg markers (cg....) with GO terms in homo sapiens?