Entering edit mode
Juan Fernández Tajes
▴
190
@juan-fernandez-tajes-5273
Last seen 10.4 years ago
Dear List,
I have a list of gene.symbol, that looks like that
>head(mylist)
$cluster.1
[1] "HSP90AB1" "INMT" "CKB" "NR2E1" "ME3" "FAM162A"
"KIRREL2"
$cluster.2
[1] "ENSG00000212860" "TRADD" "C1QBP" "KIAA1967"
"ENSG00000137379" "MAP3K3" "TNFRSF1B" "BAG2"
[9] "ENSG00000212866" "RIPK3" "EPRS" "HSPA6"
"HSPA5" "IKBKG" "TBK1" "TRAF2"
[17] "MAP3K7" "NFKB1" "MAP3K14" "HSPA1A"
"MAP3K7IP2" "HSPBP1" "NFKB2" "DNAJA1"
[25] "TNFRSF1A" "TRAF3IP2" "NFKBIA" "HSPA9"
"ENSG00000183311" "TUBB" "TUBA3D" "TANK"
[33] "ENSG00000215292" "REL" "MAP3K1" "HSPA1B"
"HSPA8" "NFKBIB" "PGAM5" "EEF1A2"
[41] "MAP3K8" "CLTC" "RCN2" "MAP3K7IP1"
"RARS" "TRAF1" "TUBA3C" "HSPA1L"
[49] "MYO1D" "NOD1" "HSP90AA2" "CAD"
"RELB" "AIFM1" "TUBB2B" "RIPK2"
[57] "CDC37" "IKBKB" "ERLIN1" "RIPK1"
"TNIP2" "STUB1" "TUBB4" "HSPA2"
[65] "CHUK" "DNAJC3" "CCDC50" "SLC25A5"
"NFKBIE" "AK3" "TICAM1" "TIMM50"
[73] "ANKRD17" "OTUD7B" "TNFAIP3" "RPS27L"
"TRPC4AP" "TUBB6" "DNAJC6" "PXMP2"
[81] "FLJ25006"
$cluster.3
[1] "ACTB" "PFN1" "XPO6" "VASP" "ZYX" "PFN2"
"DIAPH1"
"APBB1IP" "DIAPH2" "PARVG" "ENAH" "PCYT1B" "PFN4" "CNN2"
"NSMAF" "PFN3"
[17] "LMOD1"
$cluster.4
[1] "UBB" "HERC3" "KLRK1" "ULBP1"
"RAET1E" "MICA" "HCST"
"ENSG00000184444"
[9] "ENSG00000206449" "ULBP2" "ZNF385A" "ULBP3"
"RAET1G"
$cluster.5
[1] "YWHAZ" "SLAIN2" "ZC3H13" "C12orf51" "PGLYRP1" "ATL3"
$cluster.6
[1] "ACTG1" "EPS8L3" "PARVG" "TMSB4Y" "B3GALT1" "UGT1A6²
I want to extract the GO terms for every clsuter (e.g each list
component)
but excluding some of them based on their evidence codes (such as IEA
or
NR). The code I¹m using is the following:
e2s <- toTable(org.Hs.egSYMBOL)
p <- lapply(mylist, function(x) {y <- e2s$gene_id[e2s$symbol %in% x];
return(y)})
entrezIDs <- lapply(p, function(x) {org.Hs.egGO(x)})
list.GO <- lapply(entrezIDs, function(x){toTable(x)})
With this approach I got a list of data.frames (list.GO) where I can
exclude
the evidence code afterwards, however I would like to know whether any
way
to exclude the evidence codes before mapping my entrez ids to GO
terms.
My final aim is to calculate a semantic similarity index inside the
clusters
Many thanks for the help
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods
base
other attached packages:
[1] annotate_1.40.0 GO.db_2.10.1 org.Hs.eg.db_2.10.1
RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.24.0
Biobase_2.22.0
[8] BiocGenerics_0.8.0 pheatmap_0.7.7 RColorBrewer_1.0-5
plyr_1.8
loaded via a namespace (and not attached):
[1] grid_3.0.2 IRanges_1.20.5 stats4_3.0.2 tools_3.0.2
XML_3.95-0.2
xtable_1.7-1
---
Juan Fernandez Tajes, phD
Grupo Xenomar Área de Genética
Facultad de Ciencias A Zapateira
Universidad de A Coruña
Spain
Tlf - +34 981 16700
Email: Jfernandezt@udc.es
--
[[alternative HTML version deleted]]