Entering edit mode
Hi, everybody,
I was wondering whether there is a package to cluster a list of genes
to
different GO categories
my problem is as such:
i have a list of genes (a tab delimited file):
id flybasename_gene flybase_gene_id entrezgene GOMF
1616608_a_at Gpdh FBgn0001128 33824 carboxylesterase
activity
hydrolase activity 3',5'-cyclic-nucleotide phosphodiesterase
activity
protein binding
1622892_s_at CG33057 FBgn0053057 318833 nucleotide binding
protein binding ATP binding chaperone binding ammonium
transmembrane transporter activity
1622892_s_at mkg-p FBgn0035889 38955 nucleotide binding
protein binding ATP binding chaperone binding ammonium
transmembrane transporter activity
1622893_at IM3 FBgn0040736 50209 aminopeptidase activity
metalloexopeptidase activity hydrolase activity manganese ion
bindin
1622894_at CG15120 FBgn0034454 37248 protein binding
I would like to try and group the genes in various GO categories,
which are
mentioned here in the last columns. The GO categories take more than
one
column and the number is not equal in each line, deending on the depth
of
the annotation for each gene.
Is there a way of transforming the table, so that I in the first
column a
list of my GO categories and than on each line a list with gene IDs
(the
right ID are not important as I can change them as I wish).
I would like to have something like that:
GO genes
protein binding FBgn0001128 FBgn0053057 FBgn0035889 etc.
ammonium transmembrane transporter activity FBgn0053057
FBgn0035889
hydrolayse activity FBgn0040736 FBgn0001128
I would appriciate any kind of help or ideas
Thanks
Assa
[[alternative HTML version deleted]]