Entering edit mode
Hello,
I have an object with GO numbers (values), and their genes (ind) which is made from the getgo function from goseq:
all23kGos <- stack(getgo(assayed.genes,'hg19','geneSymbol'))
sorted<-all23kGos[order(all23kGos$values),]
> tail(sorted,10)
values ind
1755978 GO:2001300 ALOX15
227119 GO:2001301 ALOX12
1755979 GO:2001301 ALOX15
227120 GO:2001302 ALOX12
1755980 GO:2001302 ALOX15
227121 GO:2001303 ALOX12
1755981 GO:2001303 ALOX15
227122 GO:2001304 ALOX12
227123 GO:2001306 ALOX12
895887 GO:2001311 ACP6
Now I would like to print these genes to a tab. delim. txt file, in such a way that every row contains all genes from one GO term separated by tabs. In the end all GO terms have their own row containing all the genes.
Is this possible with an easy one liner, or do I have to make a complicated for loop for this?
Thanks in advance!
Ben

Thank you, that's a great solution. Is it possible to easily squeeze a column with term description between the GO value and the genes? From another object called
newdfhead(newdf) values term [1,] "GO:0048518" "positive regulation of biological process" [2,] "GO:0048519" "negative regulation of biological process" [3,] "GO:0019222" "regulation of metabolic process" [4,] "GO:0009605" "response to external stimulus" [5,] "GO:0048522" "positive regulation of cellular process" [6,] "GO:0051173" "positive regulation of nitrogen compound metabolic process"Sure. Assuming you don't really have the quotes around the GO terms in your new data.frame, you can merge them using
left_join()If you do have the quotes then I think you'll need to strip them before the matching will work.
I first got an error with the suggestion that I add
copy=T,and that worked fine!Thanks for your help!
Hi, I was wondering where did you get that "term" column for each of the GO values? Thank you!
The
termcolumn comes from the output you get from thegoseqfunction itself, but only with later versions ofgoseq.