yet another gene universe question
0
0
Entering edit mode
Max Kuhn ▴ 60
@max-kuhn-2554
Last seen 10.2 years ago
United States
I have access to gene sets from 19 different databases (including GO and KEGG). Some of these sets are highly curated collections for one specific biological area (such as metabolism) while others are larger (~6K gene sets). The distribution of gene sets per database is: > stem(tbl) The decimal point is 3 digit(s) to the right of the | 0 | 01122333446688925 2 | 4 4 | 6 | 3 Appropriately defining the universe is critical, as people on this list have previously demonstrated. Does anyone have an opinion about how to define the gene universe when: 1) the genes include in all the gene sets is small (say 20% of the total number of genes). 2) only specific gene sets across databases are tested at once. For example, someone might want to get all the gene sets for a specific area (say cell cycle) across the different databases and test those at once I've been thinking that the universe aught to be the set of genes that are available across all the gene sets being tested. In case 1 above, this seems too small while in case 2 it seems excessively large (cue the Goldilocks jokes). Thanks, Max
• 635 views
ADD COMMENT

Login before adding your answer.

Traffic: 1040 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6