Entering edit mode
Hi Yiwen,
Jim?s suggestion would work well.
The gage package now has a secondary vignette about data
preparation on gene set or pathway analysis. It covers: gene set and
expression
data input, probe set ID, transcript and gene ID conversion. You can
find some
examples on how to do Probe set ID or gene ID conversion in Section
4-5.
Notice that you need to update to the latest version of
gage: http://bioconductor.org/packages/2.12/bioc/html/gage.html.
And you also need to install the latest gageData package.
Hope this addresses your problem. Thanks!
Weijun
On 3/25/2013 1:49 PM, He, Yiwen (NIH/CIT) [C] wrote:
>Hi,
>I am looking into using the gage package for gene set analysis. I
would like to test run it on the human diabetic muscle microarray data
used in the initial description of GSEA paper. I downloaded the
expression data (Diabetes_hgu133a.gct) from the Broad institute
website, and also downloaded the C2 gene set there (c2.symbols.gmt).
However, the IDs in the expression dataset are Affymetrix probe IDs,
while the IDs in the gene set are gene symbols (or Entrez gene IDs if
I download another version.)
>Your manual says these two IDs should match, and I understand that.
But what should I do when they don't match? The examples given in the
manual have everything setup the right way already.
>I'm using R version 2.15.2 and gage_2.8.0 on Platform:
i386-w64-mingw32/i386 (32-bit).
>Thank you very much!
>Yiwen He
>DCB/CIT/NIH/HHS