Hi all,
I'm new to R and have some very basic questions about using GOstats
with
a non-model organism.
I'm trying to use the tutorial by Marc Carlson "How to Use GOstats
and...with unsupported model organisms."
I've created a GO to gene mapping file with the following 3 columns of
data:
Goterm Evidence GeneID
GO:0015893 IEA CNAG_00003
GO:0043231 IEA CNAG_00003
GO:0015203 IEA CNAG_00003
GO:0044425 IEA CNAG_00003
...
I can import it using read.table, but I don't seem to be able to
invoke
the data frame correctly.
The tutorial reads:
library("org.Hs.eg.db")
frame = toTable(org.Hs.egGO)
goFrameData = data.frame(frame$go_id, frame$Evidence, frame$gene_id)
I imported the data into an object using read.table
>CneoGOanno <- read.table("Cneo_GOannot.txt")
I tried to create a frame using:
> frame = toTable(CneoGOannot)
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "toTable", for
signature "data.frame"
Do I have to create some sort of database for this organism first? If
so, what is it's format?
Any suggestions would be most appreciated.
Regards,
Maureen Donlin
At the risk of too long of an email, here's the session info:
> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4
DBI_0.2-5 graph_1.28.0 Category_2.16.0
AnnotationDbi_1.12.0
[8] Biobase_2.10.0
loaded via a namespace (and not attached):
[1] annotate_1.28.0 genefilter_1.32.0 GO.db_2.4.5
GSEABase_1.12.2 RBGL_1.26.0 splines_2.12.1 survival_2.36-2
tools_2.12.1
[9] XML_3.2-0 xtable_1.5-6
--
Maureen J. Donlin, Ph.D.
Research Associate Professor
Dept. of Molecular Microbiology& Immunology
Dept. of Biochemistry& Molecular Biology
Saint Louis University School of Medicine
507 Doisy Research Center
1100 S. Grand
St. Louis, MO 63104
Phone: 314-977-8858
Hi Maureen,
On 2/14/2011 3:27 PM, Maureen J. Donlin wrote:
> Hi all,
>
> I'm new to R and have some very basic questions about using GOstats
with
> a non-model organism.
> I'm trying to use the tutorial by Marc Carlson "How to Use GOstats
> and...with unsupported model organisms."
>
> I've created a GO to gene mapping file with the following 3 columns
of
> data:
> Goterm Evidence GeneID
> GO:0015893 IEA CNAG_00003
> GO:0043231 IEA CNAG_00003
> GO:0015203 IEA CNAG_00003
> GO:0044425 IEA CNAG_00003
> ...
>
> I can import it using read.table, but I don't seem to be able to
invoke
> the data frame correctly.
When you read it in using read.table(), you automatically have a
data.frame.
>
> The tutorial reads:
> library("org.Hs.eg.db")
> frame = toTable(org.Hs.egGO)
> goFrameData = data.frame(frame$go_id, frame$Evidence, frame$gene_id)
Yep, this is just some code that Marc uses to create a data.frame so
he
can give an example.
>
> I imported the data into an object using read.table
> >CneoGOanno <- read.table("Cneo_GOannot.txt")
>
> I tried to create a frame using:
> > frame = toTable(CneoGOannot)
> Error in function (classes, fdef, mtable) :
> unable to find an inherited method for function "toTable", for
signature
> "data.frame"
>
> Do I have to create some sort of database for this organism first?
If
> so, what is it's format?
>
> Any suggestions would be most appreciated.
Just go to the next step, which will be something like
goFrame <- GOFrame(CneoGOanno, organism = "Cryptococcus neoformans")
goAllFrame <- GOALLFrame(goFrame)
Best,
Jim
>
> Regards,
> Maureen Donlin
>
> At the risk of too long of an email, here's the session info:
> > sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5
> graph_1.28.0 Category_2.16.0 AnnotationDbi_1.12.0
> [8] Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.28.0 genefilter_1.32.0 GO.db_2.4.5 GSEABase_1.12.2
> RBGL_1.26.0 splines_2.12.1 survival_2.36-2 tools_2.12.1
> [9] XML_3.2-0 xtable_1.5-6
>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues
James,
Thanks for the reply. I figured out how to get the data into a data
frame.
I was doing 2 things wrong, but here is the code that worked.
> CneoGO <- read.table("Cneo_GOannot.txt", header=TRUE)
> head(CneoGO)
Goterm Evidence GeneID
1 GO:0015893 IEA CNAG_00003
2 GO:0043231 IEA CNAG_00003
3 GO:0015203 IEA CNAG_00003
4 GO:0044425 IEA CNAG_00003
5 GO:0044444 IEA CNAG_00003
6 GO:0015846 IEA CNAG_00003
> goframeData = data.frame(CneoGO$Goterm, CneoGO$Evidence,
CneoGO$GeneID)
> head(goframeData)
CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
1 GO:0015893 IEA CNAG_00003
2 GO:0043231 IEA CNAG_00003
3 GO:0015203 IEA CNAG_00003
4 GO:0044425 IEA CNAG_00003
5 GO:0044444 IEA CNAG_00003
6 GO:0015846 IEA CNAG_00003
So continuing with the tutorial guide, I executed the following:
> library("GSEABase")
Loading required package: annotate
> goFrame = GOFrame(goframeData, organism = "Cryptococcus
neoformans")
Loading required package: GO.db
> goFrame
An object of class "GOFrame"
Slot "data":
CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
1 GO:0015893 IEA CNAG_00003
2 GO:0043231 IEA CNAG_00003
...
Slot "organism":
[1] "Cryptococcus neoformans"
> goAllFrame = GOAllFrame(goFrame)
> goAllFrame
An object of class "GOAllFrame"
Slot "data":
go_id evidence gene_id
1 GO:0000001 IEA CNAG_00006
2 GO:0000001 IEA CNAG_00088
...
Slot "organism":
[1] "Cryptococcus neoformans"
> gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
> gsc
GeneSetCollection
names: GO:0000001, GO:0000002, ..., GO:2000045 (6658 total)
unique identifiers: CNAG_00006, CNAG_00088, ..., CNAG_06995 (4822
total)
types in collection:
geneIdType: GOAllFrameIdentifier (1 total)
collectionType: GOCollection (1 total)
> universe = Lkeys(CneoGO)
Error in function (classes, fdef, mtable) :
unable to find an inherited method for function "Lkeys", for
signature "data.frame"
Am I missing some data that is found in the library("org.Hs.egGO")? I
can do the same commands with it and the structure of the goFrame,
goAllFrame and gsc seem to be the same.
Here's what I am trying to do. I have a microarray data set from a
time
course experiment done with a fungal genome, C. neoformans. I have
clusters of genes which are associated based how their expression
changed in relation to the other genes on the array. So what I have
are
gene lists, with no expression data or fold changes. For each list of
genes, I want to know what GO terms are over-represented.
I apologize if these questions are too basic. It's just that most of
the software out there for gene enrichment analysis are designed for
model organisms.
Again, any help is greatly appreciated.
Regards,
Maureen
On 2/14/11 3:23 PM, James W. MacDonald wrote:
> Hi Maureen,
>
> On 2/14/2011 3:27 PM, Maureen J. Donlin wrote:
>> Hi all,
>>
>> I'm new to R and have some very basic questions about using GOstats
with
>> a non-model organism.
>> I'm trying to use the tutorial by Marc Carlson "How to Use GOstats
>> and...with unsupported model organisms."
>>
>> I've created a GO to gene mapping file with the following 3 columns
of
>> data:
>> Goterm Evidence GeneID
>> GO:0015893 IEA CNAG_00003
>> GO:0043231 IEA CNAG_00003
>> GO:0015203 IEA CNAG_00003
>> GO:0044425 IEA CNAG_00003
>> ...
>>
>> I can import it using read.table, but I don't seem to be able to
invoke
>> the data frame correctly.
>
> When you read it in using read.table(), you automatically have a
> data.frame.
>
>>
>> The tutorial reads:
>> library("org.Hs.eg.db")
>> frame = toTable(org.Hs.egGO)
>> goFrameData = data.frame(frame$go_id, frame$Evidence,
frame$gene_id)
>
> Yep, this is just some code that Marc uses to create a data.frame so
> he can give an example.
>
>>
>> I imported the data into an object using read.table
>> >CneoGOanno <- read.table("Cneo_GOannot.txt")
>>
>> I tried to create a frame using:
>> > frame = toTable(CneoGOannot)
>> Error in function (classes, fdef, mtable) :
>> unable to find an inherited method for function "toTable", for
signature
>> "data.frame"
>>
>> Do I have to create some sort of database for this organism first?
If
>> so, what is it's format?
>>
>> Any suggestions would be most appreciated.
>
> Just go to the next step, which will be something like
>
> goFrame <- GOFrame(CneoGOanno, organism = "Cryptococcus neoformans")
> goAllFrame <- GOALLFrame(goFrame)
>
>
> Best,
>
> Jim
>
>
>
>>
>> Regards,
>> Maureen Donlin
>>
>> At the risk of too long of an email, here's the session info:
>> > sessionInfo()
>> R version 2.12.1 (2010-12-16)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5
>> graph_1.28.0 Category_2.16.0 AnnotationDbi_1.12.0
>> [8] Biobase_2.10.0
>>
>> loaded via a namespace (and not attached):
>> [1] annotate_1.28.0 genefilter_1.32.0 GO.db_2.4.5 GSEABase_1.12.2
>> RBGL_1.26.0 splines_2.12.1 survival_2.36-2 tools_2.12.1
>> [9] XML_3.2-0 xtable_1.5-6
>>
>>
>
--
Maureen J. Donlin, Ph.D.
Research Associate Professor
Dept. of Molecular Microbiology& Immunology
Dept. of Biochemistry& Molecular Biology
Saint Louis University School of Medicine
507 Doisy Research Center
1100 S. Grand
St. Louis, MO 63104
Phone: 314-977-8858
Hi Maureen,
On 2/14/2011 5:50 PM, Maureen J. Donlin wrote:
> James,
>
> Thanks for the reply. I figured out how to get the data into a data
frame.
> I was doing 2 things wrong, but here is the code that worked.
>
> > CneoGO <- read.table("Cneo_GOannot.txt", header=TRUE)
> > head(CneoGO)
> Goterm Evidence GeneID
> 1 GO:0015893 IEA CNAG_00003
> 2 GO:0043231 IEA CNAG_00003
> 3 GO:0015203 IEA CNAG_00003
> 4 GO:0044425 IEA CNAG_00003
> 5 GO:0044444 IEA CNAG_00003
> 6 GO:0015846 IEA CNAG_00003
>
> > goframeData = data.frame(CneoGO$Goterm, CneoGO$Evidence,
CneoGO$GeneID)
> > head(goframeData)
> CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
> 1 GO:0015893 IEA CNAG_00003
> 2 GO:0043231 IEA CNAG_00003
> 3 GO:0015203 IEA CNAG_00003
> 4 GO:0044425 IEA CNAG_00003
> 5 GO:0044444 IEA CNAG_00003
> 6 GO:0015846 IEA CNAG_00003
This step is unnecessary. The result of read.table() *is* a
data.frame,
so you are just creating another data.frame here.
>
> So continuing with the tutorial guide, I executed the following:
>
> > library("GSEABase")
> Loading required package: annotate
>
> > goFrame = GOFrame(goframeData, organism = "Cryptococcus
neoformans")
> Loading required package: GO.db
>
> > goFrame
> An object of class "GOFrame"
> Slot "data":
> CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
> 1 GO:0015893 IEA CNAG_00003
> 2 GO:0043231 IEA CNAG_00003
> ...
> Slot "organism":
> [1] "Cryptococcus neoformans"
>
> > goAllFrame = GOAllFrame(goFrame)
>
> > goAllFrame
> An object of class "GOAllFrame"
> Slot "data":
> go_id evidence gene_id
> 1 GO:0000001 IEA CNAG_00006
> 2 GO:0000001 IEA CNAG_00088
> ...
> Slot "organism":
> [1] "Cryptococcus neoformans"
>
>
> > gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
> > gsc
> GeneSetCollection
> names: GO:0000001, GO:0000002, ..., GO:2000045 (6658 total)
> unique identifiers: CNAG_00006, CNAG_00088, ..., CNAG_06995 (4822
total)
> types in collection:
> geneIdType: GOAllFrameIdentifier (1 total)
> collectionType: GOCollection (1 total)
>
> > universe = Lkeys(CneoGO)
> Error in function (classes, fdef, mtable) :
> unable to find an inherited method for function "Lkeys", for
signature
> "data.frame"
So here you are getting mixed up with what Marc had to do to get his
example to run, and what you need to do. The 'universe' is just the
complete set of gene IDs from which your significant set was chosen.
If you had an org.Cn.eg.db package, then you would do something
similar.
However, you don't, which is the point of this exercise. The
corresponding set of gene IDs that you do have is the third column of
the data.frame you created above (goFrameData or CneoGO).
Note here that you want to make sure that the gene IDs you use are
character values, not factors. The default for R when reading in a
data.frame is to convert a vector of strings to factor, so you either
want to use
CneoGO <- read.table("Cneo_GOannot.txt", header=TRUE, stringsAsFactors
=
FALSE)
and then
universe <- CneoGO[,3]
or proceed as you already have, but then
universe <- as.character(CneoGO[,3])
In addition, note that you will need to construct your 'genes' vector
differently from what is shown on p.3 of the vignette, instead
selecting
the set of significant genes from the results of your analysis (again,
using the CNAG gene IDs).
From that point on, you continue as Marc shows in the vignette.
Best,
Jim
>
> Am I missing some data that is found in the library("org.Hs.egGO")?
I
> can do the same commands with it and the structure of the goFrame,
> goAllFrame and gsc seem to be the same.
>
> Here's what I am trying to do. I have a microarray data set from a
time
> course experiment done with a fungal genome, C. neoformans. I have
> clusters of genes which are associated based how their expression
> changed in relation to the other genes on the array. So what I have
are
> gene lists, with no expression data or fold changes. For each list
of
> genes, I want to know what GO terms are over-represented.
>
> I apologize if these questions are too basic. It's just that most of
the
> software out there for gene enrichment analysis are designed for
model
> organisms.
>
> Again, any help is greatly appreciated.
>
> Regards,
> Maureen
>
>
>
>
>
> On 2/14/11 3:23 PM, James W. MacDonald wrote:
>> Hi Maureen,
>>
>> On 2/14/2011 3:27 PM, Maureen J. Donlin wrote:
>>> Hi all,
>>>
>>> I'm new to R and have some very basic questions about using
GOstats with
>>> a non-model organism.
>>> I'm trying to use the tutorial by Marc Carlson "How to Use GOstats
>>> and...with unsupported model organisms."
>>>
>>> I've created a GO to gene mapping file with the following 3
columns of
>>> data:
>>> Goterm Evidence GeneID
>>> GO:0015893 IEA CNAG_00003
>>> GO:0043231 IEA CNAG_00003
>>> GO:0015203 IEA CNAG_00003
>>> GO:0044425 IEA CNAG_00003
>>> ...
>>>
>>> I can import it using read.table, but I don't seem to be able to
invoke
>>> the data frame correctly.
>>
>> When you read it in using read.table(), you automatically have a
>> data.frame.
>>
>>>
>>> The tutorial reads:
>>> library("org.Hs.eg.db")
>>> frame = toTable(org.Hs.egGO)
>>> goFrameData = data.frame(frame$go_id, frame$Evidence,
frame$gene_id)
>>
>> Yep, this is just some code that Marc uses to create a data.frame
so
>> he can give an example.
>>
>>>
>>> I imported the data into an object using read.table
>>> >CneoGOanno <- read.table("Cneo_GOannot.txt")
>>>
>>> I tried to create a frame using:
>>> > frame = toTable(CneoGOannot)
>>> Error in function (classes, fdef, mtable) :
>>> unable to find an inherited method for function "toTable", for
signature
>>> "data.frame"
>>>
>>> Do I have to create some sort of database for this organism first?
If
>>> so, what is it's format?
>>>
>>> Any suggestions would be most appreciated.
>>
>> Just go to the next step, which will be something like
>>
>> goFrame <- GOFrame(CneoGOanno, organism = "Cryptococcus
neoformans")
>> goAllFrame <- GOALLFrame(goFrame)
>>
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>>
>>> Regards,
>>> Maureen Donlin
>>>
>>> At the risk of too long of an email, here's the session info:
>>> > sessionInfo()
>>> R version 2.12.1 (2010-12-16)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5
>>> graph_1.28.0 Category_2.16.0 AnnotationDbi_1.12.0
>>> [8] Biobase_2.10.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] annotate_1.28.0 genefilter_1.32.0 GO.db_2.4.5 GSEABase_1.12.2
>>> RBGL_1.26.0 splines_2.12.1 survival_2.36-2 tools_2.12.1
>>> [9] XML_3.2-0 xtable_1.5-6
>>>
>>>
>>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues
James,
Thanks ever so much. Following your advice, I was able to get this
to
work quite nicely.
Regards,
Maureen
On 2/15/11 9:51 AM, James W. MacDonald wrote:
> Hi Maureen,
>
> On 2/14/2011 5:50 PM, Maureen J. Donlin wrote:
>> James,
>>
>> Thanks for the reply. I figured out how to get the data into a data
>> frame.
>> I was doing 2 things wrong, but here is the code that worked.
>>
>> > CneoGO <- read.table("Cneo_GOannot.txt", header=TRUE)
>> > head(CneoGO)
>> Goterm Evidence GeneID
>> 1 GO:0015893 IEA CNAG_00003
>> 2 GO:0043231 IEA CNAG_00003
>> 3 GO:0015203 IEA CNAG_00003
>> 4 GO:0044425 IEA CNAG_00003
>> 5 GO:0044444 IEA CNAG_00003
>> 6 GO:0015846 IEA CNAG_00003
>>
>> > goframeData = data.frame(CneoGO$Goterm, CneoGO$Evidence,
>> CneoGO$GeneID)
>> > head(goframeData)
>> CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
>> 1 GO:0015893 IEA CNAG_00003
>> 2 GO:0043231 IEA CNAG_00003
>> 3 GO:0015203 IEA CNAG_00003
>> 4 GO:0044425 IEA CNAG_00003
>> 5 GO:0044444 IEA CNAG_00003
>> 6 GO:0015846 IEA CNAG_00003
>
> This step is unnecessary. The result of read.table() *is* a
> data.frame, so you are just creating another data.frame here.
>
>>
>> So continuing with the tutorial guide, I executed the following:
>>
>> > library("GSEABase")
>> Loading required package: annotate
>>
>> > goFrame = GOFrame(goframeData, organism = "Cryptococcus
neoformans")
>> Loading required package: GO.db
>>
>> > goFrame
>> An object of class "GOFrame"
>> Slot "data":
>> CneoGO.Goterm CneoGO.Evidence CneoGO.GeneID
>> 1 GO:0015893 IEA CNAG_00003
>> 2 GO:0043231 IEA CNAG_00003
>> ...
>> Slot "organism":
>> [1] "Cryptococcus neoformans"
>>
>> > goAllFrame = GOAllFrame(goFrame)
>>
>> > goAllFrame
>> An object of class "GOAllFrame"
>> Slot "data":
>> go_id evidence gene_id
>> 1 GO:0000001 IEA CNAG_00006
>> 2 GO:0000001 IEA CNAG_00088
>> ...
>> Slot "organism":
>> [1] "Cryptococcus neoformans"
>>
>>
>> > gsc <- GeneSetCollection(goAllFrame, setType = GOCollection())
>> > gsc
>> GeneSetCollection
>> names: GO:0000001, GO:0000002, ..., GO:2000045 (6658 total)
>> unique identifiers: CNAG_00006, CNAG_00088, ..., CNAG_06995 (4822
total)
>> types in collection:
>> geneIdType: GOAllFrameIdentifier (1 total)
>> collectionType: GOCollection (1 total)
>>
>> > universe = Lkeys(CneoGO)
>> Error in function (classes, fdef, mtable) :
>> unable to find an inherited method for function "Lkeys", for
signature
>> "data.frame"
>
> So here you are getting mixed up with what Marc had to do to get his
> example to run, and what you need to do. The 'universe' is just the
> complete set of gene IDs from which your significant set was chosen.
>
> If you had an org.Cn.eg.db package, then you would do something
> similar. However, you don't, which is the point of this exercise.
The
> corresponding set of gene IDs that you do have is the third column
of
> the data.frame you created above (goFrameData or CneoGO).
>
> Note here that you want to make sure that the gene IDs you use are
> character values, not factors. The default for R when reading in a
> data.frame is to convert a vector of strings to factor, so you
either
> want to use
>
> CneoGO <- read.table("Cneo_GOannot.txt", header=TRUE,
stringsAsFactors
> = FALSE)
>
> and then
>
> universe <- CneoGO[,3]
>
> or proceed as you already have, but then
>
> universe <- as.character(CneoGO[,3])
>
> In addition, note that you will need to construct your 'genes'
vector
> differently from what is shown on p.3 of the vignette, instead
> selecting the set of significant genes from the results of your
> analysis (again, using the CNAG gene IDs).
>
> From that point on, you continue as Marc shows in the vignette.
>
> Best,
>
> Jim
>
>
>
>>
>> Am I missing some data that is found in the library("org.Hs.egGO")?
I
>> can do the same commands with it and the structure of the goFrame,
>> goAllFrame and gsc seem to be the same.
>>
>> Here's what I am trying to do. I have a microarray data set from a
time
>> course experiment done with a fungal genome, C. neoformans. I have
>> clusters of genes which are associated based how their expression
>> changed in relation to the other genes on the array. So what I have
are
>> gene lists, with no expression data or fold changes. For each list
of
>> genes, I want to know what GO terms are over-represented.
>>
>> I apologize if these questions are too basic. It's just that most
of the
>> software out there for gene enrichment analysis are designed for
model
>> organisms.
>>
>> Again, any help is greatly appreciated.
>>
>> Regards,
>> Maureen
>>
>>
>>
>>
>>
>> On 2/14/11 3:23 PM, James W. MacDonald wrote:
>>> Hi Maureen,
>>>
>>> On 2/14/2011 3:27 PM, Maureen J. Donlin wrote:
>>>> Hi all,
>>>>
>>>> I'm new to R and have some very basic questions about using
GOstats
>>>> with
>>>> a non-model organism.
>>>> I'm trying to use the tutorial by Marc Carlson "How to Use
GOstats
>>>> and...with unsupported model organisms."
>>>>
>>>> I've created a GO to gene mapping file with the following 3
columns of
>>>> data:
>>>> Goterm Evidence GeneID
>>>> GO:0015893 IEA CNAG_00003
>>>> GO:0043231 IEA CNAG_00003
>>>> GO:0015203 IEA CNAG_00003
>>>> GO:0044425 IEA CNAG_00003
>>>> ...
>>>>
>>>> I can import it using read.table, but I don't seem to be able to
>>>> invoke
>>>> the data frame correctly.
>>>
>>> When you read it in using read.table(), you automatically have a
>>> data.frame.
>>>
>>>>
>>>> The tutorial reads:
>>>> library("org.Hs.eg.db")
>>>> frame = toTable(org.Hs.egGO)
>>>> goFrameData = data.frame(frame$go_id, frame$Evidence,
frame$gene_id)
>>>
>>> Yep, this is just some code that Marc uses to create a data.frame
so
>>> he can give an example.
>>>
>>>>
>>>> I imported the data into an object using read.table
>>>> >CneoGOanno <- read.table("Cneo_GOannot.txt")
>>>>
>>>> I tried to create a frame using:
>>>> > frame = toTable(CneoGOannot)
>>>> Error in function (classes, fdef, mtable) :
>>>> unable to find an inherited method for function "toTable", for
>>>> signature
>>>> "data.frame"
>>>>
>>>> Do I have to create some sort of database for this organism
first? If
>>>> so, what is it's format?
>>>>
>>>> Any suggestions would be most appreciated.
>>>
>>> Just go to the next step, which will be something like
>>>
>>> goFrame <- GOFrame(CneoGOanno, organism = "Cryptococcus
neoformans")
>>> goAllFrame <- GOALLFrame(goFrame)
>>>
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>
>>>>
>>>> Regards,
>>>> Maureen Donlin
>>>>
>>>> At the risk of too long of an email, here's the session info:
>>>> > sessionInfo()
>>>> R version 2.12.1 (2010-12-16)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods base
>>>>
>>>> other attached packages:
>>>> [1] org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5
>>>> graph_1.28.0 Category_2.16.0 AnnotationDbi_1.12.0
>>>> [8] Biobase_2.10.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] annotate_1.28.0 genefilter_1.32.0 GO.db_2.4.5 GSEABase_1.12.2
>>>> RBGL_1.26.0 splines_2.12.1 survival_2.36-2 tools_2.12.1
>>>> [9] XML_3.2-0 xtable_1.5-6
>>>>
>>>>
>>>
>>
>
--
Maureen J. Donlin, Ph.D.
Research Associate Professor
Dept. of Molecular Microbiology& Immunology
Dept. of Biochemistry& Molecular Biology
Saint Louis University School of Medicine
507 Doisy Research Center
1100 S. Grand
St. Louis, MO 63104
Phone: 314-977-8858