GEO and Limma
2
0
Entering edit mode
Voke AO ▴ 760
@voke-ao-4830
Last seen 7.2 years ago
Hi all, I've been struggling a bit with GEO and limma. I've come up with 2 questions that I'ld appreciate some input on. Question 1: I have a design matrix with 6columns for an expression set with 3 columns(3 levels, 2 agents). How do I create a model matrix to fit my design matrix for this dataset (eg. gds3715 below)? For a straight forward single factor analysis eg. gds161, the code below seems to work fine. gds161dat = getGEO('GDS161',destdir=".") gds161eset = GDS2eSet(gds161dat, do.log2=TRUE) m = pData(gds161eset)$metabolism design_gds161 = createDesignMatrix(gds161eset) design_gds161 = model.matrix(~m) fit = lmFit(gds161eset, design_gds161) fit2 = eBayes(fit) results = topTable(fit2, adjust ="BH", number = nrow(gds161eset)) excel = write.table(results, file = file.choose(), sep = ",") gds3715dat = getGEO('GDS3715',destdir=".") gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE) DIR = paste(pData(gds3715eset)$disease.state, pData(gds3715eset)$agent, sep =".") m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" = pData(gds3715eset)$agent) design_gds3715 = createDesignMatrix(gds3715eset) design_gds3715 = model.matrix(~m) fit = lmFit(gds3715eset, design_gds3715) fit2 = eBayes(fit) results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset)) excel = write.table(results, file = file.choose(), sep = ",") Question 2: How can I run a similar (limma) analysis with GSE files? I can hardly figure out how to get from the expression set to the analysis...I've read the Using the GEOquery Package documentation several times over. I'm just have a hard time getting it to work. gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE) This is supposed to give me an expressionset, I can't seem to figure out how to analyze it with limma. Please help. Thanks. Avoks sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx [3] LC_MONETARY=English_xxx LC_NUMERIC=C [5] LC_TIME=English_xxx attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] puma_2.4.0 mclust_3.4.10 affy_1.30.0 limma_3.8.3 [5] GEOquery_2.19.4 Biobase_2.12.2 loaded via a namespace (and not attached): [1] affyio_1.20.0 preprocessCore_1.14.0 RCurl_1.6-10.1 [4] tools_2.13.1 XML_3.4-2.2
limma GEOquery limma GEOquery • 1.3k views
ADD COMMENT
0
Entering edit mode
@freudenberg-johannes-nihniehs-e-4789
Last seen 7.2 years ago
Hi Avoks, > I can hardly figure out how to get from the expression set to the analysis... I can't tell what the problem is but my best guess would be that you're dealing with a list here. Each list element correspond to a different platform. So all you need to do is something like > myGSEeset <- gse25724dat[[1]] > head(exprs(myGSEeset)) etc... I hope this helps. Please feel free to post more specific questions if it doesn't. Best, Johannes -----Original Message----- From: Ovokeraye Achinike-Oduaran [mailto:ovokeraye@gmail.com] Sent: Tuesday, October 04, 2011 10:28 AM To: bioconductor at r-project.org Subject: [BioC] GEO and Limma Hi all, I've been struggling a bit with GEO and limma. I've come up with 2 questions that I'ld appreciate some input on. Question 1: I have a design matrix with 6columns for an expression set with 3 columns(3 levels, 2 agents). How do I create a model matrix to fit my design matrix for this dataset (eg. gds3715 below)? For a straight forward single factor analysis eg. gds161, the code below seems to work fine. gds161dat = getGEO('GDS161',destdir=".") gds161eset = GDS2eSet(gds161dat, do.log2=TRUE) m = pData(gds161eset)$metabolism design_gds161 = createDesignMatrix(gds161eset) design_gds161 = model.matrix(~m) fit = lmFit(gds161eset, design_gds161) fit2 = eBayes(fit) results = topTable(fit2, adjust ="BH", number = nrow(gds161eset)) excel = write.table(results, file = file.choose(), sep = ",") gds3715dat = getGEO('GDS3715',destdir=".") gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE) DIR = paste(pData(gds3715eset)$disease.state, pData(gds3715eset)$agent, sep =".") m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" = pData(gds3715eset)$agent) design_gds3715 = createDesignMatrix(gds3715eset) design_gds3715 = model.matrix(~m) fit = lmFit(gds3715eset, design_gds3715) fit2 = eBayes(fit) results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset)) excel = write.table(results, file = file.choose(), sep = ",") Question 2: How can I run a similar (limma) analysis with GSE files? I can hardly figure out how to get from the expression set to the analysis...I've read the Using the GEOquery Package documentation several times over. I'm just have a hard time getting it to work. gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE) This is supposed to give me an expressionset, I can't seem to figure out how to analyze it with limma. Please help. Thanks. Avoks sessionInfo() R version 2.13.1 (2011-07-08) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx [3] LC_MONETARY=English_xxx LC_NUMERIC=C [5] LC_TIME=English_xxx attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] puma_2.4.0 mclust_3.4.10 affy_1.30.0 limma_3.8.3 [5] GEOquery_2.19.4 Biobase_2.12.2 loaded via a namespace (and not attached): [1] affyio_1.20.0 preprocessCore_1.14.0 RCurl_1.6-10.1 [4] tools_2.13.1 XML_3.4-2.2 _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
0
Entering edit mode
Voke AO ▴ 760
@voke-ao-4830
Last seen 7.2 years ago
Thanks a bunch, Johannes and Sean. I'll try the GSE data again as soon as I get in front of a comp. About the first question on creating a model matrix for a 3x2 factorial dataset like gds3715, any ideas? Thanks again. Avoks On 10/4/11, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > On Tue, Oct 4, 2011 at 10:28 AM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> Hi all, >> >> I've been struggling a bit with GEO and limma. I've come up with 2 >> questions that I'ld appreciate some input on. >> >> Question 1: >> I have a design matrix with 6columns for an expression set with 3 >> columns(3 levels, 2 agents). How do I create a model matrix to fit my >> design matrix for this dataset (eg. gds3715 below)? For a straight >> forward single factor analysis eg. gds161, the code below seems to >> work fine. >> >> gds161dat = getGEO('GDS161',destdir=".") >> gds161eset = GDS2eSet(gds161dat, do.log2=TRUE) >> m = pData(gds161eset)$metabolism >> design_gds161 = createDesignMatrix(gds161eset) >> design_gds161 = model.matrix(~m) >> fit = lmFit(gds161eset, design_gds161) >> fit2 = eBayes(fit) >> results = topTable(fit2, adjust ="BH", number = nrow(gds161eset)) >> excel = write.table(results, file = file.choose(), sep = ",") >> >> gds3715dat = getGEO('GDS3715',destdir=".") >> gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE) >> DIR = paste(pData(gds3715eset)$disease.state, >> pData(gds3715eset)$agent, sep =".") >> m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" = >> pData(gds3715eset)$agent) >> design_gds3715 = createDesignMatrix(gds3715eset) >> design_gds3715 = model.matrix(~m) >> fit = lmFit(gds3715eset, design_gds3715) >> fit2 = eBayes(fit) >> results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset)) >> excel = write.table(results, file = file.choose(), sep = ",") >> >> Question 2: >> How can I run a similar (limma) analysis with GSE files? >> I can hardly figure out how to get from the expression set to the >> analysis...I've read the Using the GEOquery Package documentation >> several times over. I'm just have a hard time getting it to work. >> >> gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE) > > Hi, Avoks. The return value from this call is a LIST of > ExpressionSets, not a single ExpressionSet. You probably want the > first ExpressionSet from the list and can get that: > > gse25724eSet = gse25724dat[[1]] > > GEOquery works this way because many GSEs (so-called superseries) have > multiple series included within them, so returning a list is necessary > to be general. > > Sean > > >> This is supposed to give me an expressionset, I can't seem to figure >> out how to analyze it with limma. >> >> Please help. >> >> Thanks. >> >> Avoks >> >> sessionInfo() >> R version 2.13.1 (2011-07-08) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx >> [3] LC_MONETARY=English_xxx LC_NUMERIC=C >> [5] LC_TIME=English_xxx >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> [1] puma_2.4.0 ? ? ?mclust_3.4.10 ? affy_1.30.0 ? ? limma_3.8.3 >> [5] GEOquery_2.19.4 Biobase_2.12.2 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.20.0 ? ? ? ? preprocessCore_1.14.0 RCurl_1.6-10.1 >> [4] tools_2.13.1 ? ? ? ? ?XML_3.4-2.2 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD COMMENT
0
Entering edit mode
On Tue, Oct 4, 2011 at 2:04 PM, Ovokeraye Achinike-Oduaran <ovokeraye at="" gmail.com=""> wrote: > Thanks a bunch, Johannes and Sean. I'll try the GSE data again as soon > as I ?get in front of a comp. About the first question on creating a > model matrix ?for a 3x2 factorial dataset like gds3715, any ideas? You'll probably need a contrast matrix in there somewhere to pull out contrasts of interest. I'm not sure which you are interested in. Sean > Thanks again. > > Avoks > > On 10/4/11, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >> On Tue, Oct 4, 2011 at 10:28 AM, Ovokeraye Achinike-Oduaran >> <ovokeraye at="" gmail.com=""> wrote: >>> Hi all, >>> >>> I've been struggling a bit with GEO and limma. I've come up with 2 >>> questions that I'ld appreciate some input on. >>> >>> Question 1: >>> I have a design matrix with 6columns for an expression set with 3 >>> columns(3 levels, 2 agents). How do I create a model matrix to fit my >>> design matrix for this dataset (eg. gds3715 below)? For a straight >>> forward single factor analysis eg. gds161, the code below seems to >>> work fine. >>> >>> gds161dat = getGEO('GDS161',destdir=".") >>> gds161eset = GDS2eSet(gds161dat, do.log2=TRUE) >>> m = pData(gds161eset)$metabolism >>> design_gds161 = createDesignMatrix(gds161eset) >>> design_gds161 = model.matrix(~m) >>> fit = lmFit(gds161eset, design_gds161) >>> fit2 = eBayes(fit) >>> results = topTable(fit2, adjust ="BH", number = nrow(gds161eset)) >>> excel = write.table(results, file = file.choose(), sep = ",") >>> >>> gds3715dat = getGEO('GDS3715',destdir=".") >>> gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE) >>> DIR = paste(pData(gds3715eset)$disease.state, >>> pData(gds3715eset)$agent, sep =".") >>> m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" = >>> pData(gds3715eset)$agent) >>> design_gds3715 = createDesignMatrix(gds3715eset) >>> design_gds3715 = model.matrix(~m) >>> fit = lmFit(gds3715eset, design_gds3715) >>> fit2 = eBayes(fit) >>> results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset)) >>> excel = write.table(results, file = file.choose(), sep = ",") >>> >>> Question 2: >>> How can I run a similar (limma) analysis with GSE files? >>> I can hardly figure out how to get from the expression set to the >>> analysis...I've read the Using the GEOquery Package documentation >>> several times over. I'm just have a hard time getting it to work. >>> >>> gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE) >> >> Hi, Avoks. ?The return value from this call is a LIST of >> ExpressionSets, not a single ExpressionSet. ?You probably want the >> first ExpressionSet from the list and can get that: >> >> gse25724eSet = gse25724dat[[1]] >> >> GEOquery works this way because many GSEs (so-called superseries) have >> multiple series included within them, so returning a list is necessary >> to be general. >> >> Sean >> >> >>> This is supposed to give me an expressionset, I can't seem to figure >>> out how to analyze it with limma. >>> >>> Please help. >>> >>> Thanks. >>> >>> Avoks >>> >>> sessionInfo() >>> R version 2.13.1 (2011-07-08) >>> Platform: i386-pc-mingw32/i386 (32-bit) >>> >>> locale: >>> [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx >>> [3] LC_MONETARY=English_xxx LC_NUMERIC=C >>> [5] LC_TIME=English_xxx >>> >>> attached base packages: >>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>> >>> other attached packages: >>> [1] puma_2.4.0 ? ? ?mclust_3.4.10 ? affy_1.30.0 ? ? limma_3.8.3 >>> [5] GEOquery_2.19.4 Biobase_2.12.2 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.20.0 ? ? ? ? preprocessCore_1.14.0 RCurl_1.6-10.1 >>> [4] tools_2.13.1 ? ? ? ? ?XML_3.4-2.2 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Thanks Sean. Avoks On Tue, Oct 4, 2011 at 8:08 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > On Tue, Oct 4, 2011 at 2:04 PM, Ovokeraye Achinike-Oduaran > <ovokeraye at="" gmail.com=""> wrote: >> Thanks a bunch, Johannes and Sean. I'll try the GSE data again as soon >> as I ?get in front of a comp. About the first question on creating a >> model matrix ?for a 3x2 factorial dataset like gds3715, any ideas? > > You'll probably need a contrast matrix in there somewhere to pull out > contrasts of interest. ?I'm not sure which you are interested in. > > Sean > > >> Thanks again. >> >> Avoks >> >> On 10/4/11, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >>> On Tue, Oct 4, 2011 at 10:28 AM, Ovokeraye Achinike-Oduaran >>> <ovokeraye at="" gmail.com=""> wrote: >>>> Hi all, >>>> >>>> I've been struggling a bit with GEO and limma. I've come up with 2 >>>> questions that I'ld appreciate some input on. >>>> >>>> Question 1: >>>> I have a design matrix with 6columns for an expression set with 3 >>>> columns(3 levels, 2 agents). How do I create a model matrix to fit my >>>> design matrix for this dataset (eg. gds3715 below)? For a straight >>>> forward single factor analysis eg. gds161, the code below seems to >>>> work fine. >>>> >>>> gds161dat = getGEO('GDS161',destdir=".") >>>> gds161eset = GDS2eSet(gds161dat, do.log2=TRUE) >>>> m = pData(gds161eset)$metabolism >>>> design_gds161 = createDesignMatrix(gds161eset) >>>> design_gds161 = model.matrix(~m) >>>> fit = lmFit(gds161eset, design_gds161) >>>> fit2 = eBayes(fit) >>>> results = topTable(fit2, adjust ="BH", number = nrow(gds161eset)) >>>> excel = write.table(results, file = file.choose(), sep = ",") >>>> >>>> gds3715dat = getGEO('GDS3715',destdir=".") >>>> gds3715eset = GDS2eSet(gds3715dat, do.log2=TRUE) >>>> DIR = paste(pData(gds3715eset)$disease.state, >>>> pData(gds3715eset)$agent, sep =".") >>>> m = data.frame("fac1" = pData(gds3715eset)$disease.state, "fac2" = >>>> pData(gds3715eset)$agent) >>>> design_gds3715 = createDesignMatrix(gds3715eset) >>>> design_gds3715 = model.matrix(~m) >>>> fit = lmFit(gds3715eset, design_gds3715) >>>> fit2 = eBayes(fit) >>>> results = topTable(fit2, adjust ="BH", number = nrow(gds3715eset)) >>>> excel = write.table(results, file = file.choose(), sep = ",") >>>> >>>> Question 2: >>>> How can I run a similar (limma) analysis with GSE files? >>>> I can hardly figure out how to get from the expression set to the >>>> analysis...I've read the Using the GEOquery Package documentation >>>> several times over. I'm just have a hard time getting it to work. >>>> >>>> gse25724dat = getGEO('GSE25724', GSEMatrix = TRUE) >>> >>> Hi, Avoks. ?The return value from this call is a LIST of >>> ExpressionSets, not a single ExpressionSet. ?You probably want the >>> first ExpressionSet from the list and can get that: >>> >>> gse25724eSet = gse25724dat[[1]] >>> >>> GEOquery works this way because many GSEs (so-called superseries) have >>> multiple series included within them, so returning a list is necessary >>> to be general. >>> >>> Sean >>> >>> >>>> This is supposed to give me an expressionset, I can't seem to figure >>>> out how to analyze it with limma. >>>> >>>> Please help. >>>> >>>> Thanks. >>>> >>>> Avoks >>>> >>>> sessionInfo() >>>> R version 2.13.1 (2011-07-08) >>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>> >>>> locale: >>>> [1] LC_COLLATE=English_xxx. LC_CTYPE=English_xxx >>>> [3] LC_MONETARY=English_xxx LC_NUMERIC=C >>>> [5] LC_TIME=English_xxx >>>> >>>> attached base packages: >>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>> >>>> other attached packages: >>>> [1] puma_2.4.0 ? ? ?mclust_3.4.10 ? affy_1.30.0 ? ? limma_3.8.3 >>>> [5] GEOquery_2.19.4 Biobase_2.12.2 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.20.0 ? ? ? ? preprocessCore_1.14.0 RCurl_1.6-10.1 >>>> [4] tools_2.13.1 ? ? ? ? ?XML_3.4-2.2 >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY

Login before adding your answer.

Traffic: 539 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6