Question: Need help loading gene expression data into R using either ArrayExpress or GEOquery R packages
1
arbet003 • 10 wrote:
I am trying to load gene expression and phenotype data into R from a particular study that can be accessed at the GEO database (id: GSE72680) or at the ArrayExpress database (id: E-GEOD-72680). I have tried using both the GEOquery and ArrayExpress R packages to load this data into R but have been unsuccessful. I am wondering if anyone could help show me how to load the gene expression and phenotype data from this study into R. Thanks!
ADD COMMENT
• link
•
modified 10 months ago
by
Sean Davis ♦ 21k
•
written
10 months ago by
arbet003 • 10


Looks like this submission contains no processed data from the submitter.
For GEOquery, I tried the following:
GSEMatrix = F,
and then tried following the advice here for converting the GSE files to an expressionset but that did not work.I also tried loading the SOFT files directly into R:
gdata <-getGEO(filename='GSE72680_family.soft.gz')
but then I do not know how to extract the gene expression and phenotype data... again the problem is that I do not know how to convert the GSE files to an expressionset, the tutorial I linked above does not work.
For ArrayExpress, I tried the following:
data=ArrayExpress("E-GEOD-72680")
.... this doesnt work, says no raw data is availabledata = getAE("E-GEOD-72680", type = "processed")
cn = getcolproc(data)
show(cn)
but this also doesnt appear to work, i.e. I the
cn
object doesnt contain anything, so I cant create the processed data usingprocset(data,cn[2])
You'll need to update R and GEOquery. That SSL error is due to a change at NCBI that has been addressed in recent GEOquery versions. As for ArrayExpress, there is no "processed" data, so that approach won't work. See my answer below.