Question: Retrieving data from NCBI GEO Problem and RNA-Seq Data Analysis
11 months ago by
hkarakurt10 wrote:

Hello, I am new at RNA-Seq data analysis and I want to analyze the data and do some analyses such as finding differentially expressed genes. My data set is from NCBI GEO and coded as GSE80336. Link is here:

I see data is not normalized. I used getGEO() command. When I used exprs() command, I have this message:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘exprs’ for signature ‘"list"’

What is the real problem here? How can I find non-normalized expression matrix?

I also wanted to RAW data but there is not any RAW data file in supplementary section. 

How can I reach RAW data, normalize it and find significantly changed genes. I really need help I am stuck. 

Should I start with SRA files?

Also there is a file called "Counts.txt" in supplementary. What is this file actually and can I use it?

Thank you.

11 months ago
11 months ago by
Aaron Lun20k
Cambridge, United Kingdom
Aaron Lun20k wrote:
  1. Download the "Counts.txt" supplementary file, which contains... counts, unsurprisingly enough.
  2. Apply DE analysis workflows like
11 months ago by Aaron Lun20k
11 months ago by
Sean Davis21k
United States
Sean Davis21k wrote:

Echoing the answer by Aaron, here is the code to download all supplemental files for a given GEO accession.

sfiles = getGEOSuppFiles('GSE80336')
fnames = rownames(sfiles)
# there is only one supplemental file
b2 = read.delim(fnames[1],header=TRUE)
11 months ago by Sean Davis21k

Yes there is only one supplement file and it is the Counts file. Mostly there are RAW_data files in supplemental files but this data is different. Probably I will use counts data for analyses.

Thank you for answer.

11 months ago by hkarakurt10
