Normalization large data set
1
0
Entering edit mode
@phungduongmy1416222-19921
Last seen 2.7 years ago

Hello, I am trying to normalize data "GSE68465" (462 samples) by gcRMA and I will only use gcRMA in this case due to the requirement work. The number of sample is large and when I analyze, the error is related to R can not allocate the so big vector. Here is my code:

setwd("D:/justforR1/meta_lung/GSE68465_RAW")
source("http://bioconductor.org/biocLite.R")
biocLite()
library(affy)
library(gcrma)
eset.gcrma = justGCRMA()
exprSet.nologs = exprs(eset.gcrma)
write.table(exprSet.nologs, file="Normalizationtest.gcrma.txt", quote=F, sep="\t")


What should I do in this situation? Thank you

normalization gcRMA • 261 views
0
Entering edit mode
@james-w-macdonald-5106
Last seen 7 hours ago
United States

That's a lot of arrays. You could try using optimize.by = "memory", which is a bit more memory efficient. But it looks like you are working on a Windows box, which probably doesn't have much memory (like, maybe 16 Gb, which is a 'normal' amount of RAM for a desktop), in which case it probably won't help.

An alternative would be to use AWS to spin up a large enough instance to run GCRMA on that number of files, and then get your data back and analyze the summarized data on your desktop. That wouldn't cost much, but it does take a bit to figure out how to do it (I find most of Amazon's documentation, um, let's say obscure). There are some instructions here that you could use.