affybatch object too big

0

Entering edit mode

Noemi Andor ▴ 100

@noemi-andor-4128

Last seen 9.6 years ago

Hi, I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) eset<-rma(a) e206123_at<-exprs(eset["206123_at"]) ... e210684_s_at<-exprs(eset["210684_s_at"]) e1<-rbind(e206123_at,..., e210684_s_at) If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? Would be very greatfull for a good solution to my problem? best regards, Noemi

affy affy • 1.1k views

ADD COMMENT • link updated 13.9 years ago by Noe Andor ▴ 70 • written 13.9 years ago by Noemi Andor ▴ 100

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 8 minutes ago

United States

Hi Noemi, Noemi Andor wrote: > Hi, > > I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: > > pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) > a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) > eset<-rma(a) > e206123_at<-exprs(eset["206123_at"]) > ... > e210684_s_at<-exprs(eset["210684_s_at"]) > e1<-rbind(e206123_at,..., e210684_s_at) > > If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? > > Would be very greatfull for a good solution to my problem? You don't mention your OS, so it is more difficult to suggest. Assuming you don't have access to a computer with more memory, the memory- bounded solutions are as follows. The xps package of Bioconductor. Requires installation of ROOT, but the maintainer Christian Stratowa is very active and helpful on this list. You will need to go here first to get ROOT. http://root.cern.ch/drupal/ then a simple biocLite("xps") should get you started. A non-BioC but still R based choice is aroma.affymetrix. More info can be found here: http://www.aroma-project.org/ If you are on Windows or Linux, you could use RMAexpress, which is a standalone software to do RMA. http://rmaexpress.bmbolstad.com/ Best, Jim > > best regards, > > Noemi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 13.9 years ago James W. MacDonald 65k

0

Entering edit mode

and oligo may be of interest too (if the annotation package isn't online, you'll need to build your own via pdInfoBuilder)... library(oligo) library(ff) cels = list.celfiles() rawData = read.celfiles(cels) rmaData = rma(rawData) ## resFF is a ff object resFF = exprs(rmaData) resFF[1:10, 1:4] ## if you have RAM to store the exprs matrix resMatrix = resFF[] ## save in a tab-file tmp = as.ffdf(resFF) write.table(tmp, file="results.csv", quote=FALSE, sep="\t") same applies for Exon and Gene ST arrays. b On 14 June 2010 21:49, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > Hi Noemi, > > Noemi Andor wrote: >> >> Hi, >> >> I have a problem with the loading of an affybatch object from cel- files. I >> have multiple cel files and want to analyze them all together, yet the >> amount of data is to big to be loaded. I am interested only in a subset of >> genes within those cel-files. So I could load the cel-files into 3 or 4 affy >> objects, then read the expression values of the genes of interest and merge >> them again. But if I read them separatly, the background correction wil not >> be global, and I do not know how to read them without background correction: >> >> pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) >> a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = >> TRUE) >> eset<-rma(a) >> e206123_at<-exprs(eset["206123_at"]) >> ... >> e210684_s_at<-exprs(eset["210684_s_at"]) >> e1<-rbind(e206123_at,..., e210684_s_at) >> >> If I do the same for BI_samples_2.txt the corresponding e2 will not be >> comparable to e1, am I right? >> >> Would be very greatfull for a good solution to my problem? > > You don't mention your OS, so it is more difficult to suggest. Assuming you > don't have access to a computer with more memory, the memory-bounded > solutions are as follows. > > The xps package of Bioconductor. Requires installation of ROOT, but the > maintainer Christian Stratowa is very active and helpful on this list. > > You will need to go here first to get ROOT. > > http://root.cern.ch/drupal/ > > then a simple biocLite("xps") should get you started. > > A non-BioC but still R based choice is aroma.affymetrix. More info can be > found here: > > http://www.aroma-project.org/ > > > If you are on Windows or Linux, you could use RMAexpress, which is a > standalone software to do RMA. > > http://rmaexpress.bmbolstad.com/ > > > Best, > > Jim > > > >> >> best regards, >> >> Noemi >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 13.9 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

Noe Andor ▴ 70

@noe-andor-4129

Last seen 9.6 years ago

Hi, I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) eset<-rma(a) e206123_at<-exprs(eset["206123_at"]) ... e210684_s_at<-exprs(eset["210684_s_at"]) e1<-rbind(e206123_at,..., e210684_s_at) If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? Would be very greatfull for a good solution to my problem? best regards, Noemi [[alternative HTML version deleted]]

ADD COMMENT • link 13.9 years ago Noe Andor ▴ 70

Login before adding your answer.