affybatch object too big
2
0
Entering edit mode
Noemi Andor ▴ 100
@noemi-andor-4128
Last seen 9.6 years ago
Hi, I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) eset<-rma(a) e206123_at<-exprs(eset["206123_at"]) ... e210684_s_at<-exprs(eset["210684_s_at"]) e1<-rbind(e206123_at,..., e210684_s_at) If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? Would be very greatfull for a good solution to my problem? best regards, Noemi
affy affy • 1.1k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 8 minutes ago
United States
Hi Noemi, Noemi Andor wrote: > Hi, > > I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: > > pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) > a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) > eset<-rma(a) > e206123_at<-exprs(eset["206123_at"]) > ... > e210684_s_at<-exprs(eset["210684_s_at"]) > e1<-rbind(e206123_at,..., e210684_s_at) > > If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? > > Would be very greatfull for a good solution to my problem? You don't mention your OS, so it is more difficult to suggest. Assuming you don't have access to a computer with more memory, the memory- bounded solutions are as follows. The xps package of Bioconductor. Requires installation of ROOT, but the maintainer Christian Stratowa is very active and helpful on this list. You will need to go here first to get ROOT. http://root.cern.ch/drupal/ then a simple biocLite("xps") should get you started. A non-BioC but still R based choice is aroma.affymetrix. More info can be found here: http://www.aroma-project.org/ If you are on Windows or Linux, you could use RMAexpress, which is a standalone software to do RMA. http://rmaexpress.bmbolstad.com/ Best, Jim > > best regards, > > Noemi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
and oligo may be of interest too (if the annotation package isn't online, you'll need to build your own via pdInfoBuilder)... library(oligo) library(ff) cels = list.celfiles() rawData = read.celfiles(cels) rmaData = rma(rawData) ## resFF is a ff object resFF = exprs(rmaData) resFF[1:10, 1:4] ## if you have RAM to store the exprs matrix resMatrix = resFF[] ## save in a tab-file tmp = as.ffdf(resFF) write.table(tmp, file="results.csv", quote=FALSE, sep="\t") same applies for Exon and Gene ST arrays. b On 14 June 2010 21:49, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > Hi Noemi, > > Noemi Andor wrote: >> >> Hi, >> >> I have a problem with the loading of an affybatch object from cel- files. I >> have multiple cel files and want to analyze them all together, yet the >> amount of data is to big to be loaded. I am interested only in a subset of >> genes within those cel-files. So I could load the cel-files into 3 or 4 affy >> objects, then read the expression values of the genes of interest and merge >> them again. But if I read them separatly, the background correction wil not >> be global, and I do not know how to read them without background correction: >> >> pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) >> a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = >> TRUE) >> eset<-rma(a) >> e206123_at<-exprs(eset["206123_at"]) >> ... >> e210684_s_at<-exprs(eset["210684_s_at"]) >> e1<-rbind(e206123_at,..., e210684_s_at) >> >> If I do the same for BI_samples_2.txt the corresponding e2 will not be >> comparable to e1, am I right? >> >> Would be very greatfull for a good solution to my problem? > > You don't mention your OS, so it is more difficult to suggest. Assuming you > don't have access to a computer with more memory, the memory-bounded > solutions are as follows. > > The xps package of Bioconductor. Requires installation of ROOT, but the > maintainer Christian Stratowa is very active and helpful on this list. > > You will need to go here first to get ROOT. > > http://root.cern.ch/drupal/ > > then a simple biocLite("xps") should get you started. > > A non-BioC but still R based choice is aroma.affymetrix. More info can be > found here: > > http://www.aroma-project.org/ > > > If you are on Windows or Linux, you could use RMAexpress, which is a > standalone software to do RMA. > > http://rmaexpress.bmbolstad.com/ > > > Best, > > Jim > > > >> >> best regards, >> >> Noemi >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Noe Andor ▴ 70
@noe-andor-4129
Last seen 9.6 years ago
Hi, I have a problem with the loading of an affybatch object from cel- files. I have multiple cel files and want to analyze them all together, yet the amount of data is to big to be loaded. I am interested only in a subset of genes within those cel-files. So I could load the cel-files into 3 or 4 affy objects, then read the expression values of the genes of interest and merge them again. But if I read them separatly, the background correction wil not be global, and I do not know how to read them without background correction: pd=read.AnnotatedDataFrame("BI_samples_1.txt", header=TRUE, row.names=1) a= ReadAffy(filenames = rownames(pData(pd)), phenoData = pd, verbose = TRUE) eset<-rma(a) e206123_at<-exprs(eset["206123_at"]) ... e210684_s_at<-exprs(eset["210684_s_at"]) e1<-rbind(e206123_at,..., e210684_s_at) If I do the same for BI_samples_2.txt the corresponding e2 will not be comparable to e1, am I right? Would be very greatfull for a good solution to my problem? best regards, Noemi [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 1028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6