Question: is it possible to find sample batch # in CEL files?
0
6.9 years ago by
Brian Tsai40
Brian Tsai40 wrote:
Hi, I've been downloading raw CEL files from the gene expression omnibus, and have been trying to process them -- i'd like to account for batch effect when computing differential expression, but the authors didn't provide the information explicitly in their annotations. Is this information stored/retrievable through the CEL files through Bioconductor? [[alternative HTML version deleted]]
process • 948 views
modified 4.6 years ago by suprun.maria0 • written 6.9 years ago by Brian Tsai40
Answer: is it possible to find sample batch # in CEL files?
0
6.9 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
Not a direct answer, but you might look at the sva package which does not rely on externally-defined batch effects. Sean On Thu, Jan 24, 2013 at 8:23 AM, Brian Tsai <btsai00 at="" gmail.com=""> wrote: > Hi, > > I've been downloading raw CEL files from the gene expression omnibus, and > have been trying to process them -- i'd like to account for batch effect > when computing differential expression, but the authors didn't provide the > information explicitly in their annotations. Is this information > stored/retrievable through the CEL files through Bioconductor? > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Answer: is it possible to find sample batch # in CEL files?
0
6.9 years ago by
Guido Hooiveld2.5k
Wageningen University, Wageningen, the Netherlands
Guido Hooiveld2.5k wrote:
Hi, Some time ago I came across these lines of code, that could be of help: http://bios.ucdenver.edu/images/a/a1/Affy_headerinfo.txt Never used it myself, though. HTH, Guido -----Original Message----- From: bioconductor-bounces@r-project.org [mailto:bioconductor- bounces@r-project.org] On Behalf Of Brian Tsai Sent: Thursday, January 24, 2013 14:23 To: bioconductor at r-project.org Subject: [BioC] is it possible to find sample batch # in CEL files? Hi, I've been downloading raw CEL files from the gene expression omnibus, and have been trying to process them -- i'd like to account for batch effect when computing differential expression, but the authors didn't provide the information explicitly in their annotations. Is this information stored/retrievable through the CEL files through Bioconductor? [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Answer: is it possible to find sample batch # in CEL files?
0
6.9 years ago by
James F. Reid120
James F. Reid120 wrote:
Hi Brian, On 24/01/13 13:23, Brian Tsai wrote: > Hi, > > I've been downloading raw CEL files from the gene expression omnibus, and > have been trying to process them -- i'd like to account for batch effect > when computing differential expression, but the authors didn't provide the > information explicitly in their annotations. Is this information > stored/retrievable through the CEL files through Bioconductor? you should be able to access the date the chip was scanned using the readCelHeader function provided in the affxparser package. Look for the 'datheader' entry. James.
Hi I haven't tried this in a while, but afaIcs the 'readAffy' function in the 'affy' package automatically populates the 'ScanDate' field in the resulting AffyBatch object, which you can access with syntax like protocolData(a)$ScanDate where I have assumed that 'a' is an AffyBatch. Best wishes Wolfgang Il giorno Jan 24, 2013, alle ore 2:43 PM, James F. Reid <reidjf at="" gmail.com=""> ha scritto: > Hi Brian, > > On 24/01/13 13:23, Brian Tsai wrote: >> Hi, >> >> I've been downloading raw CEL files from the gene expression omnibus, and >> have been trying to process them -- i'd like to account for batch effect >> when computing differential expression, but the authors didn't provide the >> information explicitly in their annotations. Is this information >> stored/retrievable through the CEL files through Bioconductor? > you should be able to access the date the chip was scanned using the readCelHeader function provided in the affxparser package. Look for the 'datheader' entry. > > James. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ADD REPLYlink written 6.9 years ago by Wolfgang Huber13k Answer: is it possible to find sample batch # in CEL files? 0 6.9 years ago by Rob Dunne230 Rob Dunne230 wrote: Hi Brian, affxparser has a function called readCelHeader. library(affxparser) dates<-rep(0,length(files)) for (i in 1:length(files)){ datheader<-readCelHeader(ff[i])$datheader dd<-gsub(".*([0-9]{2,2}/[0-9]{2,2}/[0-9]{2,2}).*","\\1", datheader) dates[i]<-dd } Bye Rob ________________________________________ From: bioconductor-bounces@r-project.org [bioconductor- bounces@r-project.org] On Behalf Of Brian Tsai [btsai00@gmail.com] Sent: Friday, January 25, 2013 12:23 AM To: bioconductor at r-project.org Subject: [BioC] is it possible to find sample batch # in CEL files? Hi, I've been downloading raw CEL files from the gene expression omnibus, and have been trying to process them -- i'd like to account for batch effect when computing differential expression, but the authors didn't provide the information explicitly in their annotations. Is this information stored/retrievable through the CEL files through Bioconductor? [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Answer: is it possible to find sample batch # in CEL files?
0
4.6 years ago by
United States
suprun.maria0 wrote:

We are getting batch date using the following code:

pData(protocolData(a)[sampleNames(a),])$ScanDate Or this code to process all the samples: a$Batch <- sapply(pData(protocolData(a)[sampleNames(a),])$ScanDate, function(x){substr(x,1,10)}) If you only use protocolData(a)$ScanDate it might depend on the sorting and will assign some dates incorrectly.