Question: Problem obtaining QC statistics from 'simpleaffy'
0
gravatar for Alvord, Greg DMS [Contr]
13.2 years ago by
Alvord, Greg DMS [Contr]160 wrote:
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060914/ b69996df/attachment.pl
• 442 views
ADD COMMENTlink modified 13.2 years ago by Jenny Drnevich2.2k • written 13.2 years ago by Alvord, Greg DMS [Contr]160
Answer: Problem obtaining QC statistics from 'simpleaffy'
0
gravatar for Jenny Drnevich
13.2 years ago by
Jenny Drnevich2.2k
Jenny Drnevich2.2k wrote:
See Crispin Miller's post to the list on August 2, 2006: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060802/64c53 b71/attachment.pl Looking through the code of qc.affy (what qc() actually uses), it wouldn't be that hard to modify it to get most of what you want. The output is: return(new("QCStats", scale.factors = sfs, target = target, percent.present = dpv, average.background = meanbg, minimum.background = minbg, maximum.background = maxbg, spikes = spike.vals, qc.probes = qc.probe.vals, bioBCalls = biobcalls)) You can get everything* except spikes, qc.probes and bioBCalls for any chip type, though you'll have to supply the cutoff alpha1 and alpha2 values when dpv is created; mas5calls() uses a generic 0.04 and 0.06 instead of chip specific values. *There doesn't appear to be any output on ratios in this function. Seeing as there will always be new chip types, perhaps when the chip-specific parameters are missing, instead of stopping with the error message, it would just skip those qc metrics and give a warning that some of them weren't able to be computed? At the very least, having the documentation for qc() say that not all chips types are supported might help to reduce the number of postings on this issue. I might volunteer to hack the qc.affy code, but I'm not sure if it would break other functions that use qc... I know there's a plot method in the QCReport package... Jenny At 02:03 PM 9/14/2006, Alvord, Greg \(DMS\) [Contr] wrote: >Dear List: > > > >We have a question about obtaining quality control statistics from >library 'simpleaffy'. > > > >We are working on differential gene expression analyses for six files >using the soybean genome (3 Hawaii/Resistant vs. 3 Taiwan/Susceptible). >We successfully read in the .CEL files and obtained the following >AffyBatch object, which contains 61,170 affyids. > > > > > soy.ab > >AffyBatch object > >size of arrays=1164x1164 features (63516 kb) > >cdf=Soybean (61170 affyids) > >number of samples=6 > >number of genes=61170 > >annotation=soybean > > > >As we are interested only in the 'Glycine max' subset which corresponds >to the geneNames for the soybean genome, we followed procedures >(generously supplied by Dr. Jenny Drnevich) and extracted those genes of >interest. This results in the following updated AffyBatch object of the >same name, soy.ab, which contains only (the correct) 37,744 affyids. > > > > > soy.ab > >AffyBatch object > >size of arrays=1164x1164 features (63516 kb) > >cdf=Soybean (37744 affyids) > >number of samples=6 > >number of genes=37744 > >annotation=soybean > > > >We then attempted to load the library 'simpleaffy' and obtain an object >named 'soy.ab.qc' with the following commands: > > > > > library('simpleaffy') > >Loading required package: genefilter > >Loading required package: survival > >Loading required package: splines > > > >Attaching package: 'simpleaffy' > > > > > > The following object(s) are masked _by_ .GlobalEnv : > > > > getBioC > > > > > >Then, in attempting to construct the soy.ab.qc object, we obtained: > > > > > soy.ab.qc <- qc(soy.ab) > >Background correcting > >Retrieving data from AffyBatch...done. > >Computing expression calls... > >......done. > >scaling to a TGT of 100 ... > >Scale factor for: hr.a5.24 0.410934556631521 > >Scale factor for: hr.b5.24 0.236518276485241 > >Scale factor for: hr.c5.24 0.265672825818162 > >Scale factor for: ts.a6.24 1.96443561512830 > >Scale factor for: ts.b6.24 0.320822770974345 > >Scale factor for: ts.c6.24 0.334603048637387 > >Error in qc.affy(unnormalised, ...) : I'm sorry, I do not know about >chip type: soybeancdf > > > >We received the error message above, indicating that qc.affy does not >recognize chip type soybeancdf. Further queries show that the soy.ab.qc >object does not exist at this point. > > > >Can anyone help us to successfully obtain the soy.ab.qc object such that >avbg(...), sfs(...), percent.present(...), ratios(...), etc. may be >used? > > > >Thank you very much in advance. > > > > Greg > > > > > > > >W. Gregory Alvord, Ph.D. > >Director, Statistical Consulting Services > >Computer and Statistical Services > >National Cancer Institute at Frederick > >Post Office Box B > >Frederick, MD 21702-1201 > >Phone: 301.846.5101 > >Facsimile: 301.846.6196 > >E-Mail gwa at css.ncifcrf.gov <mailto:gwa at="" css.ncifcrf.gov=""> > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu
ADD COMMENTlink written 13.2 years ago by Jenny Drnevich2.2k
Hi Greg, I had a similar problem when trying to run the affy qc stuff on barley. After some digging I found that the necessary entries were not included for the barley chip in the packages data files "qc.probes.tab", "alpha.tab", and "spikes.tab" which are located in "/usr/local/lib/R/site-library/simpleaffy/data/" on my Linux system. Adding entries for the barley chip did NOT fix the problem however, as apparently this data isn't read from the data files once the plugin is compiled. To fix this problem, I re-wrote two functions which basically override the original QCReport(), and qc.affy() functions. I'm sure you could modify them for your own needs as well. I stored both functions in a file called "qc_barley.R" (shown below). To use them I run: >source("qc_barley.R") >QC_barley_Report(abatch) It is sort of a quick-hack, and from Crispin Miller's post (mentioned above) it looks like better things are on their way, but until then, I hope this helps. Sam Hunter BCB - University of Idaho Contents of my qc_barley.R: #NOTE: qc.probenames, spike.probenames and bb, may not contain the #correct probe names... but this will run require("affyQCReport") qc.barley <- #> qc.affy function (unnormalised, normalised = NULL, tau = 0.015, logged = TRUE, cdfn = cleancdfname(cdfName(unnormalised))) { print(encodeString("1:normalizing")) if (is.null(normalised)) { getAllSpikeProbes("hgu133acdf") normalised <- call.exprs(unnormalised, "mas5") } print(encodeString("2:normalizing done")) x <- exprs(normalised) det <- detection.p.val(unnormalised, tau = tau, alpha1 = 0.05, alpha2 = 0.065) print(encodeString("3:detection.p.val")) dpv <- apply(det$call, 2, function(x) { x[x != "P"] <- 0 x[x == "P"] <- 1 x <- as.numeric(x) return(100 * sum(x)/length(x)) }) print(encodeString("4:apply(det$call)")) sfs <- normalised at description@preprocessing$sfs target <- normalised at description@preprocessing$tgt if (!logged) { x <- log2(x) } print(encodeString("5:stats")) bgsts <- .bg.stats(unnormalised)$zonebg meanbg <- apply(bgsts, 1, mean) minbg <- apply(bgsts, 1, min) maxbg <- apply(bgsts, 1, max) stdvbg <- sqrt(apply(bgsts, 1, var)) qc.probenames <- c("AFFX-BioB-3_at", "AFFX-BioB-M_at", "AFFX-BioB-5_at", "AFFX-TrpnX-3_at", "AFFX-TrpnX-M_at", "AFFX-TrpnX-5_at") print(encodeString("6:end stats")) qc.probe.vals <- rbind(c(), (sapply(qc.probenames, function(y) { x[y, ] }))) rownames(qc.probe.vals) <- colnames(x) spike.probenames <- c("AFFX-r2-Ec-bioB-3_at", "AFFX-r2-Ec-bioC-3_at", "AFFX-r2-Ec-bioD-3_at", "AFFX-r2-P1-cre-3_at") spike.vals <- rbind(c(), (sapply(spike.probenames, function(y) { x[y, ] }))) rownames(spike.vals) <- colnames(x) #bb <- getBioB(cdfn) bb <- c("AFFX-r2-Ec-bioB-3_at") if (!is.na(bb)) { biobcalls <- det$call[bb, ] } else { biobcalls <- NULL } return(new("QCStats", scale.factors = sfs, target = target, percent.present = dpv, average.background = meanbg, minimum.background = minbg, maximum.background = maxbg, spikes = spike.vals, qc.probes = qc.probe.vals, bioBCalls = biobcalls)) } QC_barley_Report <- function (object, file = "AffyQCReport.pdf", ...) { pdf(file = file, width = 8, height = 11, onefile = TRUE) plot.window(c(1, 1), c(0, 1)) plot.new() titlePage(object) signalDist(object) plot(qc.barley(object)) borderQC1(object) borderQC2(object) correlationPlot(object) dev.off() return(TRUE) }
ADD REPLYlink written 13.2 years ago by Bio Sam60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 463 users visited in the last hour