Question: Agi4x44PreProcess - filter.probes function
0
gravatar for Neel Aluru
8.5 years ago by
Neel Aluru450
United States
Neel Aluru450 wrote:
Dear Pedro and Bioc Users, I posted this question couple of weeks ago and didn't hear from anyone. In the mean time I tried couple of different things to get the filter.probes function working. One of the things is to check if all the column names match with the ones in the filter.probes function. I didn't see anything missing. One thing I noticed is that my files (agilent feature extracted files) have PROBE_UID instead of PROBE_ID. I tried to do change it to see it works and it still does not work. All the other functions work perfectly and I just want to filter the probes so that all the controls are deleted before doing statistical analysis. Any help will be greatly appreciated. GSEA enrichment analysis function generates a file with extension ".gct". Does anyone know how this information can be interpreted? I used Feature extraction software version 9.1. and the arrays are Agilent 4x44 zebrafish arrays. Thank you, Neel R version 2.11.1 (2010-05-31) Copyright (C) 2010 The R Foundation for Statistical Computing ISBN 3-900051-07-0 [R.app GUI 1.34 (5589) x86_64-apple-darwin9.8.0] > library("Agi4x44PreProcess") > targets=read.targets(infile="targets2.txt") Target File FileName Treatment GErep 103-AHRR-a1 103-AHRR-a1.txt AHRRa 1 103-AHRR-a2 103-AHRR-a2.txt AHRRa 1 103-AHRR-b1 103-AHRR-b1.txt AHRRb 2 103-AHRR-b2 103-AHRR-b2.txt AHRRb 2 102-CONT-1 102-CONT-1.txt CONT 3 102-CONT-2 102-CONT-2.txt CONT 3 > dd2=read.AgilentFE(targets, makePLOT=FALSE) Read 103-AHRR-a1.txt Read 103-AHRR-a2.txt Read 103-AHRR-b1.txt Read 103-AHRR-b2.txt Read 102-CONT-1.txt Read 102-CONT-2.txt RGList: dd$R: 'gProcessedSignal' dd$G: 'gMeanSignal' dd$Rb: 'gBGMedianSignal' dd$Gb: 'gBGUsed' > dim(dd2) [1] 44407 6 > names(dd2) [1] "R" "G" "Rb" "Gb" "targets" "genes" "other" > CV.rep.probes(dd2,"zf.db",foreground="MeanSignal", raw.data=TRUE,writeR=FALSE,targets) ------------------------------------------------------ Non-CTRL Replicated probes foreground: MeanSignal FILTERING BY ControlType FLAG RAW DATA: PROBES AFTER ControlType FILTERING: 42990 ------------------------------------------------------ REPLICATED NonCtrl Probes 21495 UNIQUE probes 21495 DISTRIBUTION OF REPLICATED NonControl Probes reps 1 21495 # REPLICATED (redundant) probeNames 21495 ------------------------------------------------------ MEDIAN % CV 103-AHRR-a1 103-AHRR-a2 103-AHRR-b1 103-AHRR-b2 102-CONT-1 102-CONT-2 2.477 1.279 1.454 2.157 1.689 1.342 ------------------------------------------------------ > genes.rpt.agi(dd2,"zf.db",raw.data=TRUE,WRITE.html=FALSE,REPORT=FALSE) GENE SETS: same genes interrogated by different probes FILTERING BY ControlType FLAG RAW DATA: PROBES AFTER ControlType FILTERING: 42990 INPUT DATA: RAW CHIP: zf.db PROBE SETS (NON-CTRL prob rep. x 10): 21495 GEN-SETS (REPLICATED GENES): 2281 PROBES in gen-sets: 5012 > ddNORM=BGandNorm(dd2, BGmethod="half",NORMmethod="quantile",foregrou nd="MeanSignal",background="BGMedianSignal",offset=50, makePLOTpre=FALSE, makePLOTpost=FALSE) BACKGROUND CORRECTION AND NORMALIZATION foreground: MeanSignal background: BGMedianSignal BGmethod: half NORMmethod: quantile OUTPUT in log-2 scale ------------------------------------------------------ > ddFILT=filter.probes(ddNORM, control=TRUE,wellaboveBG=TRUE, isfound=TRUE,wellaboveNEG=TRUE,sat=TRUE,PopnOL=TRUE,NonUnifOL=T, nas=T RUE,limWellAbove=75,limISF=75,limNEG=75,limPopnOL=75,limNonUnifOL=75, limNAS=100,makePLOT=F,annotation.package="zf.db",flag.counts=T, targets=targets) FILTERING PROBES BY FLAGS FILTERING BY ControlType FLAG Error in data.frame(PROBE_ID, as.character(probe.chr), as.character(probe.seq), : arguments imply differing number of rows: 42990, 0 I did the remaining analysis and they all worked well. > summarize.probe(dd,makePLOT=TRUE,targets) SUMMARIZATION OF non-CTRL PROBES SUMMARIZED DATA: 21555 6 ------------------------------------------------------ Hit <return> to see next plot: Error in plot.new() : attempt to plot on null device > ddPROC=summarize.probe(dd,makePLOT=TRUE, targets) SUMMARIZATION OF non-CTRL PROBES SUMMARIZED DATA: 21555 6 ------------------------------------------------------ Hit <return> to see next plot: Hit <return> to see next plot: > eset.PROC=build.eset(ddPROC,targets,makePLOT=TRUE,annotation.package ="zf.db") > dim(eset.PROC) Features Samples 21555 6 > write.eset(eset.PROC, ddPROC, "zf.db",targets) > mappings=build.mappings(eset.PROC,annotation.package="zf.db") The mapping process takes a while ... > gsea.files(eset.PROC, targets, annotation.package="zf.db") GSEA OUTPUT FILES DataSet.gct and Phenotypes.cls 1 AHRRa-AHRRb unique gene symbols: 8428 samples: 4 2 AHRRa-CONT unique gene symbols: 8428 samples: 4 3 AHRRb-CONT unique gene symbols: 8428 samples: 4 Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607
ADD COMMENTlink written 8.5 years ago by Neel Aluru450
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 143 users visited in the last hour