Agi4x44PreProcess - Replicated genes
1
0
Entering edit mode
Neel Aluru ▴ 460
@neel-aluru-3760
Last seen 7.4 years ago
United States
Hello, I am making progress in learning R but I must admit that I am really slow and without all your help I would have given up on this. I still have some recurring troubles with Agi4x44PreProcess. This time I am having issues with Replicated genes (genes.rpt.agi). It looks like I am missing something. I have posted the session info here and highlighted the problematic ones in "red". I really appreciate your help. Thank you very much in advance, Sincerely, Neel [R.app GUI 1.29 (5464) i386-apple-darwin8.11.1] > source("http://bioconductor.org/biocLite.R") > biocLite() Using R version 2.9.2, biocinstall version 2.4.13. Installing Bioconductor version 2.4 packages: [1] "affy" "affydata" "affyPLM" "annaffy" "annotate" "Biobase" "biomaRt" [8] "Biostrings" "DynDoc" "gcrma" "genefilter" "geneplotter" "hgu95av2.db" "limma" [15] "marray" "multtest" "vsn" "xtable" "affyQCReport" Please wait... > library(org.Dr.eg.db) Loading required package: AnnotationDbi Loading required package: Biobase > setwd("/Users/Neel/agilent") > getwd() [1] "/Users/Neel/agilent" > library("Agi4x44PreProcess") Loading required package: limma Loading required package: annotate Loading required package: genefilter > targets=read.targets(infile="infile.txt") Target File X FileName Treatment GErep conta cont1 conta.txt control 1 contb cont2 contb.txt control 2 contc cont3 contc.txt control 3 contd cont4 contd.txt control 4 pcba pcb1 pcba.txt pcb 1 pcbb pcb2 pcbb.txt pcb 2 pcbc pcb3 pcbc.txt pcb 3 pcbd pcb4 pcbd.txt pcb 4 > aa=read.AgilentFE(targets, makePLOT=FALSE) Read conta.txt Read contb.txt Read contc.txt Read contd.txt Read pcba.txt Read pcbb.txt Read pcbc.txt Read pcbd.txt RGList: dd$R: 'gProcessedSignal' dd$G: 'gMeanSignal' dd$Rb: 'gBGMedianSignal' dd$Gb: 'gBGUsed' > aaNORM = BGandNorm(aa, BGmethod = "half", NORMmethod = "quantile", foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = FALSE) Loading required package: vsn BACKGROUND CORRECTION AND NORMALIZATION foreground: MeanSignal background: BGMedianSignal BGmethod: half NORMmethod: quantile OUTPUT in log-2 scale > CV.rep.probes(aa, "org.Dr.eg.db", foreground="MeanSignal", raw.data= TRUE, writeR=FALSE,targets) ------------------------------------------------------ Non-CTRL Replicated probes foreground: MeanSignal FILTERING BY ControlType FLAG RAW DATA: PROBES AFTER ControlType FILTERING: 42990 ------------------------------------------------------ REPLICATED NonCtrl Probes 21495 UNIQUE probes 21495 DISTRIBUTION OF REPLICATED NonControl Probes reps 1 21495 # REPLICATED (redundant) probeNames 21495 ------------------------------------------------------ MEDIAN % CV conta contb contc contd pcba pcbb pcbc pcbd 2.378 0.963 1.233 1.997 2.439 1.282 1.438 2.104 > genes.rpt.agi(aa, "org.Dr.eg", raw.data = TRUE, WRITE.html = FALSE, REPORT = FALSE) GENE SETS: same genes interrogated by different probes FILTERING BY ControlType FLAG RAW DATA: PROBES AFTER ControlType FILTERING: 42990 INPUT DATA: RAW CHIP: org.Dr.eg PROBE SETS (NON-CTRL prob rep. x 10): 21495 Error in lookUp(PROBE_ID, annotation.package, "SYMBOL") : No keys provided (Can anyone explain to me what keys means in R?) > PROBE_ID = aa$ProbeUID$ProbeName > GENE_ID = unlist(lookUp(PROBE_ID, "org.Dr.eg.db", "org.Dr.egACCNUM") ) Error in lookUp(PROBE_ID, "org.Dr.eg.db", "org.Dr.egACCNUM") : No keys provided > head <- c("PROBE ID","org.Dr.egACCNUM","SYMBOL") > ensembl.htmlpage(PROBE_ID,filename,"org.Dr.eg", title, table.head=head,table.center = TRUE) Error in match.arg(annotation.package, c("hgug4112a.db", "mgug4122a.db", : 'arg' should be one of “hgug4112a.db”, “mgug4122a.db”, “notAnnPack” > ensembl.htmlpage(PROBE_ID,filename,"org.Dr.eg.db", title, table.head=head,table.center = TRUE) Error in file(filename, "w") : cannot open the connection In addition: Warning message: In file(filename, "w") : cannot open file 'org.Dr.eg.db': Is a directory (Do you think I should create annotation package to solve this?) Neel Aluru Postdoctoral Scholar Biology Department Woods Hole Oceanographic Institution Woods Hole, MA 02543 USA 508-289-3607 [[alternative HTML version deleted]]
Annotation GUI probe Agi4x44PreProcess Annotation GUI probe Agi4x44PreProcess • 1.0k views
ADD COMMENT
0
Entering edit mode
Francois Pepin ★ 1.3k
@francois-pepin-1012
Last seen 9.6 years ago
Hi Neel, There are a couple of issues here, that I can see. One is that you are not using the proper annotation packages. org.Dr.eg.db is an organism package and would not contain the probe information that would be expected by the functions you are calling. You will have to use the annotation package for the chip instead. IO think you had this right the first time around, why did you change the annotation library? In addition, the Agi4x44PreProcess package has a rather narrow scope and many functions only work on 4x44 Agilent mouse and human whole genome arrays (hgug4112a and mgug4122a). It would actually be easy but possibly time-consuming for the package authors to handle other chip types. As a minor point, you are also calling internal methods, such as ensembl.htmlpage. This is generally not recommended as they are less documented, are usually not as robust and can change wildly between versions. Your call to ensembl.htmlpage also does not use the proper arguments, as the 3rd argument should be the file name, not the 2nd. As it stands, you have 2 main options. The first would be to try to convince the maintainer of the Agi4x44PreProcess package to handle other chip types. The second is use another package to do the quality control. arrayQualityMetrics contains a lot of the basic tools, and limma contains some useful functions also. Hope this helps, Francois On 11/16/2009 07:33 PM, Neel Aluru wrote: > Hello, > > I am making progress in learning R but I must admit that I am really slow and without all your help I would have given up on this. I still have some recurring troubles with Agi4x44PreProcess. This time I am having issues with Replicated genes (genes.rpt.agi). It looks like I am missing something. I have posted the session info here and highlighted the problematic ones in "red". I really appreciate your help. > > Thank you very much in advance, > > Sincerely, Neel > > [R.app GUI 1.29 (5464) i386-apple-darwin8.11.1] > >> source("http://bioconductor.org/biocLite.R") >> biocLite() > Using R version 2.9.2, biocinstall version 2.4.13. > Installing Bioconductor version 2.4 packages: > [1] "affy" "affydata" "affyPLM" "annaffy" "annotate" "Biobase" "biomaRt" > [8] "Biostrings" "DynDoc" "gcrma" "genefilter" "geneplotter" "hgu95av2.db" "limma" > [15] "marray" "multtest" "vsn" "xtable" "affyQCReport" > Please wait... >> library(org.Dr.eg.db) > Loading required package: AnnotationDbi > Loading required package: Biobase >> setwd("/Users/Neel/agilent") >> getwd() > [1] "/Users/Neel/agilent" >> library("Agi4x44PreProcess") > Loading required package: limma > Loading required package: annotate > Loading required package: genefilter >> targets=read.targets(infile="infile.txt") > > Target File > X FileName Treatment GErep > conta cont1 conta.txt control 1 > contb cont2 contb.txt control 2 > contc cont3 contc.txt control 3 > contd cont4 contd.txt control 4 > pcba pcb1 pcba.txt pcb 1 > pcbb pcb2 pcbb.txt pcb 2 > pcbc pcb3 pcbc.txt pcb 3 > pcbd pcb4 pcbd.txt pcb 4 > >> aa=read.AgilentFE(targets, makePLOT=FALSE) > Read conta.txt > Read contb.txt > Read contc.txt > Read contd.txt > Read pcba.txt > Read pcbb.txt > Read pcbc.txt > Read pcbd.txt > > RGList: > dd$R: 'gProcessedSignal' > dd$G: 'gMeanSignal' > dd$Rb: 'gBGMedianSignal' > dd$Gb: 'gBGUsed' > >> aaNORM = BGandNorm(aa, BGmethod = "half", NORMmethod = "quantile", foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = FALSE) > Loading required package: vsn > BACKGROUND CORRECTION AND NORMALIZATION > > foreground: MeanSignal > background: BGMedianSignal > > BGmethod: half > NORMmethod: quantile > OUTPUT in log-2 scale >> CV.rep.probes(aa, "org.Dr.eg.db", foreground="MeanSignal", raw.data= TRUE, writeR=FALSE,targets) > > ------------------------------------------------------ > Non-CTRL Replicated probes > foreground: MeanSignal > FILTERING BY ControlType FLAG > RAW DATA: PROBES AFTER ControlType FILTERING: 42990 > > ------------------------------------------------------ > REPLICATED NonCtrl Probes 21495 > UNIQUE probes 21495 > DISTRIBUTION OF REPLICATED NonControl Probes > reps > 1 > 21495 > # REPLICATED (redundant) probeNames 21495 > ------------------------------------------------------ > MEDIAN % CV > conta contb contc contd pcba pcbb pcbc pcbd > 2.378 0.963 1.233 1.997 2.439 1.282 1.438 2.104 > >> genes.rpt.agi(aa, "org.Dr.eg", raw.data = TRUE, WRITE.html = FALSE, REPORT = FALSE) > > GENE SETS: same genes interrogated by different probes > FILTERING BY ControlType FLAG > RAW DATA: PROBES AFTER ControlType FILTERING: 42990 > > INPUT DATA: RAW > CHIP: org.Dr.eg > > PROBE SETS (NON-CTRL prob rep. x 10): 21495 > Error in lookUp(PROBE_ID, annotation.package, "SYMBOL") : > No keys provided (Can anyone explain to me what keys means in R?) > > >> PROBE_ID = aa$ProbeUID$ProbeName >> GENE_ID = unlist(lookUp(PROBE_ID, "org.Dr.eg.db", "org.Dr.egACCNUM") ) > > Error in lookUp(PROBE_ID, "org.Dr.eg.db", "org.Dr.egACCNUM") : > No keys provided > >> head<- c("PROBE ID","org.Dr.egACCNUM","SYMBOL") >> ensembl.htmlpage(PROBE_ID,filename,"org.Dr.eg", title, table.head=head,table.center = TRUE) > > Error in match.arg(annotation.package, c("hgug4112a.db", "mgug4122a.db", : > 'arg' should be one of ?hgug4112a.db?, ?mgug4122a.db?, ?notAnnPack? > >> ensembl.htmlpage(PROBE_ID,filename,"org.Dr.eg.db", title, table.head=head,table.center = TRUE) > > Error in file(filename, "w") : cannot open the connection > In addition: Warning message: > In file(filename, "w") : cannot open file 'org.Dr.eg.db': Is a directory > > (Do you think I should create annotation package to solve this?) > > > > > > Neel Aluru > Postdoctoral Scholar > Biology Department > Woods Hole Oceanographic Institution > Woods Hole, MA 02543 > USA > 508-289-3607 > > > > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 1076 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6