Question: Agi4x44PreProcess /filtering probenames from GeneName
0
gravatar for Maria Raeder
8.6 years ago by
Maria Raeder10
Maria Raeder10 wrote:
Dear Mailing List, I have been struggling for some time with some agilent single channel arrays, which I believe has been scanned with a earlier version AFE, because they do not contain the columns Sequence and chr coord, but I have tried to use the Agi4x44PreProcess package, with some adjustments, please see below. My main problem now is that I cannot remove the agilent probe names which are embedded within the genesymbol column for some genes The reason for doing this is to prepare files for GSEA analysis. The function for doing this in the Agi4x44PreProcess package: gsea.files, does not work, porbably due the the columns I am lacking, and the filter.probes also returns an error message, probably due to the same reason. I would be very grateful for any comments and help Thanks, Maria Here is the code : library("Agi4x44PreProcess") library("hgug4112a.db") library("vsn") library("convert") library("GO.db") setwd("/mydirectory") #reading targets file targets=read.targets(infile="targets_ec3.txt") targets[1:10,1:5] names(targets) #Many( has skipped them, but included FIleName, Treatment and GErep) #read in files with LIMMA: dd <- read.maimages(targets$FileName, source="agilent", columns = list(G = "gMedianSignal", Gb = "gBGUsed", R = "gProcessedSignal", Rb = "gBGMedianSignal"), annotation = c("Row", "Col","FeatureNum", "ControlType","ProbeName","ProbeUID", "GeneName", "SystematicName", "Description", "gIsWellAboveBG", "gIsFound", "gIsSaturated", "gIsFeatPopnOL", "gIsFeatNonUnifOL")) #reads inn 146 arrays) ##########Quality control (skipped) ###########Background correction and normailzation and log 2 transformation: library(vsn) ddNORM = BGandNorm(dd, BGmethod = "half", NORMmethod = "quantile",foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = FALSE) #filtering: ddFILT=filter.probes(ddNORM, control=TRUE, wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, makePLOT=TRUE,annotation.package="hgug4112a.db",flag.c ounts=FALSE,targets) FILTERING PROBES BY FLAGS FILTERING BY ControlType FLAG Error in data.frame(PROBE_ID, as.character(probe.chr), as.character(probe.seq), : arguments imply differing number of rows: 43376, 0 [[alternative HTML version deleted]]
ADD COMMENTlink modified 8.6 years ago by Wolfgang Huber13k • written 8.6 years ago by Maria Raeder10
Answer: Agi4x44PreProcess /filtering probenames from GeneName
0
gravatar for Wolfgang Huber
8.6 years ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:
Dear Maria I am not sure I understood your question, anyway: would perhaps the 'strsplit' function of R help you, that allows you to split strings and then extract components? E.g. the idiom sapply(strsplit(x, ","), "[", 2) will extract the text between the first and second comma in each string within x. Best wishes Wolfgang Il Mar/18/11 2:28 PM, Maria Raeder ha scritto: > Dear Mailing List, > > I have been struggling for some time with some agilent single channel > arrays, which I believe has been scanned with a earlier version AFE, > because they do not contain the columns Sequence and chr coord, but I > have tried to use the Agi4x44PreProcess package, with some > adjustments, please see below. My main problem now is that I cannot > remove the agilent probe names which are embedded within the > genesymbol column for some genes The reason for doing this is to > prepare files for GSEA analysis. The function for doing this in the > Agi4x44PreProcess package: gsea.files, does not work, porbably due > the the columns I am lacking, and the filter.probes also returns an > error message, probably due to the same reason. > > I would be very grateful for any comments and help > > Thanks, Maria > > Here is the code : > > library("Agi4x44PreProcess") library("hgug4112a.db") library("vsn") > library("convert") library("GO.db") > > setwd("/mydirectory") > > #reading targets file targets=read.targets(infile="targets_ec3.txt") > targets[1:10,1:5] > > names(targets) > > #Many( has skipped them, but included FIleName, Treatment and GErep) > > #read in files with LIMMA: dd<- read.maimages(targets$FileName, > source="agilent", columns = list(G = "gMedianSignal", Gb = "gBGUsed", > R = "gProcessedSignal", Rb = "gBGMedianSignal"), annotation = > c("Row", "Col","FeatureNum", "ControlType","ProbeName","ProbeUID", > "GeneName", "SystematicName", "Description", "gIsWellAboveBG", > "gIsFound", "gIsSaturated", "gIsFeatPopnOL", "gIsFeatNonUnifOL")) > > #reads inn 146 arrays) > > ##########Quality control (skipped) > > ###########Background correction and normailzation and log 2 > transformation: library(vsn) ddNORM = BGandNorm(dd, BGmethod = > "half", NORMmethod = "quantile",foreground = "MeanSignal", background > = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = > FALSE) > > #filtering: ddFILT=filter.probes(ddNORM, control=TRUE, > wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, > PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, > limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, > makePLOT=TRUE,annotation.package="hgug4112a.db",flag.counts=FALSE,ta rgets) > > FILTERING PROBES BY FLAGS > > > FILTERING BY ControlType FLAG Error in data.frame(PROBE_ID, > as.character(probe.chr), as.character(probe.seq), : arguments imply > differing number of rows: 43376, 0 > > > [[alternative HTML version deleted]] > > _______________________________________________ Bioconductor mailing > list Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber
ADD COMMENTlink written 8.6 years ago by Wolfgang Huber13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 345 users visited in the last hour