Question

RMA/QuantileNormalization results difference between oligo and aroma.affymetrix for Hugene

0

Entering edit mode

Mikhail Pachkov ▴ 110

@mikhail-pachkov-2806

Last seen 9.7 years ago

Thank you! For further questions I will move to aroma.affymetrix mailing list. Best regards, Mikhail

• 1.4k views

ADD COMMENT • link 14.2 years ago Mikhail Pachkov ▴ 110

score 0 · Answer 1 · 2010-03-01

On Fri, Feb 26, 2010 at 6:40 PM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > If you're using the latest oligo, you can use the (not exposed - still > working on further details) function getFidProbeset() : > > probeInfo = oligo:::getFidProbeset(rawdata) > idx = probeInfo[["fid"]] > ## probeset names are in probeInfo[["fsetid"]] > intensities = exprs(rawdata)[idx,] > and work with 'intensities' (which includes the PMs and controls). Thank you! It works! > If you rather use only PMs: > > pms = pm(rawdata) > pns = probeNames(rawdata) > these are now regular matrices and you can use > rma.background.correct() and normalize.quantiles(). Here I have got a problem. probeNames() does not work with my dataset. pms = pm(rawdata) pns = probeNames(rawdata) Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: no such column: man_fsetid) Best regards, Mikhail

score 0 · Answer 2 · 2010-03-01

0

Entering edit mode

Mikhail Pachkov ▴ 110

@mikhail-pachkov-2806

Last seen 9.7 years ago

On Mon, Mar 1, 2010 at 11:51 AM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > can you give me the results for sessionInfo()? sessionInfo() R version 2.10.1 (2009-12-14) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.8-3 [3] DBI_0.2-5 oligo_1.10.2 [5] preprocessCore_1.8.0 oligoClasses_1.8.0 [7] Biobase_2.6.1 loaded via a namespace (and not attached): [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.12 IRanges_1.4.11 [5] splines_2.10.1 tools_2.10.1

ADD COMMENT • link 14.2 years ago Mikhail Pachkov ▴ 110

0

Entering edit mode

my bad... that's a mod that never made it to the code... in the meantime, can you please run the following bit once (which will set probeNames and after which everything will be fine): setMethod("probeNames", "GeneFeatureSet", function(object, subset=NULL){ res <- dbGetQuery(db(object), "SELECT fsetid FROM pmfeature")[[1]] as.character(res) }) b On Mon, Mar 1, 2010 at 10:53 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > On Mon, Mar 1, 2010 at 11:51 AM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> can you give me the results for sessionInfo()? > > ?sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-unknown-linux-gnu > > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.8-3 > [3] DBI_0.2-5 ? ? ? ? ? ? ? ? oligo_1.10.2 > [5] preprocessCore_1.8.0 ? ? ?oligoClasses_1.8.0 > [7] Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affxparser_1.18.0 ?affyio_1.14.0 ? ? ?Biostrings_2.14.12 IRanges_1.4.11 > [5] splines_2.10.1 ? ? tools_2.10.1 >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

bug fixed on 1.10.3, to appear online in about 24 hours. b On Mon, Mar 1, 2010 at 11:06 AM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > my bad... that's a mod that never made it to the code... in the > meantime, can you please run the following bit once (which will set > probeNames and after which everything will be fine): > > setMethod("probeNames", "GeneFeatureSet", > ? ? ? ? ? ? ? ? ?function(object, subset=NULL){ > ? ? ? ? ? ? ? ? ? ?res <- dbGetQuery(db(object), "SELECT fsetid FROM > pmfeature")[[1]] > ? ? ? ? ? ? ? ? ? ?as.character(res) > ? ? ? ? ? ? ? ? ?}) > > b > > On Mon, Mar 1, 2010 at 10:53 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >> On Mon, Mar 1, 2010 at 11:51 AM, Benilton Carvalho >> <beniltoncarvalho at="" gmail.com=""> wrote: >>> can you give me the results for sessionInfo()? >> >> ?sessionInfo() >> R version 2.10.1 (2009-12-14) >> x86_64-unknown-linux-gnu >> >> locale: >> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 >> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> other attached packages: >> [1] pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.8-3 >> [3] DBI_0.2-5 ? ? ? ? ? ? ? ? oligo_1.10.2 >> [5] preprocessCore_1.8.0 ? ? ?oligoClasses_1.8.0 >> [7] Biobase_2.6.1 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.18.0 ?affyio_1.14.0 ? ? ?Biostrings_2.14.12 IRanges_1.4.11 >> [5] splines_2.10.1 ? ? tools_2.10.1 >> >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

On Mon, Mar 1, 2010 at 12:06 PM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > my bad... that's a mod that never made it to the code... in the > meantime, can you please run the following bit once (which will set > probeNames and after which everything will be fine): > > setMethod("probeNames", "GeneFeatureSet", > ? ? ? ? ? ? ? ? ?function(object, subset=NULL){ > ? ? ? ? ? ? ? ? ? ?res <- dbGetQuery(db(object), "SELECT fsetid FROM > pmfeature")[[1]] > ? ? ? ? ? ? ? ? ? ?as.character(res) > ? ? ? ? ? ? ? ? ?}) That works. Thank you! However I need both fsetid and fid so I tried to modify the method a little: setMethod("probeNames", "GeneFeatureSet", function(object, subset=NULL){ res <- dbGetQuery(db(object), "SELECT fid,fsetid FROM pmfeature") paste(res[,1],res[,2]) }) Could you tell me if that is correct way to get fid, fsetid pairs for pms? Thank you in advance. Best regards, Mikhail

ADD REPLY • link 14.2 years ago Mikhail Pachkov ▴ 110

0

Entering edit mode

don't change the probeNames method. Use probeNames() to get the (character version) of fsetid and pmindex() to get the fid. b On Mon, Mar 1, 2010 at 4:50 PM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > On Mon, Mar 1, 2010 at 12:06 PM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> my bad... that's a mod that never made it to the code... in the >> meantime, can you please run the following bit once (which will set >> probeNames and after which everything will be fine): >> >> setMethod("probeNames", "GeneFeatureSet", >> ? ? ? ? ? ? ? ? ?function(object, subset=NULL){ >> ? ? ? ? ? ? ? ? ? ?res <- dbGetQuery(db(object), "SELECT fsetid FROM >> pmfeature")[[1]] >> ? ? ? ? ? ? ? ? ? ?as.character(res) >> ? ? ? ? ? ? ? ? ?}) > > That works. Thank you! However I need both fsetid and fid so I tried > to modify the method a little: > > setMethod("probeNames", "GeneFeatureSet", > ? ? ? ? ? ? ? ? function(object, subset=NULL){ > ? ? ? ? ? ? ? ? ? res <- dbGetQuery(db(object), "SELECT fid,fsetid FROM > pmfeature") > ? ? ? ? ? ? ? ? ? paste(res[,1],res[,2]) > ? ? ? ? ? ? ? ? }) > > Could you tell me if that is correct way to get fid, fsetid pairs for pms? > > Thank you in advance. > > Best regards, > > Mikhail >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

Dear Benilton, I have got a problem obtaining probe indices along with probe names. My script: library(oligo); workingDir = getwd(); celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$"); rawdata=read.celfiles(celfiles); pms = pm(rawdata) rmadata=rma.background.correct(pms) qndata=normalize.quantiles(log2(rmadata)) res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature") pid=paste(res[,1],res[,2],res[,3],sep=":") rownames(qndata)<-pid colnames(qndata)<-sampleNames(rawdata) However during analysis of the data it looked like probe names were determined wrong. I have tried to use pmindex() to extract "fid" of pm probes which seems to be a list of numbers sorted in ascending order. I do the following: pnames=probeNames(rawdata) length(pnames) [1] 818005 pmidx=pmindex(rawdata) length(pmidx) [1] 818005 # first value in probe names pnames[1] [1] "7896737" # first value in pm indices pmidx[1] [1] 1056 If I check pgf file for probe with index "1056", it belongs to probeset "7981328" not "7896737" as it given in pnames. My question: How to obtain probeset-probe_id pairs in correct order for annotating expression values in "pms" matrix? Best regards, Mikhail

ADD REPLY • link 14.2 years ago Mikhail Pachkov ▴ 110

0

Entering edit mode

what's the array you're looking at? sessionInfo()? thanks, b On Mon, Mar 22, 2010 at 10:54 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: > Dear Benilton, > > I have got a problem obtaining probe indices along with probe names. My script: > > library(oligo); > workingDir = getwd(); > celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$"); > rawdata=read.celfiles(celfiles); > > pms = pm(rawdata) > rmadata=rma.background.correct(pms) > qndata=normalize.quantiles(log2(rmadata)) > > res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature") > pid=paste(res[,1],res[,2],res[,3],sep=":") > rownames(qndata)<-pid > > colnames(qndata)<-sampleNames(rawdata) > > However during analysis of the data it looked like probe names were > determined wrong. I have tried to use pmindex() to extract "fid" of pm > probes which seems to be a list of numbers sorted in ascending order. > I do the following: > > pnames=probeNames(rawdata) > length(pnames) > [1] 818005 > > pmidx=pmindex(rawdata) > length(pmidx) > [1] 818005 > > # first value in probe names > pnames[1] > [1] "7896737" > > # first value in pm indices > pmidx[1] > [1] 1056 > > If I check pgf file for probe with index "1056", it belongs to > probeset "7981328" not "7896737" as it given in pnames. > > My question: How to obtain probeset-probe_id pairs in correct order > for annotating expression values in "pms" matrix? > > Best regards, > > Mikhail >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

Dear Mikhail, I was able to reproduce the issue you reported. The probeNames() method in 1.10.3 is missing a sort by fid. setMethod("probeNames", "GeneFeatureSet", function(object, subset=NULL){ res <- dbGetQuery(db(object), "SELECT fsetid FROM pmfeature ORDER BY fid")[[1]] as.character(res) }) I'll get this fixed now. b On Mon, Mar 22, 2010 at 11:04 AM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > what's the array you're looking at? > > sessionInfo()? > > thanks, > b > > On Mon, Mar 22, 2010 at 10:54 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >> Dear Benilton, >> >> I have got a problem obtaining probe indices along with probe names. My script: >> >> library(oligo); >> workingDir = getwd(); >> celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$"); >> rawdata=read.celfiles(celfiles); >> >> pms = pm(rawdata) >> rmadata=rma.background.correct(pms) >> qndata=normalize.quantiles(log2(rmadata)) >> >> res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature") >> pid=paste(res[,1],res[,2],res[,3],sep=":") >> rownames(qndata)<-pid >> >> colnames(qndata)<-sampleNames(rawdata) >> >> However during analysis of the data it looked like probe names were >> determined wrong. I have tried to use pmindex() to extract "fid" of pm >> probes which seems to be a list of numbers sorted in ascending order. >> I do the following: >> >> pnames=probeNames(rawdata) >> length(pnames) >> [1] 818005 >> >> pmidx=pmindex(rawdata) >> length(pmidx) >> [1] 818005 >> >> # first value in probe names >> pnames[1] >> [1] "7896737" >> >> # first value in pm indices >> pmidx[1] >> [1] 1056 >> >> If I check pgf file for probe with index "1056", it belongs to >> probeset "7981328" not "7896737" as it given in pnames. >> >> My question: How to obtain probeset-probe_id pairs in correct order >> for annotating expression values in "pms" matrix? >> >> Best regards, >> >> Mikhail >> >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

btw, the following is faster: setMethod("probeNames", "GeneFeatureSet", function(object, subset=NULL){ res <- dbGetQuery(db(object), "SELECT fid, fsetid FROM pmfeature ORDER BY fid") idx <- order(res[["fid"]]) as.character(res[idx, "fsetid"]) }) b On Mon, Mar 22, 2010 at 11:18 AM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > Dear Mikhail, > > I was able to reproduce the issue you reported. The ?probeNames() > method in 1.10.3 is missing a sort by fid. > > setMethod("probeNames", "GeneFeatureSet", > ? ? ? ? ? ? ? ? function(object, subset=NULL){ > ? ? ? ? ? ? ? ? ? res <- dbGetQuery(db(object), "SELECT fsetid FROM > pmfeature ORDER BY fid")[[1]] > ? ? ? ? ? ? ? ? ? as.character(res) > ? ? ? ? ? ? ? ? }) > > I'll get this fixed now. > > b > > On Mon, Mar 22, 2010 at 11:04 AM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> what's the array you're looking at? >> >> sessionInfo()? >> >> thanks, >> b >> >> On Mon, Mar 22, 2010 at 10:54 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >>> Dear Benilton, >>> >>> I have got a problem obtaining probe indices along with probe names. My script: >>> >>> library(oligo); >>> workingDir = getwd(); >>> celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$"); >>> rawdata=read.celfiles(celfiles); >>> >>> pms = pm(rawdata) >>> rmadata=rma.background.correct(pms) >>> qndata=normalize.quantiles(log2(rmadata)) >>> >>> res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature") >>> pid=paste(res[,1],res[,2],res[,3],sep=":") >>> rownames(qndata)<-pid >>> >>> colnames(qndata)<-sampleNames(rawdata) >>> >>> However during analysis of the data it looked like probe names were >>> determined wrong. I have tried to use pmindex() to extract "fid" of pm >>> probes which seems to be a list of numbers sorted in ascending order. >>> I do the following: >>> >>> pnames=probeNames(rawdata) >>> length(pnames) >>> [1] 818005 >>> >>> pmidx=pmindex(rawdata) >>> length(pmidx) >>> [1] 818005 >>> >>> # first value in probe names >>> pnames[1] >>> [1] "7896737" >>> >>> # first value in pm indices >>> pmidx[1] >>> [1] 1056 >>> >>> If I check pgf file for probe with index "1056", it belongs to >>> probeset "7981328" not "7896737" as it given in pnames. >>> >>> My question: How to obtain probeset-probe_id pairs in correct order >>> for annotating expression values in "pms" matrix? >>> >>> Best regards, >>> >>> Mikhail >>> >> >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

My copy/paste skills need to be improved. Apologies. setMethod("probeNames", "GeneFeatureSet", function(object, subset=NULL){ res <- dbGetQuery(db(object), "SELECT fid, fsetid FROM pmfeature") idx <- order(res[["fid"]]) as.character(res[idx, "fsetid"]) }) On Mon, Mar 22, 2010 at 11:20 AM, Benilton Carvalho <beniltoncarvalho at="" gmail.com=""> wrote: > btw, the following is faster: > > setMethod("probeNames", "GeneFeatureSet", > ? ? ? ? ? ? ? ? function(object, subset=NULL){ > ? ? ? ? ? ? ? ? ? res <- dbGetQuery(db(object), "SELECT fid, fsetid > FROM pmfeature ORDER BY fid") > ? ? ? ? ? ? ? ? ? idx <- order(res[["fid"]]) > ? ? ? ? ? ? ? ? ? as.character(res[idx, "fsetid"]) > ? ? ? ? ? ? ? ? }) > > b > > On Mon, Mar 22, 2010 at 11:18 AM, Benilton Carvalho > <beniltoncarvalho at="" gmail.com=""> wrote: >> Dear Mikhail, >> >> I was able to reproduce the issue you reported. The ?probeNames() >> method in 1.10.3 is missing a sort by fid. >> >> setMethod("probeNames", "GeneFeatureSet", >> ? ? ? ? ? ? ? ? function(object, subset=NULL){ >> ? ? ? ? ? ? ? ? ? res <- dbGetQuery(db(object), "SELECT fsetid FROM >> pmfeature ORDER BY fid")[[1]] >> ? ? ? ? ? ? ? ? ? as.character(res) >> ? ? ? ? ? ? ? ? }) >> >> I'll get this fixed now. >> >> b >> >> On Mon, Mar 22, 2010 at 11:04 AM, Benilton Carvalho >> <beniltoncarvalho at="" gmail.com=""> wrote: >>> what's the array you're looking at? >>> >>> sessionInfo()? >>> >>> thanks, >>> b >>> >>> On Mon, Mar 22, 2010 at 10:54 AM, Mikhail Pachkov <pachkov at="" gmail.com=""> wrote: >>>> Dear Benilton, >>>> >>>> I have got a problem obtaining probe indices along with probe names. My script: >>>> >>>> library(oligo); >>>> workingDir = getwd(); >>>> celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$"); >>>> rawdata=read.celfiles(celfiles); >>>> >>>> pms = pm(rawdata) >>>> rmadata=rma.background.correct(pms) >>>> qndata=normalize.quantiles(log2(rmadata)) >>>> >>>> res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature") >>>> pid=paste(res[,1],res[,2],res[,3],sep=":") >>>> rownames(qndata)<-pid >>>> >>>> colnames(qndata)<-sampleNames(rawdata) >>>> >>>> However during analysis of the data it looked like probe names were >>>> determined wrong. I have tried to use pmindex() to extract "fid" of pm >>>> probes which seems to be a list of numbers sorted in ascending order. >>>> I do the following: >>>> >>>> pnames=probeNames(rawdata) >>>> length(pnames) >>>> [1] 818005 >>>> >>>> pmidx=pmindex(rawdata) >>>> length(pmidx) >>>> [1] 818005 >>>> >>>> # first value in probe names >>>> pnames[1] >>>> [1] "7896737" >>>> >>>> # first value in pm indices >>>> pmidx[1] >>>> [1] 1056 >>>> >>>> If I check pgf file for probe with index "1056", it belongs to >>>> probeset "7981328" not "7896737" as it given in pnames. >>>> >>>> My question: How to obtain probeset-probe_id pairs in correct order >>>> for annotating expression values in "pms" matrix? >>>> >>>> Best regards, >>>> >>>> Mikhail >>>> >>> >> >

ADD REPLY • link 14.2 years ago Benilton Carvalho ★ 4.3k