unable to set sampleNames after combine (from beadarray package) on ExpressionSetIllumina

0

Entering edit mode

Adaikalavan Ramasamy ▴ 220

@adaikalavan-ramasamy-5765

Last seen 9.5 years ago

United Kingdom

Dear all, Here is a possible bug in the combine() function from beadarray. I read two ExpressionSetIllumina objects with 36 samples and 12 samples each. The combine() function works brilliantly and without errors or warnings but I get error message when I try to change the sample names. ## create fake data to read in ## tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], S1.AVG_Signal = rnorm(10, mean=7), S2.AVG_Signal = rnorm(10, mean=8), S3.AVG_Signal = rnorm(10, mean=6) ) write.table(tmp, file="SampleProbeProfile_1.txt", sep="\t", quote=F, row.names=F) rm(tmp) tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], S4.AVG_Signal = rnorm(10, mean=9), S5.AVG_Signal = rnorm(10, mean=6) ) write.table(tmp, file="SampleProbeProfile_2.txt", sep="\t", quote=F, row.names=F) rm(tmp) ## Read in and combine ## raw1 <- readBeadSummaryData(dataFile="SampleProbeProfile_1.txt", ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) raw2 <- readBeadSummaryData(dataFile="SampleProbeProfile_2.txt", ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) raw <- combine(raw1, raw2) # no warnings or error dim(raw1) # Features Samples Channels # 10 3 1 dim(raw2) # Features Samples Channels # 10 2 1 dim(raw) # Features Samples Channels # 10 5 1 raw1, raw2 and raw are all of ExpressionSetIllumina class. And here is the problem: sampleNames(raw) <- paste0("Sample", 1:5) # Error in `sampleNames<-`(`*tmp*`, value = c("Sample1", "Sample2", "Sample3", : # number of new names (5) should equal number of rows in AnnotatedDataFrame (3) Alternatively, I could change the rownames of raw1 and raw2 separately and then combine but I am just curious as to why this error message. Thank you. Regards, Adai [[alternative HTML version deleted]]

beadarray • 1.5k views

ADD COMMENT • link updated 9.6 years ago by Martin Morgan 25k • written 9.6 years ago by Adaikalavan Ramasamy ▴ 220

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 5 days ago

United States

Hi Adai -- On 09/09/2014 03:52 AM, Adaikalavan Ramasamy wrote: > Dear all, > > Here is a possible bug in the combine() function from beadarray. I read two > ExpressionSetIllumina objects with 36 samples and 12 samples each. The > combine() function works brilliantly and without errors or warnings but I > get error message when I try to change the sample names. > > ## create fake data to read in ## > tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], > S1.AVG_Signal = rnorm(10, mean=7), S2.AVG_Signal = > rnorm(10, mean=8), S3.AVG_Signal = rnorm(10, mean=6) ) > write.table(tmp, file="SampleProbeProfile_1.txt", sep="\t", quote=F, > row.names=F) > rm(tmp) > > tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], > S4.AVG_Signal = rnorm(10, mean=9), S5.AVG_Signal = > rnorm(10, mean=6) ) > write.table(tmp, file="SampleProbeProfile_2.txt", sep="\t", quote=F, > row.names=F) > rm(tmp) > > > ## Read in and combine ## > raw1 <- readBeadSummaryData(dataFile="SampleProbeProfile_1.txt", > ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) > raw2 <- readBeadSummaryData(dataFile="SampleProbeProfile_2.txt", > ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) > > raw <- combine(raw1, raw2) # no warnings or error > > dim(raw1) > # Features Samples Channels > # 10 3 1 > > dim(raw2) > # Features Samples Channels > # 10 2 1 > > dim(raw) > # Features Samples Channels > # 10 5 1 > > raw1, raw2 and raw are all of ExpressionSetIllumina class. > > > And here is the problem: > > sampleNames(raw) <- paste0("Sample", 1:5) > # Error in `sampleNames<-`(`*tmp*`, value = c("Sample1", "Sample2", > "Sample3", : > # number of new names (5) should equal number of rows in the problem is that beadarray does not 'combine' the 'protocolData' slot and does not check that the resulting object is valid > validObject(raw) Error in validObject(raw) : invalid class "ExpressionSetIllumina" object: 1: sample numbers differ between phenoData and protocolData invalid class "ExpressionSetIllumina" object: 2: sampleNames differ between phenoData and protocolData A work-around is to update the protocolData slot yourself, until the package maintainer (cc'd) has a chance to fix the problem. > protocolData(raw) <- combine(protocolData(raw1), protocolData(raw2)) > validObject(raw) [1] TRUE > sampleNames(raw) <- 1:5 > Thanks for the nice reproducible example Martin > AnnotatedDataFrame (3) > > > Alternatively, I could change the rownames of raw1 and raw2 separately and > then combine but I am just curious as to why this error message. Thank you. > > Regards, Adai > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793

ADD COMMENT • link 9.6 years ago Martin Morgan 25k

0

Entering edit mode

That's great. Thanks for sending the solution. On Wed, Sep 10, 2014 at 8:45 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > Hi Adai -- > > > On 09/09/2014 03:52 AM, Adaikalavan Ramasamy wrote: > >> Dear all, >> >> Here is a possible bug in the combine() function from beadarray. I read >> two >> ExpressionSetIllumina objects with 36 samples and 12 samples each. The >> combine() function works brilliantly and without errors or warnings but I >> get error message when I try to change the sample names. >> >> ## create fake data to read in ## >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], >> S1.AVG_Signal = rnorm(10, mean=7), S2.AVG_Signal = >> rnorm(10, mean=8), S3.AVG_Signal = rnorm(10, mean=6) ) >> write.table(tmp, file="SampleProbeProfile_1.txt", sep="\t", quote=F, >> row.names=F) >> rm(tmp) >> >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = LETTERS[1:10], >> S4.AVG_Signal = rnorm(10, mean=9), S5.AVG_Signal = >> rnorm(10, mean=6) ) >> write.table(tmp, file="SampleProbeProfile_2.txt", sep="\t", quote=F, >> row.names=F) >> rm(tmp) >> >> >> ## Read in and combine ## >> raw1 <- readBeadSummaryData(dataFile="SampleProbeProfile_1.txt", >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) >> raw2 <- readBeadSummaryData(dataFile="SampleProbeProfile_2.txt", >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) >> >> raw <- combine(raw1, raw2) # no warnings or error >> >> dim(raw1) >> # Features Samples Channels >> # 10 3 1 >> >> dim(raw2) >> # Features Samples Channels >> # 10 2 1 >> >> dim(raw) >> # Features Samples Channels >> # 10 5 1 >> >> raw1, raw2 and raw are all of ExpressionSetIllumina class. >> >> >> And here is the problem: >> >> sampleNames(raw) <- paste0("Sample", 1:5) >> # Error in `sampleNames<-`(`*tmp*`, value = c("Sample1", "Sample2", >> "Sample3", : >> # number of new names (5) should equal number of rows in >> > > the problem is that beadarray does not 'combine' the 'protocolData' slot > and does not check that the resulting object is valid > > > validObject(raw) > Error in validObject(raw) : > invalid class "ExpressionSetIllumina" object: 1: sample numbers differ > between phenoData and protocolData > invalid class "ExpressionSetIllumina" object: 2: sampleNames differ > between phenoData and protocolData > > A work-around is to update the protocolData slot yourself, until the > package maintainer (cc'd) has a chance to fix the problem. > > > protocolData(raw) <- combine(protocolData(raw1), protocolData(raw2)) > > validObject(raw) > [1] TRUE > > sampleNames(raw) <- 1:5 > > > > Thanks for the nice reproducible example > > Martin > > AnnotatedDataFrame (3) >> >> >> Alternatively, I could change the rownames of raw1 and raw2 separately and >> then combine but I am just curious as to why this error message. Thank >> you. >> >> Regards, Adai >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> >> > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]]

ADD REPLY • link 9.6 years ago Adaikalavan Ramasamy ▴ 220

0

Entering edit mode

Thanks for easily reproduced bug report Adai and the neat solution Martin. This has been fixed in beadarray version 2.15.4. Mike On 11 September 2014 12:17, Adaikalavan Ramasamy < adaikalavan.ramasamy at gmail.com> wrote: > That's great. Thanks for sending the solution. > > On Wed, Sep 10, 2014 at 8:45 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > > > Hi Adai -- > > > > > > On 09/09/2014 03:52 AM, Adaikalavan Ramasamy wrote: > > > >> Dear all, > >> > >> Here is a possible bug in the combine() function from beadarray. I read > >> two > >> ExpressionSetIllumina objects with 36 samples and 12 samples each. The > >> combine() function works brilliantly and without errors or warnings but > I > >> get error message when I try to change the sample names. > >> > >> ## create fake data to read in ## > >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = > LETTERS[1:10], > >> S1.AVG_Signal = rnorm(10, mean=7), S2.AVG_Signal = > >> rnorm(10, mean=8), S3.AVG_Signal = rnorm(10, mean=6) ) > >> write.table(tmp, file="SampleProbeProfile_1.txt", sep="\t", quote=F, > >> row.names=F) > >> rm(tmp) > >> > >> tmp <- data.frame( PROBE_ID = paste0("P", 1:10), SYMBOL = > LETTERS[1:10], > >> S4.AVG_Signal = rnorm(10, mean=9), S5.AVG_Signal = > >> rnorm(10, mean=6) ) > >> write.table(tmp, file="SampleProbeProfile_2.txt", sep="\t", quote=F, > >> row.names=F) > >> rm(tmp) > >> > >> > >> ## Read in and combine ## > >> raw1 <- readBeadSummaryData(dataFile="SampleProbeProfile_1.txt", > >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) > >> raw2 <- readBeadSummaryData(dataFile="SampleProbeProfile_2.txt", > >> ProbeID="PROBE_ID", columns=list(exprs="AVG_Signal"), skip=0) > >> > >> raw <- combine(raw1, raw2) # no warnings or error > >> > >> dim(raw1) > >> # Features Samples Channels > >> # 10 3 1 > >> > >> dim(raw2) > >> # Features Samples Channels > >> # 10 2 1 > >> > >> dim(raw) > >> # Features Samples Channels > >> # 10 5 1 > >> > >> raw1, raw2 and raw are all of ExpressionSetIllumina class. > >> > >> > >> And here is the problem: > >> > >> sampleNames(raw) <- paste0("Sample", 1:5) > >> # Error in `sampleNames<-`(`*tmp*`, value = c("Sample1", "Sample2", > >> "Sample3", : > >> # number of new names (5) should equal number of rows in > >> > > > > the problem is that beadarray does not 'combine' the 'protocolData' slot > > and does not check that the resulting object is valid > > > > > validObject(raw) > > Error in validObject(raw) : > > invalid class "ExpressionSetIllumina" object: 1: sample numbers differ > > between phenoData and protocolData > > invalid class "ExpressionSetIllumina" object: 2: sampleNames differ > > between phenoData and protocolData > > > > A work-around is to update the protocolData slot yourself, until the > > package maintainer (cc'd) has a chance to fix the problem. > > > > > protocolData(raw) <- combine(protocolData(raw1), protocolData(raw2)) > > > validObject(raw) > > [1] TRUE > > > sampleNames(raw) <- 1:5 > > > > > > > Thanks for the nice reproducible example > > > > Martin > > > > AnnotatedDataFrame (3) > >> > >> > >> Alternatively, I could change the rownames of raw1 and raw2 separately > and > >> then combine but I am just curious as to why this error message. Thank > >> you. > >> > >> Regards, Adai > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > >> > >> > > > > -- > > Computational Biology / Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N. > > PO Box 19024 Seattle, WA 98109 > > > > Location: Arnold Building M1 B861 > > Phone: (206) 667-2793 > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Mike Smith Research Associate Statistics & Computational Biology Laboratory Cambridge University [[alternative HTML version deleted]]

ADD REPLY • link 9.6 years ago Mike Smith ★ 6.5k

Login before adding your answer.