subset in XPS

0

Entering edit mode

Zhibin Lu ▴ 80

@zhibin-lu-2882

Last seen 9.7 years ago

Hi, I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? Thanks, Zhibin

xps xps • 934 views

ADD COMMENT • link updated 15.9 years ago by cstrato ★ 3.9k • written 15.9 years ago by Zhibin Lu ▴ 80

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 5.6 years ago

Austria

Dear Zhibin Since you have already done RMA you have now an ExprTreeSet, called e.g. "data.rma". You can see the structure with: > str(data.rma) Since currently there is no direct possibility to use a subset of type ExprTreeSet only, you can create a new class ExprTreeSet in the following way: 1. Make a subset of slot "data" which is a dataframe (assuming that you want to use samples 1,2,3,7,8,9): > subdata <- exprs(data.rma) > subdata <- subdata[,c(1:2,3:5, 9:11)] Please note that it is important to keep the first two columns. 2. Create a copy "sub.rma" of class "data.rma" > sub.rma <- data.rma 3. Replace slot "data" with "subdata": > exprs(sub.rma) <- subdata For the moment you need to replace slots "treenames" and "numtrees", too, which I will change in the future to be done automatically. 4. Replace slot "treenames" with the names of your subset: a, create list containing the sub samples > subtrees <- unlist(treeNames(data.g.rma)) > subtrees <- as.list(subtrees[c(1:3,7:9)]) b, check if the names are correct: > subtrees c, replace slot "treenames": > sub.rma at treenames <- subtrees 5. Replace slot "numtrees" with the number of subsamples > sub.rma at numtrees <- length(subtrees) 6. Check if the new ExprTreeSet is correct: > str(sub.rma) Now you can use the new ExprTreeSet "sub.rma" as input for method unifilter: > rma.ufr <- unifilter(sub.rma, .......) If you want to take advantage of the advanced capabilties of package "limma", then you can create a Biobase class "ExpressionSet" containing only your 6 samples as described in Appendix A.3 of the vignette xps.pdf: 1. extract the normalized expression data: > subdata <- validData(data.rma) 2. Since "subdata" is a dataframe, simply create a subframe: > subdata <- subdata[,c(1:3,7:9)] 3. Create a Biobase class "ExpressionSet", called "subset" > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) Now you have an ExpressionSet ready for use with "limma". Please let me know if you succeeded with this info. Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._ Zhibin Lu wrote: > Hi, > > I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? > > Thanks, > > Zhibin > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > >

ADD COMMENT • link 15.9 years ago cstrato ★ 3.9k

0

Entering edit mode

Dear Christian, Thanks so much for such a detailed explanation. I will try this when I get to work next week, and I do not see why I can not follow the direction. Thanks again and have a nice weekend, Zhibin > Date: Sat, 28 Jun 2008 15:46:26 +0200 > From: cstrato@aon.at > To: zhbluweb@hotmail.com > CC: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] subset in XPS > > Dear Zhibin > > Since you have already done RMA you have now an ExprTreeSet, > called e.g. "data.rma". You can see the structure with: > > str(data.rma) > > Since currently there is no direct possibility to use a > subset of type ExprTreeSet only, you can create a new class > ExprTreeSet in the following way: > > 1. Make a subset of slot "data" which is a dataframe > (assuming that you want to use samples 1,2,3,7,8,9): > > subdata <- exprs(data.rma) > > subdata <- subdata[,c(1:2,3:5, 9:11)] > Please note that it is important to keep the first > two columns. > > 2. Create a copy "sub.rma" of class "data.rma" > > sub.rma <- data.rma > > 3. Replace slot "data" with "subdata": > > exprs(sub.rma) <- subdata > > For the moment you need to replace slots "treenames" and > "numtrees", too, which I will change in the future to be > done automatically. > > 4. Replace slot "treenames" with the names of your subset: > a, create list containing the sub samples > > subtrees <- unlist(treeNames(data.g.rma)) > > subtrees <- as.list(subtrees[c(1:3,7:9)]) > b, check if the names are correct: > > subtrees > c, replace slot "treenames": > > sub.rma@treenames <- subtrees > > 5. Replace slot "numtrees" with the number of subsamples > > sub.rma@numtrees <- length(subtrees) > > 6. Check if the new ExprTreeSet is correct: > > str(sub.rma) > > Now you can use the new ExprTreeSet "sub.rma" as input for > method unifilter: > > rma.ufr <- unifilter(sub.rma, .......) > > > If you want to take advantage of the advanced capabilties > of package "limma", then you can create a Biobase class > "ExpressionSet" containing only your 6 samples as described > in Appendix A.3 of the vignette xps.pdf: > > 1. extract the normalized expression data: > > subdata <- validData(data.rma) > > 2. Since "subdata" is a dataframe, simply create a subframe: > > subdata <- subdata[,c(1:3,7:9)] > > 3. Create a Biobase class "ExpressionSet", called "subset" > > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) > > Now you have an ExpressionSet ready for use with "limma". > > Please let me know if you succeeded with this info. > > Best regards > Christian > _._._._._._._._._._._._._._._._ > C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._ > > Zhibin Lu wrote: > > Hi, > > > > I am new in R/bioconductor. I am using xps package to analyze Affymetrix Gene ST 1.0 data. After I loaded CEL files into the DataTreeSet and compute the expression level with RMA, can I work on a subset of the data? Say, I have 12 samples. After RMA, can I just work on 6 of them and divide them into two groups and apply UniFilter to just these 6 ones? > > > > Thanks, > > > > Zhibin > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > _________________________________________________________________ [[alternative HTML version deleted]]

ADD REPLY • link 15.9 years ago Zhibin Lu ▴ 80

0

Entering edit mode

Dear Zhibin Meanwhile, I have uploaded a new version to BioC devel: http://bioconductor.org/packages/2.3/bioc/html/xps.html which simplifies your request as follows: 1. get expression values > value <- exprs(data.rma) 2. select treenames of choice (no extension necessary) > treenames <- c("TestA2", "TestB1") 3. make a copy of your object if you do not want to replace it > sub.rma <- data.rma 4. replace slot data with subset exprs(sub.rma, treenames) <- value 5. check if the new ExprTreeSet is correct: > str(sub.rma) Best regards Christian Zhibin Lu wrote: > Dear Christian, > > Thanks so much for such a detailed explanation. I will try this when I > get to work next week, and I do not see why I can not follow the > direction. > > Thanks again and have a nice weekend, > > Zhibin > > > Date: Sat, 28 Jun 2008 15:46:26 +0200 > > From: cstrato at aon.at > > To: zhbluweb at hotmail.com > > CC: bioconductor at stat.math.ethz.ch > > Subject: Re: [BioC] subset in XPS > > > > Dear Zhibin > > > > Since you have already done RMA you have now an ExprTreeSet, > > called e.g. "data.rma". You can see the structure with: > > > str(data.rma) > > > > Since currently there is no direct possibility to use a > > subset of type ExprTreeSet only, you can create a new class > > ExprTreeSet in the following way: > > > > 1. Make a subset of slot "data" which is a dataframe > > (assuming that you want to use samples 1,2,3,7,8,9): > > > subdata <- exprs(data.rma) > > > subdata <- subdata[,c(1:2,3:5, 9:11)] > > Please note that it is important to keep the first > > two columns. > > > > 2. Create a copy "sub.rma" of class "data.rma" > > > sub.rma <- data.rma > > > > 3. Replace slot "data" with "subdata": > > > exprs(sub.rma) <- subdata > > > > For the moment you need to replace slots "treenames" and > > "numtrees", too, which I will change in the future to be > > done automatically. > > > > 4. Replace slot "treenames" with the names of your subset: > > a, create list containing the sub samples > > > subtrees <- unlist(treeNames(data.g.rma)) > > > subtrees <- as.list(subtrees[c(1:3,7:9)]) > > b, check if the names are correct: > > > subtrees > > c, replace slot "treenames": > > > sub.rma at treenames <- subtrees > > > > 5. Replace slot "numtrees" with the number of subsamples > > > sub.rma at numtrees <- length(subtrees) > > > > 6. Check if the new ExprTreeSet is correct: > > > str(sub.rma) > > > > Now you can use the new ExprTreeSet "sub.rma" as input for > > method unifilter: > > > rma.ufr <- unifilter(sub.rma, .......) > > > > > > If you want to take advantage of the advanced capabilties > > of package "limma", then you can create a Biobase class > > "ExpressionSet" containing only your 6 samples as described > > in Appendix A.3 of the vignette xps.pdf: > > > > 1. extract the normalized expression data: > > > subdata <- validData(data.rma) > > > > 2. Since "subdata" is a dataframe, simply create a subframe: > > > subdata <- subdata[,c(1:3,7:9)] > > > > 3. Create a Biobase class "ExpressionSet", called "subset" > > > subset <- new("ExpressionSet", exprs = as.matrix(subdata)) > > > > Now you have an ExpressionSet ready for use with "limma". > > > > Please let me know if you succeeded with this info. > > > > Best regards > > Christian > > _._._._._._._._._._._._._._._._ > > C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a > > V.i.e.n.n.a A.u.s.t.r.i.a > > e.m.a.i.l: cstrato at aon.at > > _._._._._._._._._._._._._._._._ > > > > Zhibin Lu wrote: > > > Hi, > > > > > > I am new in R/bioconductor. I am using xps package to analyze > Affymetrix Gene ST 1.0 data. After I loaded CEL files into the > DataTreeSet and compute the expression level with RMA, can I work on a > subset of the data? Say, I have 12 samples. After RMA, can I just work > on 6 of them and divide them into two groups and apply UniFilter to > just these 6 ones? > > > > > > Thanks, > > > > > > Zhibin > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor at stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > > > > -------------------------------------------------------------------- ----

ADD REPLY • link 15.8 years ago cstrato ★ 3.9k

Login before adding your answer.