combineAffyBatch for HG-U133A and HG-U133Av2 GeneChips
4
0
Entering edit mode
@kellie-j-archer-phd-644
Last seen 10.3 years ago
I used the combineAffyBatch function in the matchprobes library to merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to work well. However, the difference between the two chips (with respect to the probes interrogated) seem to be very minor, most notable is that the control probe sets for ribosomal RNAs were on the HG-U133A chips but are not on the HG-U133Av2 chip. However, after applying the combineAffyBatch function, the resulting $dat includes 5 of these rRNA probes. Additionally, there are intensities reported for the version 2 chips. However, for other probe sets I did compare a sample of their PM/MM intensities against those in GCOS probe tiling view and they appear to be accurate. My question is, how can I create an AffyBatch object that omits these 5 pm/mms so I can apply rma() to the merged dataset? I am running R 1.9.1. This is the code I used: ### Old chips ### library(affy) library(hgu133acdf) hgu133a<-ReadAffy(filenames=filenames1) ### New chips ### library(hgu133a2cdf) hgu133a.2<-ReadAffy(filenames=filenames2) ### Merge both sets library(matchprobes) both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13 3a2probe"),"newhgu133",verbose=TRUE) newhgu133<-both$cdf ###Check if rRNA probes omitted pn<-names(unlist(indexProbes(both$dat,"pm"))) pn[grep("AFFX-r2-H",pn)] [1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2" [3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1" [5] "AFFX-r2-Hs18SrRNA-M_x_at2" Best, Kellie J. Archer, Ph.D. Assistant Professor, Department of Biostatistics Virginia Commonwealth University 1101 East Marshall St. B1-066 Richmond, VA 23298-0032 phone: (804) 827-2039 fax: (804) 828-8900 e-mail: kjarcher@vcu.edu website: www.people.vcu.edu/~kjarcher
probe matchprobes probe matchprobes • 1.3k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Hi Kellie, I am sorry - I think as one of the authors of matchprobes I should be able to answer your question but it seems I can't fully parse it. I am not sure whether this is the answer, but you can remove unwanted probe sets from the CDF environment that resulted from combineAffyBatch. What exactly do you want to do you? Is RMA failing, and where/what with? Best wishes Wolfgang <quote who="Kellie J. Archer, Ph.D."> > I used the combineAffyBatch function in the matchprobes library to > merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to > work well. However, the difference between the two chips (with respect > to the probes interrogated) seem to be very minor, most notable is > that the control probe sets for ribosomal RNAs were on the HG-U133A > chips but are not on the HG-U133Av2 chip. However, after applying the > combineAffyBatch function, the resulting $dat includes 5 of these rRNA > probes. Additionally, there are intensities reported for the version 2 > chips. However, for other probe sets I did compare a sample of their > PM/MM intensities against those in GCOS probe tiling view and they > appear to be accurate. My question is, how can I create an AffyBatch > object that omits these 5 pm/mms so I can apply rma() to the merged > dataset? I am running R 1.9.1. > > This is the code I used: > ### Old chips ### > library(affy) > library(hgu133acdf) > hgu133a<-ReadAffy(filenames=filenames1) > > ### New chips ### > library(hgu133a2cdf) > hgu133a.2<-ReadAffy(filenames=filenames2) > > ### Merge both sets > library(matchprobes) > both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13 > 3a2probe"),"newhgu133",verbose=TRUE) > newhgu133<-both$cdf > > ###Check if rRNA probes omitted > pn<-names(unlist(indexProbes(both$dat,"pm"))) > pn[grep("AFFX-r2-H",pn)] > [1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2" > [3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1" > [5] "AFFX-r2-Hs18SrRNA-M_x_at2" > > Best, > Kellie J. Archer, Ph.D. > Assistant Professor, Department of Biostatistics > Virginia Commonwealth University > 1101 East Marshall St. B1-066 > Richmond, VA 23298-0032 > phone: (804) 827-2039 > fax: (804) 828-8900 > e-mail: kjarcher@vcu.edu > website: www.people.vcu.edu/~kjarcher > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.dkfz.de/abt0840/whuber
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Hi Kellie, > I would like to remove unwanted probeset from the resulting $dat > object prior to applying rma or any other expression summary method. I think it should not make much of a difference whether you remove these probesets before or after RMA, since the R in RMA stands for "Robust". > How would I remove these from the affybatch object prior to rma? You can remove probesets by some code like this comb <- both$cdf gn <- ls(comb) rm(grep("AFFX-r2-H", gn, value=TRUE), envir=comb) I haven't tested this but I hope you get the idea. > One other question is that I have been trying to ascertain why these > probes were retained by combineAffyBatch in the first place. How can > one investigate that most efficiently? You can copy the code for the function combineAffyBatch from the corresponding file in the "package source" tar.gz file, and then add various checkpoints and plots to see what is going on. Hope this helps Best wishes Wolfgang ------------------------------------ Kellie J. Archer, Ph.D. wrote: > Thanks Wolfgang, > > I would like to remove unwanted probeset from the resulting $dat > object prior to applying rma or any other expression summary method. > So far, I have been able to apply rma to the resulting $dat object > then remove unwanted probe sets from it using > > both.rma<-rma(both$dat) > gn<-geneNames(both.rma) > both.rma.2<-both.rma[-grep("AFFX-r2-H",gn)] > > One other question is that I have been trying to ascertain why these > probes were retained by combineAffyBatch in the first place. How can > one investigate that most efficiently? > > Thanks again, > Kellie Archer > ------------------- > >>Hi Kellie, >> >>I am sorry - I think as one of the authors of matchprobes I should > > be able > >>to answer your question but it seems I can't fully parse it. >> >>I am not sure whether this is the answer, but you can remove > > unwanted > >>probe sets from the CDF environment that resulted from > > combineAffyBatch. > >>What exactly do you want to do you? >>Is RMA failing, and where/what with? >> >>Best wishes >> Wolfgang >> >> >><quote who="Kellie J. Archer, Ph.D."> >> >>>I used the combineAffyBatch function in the matchprobes library to >>>merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to >>>work well. However, the difference between the two chips (with > > respect > >>>to the probes interrogated) seem to be very minor, most notable is >>>that the control probe sets for ribosomal RNAs were on the > > HG-U133A > >>>chips but are not on the HG-U133Av2 chip. However, after applying > > the > >>>combineAffyBatch function, the resulting $dat includes 5 of these > > rRNA > >>>probes. Additionally, there are intensities reported for the > > version 2 > >>>chips. However, for other probe sets I did compare a sample of > > their > >>>PM/MM intensities against those in GCOS probe tiling view and they >>>appear to be accurate. My question is, how can I create an > > AffyBatch > >>>object that omits these 5 pm/mms so I can apply rma() to the > > merged > >>>dataset? I am running R 1.9.1. >>> >>>This is the code I used: >>>### Old chips ### >>>library(affy) >>>library(hgu133acdf) >>>hgu133a<-ReadAffy(filenames=filenames1) >>> >>>### New chips ### >>>library(hgu133a2cdf) >>>hgu133a.2<-ReadAffy(filenames=filenames2) >>> >>>### Merge both sets >>>library(matchprobes) >>> > > both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13 > >>>3a2probe"),"newhgu133",verbose=TRUE) >>>newhgu133<-both$cdf >>> >>>###Check if rRNA probes omitted >>>pn<-names(unlist(indexProbes(both$dat,"pm"))) >>>pn[grep("AFFX-r2-H",pn)] >>>[1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2" >>>[3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1" >>>[5] "AFFX-r2-Hs18SrRNA-M_x_at2" >>> >>>Best, >>>Kellie J. Archer, Ph.D. >>>Assistant Professor, Department of Biostatistics >>>Virginia Commonwealth University >>>1101 East Marshall St. B1-066 >>>Richmond, VA 23298-0032 >>>phone: (804) 827-2039 >>>fax: (804) 828-8900 >>>e-mail: kjarcher@vcu.edu >>>website: www.people.vcu.edu/~kjarcher >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor@stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >> >> >>------------------------------------- >>Wolfgang Huber >>European Bioinformatics Institute >>European Molecular Biology Laboratory >>Cambridge CB10 1SD >>England >>Phone: +44 1223 494642 >>Http: www.dkfz.de/abt0840/whuber >>------------------------------------- >> >> > > -- ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.dkfz.de/abt0840/whuber
ADD COMMENT
0
Entering edit mode
@david-lee-duewer-941
Last seen 10.3 years ago
>> I would like to remove unwanted probeset from the resulting $dat >> object prior to applying rma or any other expression summary method. > >I think it should not make much of a difference whether you remove these >probesets before or after RMA, since the R in RMA stands for "Robust". Umm, no matter how nominally "robust" a method, you are always better off explicitly getting rid of KNOWN unwanted-signal (ie, noise) rather than relying on the methodology ignoring it. Even medians can be badly perturbed by the inclusion of relatively small proportions (10-15%) of asymmetric (all high or all low) "non-informative signal". D David Lee Duewer Research Chemometrician National Institute of Standards and Technology 100 Bureau Drive Stop 8390 Gaithersburg, MD 20899-8390 USA [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Hi Kellie > Thanks Wolfgang. I recognize the median polish applied per probe set > will be unaffected, but I did not want these probe sets to be included > in the quantile normalization. I also may want to eliminate all > control probes prior to normalization and expression summaries. I > tried the code below and received the following message: > >>rm(grep("AFFX-r2-H", gn, value=TRUE), envir=comb) Try rm(list=grep("AFFX-r2-H", gn, value=TRUE), envir=comb) but also please read the manual pages and the documentation of the affy package on how AffyBatches are internally structured, and do some hands-on experimentation - you're doing quite a bit of untested under-the-hood manipulation here and you want to know what you are doing. David: >Umm, no matter how nominally "robust" a method, you are always better >off explicitly getting rid of KNOWN unwanted-signal (ie, noise) rather >than relying on the methodology ignoring it. Even medians can be badly >perturbed by the inclusion of relatively small proportions (10-15%) of >asymmetric (all high or all low) "non-informative signal". Sure, you are right. It's good to see that people are following these threads and are watchful :). Best wishes Wolfgang -- ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.dkfz.de/abt0840/whuber ------------------------------------- -- ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Http: www.dkfz.de/abt0840/whuber
ADD COMMENT

Login before adding your answer.

Traffic: 613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6