I used the combineAffyBatch function in the matchprobes library to
merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to
work well. However, the difference between the two chips (with respect
to the probes interrogated) seem to be very minor, most notable is
that the control probe sets for ribosomal RNAs were on the HG-U133A
chips but are not on the HG-U133Av2 chip. However, after applying the
combineAffyBatch function, the resulting $dat includes 5 of these rRNA
probes. Additionally, there are intensities reported for the version 2
chips. However, for other probe sets I did compare a sample of their
PM/MM intensities against those in GCOS probe tiling view and they
appear to be accurate. My question is, how can I create an AffyBatch
object that omits these 5 pm/mms so I can apply rma() to the merged
dataset? I am running R 1.9.1.
This is the code I used:
### Old chips ###
library(affy)
library(hgu133acdf)
hgu133a<-ReadAffy(filenames=filenames1)
### New chips ###
library(hgu133a2cdf)
hgu133a.2<-ReadAffy(filenames=filenames2)
### Merge both sets
library(matchprobes)
both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13
3a2probe"),"newhgu133",verbose=TRUE)
newhgu133<-both$cdf
###Check if rRNA probes omitted
pn<-names(unlist(indexProbes(both$dat,"pm")))
pn[grep("AFFX-r2-H",pn)]
[1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2"
[3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1"
[5] "AFFX-r2-Hs18SrRNA-M_x_at2"
Best,
Kellie J. Archer, Ph.D.
Assistant Professor, Department of Biostatistics
Virginia Commonwealth University
1101 East Marshall St. B1-066
Richmond, VA 23298-0032
phone: (804) 827-2039
fax: (804) 828-8900
e-mail: kjarcher@vcu.edu
website: www.people.vcu.edu/~kjarcher
Hi Kellie,
I am sorry - I think as one of the authors of matchprobes I should be
able
to answer your question but it seems I can't fully parse it.
I am not sure whether this is the answer, but you can remove unwanted
probe sets from the CDF environment that resulted from
combineAffyBatch.
What exactly do you want to do you?
Is RMA failing, and where/what with?
Best wishes
Wolfgang
<quote who="Kellie J. Archer, Ph.D.">
> I used the combineAffyBatch function in the matchprobes library to
> merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to
> work well. However, the difference between the two chips (with
respect
> to the probes interrogated) seem to be very minor, most notable is
> that the control probe sets for ribosomal RNAs were on the HG-U133A
> chips but are not on the HG-U133Av2 chip. However, after applying
the
> combineAffyBatch function, the resulting $dat includes 5 of these
rRNA
> probes. Additionally, there are intensities reported for the version
2
> chips. However, for other probe sets I did compare a sample of their
> PM/MM intensities against those in GCOS probe tiling view and they
> appear to be accurate. My question is, how can I create an
AffyBatch
> object that omits these 5 pm/mms so I can apply rma() to the merged
> dataset? I am running R 1.9.1.
>
> This is the code I used:
> ### Old chips ###
> library(affy)
> library(hgu133acdf)
> hgu133a<-ReadAffy(filenames=filenames1)
>
> ### New chips ###
> library(hgu133a2cdf)
> hgu133a.2<-ReadAffy(filenames=filenames2)
>
> ### Merge both sets
> library(matchprobes)
>
both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13
> 3a2probe"),"newhgu133",verbose=TRUE)
> newhgu133<-both$cdf
>
> ###Check if rRNA probes omitted
> pn<-names(unlist(indexProbes(both$dat,"pm")))
> pn[grep("AFFX-r2-H",pn)]
> [1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2"
> [3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1"
> [5] "AFFX-r2-Hs18SrRNA-M_x_at2"
>
> Best,
> Kellie J. Archer, Ph.D.
> Assistant Professor, Department of Biostatistics
> Virginia Commonwealth University
> 1101 East Marshall St. B1-066
> Richmond, VA 23298-0032
> phone: (804) 827-2039
> fax: (804) 828-8900
> e-mail: kjarcher@vcu.edu
> website: www.people.vcu.edu/~kjarcher
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Http: www.dkfz.de/abt0840/whuber
Hi Kellie,
> I would like to remove unwanted probeset from the resulting $dat
> object prior to applying rma or any other expression summary
method.
I think it should not make much of a difference whether you remove
these
probesets before or after RMA, since the R in RMA stands for "Robust".
> How would I remove these from the affybatch object prior to rma?
You can remove probesets by some code like this
comb <- both$cdf
gn <- ls(comb)
rm(grep("AFFX-r2-H", gn, value=TRUE), envir=comb)
I haven't tested this but I hope you get the idea.
> One other question is that I have been trying to ascertain why
these
> probes were retained by combineAffyBatch in the first place. How
can
> one investigate that most efficiently?
You can copy the code for the function combineAffyBatch from the
corresponding file in the "package source" tar.gz file, and then add
various checkpoints and plots to see what is going on.
Hope this helps
Best wishes
Wolfgang
------------------------------------
Kellie J. Archer, Ph.D. wrote:
> Thanks Wolfgang,
>
> I would like to remove unwanted probeset from the resulting $dat
> object prior to applying rma or any other expression summary method.
> So far, I have been able to apply rma to the resulting $dat object
> then remove unwanted probe sets from it using
>
> both.rma<-rma(both$dat)
> gn<-geneNames(both.rma)
> both.rma.2<-both.rma[-grep("AFFX-r2-H",gn)]
>
> One other question is that I have been trying to ascertain why these
> probes were retained by combineAffyBatch in the first place. How can
> one investigate that most efficiently?
>
> Thanks again,
> Kellie Archer
> -------------------
>
>>Hi Kellie,
>>
>>I am sorry - I think as one of the authors of matchprobes I should
>
> be able
>
>>to answer your question but it seems I can't fully parse it.
>>
>>I am not sure whether this is the answer, but you can remove
>
> unwanted
>
>>probe sets from the CDF environment that resulted from
>
> combineAffyBatch.
>
>>What exactly do you want to do you?
>>Is RMA failing, and where/what with?
>>
>>Best wishes
>> Wolfgang
>>
>>
>><quote who="Kellie J. Archer, Ph.D.">
>>
>>>I used the combineAffyBatch function in the matchprobes library to
>>>merge data from HG-U133A and HG-U133Av2 GeneChips, which seemed to
>>>work well. However, the difference between the two chips (with
>
> respect
>
>>>to the probes interrogated) seem to be very minor, most notable is
>>>that the control probe sets for ribosomal RNAs were on the
>
> HG-U133A
>
>>>chips but are not on the HG-U133Av2 chip. However, after applying
>
> the
>
>>>combineAffyBatch function, the resulting $dat includes 5 of these
>
> rRNA
>
>>>probes. Additionally, there are intensities reported for the
>
> version 2
>
>>>chips. However, for other probe sets I did compare a sample of
>
> their
>
>>>PM/MM intensities against those in GCOS probe tiling view and they
>>>appear to be accurate. My question is, how can I create an
>
> AffyBatch
>
>>>object that omits these 5 pm/mms so I can apply rma() to the
>
> merged
>
>>>dataset? I am running R 1.9.1.
>>>
>>>This is the code I used:
>>>### Old chips ###
>>>library(affy)
>>>library(hgu133acdf)
>>>hgu133a<-ReadAffy(filenames=filenames1)
>>>
>>>### New chips ###
>>>library(hgu133a2cdf)
>>>hgu133a.2<-ReadAffy(filenames=filenames2)
>>>
>>>### Merge both sets
>>>library(matchprobes)
>>>
>
>
both<-combineAffyBatch(list(hgu133a,hgu133a.2),c("hgu133aprobe","hgu13
>
>>>3a2probe"),"newhgu133",verbose=TRUE)
>>>newhgu133<-both$cdf
>>>
>>>###Check if rRNA probes omitted
>>>pn<-names(unlist(indexProbes(both$dat,"pm")))
>>>pn[grep("AFFX-r2-H",pn)]
>>>[1] "AFFX-r2-Hs18SrRNA-3_s_at1" "AFFX-r2-Hs18SrRNA-3_s_at2"
>>>[3] "AFFX-r2-Hs18SrRNA-5_at" "AFFX-r2-Hs18SrRNA-M_x_at1"
>>>[5] "AFFX-r2-Hs18SrRNA-M_x_at2"
>>>
>>>Best,
>>>Kellie J. Archer, Ph.D.
>>>Assistant Professor, Department of Biostatistics
>>>Virginia Commonwealth University
>>>1101 East Marshall St. B1-066
>>>Richmond, VA 23298-0032
>>>phone: (804) 827-2039
>>>fax: (804) 828-8900
>>>e-mail: kjarcher@vcu.edu
>>>website: www.people.vcu.edu/~kjarcher
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor@stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>
>>
>>
>>-------------------------------------
>>Wolfgang Huber
>>European Bioinformatics Institute
>>European Molecular Biology Laboratory
>>Cambridge CB10 1SD
>>England
>>Phone: +44 1223 494642
>>Http: www.dkfz.de/abt0840/whuber
>>-------------------------------------
>>
>>
>
>
--
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Http: www.dkfz.de/abt0840/whuber
>> I would like to remove unwanted probeset from the resulting $dat
>> object prior to applying rma or any other expression summary
method.
>
>I think it should not make much of a difference whether you remove
these
>probesets before or after RMA, since the R in RMA stands for
"Robust".
Umm, no matter how nominally "robust" a method, you are always better
off explicitly getting rid of KNOWN unwanted-signal (ie, noise) rather
than relying on the methodology ignoring it. Even medians can be
badly perturbed by the inclusion of relatively small proportions
(10-15%) of asymmetric (all high or all low) "non-informative signal".
D
David Lee Duewer
Research Chemometrician
National Institute of Standards and Technology
100 Bureau Drive Stop 8390
Gaithersburg, MD 20899-8390
USA
[[alternative HTML version deleted]]
Hi Kellie
> Thanks Wolfgang. I recognize the median polish applied per probe set
> will be unaffected, but I did not want these probe sets to be
included
> in the quantile normalization. I also may want to eliminate all
> control probes prior to normalization and expression summaries. I
> tried the code below and received the following message:
>
>>rm(grep("AFFX-r2-H", gn, value=TRUE), envir=comb)
Try
rm(list=grep("AFFX-r2-H", gn, value=TRUE), envir=comb)
but also please read the manual pages and the documentation of the
affy
package on how AffyBatches are internally structured, and do some
hands-on experimentation - you're doing quite a bit of untested
under-the-hood manipulation here and you want to know what you are
doing.
David:
>Umm, no matter how nominally "robust" a method, you are always better
>off explicitly getting rid of KNOWN unwanted-signal (ie, noise)
rather
>than relying on the methodology ignoring it. Even medians can be
badly >perturbed by the inclusion of relatively small proportions
(10-15%) of >asymmetric (all high or all low) "non-informative
signal".
Sure, you are right. It's good to see that people are following these
threads and are watchful :).
Best wishes
Wolfgang
--
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Http: www.dkfz.de/abt0840/whuber
-------------------------------------
--
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Http: www.dkfz.de/abt0840/whuber