Methods to access GeneName to ProbeID in package Affy
1
0
Entering edit mode
Mark Dalphin ▴ 30
@mark-dalphin-571
Last seen 9.6 years ago
Hi, I'm trying to extract information from an AffyBatch object. I want to prepare a table which contains: AffyID ProbeID PM-1 PM-2 PM-3 ... Where: AffyID is also called the GeneName. ProbeID is the ID for the specific reporter on the chip. PM-1, PM-2, ... are the PerfectMatch intensities for several different chips which are part of the the expression set. I can see using the 'pm' method to extract most of this where the ProbeID and PM-1 will be row- and col-names in a matrix (this is great), but then I don't see how to associate the Affy-ID with ProbeID anywhere. A two-column data frame would be just fine. Any help would be appreciated. Thanks, Mark PS I am fairly familliar with R, but many of the data structures in BioConductor seem opaque to me. I believe this is due to the new S4 class structure being used, but I am not certain. Any pointers to how these data are represented and how to browse them would be appreciated; my old stand-by of str() is failing on these data: > str(spikein) List of 59 $ : int [1:59] 1 2 3 4 5 6 7 8 9 10 ... $ :Error in .subset2(x, i) : subscript out of bounds R version 1.8.0 under RedHat Linux 7.3 -- Mark Dalphin email: mdalphin@amgen.com Mail Stop: 29-2-A phone: +1-805-447-4951 (work) One Amgen Center Drive +1-805-375-0680 (home) Thousand Oaks, CA 91320 fax: +1-805-499-9955 (work)
• 974 views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 10 days ago
EMBL European Molecular Biology Laborat…
Hi Mark, this is documented in the vignette to the affy package, section 7. To repeat it here, you can get the probe set names by psets = ls(hgu95av2cdf) (replace "hgu95av" by whatever your chip name is). The indices of the j-th probe set's PM and MM probes by get(psets[j], hgu95av2cdf) and use these to subset the exprs matrix of the AffyBatch object. Alternatively, a dataframe that maps probe Ids and probe set Ids (and provides the sequence as well) is available in the probe packages, e.g. hgu95av2probe. Best wishes Wolfgang ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/abt0840/whuber ------------------------------------- On Mon, 15 Dec 2003, Mark Dalphin wrote: > Hi, > > I'm trying to extract information from an AffyBatch object. I want to prepare > a table which contains: > > AffyID ProbeID PM-1 PM-2 PM-3 ... > > Where: > AffyID is also called the GeneName. > ProbeID is the ID for the specific reporter on the chip. > PM-1, PM-2, ... are the PerfectMatch intensities for several different chips > which are part of the the expression set. > > I can see using the 'pm' method to extract most of this where the ProbeID and > PM-1 will be row- and col-names in a matrix (this is great), but then I don't > see how to associate the Affy-ID with ProbeID anywhere. A two-column data > frame would be just fine. > > Any help would be appreciated. > > Thanks, > Mark > > PS I am fairly familliar with R, but many of the data structures in > BioConductor seem opaque to me. I believe this is due to the new S4 class > structure being used, but I am not certain. Any pointers to how these data > are represented and how to browse them would be appreciated; my old stand-by > of str() is failing on these data: > > > str(spikein) > List of 59 > $ : int [1:59] 1 2 3 4 5 6 7 8 9 10 ... > $ :Error in .subset2(x, i) : subscript out of bounds > > R version 1.8.0 under RedHat Linux 7.3 > > -- > Mark Dalphin email: mdalphin@amgen.com > Mail Stop: 29-2-A phone: +1-805-447-4951 (work) > One Amgen Center Drive +1-805-375-0680 (home) > Thousand Oaks, CA 91320 fax: +1-805-499-9955 (work) > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Thank-you Wolfgang for your reply. I fear that I didn't understand that section of the vignettte so it didn't register as the section to return to when I got stuck. When I tried your second method with the package hgu95aprobe, I don't see the connection back to the ProbeID (sorry about the wrapping): > str(hgu95aprobe) Classes probetable and `data.frame': 199091 obs. of 6 variables: $ sequence :Class 'AsIs' chr [1:199091] "TCTCCTTTGCTGAGGCCTCCAGCTT" "AGGCCTCCAGCTTCAGGCAGGCCAA" "CCAGCTTCAGGCAGGCCAAGGCCTT" "AGCTCAGGTGGCCCCAGTTCAATCT" ... $ x : int 399 544 530 617 459 408 484 548 578 498 ... $ y : int 559 185 505 349 489 545 311 333 369 465 ... $ Probe.Set.Name :Class 'AsIs' chr [1:199091] "1000_at" "1000_at" "1000_at" "1000_at" ... $ Probe.Interrogation.Position: int 1367 1379 1385 1445 1523 1595 1649 1655 1667 1673 ... $ Target.Strandedness : Factor w/ 2 levels "Antisense","Sense": 1 1 1 1 1 1 1 1 1 1 ... To prepare a table of ProbeID versus AffyID either for the whole collection or a subset, would I rely on the x and y to give me the coordinates of the feature on the chip and then use the xy2i() function to compute an index, 'i', to extract a row from the AffyBatch object? Thanks, Mark ====================================================================== =============== On Monday 15 December 2003 02:56 pm, w.huber@dkfz-heidelberg.de wrote: > this is documented in the vignette to the affy package, section 7. > To repeat it here, you can get the probe set names by > > psets = ls(hgu95av2cdf) > > (replace "hgu95av" by whatever your chip name is). The indices of > the j-th probe set's PM and MM probes by > > get(psets[j], hgu95av2cdf) > > and use these to subset the exprs matrix of the AffyBatch object. > > Alternatively, a dataframe that maps probe Ids and probe set Ids (and > provides the sequence as well) is available in the probe packages, e.g. > hgu95av2probe. > > Best wishes > Wolfgang > > ------------------------------------- > Wolfgang Huber > Division of Molecular Genome Analysis > German Cancer Research Center > Heidelberg, Germany > Phone: +49 6221 424709 > Fax: +49 6221 42524709 > Http: www.dkfz.de/abt0840/whuber > ------------------------------------- > > On Mon, 15 Dec 2003, Mark Dalphin wrote: > > Hi, > > > > I'm trying to extract information from an AffyBatch object. I want to > > prepare a table which contains: > > > > AffyID ProbeID PM-1 PM-2 PM-3 ... > > > > Where: > > AffyID is also called the GeneName. > > ProbeID is the ID for the specific reporter on the chip. > > PM-1, PM-2, ... are the PerfectMatch intensities for several different > > chips which are part of the the expression set. > > > > I can see using the 'pm' method to extract most of this where the ProbeID > > and PM-1 will be row- and col-names in a matrix (this is great), but then > > I don't see how to associate the Affy-ID with ProbeID anywhere. A > > two-column data frame would be just fine. > > > > Any help would be appreciated. > > > > Thanks, > > Mark > > > > PS I am fairly familliar with R, but many of the data structures in > > BioConductor seem opaque to me. I believe this is due to the new S4 class > > structure being used, but I am not certain. Any pointers to how these > > data are represented and how to browse them would be appreciated; my old > > stand-by > > > > of str() is failing on these data: > > > str(spikein) > > > > List of 59 > > $ : int [1:59] 1 2 3 4 5 6 7 8 9 10 ... > > $ :Error in .subset2(x, i) : subscript out of bounds > > > > R version 1.8.0 under RedHat Linux 7.3 -- Mark Dalphin email: mdalphin@amgen.com Mail Stop: 29-2-A phone: +1-805-447-4951 (work) One Amgen Center Drive +1-805-375-0680 (home) Thousand Oaks, CA 91320 fax: +1-805-499-9955 (work)
ADD REPLY
0
Entering edit mode
Hi Mark, > To prepare a table of ProbeID versus AffyID either for the whole > collection or a subset, would I rely on the x and y to give me the > coordinates of the feature on the chip and then use the xy2i() function > to compute an index, 'i', to extract a row from the AffyBatch object? Yes, that's correct. There is 1:1 mapping between (x,y) and i, and these serve as probe identifiers. Best wishes Wolfgang.
ADD REPLY

Login before adding your answer.

Traffic: 592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6