Obtaining Raw Intensity Values and Metadata for Exon Arrays in xps
0
0
Entering edit mode
@stephen-piccolo-6761
Last seen 3.6 years ago
United States
Dear List Members: I asked a question of Christian Stratowa, maintainer of the xps package, on how to get raw intensity values for exon arrays along with some other meta information using his package. My question and his very helpful reply are below. Regards, -Steve -----Original Message----- From: cstrato [mailto:cstrato@aon.at] Sent: Wednesday, May 21, 2008 1:49 PM To: Steve Piccolo Subject: Re: Obtaining a Matrix of Raw Values in xps Dear Steve This is in principle possible however, it requires to export a couple of scheme trees in addition to the cel-trees. Here is what you need to do to get the columns you mention: 3) raw intensity value: You can get the CEL-file intensities for (x,y)-coordinates using: export(data.exon, treetype="cel", varlist = "fInten", outfile="Exon_int_cel.txt") However, for 75 CEL-files the exported file will be pretty large, so I suggest to export subsets: export.data(data.exon, treename=c("BreastA","BreastB"), varlist = "fInten", outfile="Exon_BreastAB_int_cel.txt") You can even create a data.frame in R using: cel <- export.data(data.exon, treename=c("BreastA","BreastB"), varlist = "fInten", outfile="Exon_BreastAB_int_cel.txt", as.dataframe=T) head(cel) 1) probe_id: The probe_id for (x,y) can be obtained by exporting: export(scheme.exon, treetype="cxy", outfile="HuExon_cxy.txt") However, currently the probe_id for exon arrays can be calculated from (x,y): probe_id = x + ncol * y + 1 where ncol is the number of columns of the array, i.e. for exon array ncol=2560 2) genomic sequence: I am not sure what you mean with "genomic sequence", but the probe sequences can be obtained by exporting: export(scheme.exon, treetype="prb", outfile="HuExon_prb.txt") which gives you the probe sequence at (x,y) 4) probeset_id: You can get the probeset_id by exporting: export(scheme.exon, treetype="pbs", outfile="HuExon_pbs.txt") The first column will contain an internal UNIT_ID followed by the probeset_id. The internal UNIT_ID corresponds to the internal ProbeSetID of file "HuExon_pbs.txt". Alternatively, you get the internal UNIT_ID for (x,y) by exporting: export(scheme.exon, treetype="scm", outfile="HuExon_scm.txt") Now, you need to combine the data from these files in order to get the matrix you want. I know that this seems to be complex, but sadly the exon array has a very complex structure. BTW, every export method has the possibility to import the data.frame directly into R: dataframe <- export(object, ..., as.dataframe=T) Please let me know if you succeeded with this info. Best regards Christian Steve Piccolo wrote: > Hi Christian, > > One thing I'm trying to do is develop a novel statistical approach for > "normalizing" exon data. What I need for a given CEL file is basically a > matrix with the following columns: 1) probe_id, 2) genomic sequence, 3) > raw intensity value, and 4) probeset_id. Can you tell me a little about > how to get started in accomplishing this with xps? > > Regards, > -Steve
probe probe • 1.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6