Affymetrix Mouse Gene 1.0 ST - Number of probes
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.5 years ago
Hello, I work on affymetrix mouse gene 1.0 ST. I used two methods in order to match my data base with my probes. I compared the uniques probes in the two methods after doing a RMA normalization: -> there were 34 760 probes (controls probe and main probes) when I used R/ Bioconductor. I downloaded the Unsupported Mouse Gene 1.0 ST Array CDF (Technical documentation -> Library Files) on Affymetrix website in order to have the cdf files and to make my own CDF package. -> there were 35 556 probes (controls probe and main probes) when I used Expression Console. I downloaded the Mouse Gene 1.0 ST Array, Analysis (Technical documentation -> Library Files) in order to have the files that Expression Console need. => So I lost 796 probes. It's boring! Next, when I kept only main probes (after matched my data base with the Affymetrix annotation file available on Affymetrix website), I had: -> 28 104 probes with Bioconductor -> 28 856 probes with Expression Console => There were 752 main probes, I hadn't if I realized my data analysis with Bioconductor. I'm worry because sometimes one can ask me not to do summarization probes, so I can't use Expression Console, I have to use Bioconductor. I lost a lot of probes. I asked my question to Affymetrix support and they answered: This difference can be due to a number of reasons. Firstly, the CDF file is the array layout information designed for 3' IVT array analysis, and are therefore not optimal for a WT array (The WT arrays use different library files, CLF and PGF). This is the reason why it is given a unsupported status (as seen in the name). This could explain the difference you see. Secondly, bioconductor and Expression Console are different software, so the RMA algorithm may not work identically the same. Things like background correction, filtering and such might differ between these two software. What do you think answer Affymetrix support? Personnally, I don't think that the summarization (median polish) removes somes probes. How you could explain the difference I found? How I can do in so as to I keep all the probes I need (main probes)? Thank you, Sophie LAMARRE Biostatistician - Toulouse (FRANCE) -- output of sessionInfo(): R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] LC_NUMERIC=C LC_TIME=French_France.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] affy_1.30.0 Biobase_2.12.2 loaded via a namespace (and not attached): [1] affyio_1.20.0 preprocessCore_1.14.0 tools_2.13.0 -- Sent via the guest posting facility at bioconductor.org.
Annotation cdf probe Annotation cdf probe • 1.3k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 minutes ago
United States
Hi Sophie, On 12/21/11 5:38 AM, Sophie LAMARRE [guest] wrote: > Hello, > > I work on affymetrix mouse gene 1.0 ST. > > I used two methods in order to match my data base with my probes. I compared the uniques probes in the two methods after doing a RMA normalization: > > -> there were 34 760 probes (controls probe and main probes) when I used R/ Bioconductor. I downloaded the Unsupported Mouse Gene 1.0 ST Array CDF (Technical documentation -> Library Files) on Affymetrix website in order to have the cdf files and to make my own CDF package. > -> there were 35 556 probes (controls probe and main probes) when I used Expression Console. I downloaded the Mouse Gene 1.0 ST Array, Analysis (Technical documentation -> Library Files) in order to have the files that Expression Console need. > > => So I lost 796 probes. It's boring! > > Next, when I kept only main probes (after matched my data base with the Affymetrix annotation file available on Affymetrix website), I had: > -> 28 104 probes with Bioconductor > -> 28 856 probes with Expression Console > > => There were 752 main probes, I hadn't if I realized my data analysis with Bioconductor. I'm worry because sometimes one can ask me not to do summarization probes, so I can't use Expression Console, I have to use Bioconductor. I lost a lot of probes. > > I asked my question to Affymetrix support and they answered: > > This difference can be due to a number of reasons. > > Firstly, the CDF file is the array layout information designed for 3' IVT array analysis, and are therefore not optimal for a WT array (The WT arrays use different library files, CLF and PGF). This is the reason why it is given a unsupported status (as seen in the name). This could explain the difference you see. > > Secondly, bioconductor and Expression Console are different software, so the RMA algorithm may not work identically the same. Things like background correction, filtering and such might differ between these two software. > > What do you think answer Affymetrix support? Personnally, I don't think that the summarization (median polish) removes somes probes. How you could explain the difference I found? How I can do in so as to I keep all the probes I need (main probes)? The short answer is that Affy technical support is correct. There are a host of problems associated with using the affy package to analyze WT arrays, which is why the oligo and xps packages exist. You will be better served by switching to either oligo or xps for the analysis of these data. Best, Jim > Thank you, > > Sophie LAMARRE > Biostatistician - Toulouse (FRANCE) > > -- output of sessionInfo(): > > R version 2.13.0 (2011-04-13) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 > [4] LC_NUMERIC=C LC_TIME=French_France.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] affy_1.30.0 Biobase_2.12.2 > > loaded via a namespace (and not attached): > [1] affyio_1.20.0 preprocessCore_1.14.0 tools_2.13.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT

Login before adding your answer.

Traffic: 952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6