Question about hgu133plus2cdf?
2
0
Entering edit mode
@fabrice-tourre-4394
Last seen 9.7 years ago
Dear list, I am now analysis hgu133plus2 array. I want a CDF which has been removed probes with SNPs. Because I want to remove the the noise caused by single nucleotide polymorphisms (SNPs) in different samples. Also I do not want some probeset which sequences can mapped to multiple genome position. In bioconductor, there is a package hgu133plus2cdf. I also noticed there is a website provide custom CDF file for hgu133plus2. The website is: http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF _download.asp HGU133Plus2 (Version 15.0.0, ENTREZG) Is the same for this two CDF files? Or the package hgu133plus2cdf directly from Affy CDF file? Thank you very much in advance.
hgu133plus2 cdf affy hgu133plus2 cdf affy • 1.3k views
ADD COMMENT
0
Entering edit mode
@delhommeemblde-3232
Last seen 9.7 years ago
Dear Fabrice, The hgu133plus2cdf in Bioc is based on the information provided by Affymetrix. The custom CDF from the website you mention, contains probes re- aligned to the human genome and only those probes that have a unique mapping are used. See their publication: Dai et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Research (2005) vol. 33 (20) pp. e175 . That won't solve your SNP problem, but you can use the hgu133plus2probes package that contains the probe sequences or the one provided by Dai et al for that. Based on these sequences and their mapping, you should be able to filter out those that contains SNPs you're not interested in. For that the IRanges functionalities might prove helpful. Whether you drop the whole probe-set or try to re- create your own CDF then is up to you. If you want to create your own CDF, check the vignette of the makecdfenv package for that: vignette("makecdfenv"). And you might want to make sure your new probe-set are valid. This paper is a good starting point for that: Lu et al. Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: high-resolution annotation for microarrays. BMC Bioinformatics (2007) vol. 8 pp. 108. HTH, Nico P.S. sorry missed the reply-all in the first place --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On 15 Mar 2012, at 13:44, Fabrice Tourre wrote: > Dear list, > > I am now analysis hgu133plus2 array. I want a CDF which has been > removed probes with SNPs. Because I want to remove the the noise > caused by single nucleotide polymorphisms (SNPs) in different samples. > Also I do not want some probeset which sequences can mapped to > multiple genome position. > > In bioconductor, there is a package hgu133plus2cdf. I also noticed > there is a website provide custom CDF file for hgu133plus2. > > The website is: > http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/C DF_download.asp > HGU133Plus2 (Version 15.0.0, ENTREZG) > > Is the same for this two CDF files? > > Or the package hgu133plus2cdf directly from Affy CDF file? > > Thank you very much in advance. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@fabrice-tourre-4394
Last seen 9.7 years ago
Dear Nico, Thank you very much for your explain. I am wondering why the hgu133plus2cdf in Bioc is not based on the custom CDF from Dai et al. It seems that unique mapping is better. On Thu, Mar 15, 2012 at 9:16 PM, Nicolas Delhomme <delhomme at="" embl.de=""> wrote: > Dear Fabrice, > > The hgu133plus2cdf in Bioc is based on the information provided by Affymetrix. > > The custom CDF from the website you mention, contains probes re- aligned to the human genome and only those probes that have a unique mapping are used. See their publication: ?Dai et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Research (2005) vol. 33 (20) pp. e175 . > > That won't solve your SNP problem, but you can use the hgu133plus2probes package that contains the probe sequences or the one provided by Dai et al for that. Based on these sequences and their mapping, you should be able to filter out those that contains SNPs you're not interested in. For that the IRanges functionalities might prove helpful. Whether you drop the whole probe-set or try to re- create your own CDF then is up to you. > > If you want to create your own CDF, check the vignette of the makecdfenv package for that: vignette("makecdfenv"). And you might want to make sure your new probe-set are valid. This paper is a good starting point for that: ?Lu et al. Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: high-resolution annotation for microarrays. BMC Bioinformatics (2007) vol. 8 pp. 108. > > HTH, > > Nico > > > --------------------------------------------------------------- > Nicolas Delhomme > > Genome Biology Computational Support > > European Molecular Biology Laboratory > > Tel: +49 6221 387 8310 > Email: nicolas.delhomme at embl.de > Meyerhofstrasse 1 - Postfach 10.2209 > 69102 Heidelberg, Germany > --------------------------------------------------------------- > > > > > > On 15 Mar 2012, at 13:44, Fabrice Tourre wrote: > >> Dear list, >> >> I am now analysis hgu133plus2 array. I want a CDF which has been >> removed probes with SNPs. Because I want to remove the the noise >> caused by single nucleotide polymorphisms (SNPs) in different samples. >> Also I do not want some probeset which sequences can mapped to >> multiple genome position. >> >> In bioconductor, there is a package hgu133plus2cdf. I also noticed >> there is a website provide custom CDF file for hgu133plus2. >> >> The website is: >> http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/ CDF_download.asp >> HGU133Plus2 (Version 15.0.0, ENTREZG) >> >> Is the same for this two CDF files? >> >> Or the package hgu133plus2cdf directly from Affy CDF file? >> >> Thank you very much in advance. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Fabrice, On 3/15/2012 9:52 AM, Fabrice Tourre wrote: > Dear Nico, > > Thank you very much for your explain. > > I am wondering why the hgu133plus2cdf in Bioc is not based on the > custom CDF from Dai et al. It seems that unique mapping is better. The default cdf packages from BioC are based on the manufacturer's data. As are the probe and annotation packages. We create these packages as a service to our end users, without making any claims to the suitability of these packages for any use (which I might add is true of all BioC packages, not just the metadata packages). It is not in our interest (nor yours, I might guess) for us to decide which mapping is 'better', and then restrict what we supply. We have made the MBNI packages available via biocLite() for something like 6-7 years now, in order to ensure that the end user has easy access to whatever mapping they feel appropriate to their analysis, and leave it up to the end user to make that decision. Best, Jim > > On Thu, Mar 15, 2012 at 9:16 PM, Nicolas Delhomme<delhomme at="" embl.de=""> wrote: >> Dear Fabrice, >> >> The hgu133plus2cdf in Bioc is based on the information provided by Affymetrix. >> >> The custom CDF from the website you mention, contains probes re- aligned to the human genome and only those probes that have a unique mapping are used. See their publication: Dai et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Research (2005) vol. 33 (20) pp. e175 . >> >> That won't solve your SNP problem, but you can use the hgu133plus2probes package that contains the probe sequences or the one provided by Dai et al for that. Based on these sequences and their mapping, you should be able to filter out those that contains SNPs you're not interested in. For that the IRanges functionalities might prove helpful. Whether you drop the whole probe-set or try to re- create your own CDF then is up to you. >> >> If you want to create your own CDF, check the vignette of the makecdfenv package for that: vignette("makecdfenv"). And you might want to make sure your new probe-set are valid. This paper is a good starting point for that: Lu et al. Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: high-resolution annotation for microarrays. BMC Bioinformatics (2007) vol. 8 pp. 108. >> >> HTH, >> >> Nico >> >> >> --------------------------------------------------------------- >> Nicolas Delhomme >> >> Genome Biology Computational Support >> >> European Molecular Biology Laboratory >> >> Tel: +49 6221 387 8310 >> Email: nicolas.delhomme at embl.de >> Meyerhofstrasse 1 - Postfach 10.2209 >> 69102 Heidelberg, Germany >> --------------------------------------------------------------- >> >> >> >> >> >> On 15 Mar 2012, at 13:44, Fabrice Tourre wrote: >> >>> Dear list, >>> >>> I am now analysis hgu133plus2 array. I want a CDF which has been >>> removed probes with SNPs. Because I want to remove the the noise >>> caused by single nucleotide polymorphisms (SNPs) in different samples. >>> Also I do not want some probeset which sequences can mapped to >>> multiple genome position. >>> >>> In bioconductor, there is a package hgu133plus2cdf. I also noticed >>> there is a website provide custom CDF file for hgu133plus2. >>> >>> The website is: >>> http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF /CDF_download.asp >>> HGU133Plus2 (Version 15.0.0, ENTREZG) >>> >>> Is the same for this two CDF files? >>> >>> Or the package hgu133plus2cdf directly from Affy CDF file? >>> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Well, I would not necessarily agree with that. The custom CDF removes a lot of information from the original CDF: ~30% of the probes. So it's better to have one package in R where you can get the complete original data as provided by the manufacturer, don't you agree? This way you can manipulate it the way you want, i.e. you might not find appropriate for your purpose the way that Dai et al. create their packages. Cheers, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On 15 Mar 2012, at 14:52, Fabrice Tourre wrote: > Dear Nico, > > Thank you very much for your explain. > > I am wondering why the hgu133plus2cdf in Bioc is not based on the > custom CDF from Dai et al. It seems that unique mapping is better. > > On Thu, Mar 15, 2012 at 9:16 PM, Nicolas Delhomme <delhomme at="" embl.de=""> wrote: >> Dear Fabrice, >> >> The hgu133plus2cdf in Bioc is based on the information provided by Affymetrix. >> >> The custom CDF from the website you mention, contains probes re- aligned to the human genome and only those probes that have a unique mapping are used. See their publication: Dai et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Research (2005) vol. 33 (20) pp. e175 . >> >> That won't solve your SNP problem, but you can use the hgu133plus2probes package that contains the probe sequences or the one provided by Dai et al for that. Based on these sequences and their mapping, you should be able to filter out those that contains SNPs you're not interested in. For that the IRanges functionalities might prove helpful. Whether you drop the whole probe-set or try to re- create your own CDF then is up to you. >> >> If you want to create your own CDF, check the vignette of the makecdfenv package for that: vignette("makecdfenv"). And you might want to make sure your new probe-set are valid. This paper is a good starting point for that: Lu et al. Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: high-resolution annotation for microarrays. BMC Bioinformatics (2007) vol. 8 pp. 108. >> >> HTH, >> >> Nico >> >> >> --------------------------------------------------------------- >> Nicolas Delhomme >> >> Genome Biology Computational Support >> >> European Molecular Biology Laboratory >> >> Tel: +49 6221 387 8310 >> Email: nicolas.delhomme at embl.de >> Meyerhofstrasse 1 - Postfach 10.2209 >> 69102 Heidelberg, Germany >> --------------------------------------------------------------- >> >> >> >> >> >> On 15 Mar 2012, at 13:44, Fabrice Tourre wrote: >> >>> Dear list, >>> >>> I am now analysis hgu133plus2 array. I want a CDF which has been >>> removed probes with SNPs. Because I want to remove the the noise >>> caused by single nucleotide polymorphisms (SNPs) in different samples. >>> Also I do not want some probeset which sequences can mapped to >>> multiple genome position. >>> >>> In bioconductor, there is a package hgu133plus2cdf. I also noticed >>> there is a website provide custom CDF file for hgu133plus2. >>> >>> The website is: >>> http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF /CDF_download.asp >>> HGU133Plus2 (Version 15.0.0, ENTREZG) >>> >>> Is the same for this two CDF files? >>> >>> Or the package hgu133plus2cdf directly from Affy CDF file? >>> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>
ADD REPLY

Login before adding your answer.

Traffic: 555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6