cdf-package to .cdf-file using affxparser
1
0
Entering edit mode
Samuel Wuest ▴ 330
@samuel-wuest-2821
Last seen 9.6 years ago
Dear List, I am trying to generate a .cdf-file (that could be used for programs such as dChip or aroma.affymetrix) from a cdf-package downloaded from the Bioconductor homepage (which provides me an environment containing the probe mappings)... Is there a direct way to do this (let's say something like "writeCdf("myChipcdf-package"))? If not, how can I generate a cdfheader and a cdf-list used for the function "writeCdf" from the mappings in the environment? (I got hands on a script that uses a flat-file as input for the function, but I am having problems generating this flatfile from the R environment). Thanks a million for any help on this. Best wishes, Sam ------------------------------------------------------ Wuest Samuel Smurfit Institute of Genetics Trinity College Dublin Dublin 2, Ireland <http: www.tcd.ie="" genetics="" wellmer-2="" index.html=""> ------------------------------------------------------ [[alternative HTML version deleted]]
• 889 views
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 9 months ago
United States
Hi Sam The easiest way forward would be for you to download a CDF file from affymetrix site and use readCdf on that file and then study the output of that function. The input structure to writeCdf can be complicated at first, but corresponds exactly to the output of readCdf. Not all information in a CDF file is contained in a CDF package, so you cannot always go from package -> file (although often the additional information in a CDF file is not used - but that is very context dependent). Why are you interested in doing this? Typically the CDF packages from Bioconductor are just packaged versions of the Affy files, so why not just use them directly. Kasper On Oct 25, 2008, at 8:09 , Samuel Wuest wrote: > Dear List, > > I am trying to generate a .cdf-file (that could be used for programs > such as > dChip or aroma.affymetrix) from a cdf-package downloaded from the > Bioconductor homepage (which provides me an environment containing > the probe > mappings)... > Is there a direct way to do this (let's say something like > "writeCdf("myChipcdf-package"))? If not, how can I generate a > cdfheader and > a cdf-list used for the function "writeCdf" from the mappings in the > environment? (I got hands on a script that uses a flat-file as input > for the > function, but I am having problems generating this flatfile from the R > environment). > > Thanks a million for any help on this. > > Best wishes, > > Sam > > ------------------------------------------------------ > Wuest Samuel > Smurfit Institute of Genetics > Trinity College Dublin > Dublin 2, Ireland <http: www.tcd.ie="" genetics="" wellmer-2="" index.html=""> > ------------------------------------------------------ > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Dear Kasper, thanks for your help. I understand that most cdf-packages are packaged Affy-.cdf files; however, I am using the "tinesath1cdf" package - downloaded from the Bioconductor homepage, see "experiment data" directory- which is a custom-made reannotation of the Ath1-Arabidopsis GeneChip (and according to the author, no .cdf-file is available). Therefore I was wondering whether there is a direct way from R-package to .cdf-file. You have suggested to read-in the affymetrix .cdf-file, now I have done this already and found that there are some identifiers which are not available in the custom-made package. However, the probes are mapped all right (I have X, Y, sequence and probe-set ID) and the identifiers I am missing are basically arbitrary numbers (e.g. Unit-ids), which I could theoretically just make up myself: if you do not see any obvious problems there... Anyway, I ll stick to your suggestion to use the readCdf-output as a guidline, thanks a million and best wishes, Sam 2008/10/27 Kasper Daniel Hansen <khansen@stat.berkeley.edu> > Hi Sam > > The easiest way forward would be for you to download a CDF file from > affymetrix site and use readCdf on that file and then study the output of > that function. The input structure to writeCdf can be complicated at first, > but corresponds exactly to the output of readCdf. > > Not all information in a CDF file is contained in a CDF package, so you > cannot always go from package -> file (although often the additional > information in a CDF file is not used - but that is very context dependent). > > Why are you interested in doing this? Typically the CDF packages from > Bioconductor are just packaged versions of the Affy files, so why not just > use them directly. > > Kasper > > > > On Oct 25, 2008, at 8:09 , Samuel Wuest wrote: > > Dear List, >> >> I am trying to generate a .cdf-file (that could be used for programs such >> as >> dChip or aroma.affymetrix) from a cdf-package downloaded from the >> Bioconductor homepage (which provides me an environment containing the >> probe >> mappings)... >> Is there a direct way to do this (let's say something like >> "writeCdf("myChipcdf-package"))? If not, how can I generate a cdfheader >> and >> a cdf-list used for the function "writeCdf" from the mappings in the >> environment? (I got hands on a script that uses a flat-file as input for >> the >> function, but I am having problems generating this flatfile from the R >> environment). >> >> Thanks a million for any help on this. >> >> Best wishes, >> >> Sam >> >> ------------------------------------------------------ >> Wuest Samuel >> Smurfit Institute of Genetics >> Trinity College Dublin >> Dublin 2, Ireland <http: www.tcd.ie="" genetics="" wellmer-2="" index.html=""> >> ------------------------------------------------------ >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
In that case I would obtain the original CDF file from Affymetrix. You can use the header of that file as a template. You will also get all the extra information of all the probes. You will see that the output is essentially a list of lists of lists... and all the re-annotation does is essentially re-ordering these lists. That should be easy to figure out. In addition you can use the QC probesets from the original CDF files, if you want to include them. One thing I would check is whether the same probe occurs in multiple probesets - many routines depends on that not being the case - although nothing in the CDF specification requires that a probe only belongs to one probeset. Kasper On Oct 27, 2008, at 3:31 , Samuel Wuest wrote: > Dear Kasper, > > thanks for your help. I understand that most cdf-packages are > packaged Affy-.cdf files; however, I am using the "tinesath1cdf" > package - downloaded from the Bioconductor homepage, see "experiment > data" directory- which is a custom-made reannotation of the Ath1- > Arabidopsis GeneChip (and according to the author, no .cdf-file is > available). Therefore I was wondering whether there is a direct way > from R-package to .cdf-file. > > You have suggested to read-in the affymetrix .cdf-file, now I have > done this already and found that there are some identifiers which > are not available in the custom-made package. However, the probes > are mapped all right (I have X, Y, sequence and probe-set ID) and > the identifiers I am missing are basically arbitrary numbers (e.g. > Unit-ids), which I could theoretically just make up myself: if you > do not see any obvious problems there... > > Anyway, I ll stick to your suggestion to use the readCdf-output as a > guidline, thanks a million and best wishes, > Sam > > 2008/10/27 Kasper Daniel Hansen <khansen at="" stat.berkeley.edu=""> > Hi Sam > > The easiest way forward would be for you to download a CDF file from > affymetrix site and use readCdf on that file and then study the > output of that function. The input structure to writeCdf can be > complicated at first, but corresponds exactly to the output of > readCdf. > > Not all information in a CDF file is contained in a CDF package, so > you cannot always go from package -> file (although often the > additional information in a CDF file is not used - but that is very > context dependent). > > Why are you interested in doing this? Typically the CDF packages > from Bioconductor are just packaged versions of the Affy files, so > why not just use them directly. > > Kasper > > > > On Oct 25, 2008, at 8:09 , Samuel Wuest wrote: > > Dear List, > > I am trying to generate a .cdf-file (that could be used for programs > such as > dChip or aroma.affymetrix) from a cdf-package downloaded from the > Bioconductor homepage (which provides me an environment containing > the probe > mappings)... > Is there a direct way to do this (let's say something like > "writeCdf("myChipcdf-package"))? If not, how can I generate a > cdfheader and > a cdf-list used for the function "writeCdf" from the mappings in the > environment? (I got hands on a script that uses a flat-file as input > for the > function, but I am having problems generating this flatfile from the R > environment). > > Thanks a million for any help on this. > > Best wishes, > > Sam > > ------------------------------------------------------ > Wuest Samuel > Smurfit Institute of Genetics > Trinity College Dublin > Dublin 2, Ireland <http: www.tcd.ie="" genetics="" wellmer-2="" index.html=""> > ------------------------------------------------------ > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > >
ADD REPLY
0
Entering edit mode
Thanks a lot, the hint was perfect and I can create a "Ath1121501"-like list: so I got "a" cdf-file written out, and it can be read-in again with readCdf... I have basically used the "tinesath1probe"-information rather than the "tinesath1cdf" (I assume x and y are the same there) For the MM-y values I have just added 1 to the PM-y values, the x-values are for PM and MM the same (at least this is how it seams in the Ath1121501-cdf list).... So far, dChip has crashed when reading in the .cdf, but I am going to test it for other applications and as soon as I have a working file, I'll upload it to the aroma.affymetrix page... Thank you a million, best, Sam 2008/10/27 Kasper Daniel Hansen <khansen@stat.berkeley.edu> > In that case I would obtain the original CDF file from Affymetrix. You can > use the header of that file as a template. You will also get all the extra > information of all the probes. You will see that the output is essentially a > list of lists of lists... and all the re-annotation does is essentially > re-ordering these lists. That should be easy to figure out. In addition you > can use the QC probesets from the original CDF files, if you want to include > them. One thing I would check is whether the same probe occurs in multiple > probesets - many routines depends on that not being the case - although > nothing in the CDF specification requires that a probe only belongs to one > probeset. > > Kasper > > > On Oct 27, 2008, at 3:31 , Samuel Wuest wrote: > > Dear Kasper, >> >> thanks for your help. I understand that most cdf-packages are packaged >> Affy-.cdf files; however, I am using the "tinesath1cdf" package - downloaded >> from the Bioconductor homepage, see "experiment data" directory- which is a >> custom-made reannotation of the Ath1-Arabidopsis GeneChip (and according to >> the author, no .cdf-file is available). Therefore I was wondering whether >> there is a direct way from R-package to .cdf-file. >> >> You have suggested to read-in the affymetrix .cdf-file, now I have done >> this already and found that there are some identifiers which are not >> available in the custom-made package. However, the probes are mapped all >> right (I have X, Y, sequence and probe-set ID) and the identifiers I am >> missing are basically arbitrary numbers (e.g. Unit-ids), which I could >> theoretically just make up myself: if you do not see any obvious problems >> there... >> >> Anyway, I ll stick to your suggestion to use the readCdf-output as a >> guidline, thanks a million and best wishes, >> Sam >> >> 2008/10/27 Kasper Daniel Hansen <khansen@stat.berkeley.edu> >> Hi Sam >> >> The easiest way forward would be for you to download a CDF file from >> affymetrix site and use readCdf on that file and then study the output of >> that function. The input structure to writeCdf can be complicated at first, >> but corresponds exactly to the output of readCdf. >> >> Not all information in a CDF file is contained in a CDF package, so you >> cannot always go from package -> file (although often the additional >> information in a CDF file is not used - but that is very context dependent). >> >> Why are you interested in doing this? Typically the CDF packages from >> Bioconductor are just packaged versions of the Affy files, so why not just >> use them directly. >> >> Kasper >> >> >> >> On Oct 25, 2008, at 8:09 , Samuel Wuest wrote: >> >> Dear List, >> >> I am trying to generate a .cdf-file (that could be used for programs such >> as >> dChip or aroma.affymetrix) from a cdf-package downloaded from the >> Bioconductor homepage (which provides me an environment containing the >> probe >> mappings)... >> Is there a direct way to do this (let's say something like >> "writeCdf("myChipcdf-package"))? If not, how can I generate a cdfheader >> and >> a cdf-list used for the function "writeCdf" from the mappings in the >> environment? (I got hands on a script that uses a flat-file as input for >> the >> function, but I am having problems generating this flatfile from the R >> environment). >> >> Thanks a million for any help on this. >> >> Best wishes, >> >> Sam >> >> ------------------------------------------------------ >> Wuest Samuel >> Smurfit Institute of Genetics >> Trinity College Dublin >> Dublin 2, Ireland <http: www.tcd.ie="" genetics="" wellmer-2="" index.html=""> >> ------------------------------------------------------ >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 897 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6