Affymetrix exon arrays?
2
0
Entering edit mode
@jesse-salisbury-1818
Last seen 10.3 years ago
Hi All; I have been working on the same problem, and have finally found a computer large enough to run it. Its a beowulf node at Jackson Labs with 16GB of memory. As it turns out, you need somwhere betwen 12.5 and 14GB for the R makecdfenv() package to process the MoEx-1_0-st-v1.cdf into a useable R environment. The process takes about 5hrs to complete on an Opteron processor. I would like to make the files available, but I'm not sure how to post environments on bioconductor yet. If you would like it before then, drop me a line and I'll send you a link to a ftp site. Also,there are other MoEx-1_0-st-v1 environments already available on the bioconductor website with alternative mappings. They can be used with the cdfname="env_name" flag with affybatch commands. Jesse
cdf PROcess cdf PROcess • 908 views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.3 years ago
Hi Jesse, Jesse Salisbury <ltboots at="" geneserver.mine.nu=""> writes: > Hi All; > I have been working on the same problem, and have finally found a computer > large enough to run it. Its a beowulf node at Jackson Labs with 16GB of > memory. As it turns out, you need somwhere betwen 12.5 and 14GB for the R > makecdfenv() package to process the MoEx-1_0-st-v1.cdf into a useable R > environment. The process takes about 5hrs to complete on an > Opteron processor. > > I would like to make the files available, but I'm not sure how to post > environments on bioconductor yet. We (Bioconductor) would be interested in hosting contributed annotation data packages. Right now the group here in Seattle is fairly busy preparing for the BioC2006 conference (next week!). If you'd like to contribute the data package let us know (use the email listed in the instructions for contributing a BioC package). It sounds like you might just have an environment object and not a package. Creating a data package from that isn't too hard, but it will require looking over the Writing R Extensions Manual... + seth
ADD COMMENT
0
Entering edit mode
@henrik-bengtsson-4333
Last seen 7 months ago
United States
A comment: For the advanced user, the affxparser package is a good start here. It is memory efficient and fast. I don't work with exon arrays myself, but I know that at least one person used the affxparser package to read exon CDF and CEL files, and that without problems. Note: if you can get hold of binary CDF files, that is *much* faster than ASCII CDF files. Same is true for CEL files. Typically you do not have to read all of the data in at once, but only a subsets, which is supported by affxparser. With readCel() you have access to the probe-level data either ordered from top-left corner to the bottom-right corner of the array (ordered by (x,y)). This way you'll be able access data so you can normalize it. With readCelUnits() you have access to the probe-level data ordered in probesets as defined by the CDF (now I don't know how probesets are defined on exon arrays). This allows you to sumarize data across arrays without having to load all of the data into memory at once. FYI: I'm working on a package (aroma.affymetrix) that among other things allow s you to (quantile) normalize virtually any number of arrays, e.g. I normalized the 90 CEPH 100K SNP with <150Mb RAM. The idea is to work with (CEL) files directly (utilizing affxparser) without reading everything into memory (at the same time). If I find the time (and a poster spot) I'll try to prepare a poster on this for the Bioconductor meeting in Seattle, if you happen to be there. If no poster, just grab me there and I'll show you on my laptop. Cheers Henrik On 4/10/06, Johannes Rainer <johannes.rainer at="" tcri.at=""> wrote: > Dear all, > actually i have also the same problem, > my server runs since last thursday trying to make a cdf package. currently i > use the affymetrix ExACT software to normalize the exon data. as far as i > have seen the ExACT scripts are perl scripts which compile and run smoothly > in unix (we had problems running the precompiled versions on windows, so i > compiled them from the source in linux). > so currently i use ExACT for the normalization (quantile) and summarization > (RMA, using just the PM) and analyze the normalized data in R > > best, jo > > On 4/8/06, Michael Seewald <mseewald at="" gmail.com=""> wrote: > > > > Dear all, > > > > Is it possible to analyze Affymetrix exon arrays with R/Bioconductor? I > > tried to generate a cdf environment with makecdfenv (as suggested by > > James), > > however the command never finished. The R process grows until it takes > > about > > 8 GB of RAM, then it is stuck. > > > > I am grateful for any help or advice. > > > > Best wishes, > > Michael > > > > > > > > On 11/23/05, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > > > > > > Natalia Becker wrote: > > > > I have just started working with the GeneChip(r) Human Exon 1.0 ST > > Array > > > ( > > > > v2 release version of the library files) from Affymetrix. > > > > > > > > > Unfortunately the R package "affy" doesn't accept the .CLF and .PGF > > > files. > > > > > > > > Could you send me the HuEx-1_0-st-v2.cdf file or show me the way how I > > > can > > > > create the CDF file by my own? > > > > > > You can use make.cdf.package() or make.cdf.env() in the makecdfenv > > > package. > > > > > > Best, > > > Jim > > > > > > > -- > > Dr. Michael Seewald > > Bioinformatics > > Bayer HealthCare AG > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > Johannes Rainer, Msc > Tyrolean Cancer Research Institute > Innrain 66, 6020 Innsbruck, Austria > Tel.: +43 512 570485 15 > Email: johannes.rainer at tcri.at > johannes.rainer at tugraz.at > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT

Login before adding your answer.

Traffic: 833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6