Affymetrix exon arrays?

0

Entering edit mode

Jesse Salisbury ▴ 20

@jesse-salisbury-1818

Last seen 10.3 years ago

Hi All; I have been working on the same problem, and have finally found a computer large enough to run it. Its a beowulf node at Jackson Labs with 16GB of memory. As it turns out, you need somwhere betwen 12.5 and 14GB for the R makecdfenv() package to process the MoEx-1_0-st-v1.cdf into a useable R environment. The process takes about 5hrs to complete on an Opteron processor. I would like to make the files available, but I'm not sure how to post environments on bioconductor yet. If you would like it before then, drop me a line and I'll send you a link to a ftp site. Also,there are other MoEx-1_0-st-v1 environments already available on the bioconductor website with alternative mappings. They can be used with the cdfname="env_name" flag with affybatch commands. Jesse

cdf PROcess cdf PROcess • 907 views

ADD COMMENT • link updated 18.4 years ago by Henrik Bengtsson ★ 2.4k • written 18.4 years ago by Jesse Salisbury ▴ 20

0

Entering edit mode

Seth Falcon ★ 7.4k

@seth-falcon-992

Last seen 10.3 years ago

Hi Jesse, Jesse Salisbury <ltboots at="" geneserver.mine.nu=""> writes: > Hi All; > I have been working on the same problem, and have finally found a computer > large enough to run it. Its a beowulf node at Jackson Labs with 16GB of > memory. As it turns out, you need somwhere betwen 12.5 and 14GB for the R > makecdfenv() package to process the MoEx-1_0-st-v1.cdf into a useable R > environment. The process takes about 5hrs to complete on an > Opteron processor. > > I would like to make the files available, but I'm not sure how to post > environments on bioconductor yet. We (Bioconductor) would be interested in hosting contributed annotation data packages. Right now the group here in Seattle is fairly busy preparing for the BioC2006 conference (next week!). If you'd like to contribute the data package let us know (use the email listed in the instructions for contributing a BioC package). It sounds like you might just have an environment object and not a package. Creating a data package from that isn't too hard, but it will require looking over the Writing R Extensions Manual... + seth

ADD COMMENT • link 18.4 years ago Seth Falcon ★ 7.4k

0

Entering edit mode

Henrik Bengtsson ★ 2.4k

@henrik-bengtsson-4333

Last seen 7 months ago

United States

A comment: For the advanced user, the affxparser package is a good start here. It is memory efficient and fast. I don't work with exon arrays myself, but I know that at least one person used the affxparser package to read exon CDF and CEL files, and that without problems. Note: if you can get hold of binary CDF files, that is *much* faster than ASCII CDF files. Same is true for CEL files. Typically you do not have to read all of the data in at once, but only a subsets, which is supported by affxparser. With readCel() you have access to the probe-level data either ordered from top-left corner to the bottom-right corner of the array (ordered by (x,y)). This way you'll be able access data so you can normalize it. With readCelUnits() you have access to the probe-level data ordered in probesets as defined by the CDF (now I don't know how probesets are defined on exon arrays). This allows you to sumarize data across arrays without having to load all of the data into memory at once. FYI: I'm working on a package (aroma.affymetrix) that among other things allow s you to (quantile) normalize virtually any number of arrays, e.g. I normalized the 90 CEPH 100K SNP with <150Mb RAM. The idea is to work with (CEL) files directly (utilizing affxparser) without reading everything into memory (at the same time). If I find the time (and a poster spot) I'll try to prepare a poster on this for the Bioconductor meeting in Seattle, if you happen to be there. If no poster, just grab me there and I'll show you on my laptop. Cheers Henrik On 4/10/06, Johannes Rainer <johannes.rainer at="" tcri.at=""> wrote: > Dear all, > actually i have also the same problem, > my server runs since last thursday trying to make a cdf package. currently i > use the affymetrix ExACT software to normalize the exon data. as far as i > have seen the ExACT scripts are perl scripts which compile and run smoothly > in unix (we had problems running the precompiled versions on windows, so i > compiled them from the source in linux). > so currently i use ExACT for the normalization (quantile) and summarization > (RMA, using just the PM) and analyze the normalized data in R > > best, jo > > On 4/8/06, Michael Seewald <mseewald at="" gmail.com=""> wrote: > > > > Dear all, > > > > Is it possible to analyze Affymetrix exon arrays with R/Bioconductor? I > > tried to generate a cdf environment with makecdfenv (as suggested by > > James), > > however the command never finished. The R process grows until it takes > > about > > 8 GB of RAM, then it is stuck. > > > > I am grateful for any help or advice. > > > > Best wishes, > > Michael > > > > > > > > On 11/23/05, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > > > > > > Natalia Becker wrote: > > > > I have just started working with the GeneChip(r) Human Exon 1.0 ST > > Array > > > ( > > > > v2 release version of the library files) from Affymetrix. > > > > > > > > > Unfortunately the R package "affy" doesn't accept the .CLF and .PGF > > > files. > > > > > > > > Could you send me the HuEx-1_0-st-v2.cdf file or show me the way how I > > > can > > > > create the CDF file by my own? > > > > > > You can use make.cdf.package() or make.cdf.env() in the makecdfenv > > > package. > > > > > > Best, > > > Jim > > > > > > > -- > > Dr. Michael Seewald > > Bioinformatics > > Bayer HealthCare AG > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > -- > Johannes Rainer, Msc > Tyrolean Cancer Research Institute > Innrain 66, 6020 Innsbruck, Austria > Tel.: +43 512 570485 15 > Email: johannes.rainer at tcri.at > johannes.rainer at tugraz.at > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD COMMENT • link 18.4 years ago Henrik Bengtsson ★ 2.4k

Login before adding your answer.