Question: Custom CDF files
0
gravatar for Edward Oakeley
10.9 years ago by
Edward Oakeley20 wrote:
Hi, Sorry for not being clear. The Affymetrix BPMAP file is not much use as a format for me because we mostly use S.pombe tiling arrays. Three problems with the pombe array: 1) it has oligos that alternate in their strand orientation (so it goes +,-,+,-,...) but the strand info is not encoded in the BPMAP file so you have to align it to the genome; 2) Affy only make a BPMAP file against an ancient sequence assembly for which the annotations are very very poor; 3) Only the Sanger have the annotations on their website (not the NCBI/UCSC) and these must be downloaded as GFF files. Given 1-3 and the fact that I need to make RMA expression values for the gene regions the simplest approach is to make CDF files which is actually a trivial task as the CDF format for genomic intervals is very straightforward. Given that someone somewhere is able to make packages for Bioconductor for expression arrays (which is what we now have in effect) I was just wondering if anyone knew how this can be done. I can condense them in RMAexpress but it would be nice to make a simple workflow that can be used by anyone working with these chips (i.e. make a CDF package and add all the Sanger annotations to it). Can anyone help? Thanks Edward On Mon, 2008-12-01 at 01:18 -0200, Benilton Carvalho wrote: > For tiling arrays, Affymetrix provides BPMAP files. You can use the > pdInfoBuilder package to create an annotation package. Once the > annotation package is installed, you can use the oligo package to read > the data in. > > For example, with pdInfoBuilder, you can use something like the > following: > > library(pdInfoBuilder) > bpmapFile <- "Hs35b_P06R_v01-3_NCBIv36.bpmap" > cifFile <- "Hs35b_P06R_v01.cif" > obj <- new("AffyTilingPDInfoPkgSeed", > version="0.1", > author="Benilton Carvalho", > email="bcarvalh at jhsph.edu", > biocViews="AnnotationData", > genomebuild="NCBI Build 36", > bpmapFile=bpmapFile, > cifFile=cifFile) > makePdInfoPackage(obj, destDir=".") > > > best, > > b > > On Nov 30, 2008, at 7:07 PM, Edward Oakeley wrote: > > > Hi there, > > > > I make custom CDF files for use with our Affy tiling arrays. Each > > one I > > make by aligning the oligos to the latest genome build and then > > intersecting the coordinates with the latest GFF annotation files. > > This > > works fine with applications that can accept CDF files but I would > > like > > to use Bioconductor to condense expression values. It would also be > > nice > > if there was an easy way to link in things like GO terms etc that are > > usually available from the genome repositories. > > > > Could you tell me how the CDF repositories are made for Bioconductor > > and > > if some scripts exist for automating this process. > > > > Any ideas? > > > > Thanks > > > > Edward Oakeley > > FMI, Basel > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink modified 10.9 years ago by Mark Robinson1.1k • written 10.9 years ago by Edward Oakeley20
Answer: Custom CDF files
0
gravatar for Mark Robinson
10.9 years ago by
Mark Robinson1.1k
Mark Robinson1.1k wrote:
Edward. Given that you have a CDF file (or can make one easily) and you want RMA-summarized values, I suggest you try one of: 1. use 'make.cdf.package' in the makecdfenv package to take the CDF file and create the annotation package ... then install the package with an 'R CMD INSTALL' ... then on with 'rma'. 2. use aroma.affymetrix -- CDF file goes in 1 directory, CEL files go in another, and you're set ... then on with your summaries. Cheers, Mark On 01/12/2008, at 5:51 PM, Edward Oakeley wrote: > Hi, > > Sorry for not being clear. The Affymetrix BPMAP file is not much use > as > a format for me because we mostly use S.pombe tiling arrays. Three > problems with the pombe array: 1) it has oligos that alternate in > their > strand orientation (so it goes +,-,+,-,...) but the strand info is not > encoded in the BPMAP file so you have to align it to the genome; 2) > Affy > only make a BPMAP file against an ancient sequence assembly for which > the annotations are very very poor; 3) Only the Sanger have the > annotations on their website (not the NCBI/UCSC) and these must be > downloaded as GFF files. > > Given 1-3 and the fact that I need to make RMA expression values for > the > gene regions the simplest approach is to make CDF files which is > actually a trivial task as the CDF format for genomic intervals is > very > straightforward. > > Given that someone somewhere is able to make packages for Bioconductor > for expression arrays (which is what we now have in effect) I was just > wondering if anyone knew how this can be done. I can condense them in > RMAexpress but it would be nice to make a simple workflow that can be > used by anyone working with these chips (i.e. make a CDF package and > add > all the Sanger annotations to it). > > Can anyone help? > > Thanks > Edward > > > On Mon, 2008-12-01 at 01:18 -0200, Benilton Carvalho wrote: >> For tiling arrays, Affymetrix provides BPMAP files. You can use the >> pdInfoBuilder package to create an annotation package. Once the >> annotation package is installed, you can use the oligo package to >> read >> the data in. >> >> For example, with pdInfoBuilder, you can use something like the >> following: >> >> library(pdInfoBuilder) >> bpmapFile <- "Hs35b_P06R_v01-3_NCBIv36.bpmap" >> cifFile <- "Hs35b_P06R_v01.cif" >> obj <- new("AffyTilingPDInfoPkgSeed", >> version="0.1", >> author="Benilton Carvalho", >> email="bcarvalh at jhsph.edu", >> biocViews="AnnotationData", >> genomebuild="NCBI Build 36", >> bpmapFile=bpmapFile, >> cifFile=cifFile) >> makePdInfoPackage(obj, destDir=".") >> >> >> best, >> >> b >> >> On Nov 30, 2008, at 7:07 PM, Edward Oakeley wrote: >> >>> Hi there, >>> >>> I make custom CDF files for use with our Affy tiling arrays. Each >>> one I >>> make by aligning the oligos to the latest genome build and then >>> intersecting the coordinates with the latest GFF annotation files. >>> This >>> works fine with applications that can accept CDF files but I would >>> like >>> to use Bioconductor to condense expression values. It would also be >>> nice >>> if there was an easy way to link in things like GO terms etc that >>> are >>> usually available from the genome repositories. >>> >>> Could you tell me how the CDF repositories are made for Bioconductor >>> and >>> if some scripts exist for automating this process. >>> >>> Any ideas? >>> >>> Thanks >>> >>> Edward Oakeley >>> FMI, Basel >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ------------------------------ Mark Robinson Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852
ADD COMMENTlink written 10.9 years ago by Mark Robinson1.1k
And for making custom annotations (so you can get GO terms etc.) you should be able to use the SQLForge code in AnnotationDbi: http://www.bioconductor.org/packages/2.4/bioc/html/AnnotationDbi.html Marc Mark Robinson wrote: > Edward. > > Given that you have a CDF file (or can make one easily) and you want > RMA-summarized values, I suggest you try one of: > > 1. use 'make.cdf.package' in the makecdfenv package to take the CDF > file and create the annotation package ... then install the package > with an 'R CMD INSTALL' ... then on with 'rma'. > > 2. use aroma.affymetrix -- CDF file goes in 1 directory, CEL files go > in another, and you're set ... then on with your summaries. > > Cheers, > Mark > > > On 01/12/2008, at 5:51 PM, Edward Oakeley wrote: > >> Hi, >> >> Sorry for not being clear. The Affymetrix BPMAP file is not much use as >> a format for me because we mostly use S.pombe tiling arrays. Three >> problems with the pombe array: 1) it has oligos that alternate in their >> strand orientation (so it goes +,-,+,-,...) but the strand info is not >> encoded in the BPMAP file so you have to align it to the genome; 2) Affy >> only make a BPMAP file against an ancient sequence assembly for which >> the annotations are very very poor; 3) Only the Sanger have the >> annotations on their website (not the NCBI/UCSC) and these must be >> downloaded as GFF files. >> >> Given 1-3 and the fact that I need to make RMA expression values for the >> gene regions the simplest approach is to make CDF files which is >> actually a trivial task as the CDF format for genomic intervals is very >> straightforward. >> >> Given that someone somewhere is able to make packages for Bioconductor >> for expression arrays (which is what we now have in effect) I was just >> wondering if anyone knew how this can be done. I can condense them in >> RMAexpress but it would be nice to make a simple workflow that can be >> used by anyone working with these chips (i.e. make a CDF package and add >> all the Sanger annotations to it). >> >> Can anyone help? >> >> Thanks >> Edward >> >> >> On Mon, 2008-12-01 at 01:18 -0200, Benilton Carvalho wrote: >>> For tiling arrays, Affymetrix provides BPMAP files. You can use the >>> pdInfoBuilder package to create an annotation package. Once the >>> annotation package is installed, you can use the oligo package to read >>> the data in. >>> >>> For example, with pdInfoBuilder, you can use something like the >>> following: >>> >>> library(pdInfoBuilder) >>> bpmapFile <- "Hs35b_P06R_v01-3_NCBIv36.bpmap" >>> cifFile <- "Hs35b_P06R_v01.cif" >>> obj <- new("AffyTilingPDInfoPkgSeed", >>> version="0.1", >>> author="Benilton Carvalho", >>> email="bcarvalh at jhsph.edu", >>> biocViews="AnnotationData", >>> genomebuild="NCBI Build 36", >>> bpmapFile=bpmapFile, >>> cifFile=cifFile) >>> makePdInfoPackage(obj, destDir=".") >>> >>> >>> best, >>> >>> b >>> >>> On Nov 30, 2008, at 7:07 PM, Edward Oakeley wrote: >>> >>>> Hi there, >>>> >>>> I make custom CDF files for use with our Affy tiling arrays. Each >>>> one I >>>> make by aligning the oligos to the latest genome build and then >>>> intersecting the coordinates with the latest GFF annotation files. >>>> This >>>> works fine with applications that can accept CDF files but I would >>>> like >>>> to use Bioconductor to condense expression values. It would also be >>>> nice >>>> if there was an easy way to link in things like GO terms etc that are >>>> usually available from the genome repositories. >>>> >>>> Could you tell me how the CDF repositories are made for Bioconductor >>>> and >>>> if some scripts exist for automating this process. >>>> >>>> Any ideas? >>>> >>>> Thanks >>>> >>>> Edward Oakeley >>>> FMI, Basel >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > ------------------------------ > Mark Robinson > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLYlink written 10.9 years ago by Marc Carlson7.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 218 users visited in the last hour