barcode with custom CDF
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Dear BioC-ers, I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. thanks a lot for your help and suggestions. cheers dario -- output of sessionInfo(): sessionInfo() R version 2.15.3 (2013-03-01) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 [5] AnnotationDbi_1.20.7 affy_1.36.1 [7] frma_1.10.0 Biobase_2.18.0 [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 loaded via a namespace (and not attached): [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 > -- Sent via the guest posting facility at bioconductor.org.
cdf frma frmaTools cdf frma frmaTools • 1.5k views
ADD COMMENT
0
Entering edit mode
@matthew-mccall-4459
Last seen 4.9 years ago
United States
Dario, Generating the barcode vectors (estimating the null distribution for each probeset) typically isn't something one can run on a laptop. It takes about 1-2 days running in parallel on about 20 nodes of a computing cluster. If you have access to such resources, I'm happy to help you create your own implementation. Is the custom CDF you're using one of the Brain Array CDFs or something of your own design? Best, Matt On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] <guest at="" bioconductor.org=""> wrote: > > Dear BioC-ers, > > I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. > I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. > I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. > > thanks a lot for your help and suggestions. > cheers > dario > > > > -- output of sessionInfo(): > > sessionInfo() > R version 2.15.3 (2013-03-01) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 > [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 > [5] AnnotationDbi_1.20.7 affy_1.36.1 > [7] frma_1.10.0 Biobase_2.18.0 > [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 > > loaded via a namespace (and not attached): > [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 > [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 > [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 > [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 > [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 > [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 > [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >> > > -- > Sent via the guest posting facility at bioconductor.org. -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD COMMENT
0
Entering edit mode
Dear Matt, thanks a lot for the quick reply! i'm working on data from 8 homo sapiens affymetrix platforms re- annotated with brainarray cdf (ensembl gene). i can have access to relatively large computer clusters, so that is not worrying me. the most obvious question is probably concerning what volume of data from chipsets other than 133a and 133p2 i would need in order to generate meaningful estimations. thanks d On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com=""> wrote: > Dario, > > Generating the barcode vectors (estimating the null distribution for > each probeset) typically isn't something one can run on a laptop. It > takes about 1-2 days running in parallel on about 20 nodes of a > computing cluster. If you have access to such resources, I'm happy to > help you create your own implementation. Is the custom CDF you're > using one of the Brain Array CDFs or something of your own design? > > Best, > Matt > > > On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] > <guest at="" bioconductor.org=""> wrote: >> >> Dear BioC-ers, >> >> I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. >> I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. >> I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. >> >> thanks a lot for your help and suggestions. >> cheers >> dario >> >> >> >> -- output of sessionInfo(): >> >> sessionInfo() >> R version 2.15.3 (2013-03-01) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 >> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 >> [5] AnnotationDbi_1.20.7 affy_1.36.1 >> [7] frma_1.10.0 Biobase_2.18.0 >> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 >> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 >> [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 >> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 >> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 >> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 >> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >>> >> >> -- >> Sent via the guest posting facility at bioconductor.org. > > > > -- > Matthew N McCall, PhD > 112 Arvine Heights > Rochester, NY 14611 > Cell: 202-222-5880
ADD REPLY
0
Entering edit mode
Dario, For the barcode implementations in BioC, I used > 10,000 arrays from each platform. I doubt this amount of data is available for all 8 Affy platforms you're using. If you don't mind giving me a brief overview of your research goals for this project (not cc'ing the BioC mailing list if you're more comfortable with that), I might be able to provide some alternatives to a full barcode implementation. Best, Matt On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se=""> wrote: > Dear Matt, > thanks a lot for the quick reply! > i'm working on data from 8 homo sapiens affymetrix platforms re- annotated with brainarray cdf (ensembl gene). > i can have access to relatively large computer clusters, so that is not worrying me. > the most obvious question is probably concerning what volume of data from chipsets other than 133a and 133p2 i would need in order to generate meaningful estimations. > thanks > d > > > On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com=""> wrote: > >> Dario, >> >> Generating the barcode vectors (estimating the null distribution for >> each probeset) typically isn't something one can run on a laptop. It >> takes about 1-2 days running in parallel on about 20 nodes of a >> computing cluster. If you have access to such resources, I'm happy to >> help you create your own implementation. Is the custom CDF you're >> using one of the Brain Array CDFs or something of your own design? >> >> Best, >> Matt >> >> >> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] >> <guest at="" bioconductor.org=""> wrote: >>> >>> Dear BioC-ers, >>> >>> I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. >>> I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. >>> I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. >>> >>> thanks a lot for your help and suggestions. >>> cheers >>> dario >>> >>> >>> >>> -- output of sessionInfo(): >>> >>> sessionInfo() >>> R version 2.15.3 (2013-03-01) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 >>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 >>> [5] AnnotationDbi_1.20.7 affy_1.36.1 >>> [7] frma_1.10.0 Biobase_2.18.0 >>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 >>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 >>> [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 >>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 >>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 >>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 >>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >>>> >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >> >> >> >> -- >> Matthew N McCall, PhD >> 112 Arvine Heights >> Rochester, NY 14611 >> Cell: 202-222-5880 > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD REPLY
0
Entering edit mode
Hi Matt, Sorry to interfere with this specific discussion, but i would also be interested in your suggestions on potential alternative approaches. The reason I am interested is because ideally I would like to apply a (your) barcoding approach for platforms that are less used compared to the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x arrays. Regards, Guido -----Original Message----- From: bioconductor-bounces@r-project.org [mailto:bioconductor- bounces@r-project.org] On Behalf Of Matthew McCall Sent: Tuesday, March 26, 2013 15:59 To: Dario Greco Cc: Bioconductor at r-project.org Subject: Re: [BioC] barcode with custom CDF Dario, For the barcode implementations in BioC, I used > 10,000 arrays from each platform. I doubt this amount of data is available for all 8 Affy platforms you're using. If you don't mind giving me a brief overview of your research goals for this project (not cc'ing the BioC mailing list if you're more comfortable with that), I might be able to provide some alternatives to a full barcode implementation. Best, Matt On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se=""> wrote: > Dear Matt, > thanks a lot for the quick reply! > i'm working on data from 8 homo sapiens affymetrix platforms re- annotated with brainarray cdf (ensembl gene). > i can have access to relatively large computer clusters, so that is not worrying me. > the most obvious question is probably concerning what volume of data from chipsets other than 133a and 133p2 i would need in order to generate meaningful estimations. > thanks > d > > > On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com=""> wrote: > >> Dario, >> >> Generating the barcode vectors (estimating the null distribution for >> each probeset) typically isn't something one can run on a laptop. It >> takes about 1-2 days running in parallel on about 20 nodes of a >> computing cluster. If you have access to such resources, I'm happy to >> help you create your own implementation. Is the custom CDF you're >> using one of the Brain Array CDFs or something of your own design? >> >> Best, >> Matt >> >> >> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] >> <guest at="" bioconductor.org=""> wrote: >>> >>> Dear BioC-ers, >>> >>> I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. >>> I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. >>> I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. >>> >>> thanks a lot for your help and suggestions. >>> cheers >>> dario >>> >>> >>> >>> -- output of sessionInfo(): >>> >>> sessionInfo() >>> R version 2.15.3 (2013-03-01) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 >>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 >>> [5] AnnotationDbi_1.20.7 affy_1.36.1 >>> [7] frma_1.10.0 Biobase_2.18.0 >>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 >>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 >>> [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 >>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 >>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 >>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 >>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >>>> >>> >>> -- >>> Sent via the guest posting facility at bioconductor.org. >> >> >> >> -- >> Matthew N McCall, PhD >> 112 Arvine Heights >> Rochester, NY 14611 >> Cell: 202-222-5880 > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880 _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Guido, No worries. It depends on what you plan to do with the data. One option is to go back to the method described in the original barcode paper (Zilliox and Irizarry Nat Methods 2007), which discards any genes that don't show a bimodal distribution. Another option is to define the null distribution based on your specific data set -- e.g. you estimate the null distribution for each gene using say 50 untreated samples and then use that distribution to "barcode" treated samples (this is similar to the POE algorithm -- http://astor.som.jhmi.edu/poe/). There are other options as well. The reason the barcode implementations I make require so many arrays is that we are trying to perform well regardless of what the researcher is interested in -- we give a bunch of examples of how to use the barcode algorithm for various tasks in the NAR 2011 paper. As for a HuGene and MoGene ST barcode implementation -- I'm working on this and hope to have something by the fall BioC release. Best, Matt On Tue, Mar 26, 2013 at 11:15 AM, Hooiveld, Guido <guido.hooiveld at="" wur.nl=""> wrote: > Hi Matt, > Sorry to interfere with this specific discussion, but i would also be interested in your suggestions on potential alternative approaches. > The reason I am interested is because ideally I would like to apply a (your) barcoding approach for platforms that are less used compared to the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x arrays. > > Regards, > Guido > > -----Original Message----- > From: bioconductor-bounces at r-project.org [mailto:bioconductor- bounces at r-project.org] On Behalf Of Matthew McCall > Sent: Tuesday, March 26, 2013 15:59 > To: Dario Greco > Cc: Bioconductor at r-project.org > Subject: Re: [BioC] barcode with custom CDF > > Dario, > > For the barcode implementations in BioC, I used > 10,000 arrays from each platform. I doubt this amount of data is available for all 8 Affy platforms you're using. If you don't mind giving me a brief overview of your research goals for this project (not cc'ing the BioC mailing list if you're more comfortable with that), I might be able to provide some alternatives to a full barcode implementation. > > Best, > Matt > > > On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se=""> wrote: >> Dear Matt, >> thanks a lot for the quick reply! >> i'm working on data from 8 homo sapiens affymetrix platforms re- annotated with brainarray cdf (ensembl gene). >> i can have access to relatively large computer clusters, so that is not worrying me. >> the most obvious question is probably concerning what volume of data from chipsets other than 133a and 133p2 i would need in order to generate meaningful estimations. >> thanks >> d >> >> >> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com=""> wrote: >> >>> Dario, >>> >>> Generating the barcode vectors (estimating the null distribution for >>> each probeset) typically isn't something one can run on a laptop. It >>> takes about 1-2 days running in parallel on about 20 nodes of a >>> computing cluster. If you have access to such resources, I'm happy to >>> help you create your own implementation. Is the custom CDF you're >>> using one of the Brain Array CDFs or something of your own design? >>> >>> Best, >>> Matt >>> >>> >>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] >>> <guest at="" bioconductor.org=""> wrote: >>>> >>>> Dear BioC-ers, >>>> >>>> I would like to run the function 'barcode' on a set of CEL files preprocessed with a custom CDF. >>>> I am wondering if there is a quick way to generate the needed vectors (mu and tau for the unexpressed distribution) in the same way as the package frmaTools allows for the fRMA necessary vectors. >>>> I hope I am not posting about an issue already treated in this mailing list, but searching it produced no obvious hints. >>>> >>>> thanks a lot for your help and suggestions. >>>> cheers >>>> dario >>>> >>>> >>>> >>>> -- output of sessionInfo(): >>>> >>>> sessionInfo() >>>> R version 2.15.3 (2013-03-01) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 >>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 >>>> [5] AnnotationDbi_1.20.7 affy_1.36.1 >>>> [7] frma_1.10.0 Biobase_2.18.0 >>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 >>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 >>>> [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 >>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 >>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 >>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 >>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >>>>> >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>> >>> >>> >>> -- >>> Matthew N McCall, PhD >>> 112 Arvine Heights >>> Rochester, NY 14611 >>> Cell: 202-222-5880 >> > > > > -- > Matthew N McCall, PhD > 112 Arvine Heights > Rochester, NY 14611 > Cell: 202-222-5880 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > -- Matthew N McCall, PhD 112 Arvine Heights Rochester, NY 14611 Cell: 202-222-5880
ADD REPLY
0
Entering edit mode
@stephen-piccolo-6761
Last seen 3.6 years ago
United States
Hi Dario and Guido, The UPC function in our SCAN.UPC package addresses this need. We use a "single-sample" approach to estimating barcodes. Essentially this means that we use the probe values within a given microarray sample to estimate a background distribution and then use that information to estimate whether each gene is "active" or "inactive" in that array. This is similar in concept to the barcode function (fRMA package) except that it does not require a large collection of reference samples, so it can easily be applied to Affy arrays from any platform. We have performed a comparison using the Affy Latin Square data, and our approached compares favorably to the barcode function (manuscript in revision, we can send more details offline if you're interested). It's also straightforward to use alternative CDFs, such as from BrainArray. This functionality is described in the package's documentation. One caveat: the UPC function is currently available only in the "development" version of Bioconductor (it will be released to the main version in a couple weeks). So if you want to try it out, you'll need to install the development version of R and then the development version of Bioconductor. Please let us know if you have any questions! Regards, -Steve > >Message: 13 >Date: Tue, 26 Mar 2013 15:15:25 +0000 >From: "Hooiveld, Guido" <guido.hooiveld at="" wur.nl=""> >To: "'Matthew McCall'" <mccallm at="" gmail.com="">, "'Dario Greco'" > <dario.greco at="" ki.se=""> >Cc: "'Bioconductor at r-project.org'" <bioconductor at="" r-project.org=""> >Subject: Re: [BioC] barcode with custom CDF >Message-ID: > <eb992c246eb7bf449bc1e6b12af7f65007db84c0 at="" scomp0933.wurnet.nl=""> >Content-Type: text/plain; charset="us-ascii" > >Hi Matt, >Sorry to interfere with this specific discussion, but i would also be >interested in your suggestions on potential alternative approaches. >The reason I am interested is because ideally I would like to apply a >(your) barcoding approach for platforms that are less used compared to >the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x >arrays. > >Regards, >Guido > >-----Original Message----- >From: bioconductor-bounces at r-project.org >[mailto:bioconductor-bounces at r-project.org] On Behalf Of Matthew McCall >Sent: Tuesday, March 26, 2013 15:59 >To: Dario Greco >Cc: Bioconductor at r-project.org >Subject: Re: [BioC] barcode with custom CDF > >Dario, > >For the barcode implementations in BioC, I used > 10,000 arrays from each >platform. I doubt this amount of data is available for all 8 Affy >platforms you're using. If you don't mind giving me a brief overview of >your research goals for this project (not cc'ing the BioC mailing list if >you're more comfortable with that), I might be able to provide some >alternatives to a full barcode implementation. > >Best, >Matt > > >On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se=""> wrote: >> Dear Matt, >> thanks a lot for the quick reply! >> i'm working on data from 8 homo sapiens affymetrix platforms >>re-annotated with brainarray cdf (ensembl gene). >> i can have access to relatively large computer clusters, so that is not >>worrying me. >> the most obvious question is probably concerning what volume of data >>from chipsets other than 133a and 133p2 i would need in order to >>generate meaningful estimations. >> thanks >> d >> >> >> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com=""> wrote: >> >>> Dario, >>> >>> Generating the barcode vectors (estimating the null distribution for >>> each probeset) typically isn't something one can run on a laptop. It >>> takes about 1-2 days running in parallel on about 20 nodes of a >>> computing cluster. If you have access to such resources, I'm happy to >>> help you create your own implementation. Is the custom CDF you're >>> using one of the Brain Array CDFs or something of your own design? >>> >>> Best, >>> Matt >>> >>> >>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest] >>> <guest at="" bioconductor.org=""> wrote: >>>> >>>> Dear BioC-ers, >>>> >>>> I would like to run the function 'barcode' on a set of CEL files >>>>preprocessed with a custom CDF. >>>> I am wondering if there is a quick way to generate the needed vectors >>>>(mu and tau for the unexpressed distribution) in the same way as the >>>>package frmaTools allows for the fRMA necessary vectors. >>>> I hope I am not posting about an issue already treated in this >>>>mailing list, but searching it produced no obvious hints. >>>> >>>> thanks a lot for your help and suggestions. >>>> cheers >>>> dario >>>> >>>> >>>> >>>> -- output of sessionInfo(): >>>> >>>> sessionInfo() >>>> R version 2.15.3 (2013-03-01) >>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>>> >>>> locale: >>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12 >>>> [3] hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0 >>>> [5] AnnotationDbi_1.20.7 affy_1.36.1 >>>> [7] frma_1.10.0 Biobase_2.18.0 >>>> [9] BiocGenerics_0.4.0 BiocInstaller_1.8.3 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3 >>>> [4] bit_1.1-10 codetools_0.2-8 DBI_0.2-5 >>>> [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 >>>> [10] IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23 >>>> [13] oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3 >>>> [16] preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3 >>>> [19] stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0 >>>>> >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>> >>> >>> >>> -- >>> Matthew N McCall, PhD >>> 112 Arvine Heights >>> Rochester, NY 14611 >>> Cell: 202-222-5880 >> > > > >-- >Matthew N McCall, PhD >112 Arvine Heights >Rochester, NY 14611 >Cell: 202-222-5880 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at r-project.org >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor > > >*********************************************
ADD COMMENT

Login before adding your answer.

Traffic: 782 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6