Hi Dario and Guido,
The UPC function in our SCAN.UPC package addresses this need. We use a
"single-sample" approach to estimating barcodes. Essentially this
that we use the probe values within a given microarray sample to
a background distribution and then use that information to estimate
whether each gene is "active" or "inactive" in that array. This is
in concept to the barcode function (fRMA package) except that it does
require a large collection of reference samples, so it can easily be
applied to Affy arrays from any platform. We have performed a
using the Affy Latin Square data, and our approached compares
the barcode function (manuscript in revision, we can send more details
offline if you're
It's also straightforward to use alternative CDFs, such as from
BrainArray. This functionality is described in the package's
One caveat: the UPC function is currently available only in the
"development" version of Bioconductor (it will be released to the main
version in a couple weeks). So if you want to try it out, you'll need
install the development version of R and then the development version
Please let us know if you have any questions!
>Date: Tue, 26 Mar 2013 15:15:25 +0000
>From: "Hooiveld, Guido" <guido.hooiveld at="" wur.nl="">
>To: "'Matthew McCall'" <mccallm at="" gmail.com="">, "'Dario Greco'"
> <dario.greco at="" ki.se="">
>Cc: "'Bioconductor at r-project.org'" <bioconductor at="" r-project.org="">
>Subject: Re: [BioC] barcode with custom CDF
> <eb992c246eb7bf449bc1e6b12af7f65007db84c0 at="" scomp0933.wurnet.nl="">
>Content-Type: text/plain; charset="us-ascii"
>Sorry to interfere with this specific discussion, but i would also be
>interested in your suggestions on potential alternative approaches.
>The reason I am interested is because ideally I would like to apply a
>(your) barcoding approach for platforms that are less used compared
>the HGU133 or MOE430 platforms, such as the HuGene and MoGene ST v1.x
>From: bioconductor-bounces at r-project.org
>[mailto:bioconductor-bounces at r-project.org] On Behalf Of Matthew
>Sent: Tuesday, March 26, 2013 15:59
>To: Dario Greco
>Cc: Bioconductor at r-project.org
>Subject: Re: [BioC] barcode with custom CDF
>For the barcode implementations in BioC, I used > 10,000 arrays from
>platform. I doubt this amount of data is available for all 8 Affy
>platforms you're using. If you don't mind giving me a brief overview
>your research goals for this project (not cc'ing the BioC mailing
>you're more comfortable with that), I might be able to provide some
>alternatives to a full barcode implementation.
>On Tue, Mar 26, 2013 at 10:47 AM, Dario Greco <dario.greco at="" ki.se="">
>> Dear Matt,
>> thanks a lot for the quick reply!
>> i'm working on data from 8 homo sapiens affymetrix platforms
>>re-annotated with brainarray cdf (ensembl gene).
>> i can have access to relatively large computer clusters, so that is
>> the most obvious question is probably concerning what volume of
>>from chipsets other than 133a and 133p2 i would need in order to
>>generate meaningful estimations.
>> On Mar 26, 2013, at 2:43 PM, Matthew McCall <mccallm at="" gmail.com="">
>>> Generating the barcode vectors (estimating the null distribution
>>> each probeset) typically isn't something one can run on a laptop.
>>> takes about 1-2 days running in parallel on about 20 nodes of a
>>> computing cluster. If you have access to such resources, I'm happy
>>> help you create your own implementation. Is the custom CDF you're
>>> using one of the Brain Array CDFs or something of your own design?
>>> On Tue, Mar 26, 2013 at 7:03 AM, Dario Greco [guest]
>>> <guest at="" bioconductor.org=""> wrote:
>>>> Dear BioC-ers,
>>>> I would like to run the function 'barcode' on a set of CEL files
>>>>preprocessed with a custom CDF.
>>>> I am wondering if there is a quick way to generate the needed
>>>>(mu and tau for the unexpressed distribution) in the same way as
>>>>package frmaTools allows for the fRMA necessary vectors.
>>>> I hope I am not posting about an issue already treated in this
>>>>mailing list, but searching it produced no obvious hints.
>>>> thanks a lot for your help and suggestions.
>>>> -- output of sessionInfo():
>>>> R version 2.15.3 (2013-03-01)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>> attached base packages:
>>>>  stats graphics grDevices utils datasets methods
>>>> other attached packages:
>>>>  hgu133plus2barcodevecs_1.0.5 hgu133plus2frmavecs_1.1.12
>>>>  hgu133abarcodevecs_1.0.5 hthgu133acdf_2.11.0
>>>>  AnnotationDbi_1.20.7 affy_1.36.1
>>>>  frma_1.10.0 Biobase_2.18.0
>>>>  BiocGenerics_0.4.0 BiocInstaller_1.8.3
>>>> loaded via a namespace (and not attached):
>>>>  affxparser_1.30.2 affyio_1.26.0 Biostrings_2.26.3
>>>>  bit_1.1-10 codetools_0.2-8 DBI_0.2-5
>>>>  ff_2.2-11 foreach_1.4.0
>>>>  IRanges_1.16.6 iterators_1.0.6 MASS_7.3-23
>>>>  oligo_1.22.0 oligoClasses_1.20.0 parallel_2.15.3
>>>>  preprocessCore_1.20.0 RSQLite_0.11.2 splines_2.15.3
>>>>  stats4_2.15.3 tools_2.15.3 zlibbioc_1.4.0
>>>> Sent via the guest posting facility at bioconductor.org.
>>> Matthew N McCall, PhD
>>> 112 Arvine Heights
>>> Rochester, NY 14611
>>> Cell: 202-222-5880
>Matthew N McCall, PhD
>112 Arvine Heights
>Rochester, NY 14611
>Bioconductor mailing list
>Bioconductor at r-project.org
>Search the archives: