Question: How to use brainarray custom cdf with any bioconductor package?
0
4.9 years ago by
rishi.dasroy10
Finland
rishi.dasroy10 wrote:

I am trying to analyse affymetrix exon array with latest relaease of brainarray custom cdf.

I have tried 'affy' package with follwing command

> Data <- ReadAffy(cdfname ='moex10stmmrefseqcdf')
Error:

The affy package is not designed for this array type.
Please use either the oligo or xps package.

Next I have tried 'oligo' with following command

> affyExonFS <- read.celfiles(exonCELs,pkgname = "moex10stmmrefseqcdf")

Attaching package: ‘AnnotationDbi’

The following object is masked from ‘package:GenomeInfoDb’:

species

Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function ‘kind’ for signature ‘"environment"’

Please let me know what is wrong here?

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=fi_FI.UTF-8    LC_NUMERIC=C            LC_TIME=en_GB           LC_COLLATE=en_GB        LC_MONETARY=fi_FI.UTF-8
[6] LC_MESSAGES=en_GB       LC_PAPER=en_GB          LC_NAME=C               LC_ADDRESS=C            LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB    LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] pd.moex10st.mm.aceviewg_0.0.1 affy_1.44.0                   BiocInstaller_1.16.1          pd.moex.1.0.st.v1_3.10.0
[5] RSQLite_1.0.0                 DBI_0.3.1                     moex10stmmrefseqcdf_19.0.0    AnnotationDbi_1.28.1
[9] GenomeInfoDb_1.2.3            oligo_1.30.0                  Biostrings_2.34.0             XVector_0.6.0
[13] IRanges_2.0.0                 S4Vectors_0.4.0               Biobase_2.26.0                oligoClasses_1.28.0
[17] BiocGenerics_0.12.1

loaded via a namespace (and not attached):
[1] affxparser_1.38.0     affyio_1.34.0         bit_1.1-12            codetools_0.2-8       ff_2.2-13             foreach_1.4.2
[7] GenomicRanges_1.18.3  iterators_1.0.7       preprocessCore_1.28.0 splines_3.1.2         tools_3.1.2           zlibbioc_1.12.0     
customcdf affy oligo • 3.6k views
modified 4.5 years ago • written 4.9 years ago by rishi.dasroy10
Answer: How to use brainarray custom cdf with any bioconductor package?
0
4.8 years ago by
United States
Stephen Piccolo560 wrote:

Rishi,

Sorry for the delay in replying. If you look at the code from the SCAN.UPC package http://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html), you can find an example of how to use BrainArray mappings with exon arrays. Or you might also want to try normalizing (SCAN function) with this package rather than trying to write something from scratch.

-Steve

Hi Steve,

Thanks for your answer. I have visited SCAN.UPC page and it looks promising.

Can you explain and provide examples more how to do DE of gene and exons using this?

Thanks

rishi

SCAN.UPC is used to normalize and summarize data, not test for differential expression. For that you would want to use something like the limma package. But if you want examples for how to use SCAN.UPC, you should read the vignette that can be accessed from the landing page:

http://bioconductor.org/packages/release/bioc/vignettes/SCAN.UPC/inst/doc/SCAN.vignette.pdf

Also note that your original post indicated that you want to summarize the data using the RefSeq mappings for the Mouse Exon 1.0 ST array. Since RefSeq is a transcript-based annotation database, you cannot use that to do DE of genes or exons. If you want genes, you should likely use the moex10stmmentrezgcdf package or the moex10stmmensgcdf package (for Entrez Gene and Ensembl gene mappings, respectively). If you care about exons, then I believe your only choice is moex10stmmensecdf.

Hi James,

Thanks for your response. I have normalized the exons through SCAN.UPC package using  moex10st_Mm_ENSE.cdf to check alternative splicing . Although there are 337485 exons in this cdf , but after normalization with SCAN.UPC I have only got normalization value of 228057 exons.

However normalization with RMA (FIRMA) has given intensities of 337485 exons.

Why there is a less number of exons while processing with SCAN.UPC?

I used following command to normalize the exons.

normalized_exon_xprsn_set = SCAN(celFilePath, outFilePath="exon_normalize_moex10stmmenseprobe_19.0.0.txt", probeSummaryPackage = "moex10stmmenseprobe",exonArrayTarget = "probeset")


> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=fi_FI.UTF-8    LC_NUMERIC=C            LC_TIME=en_GB           LC_COLLATE=en_GB        LC_MONETARY=fi_FI.UTF-8
[6] LC_MESSAGES=en_GB       LC_PAPER=en_GB          LC_NAME=C               LC_ADDRESS=C            LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB    LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] moex10stmmenseprobe_19.0.0 AnnotationDbi_1.28.1       GenomeInfoDb_1.2.4         IRanges_2.0.1              S4Vectors_0.4.0
[6] Biobase_2.26.0             BiocGenerics_0.12.1        data.table_1.9.4           plyr_1.8.1                 limma_3.22.3
[11] biomaRt_2.22.0

loaded via a namespace (and not attached):
[1] bitops_1.0-6     chron_2.3-45     colorspace_1.2-4 DBI_0.3.1        digest_0.6.8     ggplot2_1.0.0    grid_3.1.2       gtable_0.1.2
[9] labeling_0.3     MASS_7.3-37      munsell_0.4.2    proto_0.3-10     Rcpp_0.11.3      RCurl_1.95-4.5   reshape2_1.4.1   RSQLite_1.0.0
[17] scales_0.2.4     stringr_0.6.2    tools_3.1.2      XML_3.98-1.1

When you used fRMA, did you somehow use BrainArray mappings?

The BrainArray mappings do not include all probes (some are considered to be low quality). So those would be excluded when you are using SCAN.UPC.

Yes I have used Brainarray gene and exon mappings both for fiRMA and SCAN.UPC. I have received same number of genes by both the method but it is different in case of exons.

Rishi,

Ah OK. Since these are exon arrays, you would also want to specify a value for the "exonArrayTarget" parameter. There are three types of probes within exon arrays: "core," "extended," and "full," and these are supported by varying levels of evidence. By default it will use just "core" probes because they are of the highest quality. That is probably why you are seeing a difference in the number of exons. If you want to use all probes, you would specify exonArrayTarget="probeset". The SCAN.UPC documentation provides more detail on this.

Answer: How to use brainarray custom cdf with any bioconductor package?
0
4.5 years ago by
rishi.dasroy10
Finland
rishi.dasroy10 wrote:

Stephen ,

Thank you so much for quick response. But I normalized them with exonArrayTarget="probeset" parameter.

I have also checked the number of exons defined in the brainarray cdf. It is 337485 which is more than what I am getting from the SCAN.UPC

> t <-as.data.frame(table(moex10stmmenseprobe\$Probe.Set.Name))
Var1 Freq
1 ENSMUSE00000097910_at    4
2 ENSMUSE00000097912_at    8
3 ENSMUSE00000097938_at    4
4 ENSMUSE00000097939_at    4
5 ENSMUSE00000097942_at    4
6 ENSMUSE00000097957_at    4
> dim(t)
[1] 337485      2
1

Would you be willing to send me an email with a sample CEL file, the exact BrainArray version you are working with, and the R command you are using to normalize it? It would be probably be best to send the CEL file via Dropbox or some other web-based service due to size. Thanks.

I was trying to normalize 64 CEL files with a single command and the system was not responding (with 8 cores and 16 gb RAM, it was running for 56 hours ) . But I still found the resultant file given in "outFilePath". Now I understand the process may not be able to save all the exons in the file.

According to your suggestion I have processed a single CEL file and got desired number of exons. Now I will process them one by one.

BIG sorry for wasting your time and thank you very much for your help.