Question: Beadarray and illumina methylation arrays
0
10.7 years ago by
Julian Lee140
Julian Lee140 wrote:
Hi Katrina and Mark, just a short comment. i'm also learning bits and pieces here, so correct me if i'm wrong. -Yes, currently the only way of reading methylation data into beadarray is by using the bead-level data. I've been working with some illumina Goldengate methylation datasets. (this is different from the Infinium assay, whole genome right?) I took my illumina methylation dataset and processed it on beadstudio first, using illumina's normalisation methods for its goldengate assay. Output it onto a txt file and read it via beadarray's readBeadSummaryData The output which i used to be read into beadarray package had values ranged from 0 to 1, where 0 is unmethylated and 1 is methylated. No normalization was done on R/Bioconductor I found the vignette very helpful >beadarrayUsersGuide(topic = "beadsummary") 4. Some of my samples seem to have a large number of targets which have a p value detection rate above 0.05 (beadstudio output). Illumina have indicated that they disregard these. If I can not read in the bead summary data from bead studio, I am assuming that these detection p values can not be taken into account in the analysis? Or are there other methods that remove/down grade these less than optimal probes (most removed as outliers?). For this problem, i used beadarray's Detection function. so >probesinterest<-Detection(BLData)<0.05 cheers julian ----- Original Message ----- From: "Mark Dunning" <md392@cam.ac.uk> To: "Katrina bell" <katrina.bell at="" mcri.edu.au="">, bioconductor at stat.math.ethz.ch Sent: Wednesday, October 22, 2008 7:42:11 AM GMT -08:00 US/Canada Pacific Subject: Re: [BioC] Beadarray and illumina methylation arrays Hi Katrina, I only have limited experience with methylation data, but hopefully I might be able to give you a few pointers. -The error with readIllumina is quite hard to diagnose without seeing the example. I haven't seen any data from this type of Methylation array. What do the first few lines of the bead-level text files (.csv in your case) look like? It could be that the x and y coordinates are in a slightly different format to that we have seen before. -Yes, 25% of outliers does seem a little high I'm afraid. Have you also looked at whereabouts the outliers are located on the arrays or made some imageplots? We have just developed a new function for automatic artefact detection called BASH that will be available in the forthcoming Bioconductor release. I could be interesting to run that on your data as Illumina do sometimes miss some beads in obvious artefacts and remove too many beads on the rest of the array. BASH should be a good compromise. -Yes, currently the only way of reading methylation data into beadarray is by using the bead-level data. -I'm not very familiar with the output of BeadStudio. Do you get separate detection values for the green and red channels? If so, then I don't think it would be problem if a bead-type was detected in one channel but not the other (since they are measure of either methylated or unmethylated respectively). Bead types that are not detected in either channel could be a problem though. -I haven't really seen many guidelines for normalization and this is something I would like to look into. There is an obvious dye-bias that needs to be corrected and the background normalisation used by Illumina might actually help in this regard (although I wouldn't usually recommend it for other Illumina data). Quantile normalisation has been used for other types of two-colour Illumina data (http://www.biomedcentral.com/1471-2105/9/409, http://genome.cshlp.org/cgi/content/abstract/17/3/368) so it could work here. Hope this helps, Mark -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Katrina bell Sent: 21 October 2008 03:04 To: bioconductor at stat.math.ethz.ch Subject: [BioC] Beadarray and illumina methylation arrays Hello, This is the first time I have used beadarray . I am using it for the analysis of an illumina 27 methylation array and I am having a few issues I hope that you could help me with. 1. The first time I tried to load the methylation data, I didn't write in singleChannel=FALSE. It happily read in my 12 arrays with no problems what so ever. I tried plotting a few things which worked fine. Seeing my mistake, I then went back to reload my data with the red channel (singleChannel=FALSE) and got the following error. > BLData = readIllumina(arrayNames = targets$ChipBarcode, textType=".csv", targets=targets, backgroundMethod="none", singleChannel=FALSE) Found 12 arrays Reading pixels of ./4408100017_A_Grn.tif Calculating background Sharpening Image Calculating foregound Background correcting: method = none Reading pixels of ./4408100017_A_Red.tif Calculating background *** caught segfault *** address (nil), cause 'memory not mapped' Traceback: 1: .C("readBeadImage", as.character(tifFiles2[i]), as.double(RedX[ord]), as.double(RedY[ord]), as.integer(numBeads), foreGround = double(length = numBeads), backGround = double(length = numBeads), as.integer(backgroundSize), as.integer(manip), as.integer(fground), PACKAGE = "beadarray") 2: readIllumina(arrayNames = targets$ChipBarcode, textType = ".csv", targets = targets, backgroundMethod = "none", singleChannel = FALSE) session info Below. So I ended up loading in the data with images=FALSE, which worked, but I would like to be able to look at the background. Is there a way around this issue? 2. When I plotted the outliers (bar chart) I got an astounding 25% for the majority of my 12 samples, both in the red and green channel (unlogged data). In addition, 3 of the samples had a peak of intensity at 5 in the green channel, leading me to believe that I have some real quality control issues with my samples. Any opinions/suggestions on these results would be most welcome. 3. Is it correct that readBeadSummaryData, is not set up for two colour arrays such as the methylation arrays? So the only way to look at methylation data is through reading in BLData? 4. Some of my samples seem to have a large number of targets which have a p value detection rate above 0.05 (beadstudio output). Illumina have indicated that they disregard these. If I can not read in the bead summary data from bead studio, I am assuming that these detection p values can not be taken into account in the analysis? Or are there other methods that remove/down grade these less than optimal probes (most removed as outliers?). 5. Has any one had any experience with normalisation of the methylation arrays? I know that many of the usual array methods are out of the question due to the assumption that most probes on the array will not be differentially expressed is invalid. I read in a bioconductor list someone suggesting quantile normalisation? I would really appreciate any feeback from people who have tried this or other methods, especially if they have verified their methylation results. Thanks for any help/advice you may be able to give. Cheers Katrina > sessionInfo() below R version 2.7.0 (2008-04-22) x86_64-redhat-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8 ;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C; LC_ADD RESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] beadarray_1.8.0 affy_1.18.2 preprocessCore_1.2.1 [4] affyio_1.8.0 geneplotter_1.18.0 annotate_1.18.0 [7] xtable_1.5-2 AnnotationDbi_1.2.2 RSQLite_0.6-9 [10] DBI_0.2-4 lattice_0.17-6 Biobase_2.0.1 [13] limma_2.14.5 loaded via a namespace (and not attached): [1] grid_2.7.0 KernSmooth_2.22-22 RColorBrewer_1.0-2 [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Julian Lee Bioinformatics Specialist Cellular and Molecular Research National Cancer Center Singapore