Entering edit mode
Julian Lee
▴
140
@julian-lee-2487
Last seen 10.3 years ago
Hi Katrina and Mark,
just a short comment. i'm also learning bits and pieces here, so
correct me if i'm wrong.
-Yes, currently the only way of reading methylation data into
beadarray is
by using the bead-level data.
I've been working with some illumina Goldengate methylation datasets.
(this is different from the Infinium assay, whole genome right?)
I took my illumina methylation dataset and processed it on beadstudio
first, using illumina's normalisation methods for its goldengate
assay. Output it onto a txt file and read it via beadarray's
readBeadSummaryData
The output which i used to be read into beadarray package had values
ranged from 0 to 1, where 0 is unmethylated and 1 is methylated.
No normalization was done on R/Bioconductor
I found the vignette very helpful
>beadarrayUsersGuide(topic = "beadsummary")
4. Some of my samples seem to have a large number of targets which
have a p
value detection rate above 0.05 (beadstudio output). Illumina have
indicated that they disregard these. If I can not read in the bead
summary
data from bead studio, I am assuming that these detection p values can
not
be taken into account in the analysis? Or are there other methods that
remove/down grade these less than optimal probes (most removed as
outliers?).
For this problem, i used beadarray's Detection function.
so
>probesinterest<-Detection(BLData)<0.05
cheers
julian
----- Original Message -----
From: "Mark Dunning" <md392@cam.ac.uk>
To: "Katrina bell" <katrina.bell at="" mcri.edu.au="">, bioconductor at
stat.math.ethz.ch
Sent: Wednesday, October 22, 2008 7:42:11 AM GMT -08:00 US/Canada
Pacific
Subject: Re: [BioC] Beadarray and illumina methylation arrays
Hi Katrina,
I only have limited experience with methylation data, but hopefully I
might
be able to give you a few pointers.
-The error with readIllumina is quite hard to diagnose without seeing
the
example. I haven't seen any data from this type of Methylation array.
What
do the first few lines of the bead-level text files (.csv in your
case) look
like? It could be that the x and y coordinates are in a slightly
different
format to that we have seen before.
-Yes, 25% of outliers does seem a little high I'm afraid. Have you
also
looked at whereabouts the outliers are located on the arrays or made
some
imageplots? We have just developed a new function for automatic
artefact
detection called BASH that will be available in the forthcoming
Bioconductor
release. I could be interesting to run that on your data as Illumina
do
sometimes miss some beads in obvious artefacts and remove too many
beads on
the rest of the array. BASH should be a good compromise.
-Yes, currently the only way of reading methylation data into
beadarray is
by using the bead-level data.
-I'm not very familiar with the output of BeadStudio. Do you get
separate
detection values for the green and red channels? If so, then I don't
think
it would be problem if a bead-type was detected in one channel but not
the
other (since they are measure of either methylated or unmethylated
respectively). Bead types that are not detected in either channel
could be a
problem though.
-I haven't really seen many guidelines for normalization and this is
something I would like to look into. There is an obvious dye-bias that
needs
to be corrected and the background normalisation used by Illumina
might
actually help in this regard (although I wouldn't usually recommend it
for
other Illumina data). Quantile normalisation has been used for other
types
of two-colour Illumina data
(http://www.biomedcentral.com/1471-2105/9/409,
http://genome.cshlp.org/cgi/content/abstract/17/3/368) so it could
work
here.
Hope this helps,
Mark
-----Original Message-----
From: bioconductor-bounces@stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
Katrina bell
Sent: 21 October 2008 03:04
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Beadarray and illumina methylation arrays
Hello,
This is the first time I have used beadarray . I am using it for the
analysis of an illumina 27 methylation array and I am having a few
issues I
hope that you could help me with.
1. The first time I tried to load the methylation data, I didn't
write
in singleChannel=FALSE. It happily read in my 12 arrays with no
problems
what so ever. I tried plotting a few things which worked fine. Seeing
my
mistake, I then went back to reload my data with the red channel
(singleChannel=FALSE) and got the following error.
> BLData = readIllumina(arrayNames = targets$ChipBarcode,
textType=".csv",
targets=targets, backgroundMethod="none", singleChannel=FALSE)
Found 12 arrays
Reading pixels of ./4408100017_A_Grn.tif
Calculating background
Sharpening Image
Calculating foregound
Background correcting: method = none
Reading pixels of ./4408100017_A_Red.tif
Calculating background
*** caught segfault ***
address (nil), cause 'memory not mapped'
Traceback:
1: .C("readBeadImage", as.character(tifFiles2[i]),
as.double(RedX[ord]),
as.double(RedY[ord]), as.integer(numBeads), foreGround = double(length
=
numBeads), backGround = double(length = numBeads),
as.integer(backgroundSize), as.integer(manip), as.integer(fground),
PACKAGE
= "beadarray")
2: readIllumina(arrayNames = targets$ChipBarcode, textType = ".csv",
targets = targets, backgroundMethod = "none", singleChannel = FALSE)
session info Below.
So I ended up loading in the data with images=FALSE, which worked, but
I
would like to be able to look at the background. Is there a way around
this
issue?
2. When I plotted the outliers (bar chart) I got an astounding 25% for
the
majority of my 12 samples, both in the red and green channel (unlogged
data). In addition, 3 of the samples had a peak of intensity at 5 in
the
green channel, leading me to believe that I have some real quality
control
issues with my samples. Any opinions/suggestions on these results
would be
most welcome.
3. Is it correct that readBeadSummaryData, is not set up for two
colour
arrays such as the methylation arrays? So the only way to look at
methylation data is through reading in BLData?
4. Some of my samples seem to have a large number of targets which
have a p
value detection rate above 0.05 (beadstudio output). Illumina have
indicated that they disregard these. If I can not read in the bead
summary
data from bead studio, I am assuming that these detection p values can
not
be taken into account in the analysis? Or are there other methods that
remove/down grade these less than optimal probes (most removed as
outliers?).
5. Has any one had any experience with normalisation of the
methylation
arrays? I know that many of the usual array methods are out of the
question
due to the assumption that most probes on the array will not be
differentially expressed is invalid. I read in a bioconductor list
someone
suggesting quantile normalisation? I would really appreciate any
feeback
from people who have tried this or other methods, especially if they
have
verified their methylation results.
Thanks for any help/advice you may be able to give.
Cheers
Katrina
> sessionInfo() below
R version 2.7.0 (2008-04-22)
x86_64-redhat-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US
.UTF-8
;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;
LC_ADD
RESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] beadarray_1.8.0 affy_1.18.2 preprocessCore_1.2.1
[4] affyio_1.8.0 geneplotter_1.18.0 annotate_1.18.0
[7] xtable_1.5-2 AnnotationDbi_1.2.2 RSQLite_0.6-9
[10] DBI_0.2-4 lattice_0.17-6 Biobase_2.0.1
[13] limma_2.14.5
loaded via a namespace (and not attached):
[1] grid_2.7.0 KernSmooth_2.22-22 RColorBrewer_1.0-2
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Julian Lee
Bioinformatics Specialist
Cellular and Molecular Research
National Cancer Center Singapore