Dear list,
I used r/gPreProcessedSignals from Agilent FE outpup files as a start
to
analyze without any filtering. The density plot (see attachment)
indicates
that both channels were pretty well consistent in high intensity
range.
There were separate read and green peaks, however, which located at
log2
(5) and log2 (4) respectively. MA plots were pretty normal (see
attached).
The experiment was human colon cancer versus Stratgen universal human
cancer RNAs. These two minor peaks, to me, may be more than what could
be
explained by just dye bias. As r/gPreprocessedSignal is supposed to
have
gone through a lowess normalization or something like that. Could they
be
"real" difference between the samples and the universal reference?
Has anyone had similar observations? I appreciate any comments to help
me
out?
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu
Sorry I made some mistakes on attachment of plots in my last email. I
sent
it again therefore. Sorry for multiple versions of emails.
Dear list,
I used r/gPreProcessedSignals from Agilent FE outpup files as a start
to
analyze without filtering out any genes, except for those control
spots.
The density plot indicates that both channels were pretty well matched
in
high intensity range. There were separate read and green peaks,
however,
which located at log2 (5) and log2 (4) respectively. MA plots were
pretty
normal (please visit the site for viewing plots:
<http: www.unc.edu="" ~jjin="" graph=""/> )
The experiment was human colon cancer versus Stratgen universal human
cancer RNAs. The two minor peaks, to me, may be more than what could
be
explained by just dye bias. As r/gPreprocessedSignal was supposed to
have
gone through a lowess normalization or something like that. Could they
be
"real" difference between the samples and the universal reference?
Has anyone had similar observations? I appreciate any comments to help
me
out?
thanks,
Jianping
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu
Quoting Jianping Jin <jjin at="" email.unc.edu="">:
> Sorry I made some mistakes on attachment of plots in my last email.
I sent
> it again therefore. Sorry for multiple versions of emails.
>
> Dear list,
>
> I used r/gPreProcessedSignals from Agilent FE outpup files as a
start to
> analyze without filtering out any genes, except for those control
spots.
> The density plot indicates that both channels were pretty well
matched in
> high intensity range. There were separate read and green peaks,
however,
> which located at log2 (5) and log2 (4) respectively. MA plots were
pretty
> normal (please visit the site for viewing plots:
> <http: www.unc.edu="" ~jjin="" graph=""/> )
>
> The experiment was human colon cancer versus Stratgen universal
human
> cancer RNAs. The two minor peaks, to me, may be more than what could
be
> explained by just dye bias. As r/gPreprocessedSignal was supposed to
have
> gone through a lowess normalization or something like that. Could
they be
> "real" difference between the samples and the universal reference?
>
> Has anyone had similar observations? I appreciate any comments to
help me
> out?
>
> thanks,
>
> Jianping
I don't think you can say there are any real differences based on
those peaks. A log2 of 5 comes from an intensity value of only 32,
that's extremely low. In your MA plots you can see a kind of a "blob"
in one direction, at the very left of the plots... it looks to me
(without being familiar with teh actual processing you used), that
anything below 8 or so is produced by very low intensity spots, and
the measurements cannot be very reliable.
If you do a filtering to remove low intensity spots (on BOTH channels,
on all slides) you will probably clean up that area of the graph, and
will remove the spots that produced those small peaks.
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at
ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131
6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131
6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
Hi Jose,
Thanks for your comments! I agreed that intensities of those small
peaks
were too low to be reliable. We can remove them from further analyses.
no
question about that.
In terms of my previous question of whether or not they could be
"real"
difference existing between the colon cancer and the universal cancer
cell
line RNAs, considerations may be given beyond just removing those
spots.
What I noticed was that some probes can only be hybridized with the
reference RNAs and some others only with colon cancer samples (see
"RG_cutoff.jpeg" at <http: www.unc.edu="" ~jjin="" graph=""/> ). Take one chip
as
an example, 4548 genes showed green signals more than 2^8 with read
signals less than 2^6, and 1831 genes showed read signal more than 2^8
with
green signal less than 2^5. On both cases maximum signals, read or
green,
can be as high as 2^12. The observation suggested that there exist
some
real differences between RNAs.
This raises another question. Is the pooled universal cancer RNA an
idea
reference? It may create difficulties in explanation of results for
some
genes.
Any comments will be appreciated!
Jianping
--On Thursday, May 10, 2007 6:46 PM +0100 J.delasHeras at ed.ac.uk
wrote:
> Quoting Jianping Jin <jjin at="" email.unc.edu="">:
>
>> Sorry I made some mistakes on attachment of plots in my last email.
I
>> sent it again therefore. Sorry for multiple versions of emails.
>>
>> Dear list,
>>
>> I used r/gPreProcessedSignals from Agilent FE outpup files as a
start to
>> analyze without filtering out any genes, except for those control
spots.
>> The density plot indicates that both channels were pretty well
matched in
>> high intensity range. There were separate read and green peaks,
however,
>> which located at log2 (5) and log2 (4) respectively. MA plots were
pretty
>> normal (please visit the site for viewing plots:
>> <http: www.unc.edu="" ~jjin="" graph=""/> )
>>
>> The experiment was human colon cancer versus Stratgen universal
human
>> cancer RNAs. The two minor peaks, to me, may be more than what
could be
>> explained by just dye bias. As r/gPreprocessedSignal was supposed
to have
>> gone through a lowess normalization or something like that. Could
they be
>> "real" difference between the samples and the universal reference?
>>
>> Has anyone had similar observations? I appreciate any comments to
help me
>> out?
>>
>> thanks,
>>
>> Jianping
>
> I don't think you can say there are any real differences based on
> those peaks. A log2 of 5 comes from an intensity value of only 32,
> that's extremely low. In your MA plots you can see a kind of a
"blob"
> in one direction, at the very left of the plots... it looks to me
> (without being familiar with teh actual processing you used), that
> anything below 8 or so is produced by very low intensity spots, and
> the measurements cannot be very reliable.
> If you do a filtering to remove low intensity spots (on BOTH
channels,
> on all slides) you will probably clean up that area of the graph,
and
> will remove the spots that produced those small peaks.
>
> Jose
>
> --
> Dr. Jose I. de las Heras Email: J.delasHeras at
ed.ac.uk
> The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131
6513374
> Institute for Cell & Molecular Biology Fax: +44 (0)131
6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu
Hi Jianping,
> In terms of my previous question of whether or not they could be
"real"
> difference existing between the colon cancer and the universal
cancer
> cell line RNAs, considerations may be given beyond just removing
those
> spots. What I noticed was that some probes can only be hybridized
with
> the reference RNAs and some others only with colon cancer samples
(see
> "RG_cutoff.jpeg" at <http: www.unc.edu="" ~jjin="" graph=""/> ). Take one
chip
> as an example, 4548 genes showed green signals more than 2^8 with
read
> signals less than 2^6, and 1831 genes showed read signal more than
2^8
> with green signal less than 2^5. On both cases maximum signals, read
or
> green, can be as high as 2^12. The observation suggested that there
> exist some real differences between RNAs.
I am not surprised that you can find individual genes that have signal
only in one of the samples, either the reference or the cancer one. In
fact, this is teh sort of thing I am usually looking for: genes that
are either silenced or activated in cancer, with respect to a "normal"
reference.
The plot your showing does not appear to come from normalised arrays,
in which case you can infer little from the differences in the
distribution. What it does show is that you have very weak signal on
both channels on both arrays...
Normalise your data (within arrays, probably using some "flavour" of
loess), and look at the MA plots: that's a better picture of what's
going on.
In an ideal plot, genes that are only expressed in one sample tend to
cluster along the left 2 sides of an imaginary diamond... for
instance:
http://mcnach.com/MISC/MAplot.png
This is a very unusual MA plot, from an experiment where many many
many genes are activated (a cell line transfected with a strong
activator, hybridised against the non-transfected cells).
I drew in red the "imaginary diamond", and numbered 1 and 2 teh two
sides I was talking about. Along 1 you get genes that are activated in
one sample (with M>0), and along 2 you woudl get genes silenced in teh
same sample (with M<0).
This experiment is unusual in that it allows to see clearly a "spike"
of activated genes along "1". In most experiments you don'd see
anything like that, but that's the area where ideally you'll have this
sort of genes clustering. If there are many genes that only have
signal in either of your samples, you may see a well populated "cloud"
around these areas.
Your MA plots seem to me to indicate that this is the case (starting
from A around 8+, the stuff on teh left seems a little artifactual)...
but you really need to dig in deeper if you want some clear answers ;)
> This raises another question. Is the pooled universal cancer RNA an
> idea reference? It may create difficulties in explanation of results
> for some genes.
Ideal? It depends on teh experiment, I suppose.
It all depends on what questions you're asking. Even very closely
related samples, from similar tissues, one cancerous and one normal,
have lots of expression differences. Your answers will of course be
determined by what comparisons you're making, what references you
choose, etc. A pooled "universal cancer" RNA can potentially contain
very different types of cells, etc... which can be good or bad,
depending on what you're after, really...
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at
ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131
6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131
6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
Hi Jianping,
> In terms of my previous question of whether or not they could be
"real"
> difference existing between the colon cancer and the universal
cancer cell
> line RNAs, considerations may be given beyond just removing those
spots.
> What I noticed was that some probes can only be hybridized with the
> reference RNAs and some others only with colon cancer samples (see
> "RG_cutoff.jpeg" at <http: www.unc.edu="" ~jjin="" graph=""/> ). Take one
chip as
> an example, 4548 genes showed green signals more than 2^8 with read
> signals less than 2^6, and 1831 genes showed read signal more than
2^8 with
> green signal less than 2^5. On both cases maximum signals, read or
green,
> can be as high as 2^12. The observation suggested that there exist
some
> real differences between RNAs.
If these differences you see are really due to differences in the cell
types, you should see the reverse effect in the dye-swapped arrays.
You may also want to have a look at the raw signals, and also check
the quality of the slides (just because the slides are Agilent doesn't
mean they're perfect).
Jeremy Davis-Turak
Thanks Jeremy,
You made a good point. Unfortunately the dye-swapped arrays were not
considered in the original experiment design.
We did check the quality of the slides. We found the slide were good
except
a couple of samples which we removed from the analysis due to some
gridding
problems.
thanks again for your comment!
Jianping
--On Saturday, May 12, 2007 11:50 AM -0700 Jeremy Davis-Turak
<jeremydt at="" gmail.com=""> wrote:
> Hi Jianping,
>
>> In terms of my previous question of whether or not they could be
"real"
>> difference existing between the colon cancer and the universal
cancer
>> cell line RNAs, considerations may be given beyond just removing
those
>> spots. What I noticed was that some probes can only be hybridized
with
>> the reference RNAs and some others only with colon cancer samples
(see
>> "RG_cutoff.jpeg" at <http: www.unc.edu="" ~jjin="" graph=""/> ). Take one
chip as
>> an example, 4548 genes showed green signals more than 2^8 with
read
>> signals less than 2^6, and 1831 genes showed read signal more than
2^8
>> with green signal less than 2^5. On both cases maximum signals,
read or
>> green, can be as high as 2^12. The observation suggested that there
>> exist some real differences between RNAs.
>
> If these differences you see are really due to differences in the
cell
> types, you should see the reverse effect in the dye-swapped arrays.
>
> You may also want to have a look at the raw signals, and also check
> the quality of the slides (just because the slides are Agilent
doesn't
> mean they're perfect).
>
> Jeremy Davis-Turak
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
##################################
Jianping Jin Ph.D.
Bioinformatics scientist
Center for Bioinformatics
Room 3133 Bioinformatics building
CB# 7104
University of Chapel Hill
Chapel Hill, NC 27599
Phone: (919)843-6105
FAX: (919)843-3103
E-Mail: jjin at email.unc.edu