Hi
I have a set of Affymetrix Exon data which has about 40 samples. The
last third of the samples have used a different kit for the
experiment,
and I have been asked to determine whether the change in kit is
significant.
I have done clustering and PCA and the results suggest it does make a
different, but I would like to put some sort of statistic on it. What
is the best way to do this? I would think maybe this is a limma type
problem but I am not sure how to get an overall statistic rather than
just for individual probes.
Many thanks
Dan
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Molecular Carcinogenesis
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
addre...{{dropped}}
Hi Daniel,
I'm assuming that there should not be any differences between the
arrays
with the different kits. If they did the healthy samples first and the
diseased ones on the new kit, then you obviously won't be able to
differentiate between the biological and the kit effect.
There are a few ways you could know if the differences are
significant.
If clustering clearly separates samples that should be similar, then
you
could use bootstrap (like the pvclust package) to determine
significance. You could also look at the probability to get X
differentially expressed probes/exons/genes between the kits compared
to
random permutations of your samples. There should be a number of other
ways to get a p-value out of it.
I hope this helps,
Francois
On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote:
> Hi
>
> I have a set of Affymetrix Exon data which has about 40 samples.
The
> last third of the samples have used a different kit for the
experiment,
> and I have been asked to determine whether the change in kit is
significant.
>
> I have done clustering and PCA and the results suggest it does make
a
> different, but I would like to put some sort of statistic on it.
What
> is the best way to do this? I would think maybe this is a limma
type
> problem but I am not sure how to get an overall statistic rather
than
> just for individual probes.
>
> Many thanks
>
> Dan
>
Hi Daniel,
I had a different interpretation of what you wanted than what Francois
mentions here. Did the last third of the samples contain all sample
types (e.g., they aren't all just experimental or control)?
If so, you could always fit a linear model to the data that includes a
kit effect. You will then be able to test for each probeset if the
'kit'
parameter is equal to zero or not.
When you mention putting a statistic on it, is this what you mean?
Best,
Jim
Francois Pepin wrote:
> Hi Daniel,
>
> I'm assuming that there should not be any differences between the
arrays
> with the different kits. If they did the healthy samples first and
the
> diseased ones on the new kit, then you obviously won't be able to
> differentiate between the biological and the kit effect.
>
> There are a few ways you could know if the differences are
significant.
> If clustering clearly separates samples that should be similar, then
you
> could use bootstrap (like the pvclust package) to determine
> significance. You could also look at the probability to get X
> differentially expressed probes/exons/genes between the kits
compared to
> random permutations of your samples. There should be a number of
other
> ways to get a p-value out of it.
>
> I hope this helps,
>
> Francois
>
> On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote:
>
>>Hi
>>
>>I have a set of Affymetrix Exon data which has about 40 samples.
The
>>last third of the samples have used a different kit for the
experiment,
>>and I have been asked to determine whether the change in kit is
significant.
>>
>>I have done clustering and PCA and the results suggest it does make
a
>>different, but I would like to put some sort of statistic on it.
What
>>is the best way to do this? I would think maybe this is a limma
type
>>problem but I am not sure how to get an overall statistic rather
than
>>just for individual probes.
>>
>>Many thanks
>>
>>Dan
>>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.
Sorry this has taken me so long, but I have been away. Unfortunately
the last third contains all experimental, which makes life a bit
tricky.
That said, the rest are a mixture of controls and experimental. If I
did use the linear model would it be fair to say that if say 50% of
probes have a "kit" effect then the kit effect is significant?
Many thanks
Dan
James W. MacDonald wrote:
> Hi Daniel,
>
> I had a different interpretation of what you wanted than what
Francois
> mentions here. Did the last third of the samples contain all sample
> types (e.g., they aren't all just experimental or control)?
>
> If so, you could always fit a linear model to the data that includes
a
> kit effect. You will then be able to test for each probeset if the
'kit'
> parameter is equal to zero or not.
>
> When you mention putting a statistic on it, is this what you mean?
>
> Best,
>
> Jim
>
> Francois Pepin wrote:
>> Hi Daniel,
>>
>> I'm assuming that there should not be any differences between the
arrays
>> with the different kits. If they did the healthy samples first and
the
>> diseased ones on the new kit, then you obviously won't be able to
>> differentiate between the biological and the kit effect.
>>
>> There are a few ways you could know if the differences are
significant.
>> If clustering clearly separates samples that should be similar,
then you
>> could use bootstrap (like the pvclust package) to determine
>> significance. You could also look at the probability to get X
>> differentially expressed probes/exons/genes between the kits
compared to
>> random permutations of your samples. There should be a number of
other
>> ways to get a p-value out of it.
>>
>> I hope this helps,
>>
>> Francois
>>
>> On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote:
>>> Hi
>>>
>>> I have a set of Affymetrix Exon data which has about 40 samples.
The
>>> last third of the samples have used a different kit for the
experiment,
>>> and I have been asked to determine whether the change in kit is
>>> significant.
>>>
>>> I have done clustering and PCA and the results suggest it does
make a
>>> different, but I would like to put some sort of statistic on it.
What
>>> is the best way to do this? I would think maybe this is a limma
type
>>> problem but I am not sure how to get an overall statistic rather
than
>>> just for individual probes.
>>>
>>> Many thanks
>>>
>>> Dan
>>>
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
--
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
Email: daniel.brewer at icr.ac.uk
**************************************************************
The Institute of Cancer Research: Royal Cancer Hospital, a charitable
Company Limited by Guarantee, Registered in England under Company No.
534147 with its Registered Office at 123 Old Brompton Road, London SW7
3RP.
This e-mail message is confidential and for use by the
addre...{{dropped}}
Hi Dan,
Daniel Brewer wrote:
> Sorry this has taken me so long, but I have been away.
Unfortunately
> the last third contains all experimental, which makes life a bit
tricky.
> That said, the rest are a mixture of controls and experimental. If
I
> did use the linear model would it be fair to say that if say 50% of
> probes have a "kit" effect then the kit effect is significant?
Luckily you have a mixture of control and experimental samples for the
first set. This is still not an ideal situation, as by fitting a kit
parameter you are assuming that any differences between the
experimentals in the first set and the second is completely explained
by
the kit.
In other words, if there are any other differences in these
experimentals that are not due to the kit change, you won't be able to
detect that. In fact, you will ignore it.
As to your question, I don't think it is that simple. When you are
fitting the model in limma, you are doing so for each probeset
individually. How each probeset is affected by the change in kit is
not
likely to be consistent over all probesets (e.g., some probesets may
not
be affected at all, whereas others may have much higher/lower
binding),
so when you fit the model you will be able to see for each probeset if
the change in kits affected that probeset by looking to see if the
batch
effect is significant.
Best,
Jim
>
> Many thanks
>
> Dan
>
> James W. MacDonald wrote:
>> Hi Daniel,
>>
>> I had a different interpretation of what you wanted than what
Francois
>> mentions here. Did the last third of the samples contain all sample
>> types (e.g., they aren't all just experimental or control)?
>>
>> If so, you could always fit a linear model to the data that
includes a
>> kit effect. You will then be able to test for each probeset if the
'kit'
>> parameter is equal to zero or not.
>>
>> When you mention putting a statistic on it, is this what you mean?
>>
>> Best,
>>
>> Jim
>>
>> Francois Pepin wrote:
>>> Hi Daniel,
>>>
>>> I'm assuming that there should not be any differences between the
arrays
>>> with the different kits. If they did the healthy samples first and
the
>>> diseased ones on the new kit, then you obviously won't be able to
>>> differentiate between the biological and the kit effect.
>>>
>>> There are a few ways you could know if the differences are
significant.
>>> If clustering clearly separates samples that should be similar,
then you
>>> could use bootstrap (like the pvclust package) to determine
>>> significance. You could also look at the probability to get X
>>> differentially expressed probes/exons/genes between the kits
compared to
>>> random permutations of your samples. There should be a number of
other
>>> ways to get a p-value out of it.
>>>
>>> I hope this helps,
>>>
>>> Francois
>>>
>>> On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote:
>>>> Hi
>>>>
>>>> I have a set of Affymetrix Exon data which has about 40 samples.
The
>>>> last third of the samples have used a different kit for the
experiment,
>>>> and I have been asked to determine whether the change in kit is
>>>> significant.
>>>>
>>>> I have done clustering and PCA and the results suggest it does
make a
>>>> different, but I would like to put some sort of statistic on it.
What
>>>> is the best way to do this? I would think maybe this is a limma
type
>>>> problem but I am not sure how to get an overall statistic rather
than
>>>> just for individual probes.
>>>>
>>>> Many thanks
>>>>
>>>> Dan
>>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
--
James W. MacDonald
Affymetrix and cDNA Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should
not be used for urgent or sensitive issues.