Is a change in kit significant?

0

Entering edit mode

Daniel Brewer ★ 1.9k

@daniel-brewer-1791

Last seen 9.6 years ago

Hi I have a set of Affymetrix Exon data which has about 40 samples. The last third of the samples have used a different kit for the experiment, and I have been asked to determine whether the change in kit is significant. I have done clustering and PCA and the results suggest it does make a different, but I would like to put some sort of statistic on it. What is the best way to do this? I would think maybe this is a limma type problem but I am not sure how to get an overall statistic rather than just for individual probes. Many thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addre...{{dropped}}

Clustering Cancer limma Clustering Cancer limma • 791 views

ADD COMMENT • link updated 17.0 years ago by Francois Pepin ★ 1.3k • written 17.0 years ago by Daniel Brewer ★ 1.9k

0

Entering edit mode

Francois Pepin ★ 1.3k

@francois-pepin-1012

Last seen 9.6 years ago

Hi Daniel, I'm assuming that there should not be any differences between the arrays with the different kits. If they did the healthy samples first and the diseased ones on the new kit, then you obviously won't be able to differentiate between the biological and the kit effect. There are a few ways you could know if the differences are significant. If clustering clearly separates samples that should be similar, then you could use bootstrap (like the pvclust package) to determine significance. You could also look at the probability to get X differentially expressed probes/exons/genes between the kits compared to random permutations of your samples. There should be a number of other ways to get a p-value out of it. I hope this helps, Francois On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote: > Hi > > I have a set of Affymetrix Exon data which has about 40 samples. The > last third of the samples have used a different kit for the experiment, > and I have been asked to determine whether the change in kit is significant. > > I have done clustering and PCA and the results suggest it does make a > different, but I would like to put some sort of statistic on it. What > is the best way to do this? I would think maybe this is a limma type > problem but I am not sure how to get an overall statistic rather than > just for individual probes. > > Many thanks > > Dan >

ADD COMMENT • link 17.0 years ago Francois Pepin ★ 1.3k

0

Entering edit mode

Hi Daniel, I had a different interpretation of what you wanted than what Francois mentions here. Did the last third of the samples contain all sample types (e.g., they aren't all just experimental or control)? If so, you could always fit a linear model to the data that includes a kit effect. You will then be able to test for each probeset if the 'kit' parameter is equal to zero or not. When you mention putting a statistic on it, is this what you mean? Best, Jim Francois Pepin wrote: > Hi Daniel, > > I'm assuming that there should not be any differences between the arrays > with the different kits. If they did the healthy samples first and the > diseased ones on the new kit, then you obviously won't be able to > differentiate between the biological and the kit effect. > > There are a few ways you could know if the differences are significant. > If clustering clearly separates samples that should be similar, then you > could use bootstrap (like the pvclust package) to determine > significance. You could also look at the probability to get X > differentially expressed probes/exons/genes between the kits compared to > random permutations of your samples. There should be a number of other > ways to get a p-value out of it. > > I hope this helps, > > Francois > > On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote: > >>Hi >> >>I have a set of Affymetrix Exon data which has about 40 samples. The >>last third of the samples have used a different kit for the experiment, >>and I have been asked to determine whether the change in kit is significant. >> >>I have done clustering and PCA and the results suggest it does make a >>different, but I would like to put some sort of statistic on it. What >>is the best way to do this? I would think maybe this is a limma type >>problem but I am not sure how to get an overall statistic rather than >>just for individual probes. >> >>Many thanks >> >>Dan >> > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 17.0 years ago James W. MacDonald 65k

0

Entering edit mode

Sorry this has taken me so long, but I have been away. Unfortunately the last third contains all experimental, which makes life a bit tricky. That said, the rest are a mixture of controls and experimental. If I did use the linear model would it be fair to say that if say 50% of probes have a "kit" effect then the kit effect is significant? Many thanks Dan James W. MacDonald wrote: > Hi Daniel, > > I had a different interpretation of what you wanted than what Francois > mentions here. Did the last third of the samples contain all sample > types (e.g., they aren't all just experimental or control)? > > If so, you could always fit a linear model to the data that includes a > kit effect. You will then be able to test for each probeset if the 'kit' > parameter is equal to zero or not. > > When you mention putting a statistic on it, is this what you mean? > > Best, > > Jim > > Francois Pepin wrote: >> Hi Daniel, >> >> I'm assuming that there should not be any differences between the arrays >> with the different kits. If they did the healthy samples first and the >> diseased ones on the new kit, then you obviously won't be able to >> differentiate between the biological and the kit effect. >> >> There are a few ways you could know if the differences are significant. >> If clustering clearly separates samples that should be similar, then you >> could use bootstrap (like the pvclust package) to determine >> significance. You could also look at the probability to get X >> differentially expressed probes/exons/genes between the kits compared to >> random permutations of your samples. There should be a number of other >> ways to get a p-value out of it. >> >> I hope this helps, >> >> Francois >> >> On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote: >>> Hi >>> >>> I have a set of Affymetrix Exon data which has about 40 samples. The >>> last third of the samples have used a different kit for the experiment, >>> and I have been asked to determine whether the change in kit is >>> significant. >>> >>> I have done clustering and PCA and the results suggest it does make a >>> different, but I would like to put some sort of statistic on it. What >>> is the best way to do this? I would think maybe this is a limma type >>> problem but I am not sure how to get an overall statistic rather than >>> just for individual probes. >>> >>> Many thanks >>> >>> Dan >>> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addre...{{dropped}}

ADD REPLY • link 16.9 years ago Daniel Brewer ★ 1.9k

0

Entering edit mode

Hi Dan, Daniel Brewer wrote: > Sorry this has taken me so long, but I have been away. Unfortunately > the last third contains all experimental, which makes life a bit tricky. > That said, the rest are a mixture of controls and experimental. If I > did use the linear model would it be fair to say that if say 50% of > probes have a "kit" effect then the kit effect is significant? Luckily you have a mixture of control and experimental samples for the first set. This is still not an ideal situation, as by fitting a kit parameter you are assuming that any differences between the experimentals in the first set and the second is completely explained by the kit. In other words, if there are any other differences in these experimentals that are not due to the kit change, you won't be able to detect that. In fact, you will ignore it. As to your question, I don't think it is that simple. When you are fitting the model in limma, you are doing so for each probeset individually. How each probeset is affected by the change in kit is not likely to be consistent over all probesets (e.g., some probesets may not be affected at all, whereas others may have much higher/lower binding), so when you fit the model you will be able to see for each probeset if the change in kits affected that probeset by looking to see if the batch effect is significant. Best, Jim > > Many thanks > > Dan > > James W. MacDonald wrote: >> Hi Daniel, >> >> I had a different interpretation of what you wanted than what Francois >> mentions here. Did the last third of the samples contain all sample >> types (e.g., they aren't all just experimental or control)? >> >> If so, you could always fit a linear model to the data that includes a >> kit effect. You will then be able to test for each probeset if the 'kit' >> parameter is equal to zero or not. >> >> When you mention putting a statistic on it, is this what you mean? >> >> Best, >> >> Jim >> >> Francois Pepin wrote: >>> Hi Daniel, >>> >>> I'm assuming that there should not be any differences between the arrays >>> with the different kits. If they did the healthy samples first and the >>> diseased ones on the new kit, then you obviously won't be able to >>> differentiate between the biological and the kit effect. >>> >>> There are a few ways you could know if the differences are significant. >>> If clustering clearly separates samples that should be similar, then you >>> could use bootstrap (like the pvclust package) to determine >>> significance. You could also look at the probability to get X >>> differentially expressed probes/exons/genes between the kits compared to >>> random permutations of your samples. There should be a number of other >>> ways to get a p-value out of it. >>> >>> I hope this helps, >>> >>> Francois >>> >>> On Wed, 2007-05-09 at 16:39 +0100, Daniel Brewer wrote: >>>> Hi >>>> >>>> I have a set of Affymetrix Exon data which has about 40 samples. The >>>> last third of the samples have used a different kit for the experiment, >>>> and I have been asked to determine whether the change in kit is >>>> significant. >>>> >>>> I have done clustering and PCA and the results suggest it does make a >>>> different, but I would like to put some sort of statistic on it. What >>>> is the best way to do this? I would think maybe this is a limma type >>>> problem but I am not sure how to get an overall statistic rather than >>>> just for individual probes. >>>> >>>> Many thanks >>>> >>>> Dan >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > -- James W. MacDonald Affymetrix and cDNA Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 16.9 years ago James W. MacDonald 65k

Login before adding your answer.