Effect of biological variation on tests

0

Entering edit mode

Yogi Sundaravadanam ▴ 320

@yogi-sundaravadanam-2312

Last seen 9.9 years ago

Hi all

I am working with biological replicates and I am a bit worried about the biological variation between samples.

For example, the abundance of a certain gene in sample 1 could be hundreds of time higher or lower than in sample B. If this is the case, this will significantly affect the P-value in the t-test.

As such, my question is whether there is a way we can account for this fact in the statistical analysis?

I will be much grateful if you guys could shed some light on this topic?

Thank you
Yogi

biologicalreplicates statistical test • 1.6k views

ADD COMMENT • link updated 7.9 years ago by Gordon Smyth 51k • written 16.8 years ago by Yogi Sundaravadanam ▴ 320

0

Entering edit mode

Jenny Drnevich ★ 2.2k

@jenny-drnevich-382

Last seen 9.9 years ago

Hi Yogi, I am working with biological replicates and I am a bit worried about the >biological variation between samples. > >For example, the abundance of a certain gene in sample 1 could be >hundreds of time higher or lower than in sample B. If this is the case, > >this will significantly affect the P-value in the t-test. > >As such, my question is whether there is a way we can account for this >fact in the statistical analysis? I'm not sure what your question is... the fact that a large amount of biological variation among samples in one treatment group will affect the P-value in a t-test is EXACTLY how the statistical analysis accounts for a large amount of biological variation. In simplified terms, a t-test calculates the differences in the means between two groups, then adjusts for the amount of biological variation within each group. The p-value is the probability of getting the calculated t-value if the two groups had been randomly sampled from the same distribution. A low probability leads to the conclusion that the two groups were likely sampled from distributions with different means. If this doesn't answer your question, perhaps you could elaborate on exactly how you want to account for biological variation in the statistical analysis? Cheers, Jenny > > >I will be much grateful if you guys could shed some light on this topic? > > > > >Thank you > >Yogi > > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD COMMENT • link 16.8 years ago Jenny Drnevich ★ 2.2k

0

Entering edit mode

Hello Say for example, I have 3 biological replicates of diseased cells and for a certain gene, the expression of 1 replicate is too high than the other two replicates, I would like to know if a t-test accounts for this variability? If a gene is differentially expressed in diseased cells compared to normal cells, I want to make sure that all the replicates of diseased cells were in fact having a similar gene expression profile... I hope it makes sense... I'm pretty new to all this Thanks heaps Yogi -----Original Message----- From: Jenny Drnevich [mailto:drnevich@uiuc.edu] Sent: Friday, 28 September 2007 1:33 AM To: Yogi Sundaravadanam; bioconductor Subject: Re: [BioC] Biological replicates Hi Yogi, I am working with biological replicates and I am a bit worried about the >biological variation between samples. > >For example, the abundance of a certain gene in sample 1 could be >hundreds of time higher or lower than in sample B. If this is the case, > >this will significantly affect the P-value in the t-test. > >As such, my question is whether there is a way we can account for this >fact in the statistical analysis? I'm not sure what your question is... the fact that a large amount of biological variation among samples in one treatment group will affect the P-value in a t-test is EXACTLY how the statistical analysis accounts for a large amount of biological variation. In simplified terms, a t-test calculates the differences in the means between two groups, then adjusts for the amount of biological variation within each group. The p-value is the probability of getting the calculated t-value if the two groups had been randomly sampled from the same distribution. A low probability leads to the conclusion that the two groups were likely sampled from distributions with different means. If this doesn't answer your question, perhaps you could elaborate on exactly how you want to account for biological variation in the statistical analysis? Cheers, Jenny > > >I will be much grateful if you guys could shed some light on this topic? > > > > >Thank you > >Yogi > > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD REPLY • link 16.8 years ago Yogi Sundaravadanam ▴ 320

0

Entering edit mode

Ana Conesa ▴ 130

@ana-conesa-2246

Last seen 9.9 years ago

There will be always a difference in expression between biological replicates. If this is big then you need bigger differences between conditions to find a signigicant differential expressed gene. It?s not that this will skew the data a bit, it?s that it will be harder to find significant changes. Big differences between replicates could have a technical origin or simply reflect biological variation. If you do not have technical replicates aswell you cannot tell the difference. A > > >---- Mensaje Original ---- >De: yogi.sundaravadanam at agrf.org.au >Para: bioconductor at stat.math.ethz.ch, naomi at stat.psu.edu >Asunto: Re: [BioC] Biological replicates >Fecha: Fri, 28 Sep 2007 08:16:09 +1000 > >>>This is exactly what the t-test is all about. If you want to state > >>that a gene differentially expresses between 2 conditions, don't you > >>mean that the difference in expression is higher than the difference > >>between biological replicates of the same condition? >> >>I was just wondering what I should do if the difference of >expression exists between the replicates itself... won't that skew >the data a bit? >> >> >> -----Original Message----- >>From: Naomi Altman [mailto:naomi at stat.psu.edu] >>Sent: Friday, 28 September 2007 1:01 AM >>To: Yogi Sundaravadanam >>Subject: Re: [BioC] Biological replicates >> >>This is exactly what the t-test is all about. If you want to state >>that a gene differentially expresses between 2 conditions, don't you > >>mean that the difference in expression is higher than the difference > >>between biological replicates of the same condition? >> >>--Naomi >> >>At 01:13 AM 9/27/2007, you wrote: >>>Hi all >>> >>> >>> >>>I am working with biological replicates and I am a bit worried >about the >>>biological variation between samples. >>> >>>For example, the abundance of a certain gene in sample 1 could be >>>hundreds of time higher or lower than in sample B. If this is the >case, >>> >>>this will significantly affect the P-value in the t-test. >>> >>> >>> >>>As such, my question is whether there is a way we can account for >this >>>fact in the statistical analysis? >>> >>> >>> >>>I will be much grateful if you guys could shed some light on this >topic? >>> >>> >>> >>> >>>Thank you >>> >>>Yogi >>> >>> >>> >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor at stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>>Search the archives: >>>http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>Naomi S. Altman 814-865-3791 (voice) >>Associate Professor >>Dept. of Statistics 814-863-7114 (fax) >>Penn State University 814-865-1348 >(Statistics) >>University Park, PA 16802-2111 >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor >>Search the archives: http://news.gmane.org/gmane.science.biology.inf >ormatics.conductor >>

ADD COMMENT • link 16.8 years ago Ana Conesa ▴ 130

0

Entering edit mode

Technical replication is usually not effective in determining biologically meaningful effects, but is certainly useful for determining whether an outlying sample is actually biologically different, or just part of the usual variability in the system (which is a mix of biological variation and technical variation). However, it is also useful to remember that the technical variation in the system can be due to the sample preparation as well as the hybridization. So a "bad" array might produce an almost identical technical replicate. All in all, if possible it is best to take another biological sample. With small sample sizes, you cannot help seeing what appear to be unusual effects. To give you an idea, suppose that you have 4 biological replicates from the same treatment and you divide them arbitrarily into 2 groups of 2. There is a 1/3 probability that the 2 largest end up in one group and the 2 smallest in the other. On the other hand, there is also 1/3 probability that the largest and smallest are in one group and the 2 middle ones in the other, which gives the false impression that the variability is higher in one group than the other. --Naomi At 06:51 PM 9/27/2007, Ana Conesa wrote: >There will be always a difference in expression between biological >replicates. If this is big then you need bigger differences between >conditions to find a signigicant differential expressed gene. It?s >not that this will skew the data a bit, it?s that it will be harder >to find significant changes. Big differences between replicates could >have a technical origin or simply reflect biological variation. If >you do not have technical replicates aswell you cannot tell the >difference. >A > > > > > >---- Mensaje Original ---- > >De: yogi.sundaravadanam at agrf.org.au > >Para: bioconductor at stat.math.ethz.ch, naomi at stat.psu.edu > >Asunto: Re: [BioC] Biological replicates > >Fecha: Fri, 28 Sep 2007 08:16:09 +1000 > > > >>>This is exactly what the t-test is all about. If you want to state > > > >>that a gene differentially expresses between 2 conditions, don't you > > > >>mean that the difference in expression is higher than the difference > > > >>between biological replicates of the same condition? > >> > >>I was just wondering what I should do if the difference of > >expression exists between the replicates itself... won't that skew > >the data a bit? > >> > >> > >> -----Original Message----- > >>From: Naomi Altman [mailto:naomi at stat.psu.edu] > >>Sent: Friday, 28 September 2007 1:01 AM > >>To: Yogi Sundaravadanam > >>Subject: Re: [BioC] Biological replicates > >> > >>This is exactly what the t-test is all about. If you want to state > >>that a gene differentially expresses between 2 conditions, don't you > > > >>mean that the difference in expression is higher than the difference > > > >>between biological replicates of the same condition? > >> > >>--Naomi > >> > >>At 01:13 AM 9/27/2007, you wrote: > >>>Hi all > >>> > >>> > >>> > >>>I am working with biological replicates and I am a bit worried > >about the > >>>biological variation between samples. > >>> > >>>For example, the abundance of a certain gene in sample 1 could be > >>>hundreds of time higher or lower than in sample B. If this is the > >case, > >>> > >>>this will significantly affect the P-value in the t-test. > >>> > >>> > >>> > >>>As such, my question is whether there is a way we can account for > >this > >>>fact in the statistical analysis? > >>> > >>> > >>> > >>>I will be much grateful if you guys could shed some light on this > >topic? > >>> > >>> > >>> > >>> > >>>Thank you > >>> > >>>Yogi > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> [[alternative HTML version deleted]] > >>> > >>>_______________________________________________ > >>>Bioconductor mailing list > >>>Bioconductor at stat.math.ethz.ch > >>>https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>Search the archives: > >>>http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >>Naomi S. Altman 814-865-3791 (voice) > >>Associate Professor > >>Dept. of Statistics 814-863-7114 (fax) > >>Penn State University 814-865-1348 > >(Statistics) > >>University Park, PA 16802-2111 > >> > >>_______________________________________________ > >>Bioconductor mailing list > >>Bioconductor at stat.math.ethz.ch > >>https://stat.ethz.ch/mailman/listinfo/bioconductor > >>Search the archives: http://news.gmane.org/gmane.science.biology.inf > >ormatics.conductor > >> > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD REPLY • link 16.8 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Yogi Sundaravadanam ▴ 320

@yogi-sundaravadanam-2312

Last seen 9.9 years ago

>This is exactly what the t-test is all about. If you want to state that a gene differentially expresses between 2 conditions, don't you mean that the difference in expression is higher than the difference between biological replicates of the same condition? I was just wondering what I should do if the difference of expression exists between the replicates itself... won't that skew the data a bit? -----Original Message----- From: Naomi Altman [mailto:naomi@stat.psu.edu] Sent: Friday, 28 September 2007 1:01 AM To: Yogi Sundaravadanam Subject: Re: [BioC] Biological replicates This is exactly what the t-test is all about. If you want to state that a gene differentially expresses between 2 conditions, don't you mean that the difference in expression is higher than the difference between biological replicates of the same condition? --Naomi At 01:13 AM 9/27/2007, you wrote: >Hi all > > > >I am working with biological replicates and I am a bit worried about the >biological variation between samples. > >For example, the abundance of a certain gene in sample 1 could be >hundreds of time higher or lower than in sample B. If this is the case, > >this will significantly affect the P-value in the t-test. > > > >As such, my question is whether there is a way we can account for this >fact in the statistical analysis? > > > >I will be much grateful if you guys could shed some light on this topic? > > > > >Thank you > >Yogi > > > > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 16.8 years ago Yogi Sundaravadanam ▴ 320

0

Entering edit mode

tidecrepep • 0

@tidecrepep-11333

Last seen 7.8 years ago

Thanks for the details. The DNA microarray platforms like Buserelin Acetate generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Using the standard curve method and the same set of standard DNA for every qpcr run.

ADD COMMENT • link updated 7.7 years ago by Gordon Smyth 51k • written 7.9 years ago by tidecrepep • 0

Login before adding your answer.