Apply non-Gaussian distribution data to limma
1
1
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Hi, I have high through array data with two peaks in its distribution (raw and log 2 transformed). I googled it looks like is mixed Gaussian distribution - two normal distribution as some people suggested. I think limma's assumption is normal distribution. I was wondering if there is any way to fix the problem or to convert my data to normal distribution before applying to limma. Thanks, Jiang -- output of sessionInfo(): see question -- Sent via the guest posting facility at bioconductor.org.
limma convert limma convert • 2.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 17 hours ago
United States
Hi Jiang, The distributions you are talking about are within subject (e.g., the distribution of the expression values for multiple genes from an individual), but the comparisons you are making are between subjects. It really doesn't matter what the within subject distributions look like, unless you are concerned with intra-subject comparisons. The between subject comparisons are usually based on far too few replicates to get a real sense of the distribution, and you are usually making thousands of such comparisons, so even if you could check the gene- wise distributions you would then have to 'fix' them one by one. Luckily, the t-statistic is pretty robust to non-normal distributed data, so you (like thousands of people already) can just go ahead and fit the model using limma. If you are really worried, you could use a resistant regression but the cost is power, which most microarray studies (at least in my experience) are lacking already. Best, Jim On Monday, December 09, 2013 10:02:25 PM, Jiang [guest] wrote: > > Hi, > I have high through array data with two peaks in its distribution (raw and log 2 transformed). I googled it looks like is mixed Gaussian distribution - two normal distribution as some people suggested. I think limma's assumption is normal distribution. I was wondering if there is any way to fix the problem or to convert my data to normal distribution before applying to limma. > > Thanks, > Jiang > > -- output of sessionInfo(): > > see question > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, Thank you very much for your reply. I got what you mean. Yes, I am doing comparisons between groups (subjects)and I will go ahead with limma. Best, Zhengyu > Date: Tue, 10 Dec 2013 09:04:48 -0500 > From: jmacdon@uw.edu > To: guest@bioconductor.org > CC: bioconductor@r-project.org; zhyjiang2006@hotmail.com > Subject: Re: [BioC] Apply non-Gaussian distribution data to limma > > Hi Jiang, > > The distributions you are talking about are within subject (e.g., the > distribution of the expression values for multiple genes from an > individual), but the comparisons you are making are between subjects. > > It really doesn't matter what the within subject distributions look > like, unless you are concerned with intra-subject comparisons. The > between subject comparisons are usually based on far too few replicates > to get a real sense of the distribution, and you are usually making > thousands of such comparisons, so even if you could check the gene- wise > distributions you would then have to 'fix' them one by one. > > Luckily, the t-statistic is pretty robust to non-normal distributed > data, so you (like thousands of people already) can just go ahead and > fit the model using limma. If you are really worried, you could use a > resistant regression but the cost is power, which most microarray > studies (at least in my experience) are lacking already. > > Best, > > Jim > > > > On Monday, December 09, 2013 10:02:25 PM, Jiang [guest] wrote: > > > > Hi, > > I have high through array data with two peaks in its distribution (raw and log 2 transformed). I googled it looks like is mixed Gaussian distribution - two normal distribution as some people suggested. I think limma's assumption is normal distribution. I was wondering if there is any way to fix the problem or to convert my data to normal distribution before applying to limma. > > > > Thanks, > > Jiang > > > > -- output of sessionInfo(): > > > > see question > > > > -- > > Sent via the guest posting facility at bioconductor.org. > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6