duplicateCorrelation and design matrix
1
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia
> Date: Sun, 03 Jul 2005 10:13:29 +0000 > From: Carolyn Fitzsimmons <carolyn.fitzsimmons at="" imbim.uu.se=""> > Subject: Re: [BioC] duplicateCorrelation and design matrix > To: bioconductor at stat.math.ethz.ch > > Hi Gordon, thanks for your reply. I have a few more questions: > > Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > >> > Date: Thu, 30 Jun 2005 11:44:02 +0000 >> > From: Carolyn Fitzsimmons <carolyn.fitzsimmons at="" imbim.uu.se=""> >> > Subject: [BioC] duplicateCorrelation and design matrix >> > To: Bioconductor list <bioconductor at="" stat.math.ethz.ch=""> >> > >> > Hello, >> > >> > I need an explanation of how the design matrix influences the consensus >> > correlation of the duplicateCorrelation function when accounting for >> technical >> > replicates. Here is my specific example: >> > >> > Design matrix: >> >> design >> > RJf RJm WLf WLm >> > 1 0 0 0 1 >> > 2 0 0 0 1 >> > 3 0 0 0 1 >> > 4 0 0 0 1 >> > 5 0 0 0 1 >> > 6 0 0 0 1 >> > 7 0 0 0 1 >> > 8 0 0 0 1 >> > 9 0 0 1 0 >> > 10 0 0 1 0 >> > 11 0 0 1 0 >> > 12 0 0 1 0 >> > 13 0 0 1 0 >> > 14 0 0 1 0 >> > 15 0 0 1 0 >> > 16 0 0 1 0 >> > 17 0 1 0 0 >> > 18 0 1 0 0 >> > 19 0 1 0 0 >> > 20 0 1 0 0 >> > 21 0 1 0 0 >> > 22 0 1 0 0 >> > 23 0 1 0 0 >> > 24 0 1 0 0 >> > 25 1 0 0 0 >> > 26 1 0 0 0 >> > 27 1 0 0 0 >> > 28 1 0 0 0 >> > 29 1 0 0 0 >> > 30 1 0 0 0 >> > 31 1 0 0 0 >> > 32 1 0 0 0 >> > # >> > each second slide is a replicate of the first (eg. 1 and 2 are replicates, >> then >> > 3 and 4,... etc.). There are also 4 groups that I want to compare, with 4 >> > individuals in each group (each duplicated). So I continue with the >> > duplicateCorrelation: >> > # >> >> cor <- duplicateCorrelation(Mmatrix_ny, design=design, >> > + >> > >> > block=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13, 14,14,15,15,16,16)) >> >> cor$cor >> > [1] -0.03060575 >> > # >> > which is a pretty bad correlation so I probably should just use the >> technical >> > replicates as biological replicates (the limma user guide says). But in >> > another comparison I want to put all the arrays in 2 groups, see design >> > matrix: >> >> designWLRJ >> > RJ WL >> > 1 0 1 >> > 2 0 1 >> > 3 0 1 >> > 4 0 1 >> > 5 0 1 >> > 6 0 1 >> > 7 0 1 >> > 8 0 1 >> > 9 0 1 >> > 10 0 1 >> > 11 0 1 >> > 12 0 1 >> > 13 0 1 >> > 14 0 1 >> > 15 0 1 >> > 16 0 1 >> > 17 1 0 >> > 18 1 0 >> > 19 1 0 >> > 20 1 0 >> > 21 1 0 >> > 22 1 0 >> > 23 1 0 >> > 24 1 0 >> > 25 1 0 >> > 26 1 0 >> > 27 1 0 >> > 28 1 0 >> > 29 1 0 >> > 30 1 0 >> > 31 1 0 >> > 32 1 0 >> > # >> > and then do the duplicateCorrelation function and get a different >> correlation. >> > # >> >> corWLRJ <- duplicateCorrelation (Mmatrix_ny, design=designWLRJ, >> > + >> > >> > block=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13, 14,14,15,15,16,16)) >> >> corWLRJ$cor >> > [1] 0.01745252 >> > # >> > Moreover when I compute the consensus correlation without using a design >> matrix >> > I get 0.1073055. I know from looking through previous posts and a lot of >> help >> > from Johan L. that the way the blocking is set up and using the design >> matrix >> > in these situations is correct. >> >> You've used three different non-equivalent design matrices. No more than one >> of these can be >> correct. > > But if I need to group the individuals differently to test for differential > expression between different groupings of individuals (i.e. between > WLm/WLf/RJm/RJf and WL/RJ), the use of 2 different design matrixies in the > dupCorrelation function is warrented, yes? No. Unless you have a good reason to do otherwise, set the full design matrix and use contrasts.fit() to group the individuals for differential expression tests. Gordon > >> >> > So how is the consensus correlation actually >> > being calculated in the above situations? (in loose mathamatical terms if >> > possible, as you can probably tell from my question). >> >> In loose terms the correlation measures the variability between blocks >> relative to the variation >> within blocks. Over-simplifying the design matrix will increase the >> between-blocks variation, >> because it will now reflect differences between your treatments as well as >> differences between >> biological replicates. Hence the estimated correlation increases. >> > > Okay. Now I believe I understand how it is calculated. When you use a design > matrix here you create blocks, then the blocking argument creates blocks within > blocks. (Correct me if this is wrong). > > Best Regards, Carolyn
limma limma • 668 views
ADD COMMENT
0
Entering edit mode
@carolyn-fitzsimmons-1318
Last seen 9.7 years ago
Hello again Gordon, > > But if I need to group the individuals differently to test for > differential > > expression between different groupings of individuals (i.e. between > > WLm/WLf/RJm/RJf and WL/RJ), the use of 2 different design matrixies in the > > dupCorrelation function is warrented, yes? > > No. Unless you have a good reason to do otherwise, set the full design > matrix and use > contrasts.fit() to group the individuals for differential expression tests. > > Gordon > Then I would have to set a contrasts matrix for a comparison between WJ and RJ like this: WL.RJ RJf -0.5 RJm -0.5 WLf 0.5 WLm 0.5 Instead of this: WL.RJ RJf -1 RJm -1 WLf 1 WLm 1 Because it I get inflated m-values with the second matrix. Is this what you would do? Regards, Carolyn -- Carolyn Fitzsimmons Dept. Medical Biochemistry and Microbiology Uppsala University Box 597/BMC SE-751 24 SWEDEN E-mail: Carolyn.Fitzsimmons at imbim.uu.se Tel: +46 (0)18 471 4593 Mobile: +46 (0)73 704 1248
ADD COMMENT

Login before adding your answer.

Traffic: 774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6