Un-balanced dye-swaps and LIMMA
2
0
Entering edit mode
@matthew-hannah-621
Last seen 9.6 years ago
Hi, I have some questions regarding some cDNA array data I've been asked to look at. The design is slightly different to the standard designs, in that independent biological replicates (different plants within the same experiment) have been hybridised to different arrays. Therefore there are biological dye- swaps but not technical ones. Array 1 - WT plant 1/Treated plant 1 Array 2 - Treated plant 2/WT plant 2 Array 3 - WT plant 3/Treated plant 3 Analysing these data using LIMMA, lmfit and ebayes (as in html manual) produces some odd looking qq plots that I have some questions about. All analyses used print-tip loess followed by quantile normalisation, with different BG correction. BG- NONE - produces a normal looking single S-curved line, but both ends are the same side of the 1/1 line. BG-minimum - looks alright, although the extreme values at the upper end cross back over the 1/1 line. BG- subtract - the qq plot separates at both ends into 3 lines (presumably 1 for each array), which clearly isn't normal. My question is whether these are likely to result from the unbalanced dye-swap or the independent plant (rather than pooled) RNA used. More generally is it valid to treat the individual channels of cDNA data in any way similar to single array data (like affy?) after quantile normalising between arrays, or will the between array differences always be too great? Also is it generally considered best to use local BG correction, or non-corrected values and then eliminate bad spots later. Thanks, Matt
limma limma • 867 views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 2.9 years ago
United States
Here is what I have been telling people about dye-swapping. I would be very interested if anyone thinks this is wrong. The dye by gene interaction is due to the labeling chemistry. Therefore, unless there are big allelic differences between the biological samples, dye-swapping should be done between biological reps. Technical dye- swaps are a waste of resources unless the cost of taking biological samples is very high. Regarding the qq plots - I am not sure what you are plotting. If you are plotting the quantiles of WT vs quantiles of Treated, then the QQ plot should be straight in the center, and curved at the ends. The curved ends are differentially expressing genes. The curvature could be in either direction depending on the thickness of the tails of the distribution relative to the center. --Naomi At 11:25 AM 5/24/2004 +0200, Matthew Hannah wrote: >Hi, > >I have some questions regarding some cDNA array data I've been asked to >look at. >The design is slightly different to the standard designs, in that independent >biological replicates (different plants within the same experiment) have been >hybridised to different arrays. Therefore there are biological dye- swaps >but not >technical ones. >Array 1 - WT plant 1/Treated plant 1 >Array 2 - Treated plant 2/WT plant 2 >Array 3 - WT plant 3/Treated plant 3 > >Analysing these data using LIMMA, lmfit and ebayes (as in html manual) >produces >some odd looking qq plots that I have some questions about. All analyses used >print-tip loess followed by quantile normalisation, with different BG >correction. >BG- NONE - produces a normal looking single S-curved line, but both ends >are the >same side of the 1/1 line. >BG-minimum - looks alright, although the extreme values at the upper end cross >back over the 1/1 line. >BG- subtract - the qq plot separates at both ends into 3 lines (presumably >1 for >each array), which clearly isn't normal. > >My question is whether these are likely to result from the unbalanced >dye-swap or >the independent plant (rather than pooled) RNA used. More generally is it >valid >to treat the individual channels of cDNA data in any way similar to single >array >data (like affy?) after quantile normalising between arrays, or will the >between >array differences always be too great? Also is it generally considered >best to use >local BG correction, or non-corrected values and then eliminate bad spots >later. > >Thanks, >Matt > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
@matthew-hannah-621
Last seen 9.6 years ago
Naomi, Thanks for the reply, I've been doing some investigations of my own since. The point on biological dye-swaps makes sense, this just leaves the question of whether having an un-balanced design (2 vs. 1) would effect the analysis. As for the qq plots - I've recently made some progress. I was plotting this >fit <- lmFit(MA, design); >fit <- eBayes(fit) >qqt(fit$t,df=fit$df.prior+fit$df.residual,pch=16,cex=0.1) >abline(0,1) The S-shape I describe is generally as you describe a QQ plot, however there are problems. With the BG subtract, the curved ends are not a single line, but separate into 3 lines. I've now found that this is due to missing values (BG>FG?) giving either 0, 1, 2 for the residual df. Why are the prior and residuals added in the suggested plot? With BG = none, the QQ is a nice shape, but the abline doesn't follow the straight central section but cuts through the back of the upper curve on the 'S' leaving both curved ends the same side of the line. If you plot it with df.prior (c.3) it looks fine, but with the residuals added (c.5) it has the problem I described. I guess this means the t values are not normally distributed. I'm not sure why though? - lots of low-end noise, the 2 vs. 1 in the dye-swaps..? Anymore thoughts? Thanks, Matt -----Original Message----- From: Naomi Altman [mailto:naomi@stat.psu.edu] Sent: Dienstag, 25. Mai 2004 14:42 To: Matthew Hannah; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Un-balanced dye-swaps and LIMMA Here is what I have been telling people about dye-swapping. I would be very interested if anyone thinks this is wrong. The dye by gene interaction is due to the labeling chemistry. Therefore, unless there are big allelic differences between the biological samples, dye-swapping should be done between biological reps. Technical dye- swaps are a waste of resources unless the cost of taking biological samples is very high. Regarding the qq plots - I am not sure what you are plotting. If you are plotting the quantiles of WT vs quantiles of Treated, then the QQ plot should be straight in the center, and curved at the ends. The curved ends are differentially expressing genes. The curvature could be in either direction depending on the thickness of the tails of the distribution relative to the center. --Naomi At 11:25 AM 5/24/2004 +0200, Matthew Hannah wrote: >Hi, > >I have some questions regarding some cDNA array data I've been asked to >look at. >The design is slightly different to the standard designs, in that independent >biological replicates (different plants within the same experiment) have been >hybridised to different arrays. Therefore there are biological dye- swaps >but not >technical ones. >Array 1 - WT plant 1/Treated plant 1 >Array 2 - Treated plant 2/WT plant 2 >Array 3 - WT plant 3/Treated plant 3 > >Analysing these data using LIMMA, lmfit and ebayes (as in html manual) >produces >some odd looking qq plots that I have some questions about. All analyses used >print-tip loess followed by quantile normalisation, with different BG >correction. >BG- NONE - produces a normal looking single S-curved line, but both ends >are the >same side of the 1/1 line. >BG-minimum - looks alright, although the extreme values at the upper end cross >back over the 1/1 line. >BG- subtract - the qq plot separates at both ends into 3 lines (presumably >1 for >each array), which clearly isn't normal. > >My question is whether these are likely to result from the unbalanced >dye-swap or >the independent plant (rather than pooled) RNA used. More generally is it >valid >to treat the individual channels of cDNA data in any way similar to single >array >data (like affy?) after quantile normalising between arrays, or will the >between >array differences always be too great? Also is it generally considered >best to use >local BG correction, or non-corrected values and then eliminate bad spots >later. > >Thanks, >Matt > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT

Login before adding your answer.

Traffic: 714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6