Question

Fw: random location of duplicate spots and use of limma

0

Entering edit mode

Ingunn Berget ▴ 150

@ingunn-berget-1066

Last seen 9.7 years ago

Hello There are approximately 6000 different genes on the arrays, there are two spots for each gene The duplicated spots have random location, which means that the number of spots between each duplicate is not the same for every gene. This is the summary for the distances: Min. 1st Qu. Median Mean 3rd Qu. Max. 4.00 32.00 71.00 86.59 135.00 244.00 (Distance here means number of spots between the two duplicates) The function duplicateCorrelation in limma can be used to estimate correlation between within-array duplicates, the methodology is based on the assumption that duplicates are equally spaced. Since this assumption is not fulfilled here does this means that I cannot calculate the correlations and must take the average of the duplicates? Are there some functions to do this in limma or other BioC packages? ---------------------------------------------------------------------- -------- Ingunn Berget Norwegian University of Life Sciences Department of Animal and Aquaculture [[alternative HTML version deleted]]

limma limma • 865 views

ADD COMMENT • link updated 19.3 years ago by Gordon Smyth 50k • written 19.3 years ago by Ingunn Berget ▴ 150

score 0 · Answer 1 · 2005-01-07

> Date: Fri, 7 Jan 2005 10:11:30 +0100 > From: "Ingunn Berget" <ingunn.berget@umb.no> > Subject: [BioC] Fw: random location of duplicate spots and use of > limma > To: <bioconductor@stat.math.ethz.ch> > > Hello > > There are approximately 6000 different genes on the arrays, there are two spots for each gene > The duplicated spots have random location, which means that the number of spots between each > duplicate is not the same for every gene. This is the summary for the distances: > > Min. 1st Qu. Median Mean 3rd Qu. Max. > 4.00 32.00 71.00 86.59 135.00 244.00 > > (Distance here means number of spots between the two duplicates) > > The function duplicateCorrelation in limma can be used to estimate correlation between > within-array duplicates, the methodology is based on the assumption that duplicates are equally > spaced. Since this assumption is not fulfilled here does this means that I cannot calculate the > correlations and must take the average of the duplicates? Are there some functions to do this in > limma or other BioC packages? The are no functions in limma, or in other packages, to specifically handle this situation, either to compute correlations or to take averages. However, none of the duplicates on your arrays are very far apart. It might be reasonable to treat them as approximately equal distance. Try sorting the data into gene ID order and then use ndups=2 and spacing=1. E.g., o <- order(MA$genes$ID) MAsorted <- MA[o,] This does assume that *every* probe occurs twice or an even number of times. If this is not so, you'll need to remove the exceptions first. Gordon > -------------------------------------------------------------------- ---------- > Ingunn Berget > Norwegian University of Life Sciences > Department of Animal and Aquaculture