To many differentially expressed genes produced by LIMMA and dye-effect question

0

Entering edit mode

krasikov@science.uva.nl ▴ 100

@krasikovscienceuvanl-1517

Last seen 11.4 years ago

Dear Naomi Thanks a lot for the fast reply. Naomi Altman wrote: > Dear Vladimir,. > I am a bit confused by the output because, while the code seems to > indicate that the analysis is on the log2 scale, the DecideTests > output does not. > > If the data are on the natural scale, the distributional assumptions > of limma are incorrect. The denominator adjustment will pull the > estimated variance down too much, leading to large values of the test > statistic and high significance values. > > To check the scale of the data, just type: > > MA.scale[1:5,] The scale of the data should be ordinary log2 scale of limma as I do everything according to the limma vignette and help-pages of the particular functions. But anyway I checked the M values and they are indeed in log-scale. > which prints the first 5 genes. If the data are on the log scale, all > values should be beween 4 and 16. > > Also, in a dye-swap design, you really do need to consider the blocks > (see the limma manual). Presumably the technical replicates are less > variable than the biological replicates. Also, the additional > replication inflates the degrees of freedom. Both of these things > also pull the estimated variance down. I would need some help here... I would like to use the existing power of dye-swaps technical replicates to get as much info as possible Here are some variants which I've tried and all of them are producing strange staff: 1. (the one I already described earlier) >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >fit1 <-lmFit(MA.scale,design) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) >d<-decideTests(fit3, adjust.method="fdr", p.value=0.001, lfc=log2(1.5)) > summary(d) Dye A B AvsB -1 0 7012 7045 448 0 38127 23331 23479 36656 1 6 7790 7609 1029 amount of genes without cutoff on fold change is enormous - more than 16.000 !!! 2a. Blocked dye-swaps >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >biolrep<-c(1,1,2,2,3,3,...,18,18) >corfit <- duplicateCorrelation(MA.scale, design, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] NaN > What is wrong here? 2b. >design <- modelMatrix(targets, ref="Ref") >biolrep<-c(1,1,2,2,3,3,...,18,18) >corfit <- duplicateCorrelation(MA.scale, design, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] NaN > What is wrong here? 2c. >corfit <- duplicateCorrelation(MA.scale, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] -0.9645838 # finally I've got good correlation by excluding the design from the function >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >fit1 <- lmFit(MA.scale, design, block = biolrep, cor = corfit$consensus) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01) > summary(d) Dye A B AvsB -1 4818 6360 7950 7 0 28704 25049 22021 38098 1 4611 6724 8162 28 > # Oops :) No regulation # 2d. all the same as 2c but in lmFit() excluded block variable: >fit1 <- lmFit(MA.scale, design, cor = corfit$consensus) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01) > summary(d) Dye RA SPA RAvsSPA -1 4818 15172 15440 10159 0 28704 6369 5335 18074 1 4611 16592 17358 9900 > > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01, lfc=log2(1.5)) > summary(d) Dye RA SPA RAvsSPA -1 0 7013 7046 450 0 38127 23330 23477 36653 1 6 7790 7610 1030 > The same amount of regulation as in variant 1 > Please may you explain me how to use blocks and duplicateCorrelation because I don't understand outcome and which of the above scenarios is correct? The question about enormous amount of statistically reliable regulated genes still worries me... I can't explain it I will appreciate any comments on this matter Regards Vladimir Krasikov > --Naomi > > At 07:51 PM 1/6/2008, you wrote: >> Dear List >> I hope somebody may give me an explanation of strange phenomena >> >> I need some explanation about my results obtained by LIMMA analysis of >> my microarray data. >> I have some experience with limma and some other packages from >> Bioconductor >> however I must say that I'm rather biologist than bioinformatician. >> >> Briefly description of my experiment: >> I'm comparing two groups of patients >> with different forms of one disease (let's say group A and group B) >> RNA from 18 patients (11 from A and 7 from B) were hybridized to 36 44K >> Agilent human microarrays >> All of microarrays were performed against common reference and each one >> had a dye-swap hybridization. >> >> targets: >> filename sampleID Cy3 Cy5 >> 1 A1.ST.txt A1 Ref A1 >> 2 A1.DS.txt A1 A1 Ref >> .... >> 21 A11.ST.txt A11 Ref A11 >> 22 A11.DS.txt A11 A11 Ref >> 23 B1.ST.txt B1 Ref B1 >> 24 B1.DS.txt B1 B1 Ref >> .... >> 35 B7.ST.txt B7 Ref B7 >> 36 B7.DS.txt B7 B7 Ref >> >> after importing the data, removing all types of control spots from >> dataset, >> performing "loess" within array normalization like and "scale" between >> normalization: >> >MA.offset <- normalizeWithinArrays(RG, method="loess", >> bc.method="normexp", offset = 50) >> >MA.scale <- normalizeBetweenArrays(MA.offset, method="scale") >> >> >dim(MA.scale) >> [1] 38133 36 >> >> Design matrix is rather simple in my case: >> >design <- modelMatrix(targets, ref="Ref") >> and to account for the possible dye-effect include: >> >design <-cbind(Dye=1, design) >> >> and then linear model: >> >fit1 <-lmFit(MA.scale,design) >> >cont.matrix<-makeContrasts(Dye, >> + A=(A1+....+A11)/11, >> + B=(B1+...+B7)/7, >> + >> AvsB=((A1+....+A11)/11-(B1+...+B7)/7), >> + design) >> >fit2<-contrasts.fit(fit1,cont.matrix) >> >fit3<-eBayes(fit2) >> >> So far so good but then: >> >> >d<-decideTests(fit3, adjust.method="fdr", p.value=0.001) >> >summary(d) >> Dye A B AvsB >> -1 3026 14909 14530 8161 >> 0 32497 6720 7881 21544 >> 1 2610 16504 15722 8428 >> gives me terrible amount of regulations and dye-effects: >> >> even with incredibly stricken adjustments and p.value cutoff I'm getting >> ennormous amount of regulation: >> >d<-decideTests(fit3, adjust.method="bonferroni", p.value=0.0001) >> >summary(d) >> Dye A B AvsB >> -1 320 12390 11606 2584 >> 0 37496 12633 14242 31896 >> 1 317 13110 12285 3653 >> >> Even if to play with cut-off on fold change the amount of the data with >> fold change above 1.5 is 1000 genes for up-regulation...:( >> >> In general it can't be true as far as I understand biological problem >> and statistical surrounding of the data, >> It's virtualy not possible that different human patients could produce >> such a similar expression profile to each other (I mean within one >> group). >> Form other hand it is violating major assumption of microarray >> dif.expression testing that only small proportion of the genes could be >> differentially expressed. >> >> May anybody give me a hint on what I'm doing not correct in data >> treatment >> >> I tried to fit the model slightly different way and got completely >> ironic results which I can't interpret myself at all, >> so I'm lost afterwards >> >> >biolrep <- c(1,1,2,2,3,3,....,18,18) >> >corfit<-duplicateCorrelation(MA.scale, ndups=1, block=biolrep) >> >corfit$consensus >> [1] -0.9645838 >> which is not bad as I do understand that there is nice negative >> correlation between dye-swaps >> however there is nowhere stated that all this arrays are belonging to >> two different groups (A and B), >> (probably there is no matter for this function as it calculates the >> correlation between each pair only) >> >> Thanks a lot for any help and advise. >> >> > sessionInfo() >> R version 2.6.1 (2007-11-26) >> i386-pc-mingw32 >> >> locale: >> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> States.1252;LC_MONETARY=English_United >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] statmod_1.3.1 limma_2.12.0 >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Which afterwards gives a really simple design: >> >> >> >> >> >> >> -- >> V. Krasikov >> Swammerdam Institute for Life Sciences >> Plant Pathology >> University of Amsterdam >> Kruislaan 318 >> 1098SM Amsterdam >> >> Telephone: +31(0)20 5257839 >> Telefax: +31(0)20 5257934 >> E-mail: krasikov at science.uva.nl >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > -- V. Krasikov Swammerdam Institute for Life Sciences Plant Pathology University of Amsterdam Kruislaan 318 1098SM Amsterdam Telephone: +31(0)20 5257839 Telefax: +31(0)20 5257934 E-mail: krasikov at science.uva.nl

Microarray Normalization limma Microarray Normalization limma • 1.4k views

ADD COMMENT • link updated 18.0 years ago by Anguraj Sadanandam ▴ 40 • written 18.0 years ago by krasikov@science.uva.nl ▴ 100

0

Entering edit mode

Anguraj Sadanandam ▴ 40

@anguraj-sadanandam-2574

Last seen 11.4 years ago

Hi, I am trying to generate frequency plot using aCGH package. I am constantly getting the following error while trying to impute the data impute.lowess(ex.acgh, maxChrom = 24) the error is Processing chromosome 1 Error in lowess(kbr[ind], vecr[ind], f = smooth) : NA/NaN/Inf in foreign function call (arg 2) Please advice. Thanks, Anguraj

ADD COMMENT • link 18.0 years ago Anguraj Sadanandam ▴ 40

Login before adding your answer.