To many differentially expressed genes produced by LIMMA and dye-effect question
1
0
Entering edit mode
@krasikovscienceuvanl-1517
Last seen 9.6 years ago
Dear Naomi Thanks a lot for the fast reply. Naomi Altman wrote: > Dear Vladimir,. > I am a bit confused by the output because, while the code seems to > indicate that the analysis is on the log2 scale, the DecideTests > output does not. > > If the data are on the natural scale, the distributional assumptions > of limma are incorrect. The denominator adjustment will pull the > estimated variance down too much, leading to large values of the test > statistic and high significance values. > > To check the scale of the data, just type: > > MA.scale[1:5,] The scale of the data should be ordinary log2 scale of limma as I do everything according to the limma vignette and help-pages of the particular functions. But anyway I checked the M values and they are indeed in log-scale. > which prints the first 5 genes. If the data are on the log scale, all > values should be beween 4 and 16. > > Also, in a dye-swap design, you really do need to consider the blocks > (see the limma manual). Presumably the technical replicates are less > variable than the biological replicates. Also, the additional > replication inflates the degrees of freedom. Both of these things > also pull the estimated variance down. I would need some help here... I would like to use the existing power of dye-swaps technical replicates to get as much info as possible Here are some variants which I've tried and all of them are producing strange staff: 1. (the one I already described earlier) >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >fit1 <-lmFit(MA.scale,design) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) >d<-decideTests(fit3, adjust.method="fdr", p.value=0.001, lfc=log2(1.5)) > summary(d) Dye A B AvsB -1 0 7012 7045 448 0 38127 23331 23479 36656 1 6 7790 7609 1029 amount of genes without cutoff on fold change is enormous - more than 16.000 !!! 2a. Blocked dye-swaps >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >biolrep<-c(1,1,2,2,3,3,...,18,18) >corfit <- duplicateCorrelation(MA.scale, design, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] NaN > What is wrong here? 2b. >design <- modelMatrix(targets, ref="Ref") >biolrep<-c(1,1,2,2,3,3,...,18,18) >corfit <- duplicateCorrelation(MA.scale, design, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] NaN > What is wrong here? 2c. >corfit <- duplicateCorrelation(MA.scale, ndups = 1, block=biolrep) >corfit$consensus.correlation [1] -0.9645838 # finally I've got good correlation by excluding the design from the function >design <- modelMatrix(targets, ref="Ref") >design <-cbind(Dye=1, design) >fit1 <- lmFit(MA.scale, design, block = biolrep, cor = corfit$consensus) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01) > summary(d) Dye A B AvsB -1 4818 6360 7950 7 0 28704 25049 22021 38098 1 4611 6724 8162 28 > # Oops :) No regulation # 2d. all the same as 2c but in lmFit() excluded block variable: >fit1 <- lmFit(MA.scale, design, cor = corfit$consensus) >cont.matrix<-makeContrasts(Dye, A=(A1+....+A11)/11, B=(B1+...+B7)/7, AvsB=((A1+....+A11)/11-(B1+...+B7)/7), design) >fit2<-contrasts.fit(fit1,cont.matrix) >fit3<-eBayes(fit2) > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01) > summary(d) Dye RA SPA RAvsSPA -1 4818 15172 15440 10159 0 28704 6369 5335 18074 1 4611 16592 17358 9900 > > d <- decideTests(fit3, adjust.method="fdr", p.value=0.01, lfc=log2(1.5)) > summary(d) Dye RA SPA RAvsSPA -1 0 7013 7046 450 0 38127 23330 23477 36653 1 6 7790 7610 1030 > The same amount of regulation as in variant 1 > Please may you explain me how to use blocks and duplicateCorrelation because I don't understand outcome and which of the above scenarios is correct? The question about enormous amount of statistically reliable regulated genes still worries me... I can't explain it I will appreciate any comments on this matter Regards Vladimir Krasikov > --Naomi > > At 07:51 PM 1/6/2008, you wrote: >> Dear List >> I hope somebody may give me an explanation of strange phenomena >> >> I need some explanation about my results obtained by LIMMA analysis of >> my microarray data. >> I have some experience with limma and some other packages from >> Bioconductor >> however I must say that I'm rather biologist than bioinformatician. >> >> Briefly description of my experiment: >> I'm comparing two groups of patients >> with different forms of one disease (let's say group A and group B) >> RNA from 18 patients (11 from A and 7 from B) were hybridized to 36 44K >> Agilent human microarrays >> All of microarrays were performed against common reference and each one >> had a dye-swap hybridization. >> >> targets: >> filename sampleID Cy3 Cy5 >> 1 A1.ST.txt A1 Ref A1 >> 2 A1.DS.txt A1 A1 Ref >> .... >> 21 A11.ST.txt A11 Ref A11 >> 22 A11.DS.txt A11 A11 Ref >> 23 B1.ST.txt B1 Ref B1 >> 24 B1.DS.txt B1 B1 Ref >> .... >> 35 B7.ST.txt B7 Ref B7 >> 36 B7.DS.txt B7 B7 Ref >> >> after importing the data, removing all types of control spots from >> dataset, >> performing "loess" within array normalization like and "scale" between >> normalization: >> >MA.offset <- normalizeWithinArrays(RG, method="loess", >> bc.method="normexp", offset = 50) >> >MA.scale <- normalizeBetweenArrays(MA.offset, method="scale") >> >> >dim(MA.scale) >> [1] 38133 36 >> >> Design matrix is rather simple in my case: >> >design <- modelMatrix(targets, ref="Ref") >> and to account for the possible dye-effect include: >> >design <-cbind(Dye=1, design) >> >> and then linear model: >> >fit1 <-lmFit(MA.scale,design) >> >cont.matrix<-makeContrasts(Dye, >> + A=(A1+....+A11)/11, >> + B=(B1+...+B7)/7, >> + >> AvsB=((A1+....+A11)/11-(B1+...+B7)/7), >> + design) >> >fit2<-contrasts.fit(fit1,cont.matrix) >> >fit3<-eBayes(fit2) >> >> So far so good but then: >> >> >d<-decideTests(fit3, adjust.method="fdr", p.value=0.001) >> >summary(d) >> Dye A B AvsB >> -1 3026 14909 14530 8161 >> 0 32497 6720 7881 21544 >> 1 2610 16504 15722 8428 >> gives me terrible amount of regulations and dye-effects: >> >> even with incredibly stricken adjustments and p.value cutoff I'm getting >> ennormous amount of regulation: >> >d<-decideTests(fit3, adjust.method="bonferroni", p.value=0.0001) >> >summary(d) >> Dye A B AvsB >> -1 320 12390 11606 2584 >> 0 37496 12633 14242 31896 >> 1 317 13110 12285 3653 >> >> Even if to play with cut-off on fold change the amount of the data with >> fold change above 1.5 is 1000 genes for up-regulation...:( >> >> In general it can't be true as far as I understand biological problem >> and statistical surrounding of the data, >> It's virtualy not possible that different human patients could produce >> such a similar expression profile to each other (I mean within one >> group). >> Form other hand it is violating major assumption of microarray >> dif.expression testing that only small proportion of the genes could be >> differentially expressed. >> >> May anybody give me a hint on what I'm doing not correct in data >> treatment >> >> I tried to fit the model slightly different way and got completely >> ironic results which I can't interpret myself at all, >> so I'm lost afterwards >> >> >biolrep <- c(1,1,2,2,3,3,....,18,18) >> >corfit<-duplicateCorrelation(MA.scale, ndups=1, block=biolrep) >> >corfit$consensus >> [1] -0.9645838 >> which is not bad as I do understand that there is nice negative >> correlation between dye-swaps >> however there is nowhere stated that all this arrays are belonging to >> two different groups (A and B), >> (probably there is no matter for this function as it calculates the >> correlation between each pair only) >> >> Thanks a lot for any help and advise. >> >> > sessionInfo() >> R version 2.6.1 (2007-11-26) >> i386-pc-mingw32 >> >> locale: >> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> States.1252;LC_MONETARY=English_United >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] statmod_1.3.1 limma_2.12.0 >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Which afterwards gives a really simple design: >> >> >> >> >> >> >> -- >> V. Krasikov >> Swammerdam Institute for Life Sciences >> Plant Pathology >> University of Amsterdam >> Kruislaan 318 >> 1098SM Amsterdam >> >> Telephone: +31(0)20 5257839 >> Telefax: +31(0)20 5257934 >> E-mail: krasikov at science.uva.nl >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > -- V. Krasikov Swammerdam Institute for Life Sciences Plant Pathology University of Amsterdam Kruislaan 318 1098SM Amsterdam Telephone: +31(0)20 5257839 Telefax: +31(0)20 5257934 E-mail: krasikov at science.uva.nl
Microarray Normalization limma Microarray Normalization limma • 1.1k views
ADD COMMENT
0
Entering edit mode
@anguraj-sadanandam-2574
Last seen 9.6 years ago
Hi, I am trying to generate frequency plot using aCGH package. I am constantly getting the following error while trying to impute the data impute.lowess(ex.acgh, maxChrom = 24) the error is Processing chromosome 1 Error in lowess(kbr[ind], vecr[ind], f = smooth) : NA/NaN/Inf in foreign function call (arg 2) Please advice. Thanks, Anguraj
ADD COMMENT

Login before adding your answer.

Traffic: 911 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6