Question

limma for finding differentialy expressed genes of several groups

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

On Thu, Oct 18, 2012 at 7:57 AM, dhivyaa reddy <reddy.dhivyaa@gmail.com>wrote: > Shoudl I log transform the RMA normalized values before Ibegin the > analysis for differentially expressed genes ? Also I feel that the Adj- P > values are quite high and am sure that all of them cannot be insignificant > right ?? > > Please keep discussions on the list so that you can get the best answers possible. RMA-normalized data are already log-transformed and are appropriate for what you are doing. As for significance, it is quite possible and not that uncommon to have an experiment that shows little or no statistical significance. Since you have few replicates, it may be that your power to detect statistically significant changes is limited. That said, there may be some genes near the top of your list(s) that are still interesting; biological followup may still be warranted in this situation. Sean > On Thu, Oct 18, 2012 at 1:25 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > >> >> >> On Thu, Oct 18, 2012 at 4:24 AM, priya [guest] <guest@bioconductor.org>wrote: >> >>> >>> I would like to find the differentially expressed genes for several >>> variables using the limma package for several groups. >>> I have the rma normalized matrix in the following format : >>> >>> >>> ID_REF GSM362180 GSM362181 GSM362188 GSM362189 GSM362192 >>> 244901 5.094871713 4.626623079 4.554272515 4.748604391 4.759221647 >>> 244902 5.194528083 4.985930299 4.817426064 5.151654407 4.838741605 >>> 244903 5.412329253 5.352970877 5.06250609 5.305709079 8.365082403 >>> 244904 5.529220594 5.28134657 5.467445095 5.62968933 5.458388909 >>> 244905 5.024052699 4.714631878 4.792865831 4.843975286 4.657188246 >>> 244906 5.786557533 5.242403911 5.060605782 5.458148567 5.890061836 >>> >>> where the different columns correspond to four different types of >>> promoters and each of the four promoters has a biological replicate so >>> totally there are 8 columns.There are totally 22810 genes and I would like >>> to get a list of the genes which are differentially expressed >>> >>> I tried using the Limma package to find the differentially expressed >>> genes across several promoters ( with replicates). >>> This is the code that I used: >>> >>> Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4"), >>> levels = >>> c("GSM362180","GSM362181","GSM362188","GSM362189","GSM362192","GSM 362193","GSM362194","GSM362197","GSM362198")) >>> >>> design <- model.matrix(~0 + Group) >>> >>> colnames(design) <- >>> c("GSM362180","GSM362181","GSM362188","GSM362189","GSM362192","GSM 362193","GSM362194","GSM362197") >>> fit <- lmFit(modified, design) >>> >>> where modified is the rma normalized data matrix as inputted in the >>> above format. >>> I get the following error: >>> >>> Coefficients not estimable: GSM362180 GSM362181 GSM362188 GSM362189 >>> GSM362192 GSM362193 GSM362194 GSM362197 GSM362198 >>> Error in lm.fit(design, t(M)) : 0 (non-NA) cases >>> >>> >>> I managed to get help from the mailing list prior to this and was able >>> to correct it in the following way. >>> >>> >>> -- output of sessionInfo(): >>> >>> Group <- factor(c("p1", "p1", "p2", "p2", "p3","p3","p3","p4","p4")) >>> design <- model.matrix(~0+Group) >>> colnames(design) <- gsub("Group","", colnames(design)) >>> >>> For creating the contrast matrix I proceeded as : >>> fit<-lmFit(modified,design) >>> fit<-ebayes(fit) >>> fit<-lmFit(modified,design) >>> >>> contrast.matrix<-makeContrasts(p1-p2,p1-p3,p1-p4,p2-p3,p2-p4,p3-p4 ,levels=design) >>> fit2<-contrasts.fit(fit,contrast.matrix) >>> fit2<-eBayes(fit2) >>> topTable(fit2,coef=1,adjust="fdr") >>> logFC AveExpr t P.Value adj.P.Val B >>> 14865 -3.063442 11.939646 -20.85957 5.020817e-09 8.235097e-05 10.906936 >>> 15107 -3.316203 13.136888 -19.79194 8.041764e-09 8.235097e-05 10.543106 >>> 12037 2.806403 10.772050 19.10380 1.103823e-08 8.235097e-05 10.292274 >>> 15931 -3.469330 10.325303 -18.53793 1.444120e-08 8.235097e-05 10.075671 >>> 18327 3.198993 9.633795 17.57118 2.328424e-08 8.331092e-05 9.682365 >>> 7521 -2.419999 7.373064 -17.16080 2.873576e-08 8.331092e-05 9.505924 >>> 16564 3.268568 8.365454 17.09028 2.980775e-08 8.331092e-05 9.475007 >>> 3832 -2.685268 7.540418 -16.89167 3.307237e-08 8.331092e-05 9.386966 >>> 10364 2.466369 6.779762 16.71021 3.640344e-08 8.331092e-05 9.305265 >>> 4967 -2.453614 11.409188 -16.62282 3.813877e-08 8.331092e-05 9.265479 >>> o<-order(fit2$F.p.value) >>> fit2$genes[o[1:30],] >>> >>> >>> After the above step I get as NULL. I do not know where am making the >>> mistake. >>> >>> >>> clas <- decideTests(fit2, method = "nestedF", >>> + adjust.method = "fdr", p = 0.05) >>> >>> >>> I get the following output which I know is quite wrong : >>> Contrasts >>> p1 - p2 p1 - p3 p1 - p4 p2 - p3 p2 - p4 p3 - p4 >>> [1,] 0 0 0 0 0 0 >>> [2,] 0 0 0 0 0 0 >>> [3,] 0 0 0 0 0 0 >>> [4,] 0 0 0 0 0 0 >>> [5,] 0 0 0 0 0 0 >>> [6,] 0 0 0 0 0 0 >>> >> >> Hi, Priya. >> >> Nice job moving forward with your analysis. >> >> The output above looks fine to me. If you read the help for decideTests, >> you will notice that the output is 0/1 where a 1 signifies that a given >> probe was "significant" for a given contrast. What makes you think your >> results are wrong? >> >> Sean >> >> > [[alternative HTML version deleted]]

probe limma probe limma • 775 views

ADD COMMENT • link 11.5 years ago Sean Davis 21k