0
7.9 years ago by
Asma rabe290
Japan
Asma rabe290 wrote:
Hi All, I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs. I would like to use edgeR to get genes that are differentially expressed throughout the experiment. In a recent edgeR user's manual ,it has been suggested to use a dispersion value of 0.1 in case of no replicates for organisms that are genetically similar to humans. so it is possible to be used for mouse data for example,what about other organisms like yeast ?? As exact test is used to get DE genes across 2 samples and i need to get DE genes across my time course experiment,i used GLM. when i used GLM model to find DE genes across my samples i ran the following commands: mm<-model.matrix(~0+my_groups_factor) fit<-glmfit(my_data,mm,dispresion_value) fit<-glmLRT(data,fit) topTags(fit) ----- i got: coefficient: group8 logConc logFC PValue FDR gene1 ..... ..... .... .... gene2 .. .... .... .... .. . .. --- I have a basic question(sorry i'm not statistician) Why the last gorup 8hrs was chooosen to be the coef.?? Thank you in advance. Best Regards, Rabe [[alternative HTML version deleted]]
yeast edger • 580 views
modified 7.9 years ago by Mark Robinson870 • written 7.9 years ago by Asma rabe290
0
7.9 years ago by
Mark Robinson870
Mark Robinson870 wrote:
Hi Rabe, Some comments below. On 19.12.2011, at 10:57, Asma rabe wrote: > Hi All, > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs. > > I would like to use edgeR to get genes that are differentially expressed > throughout the experiment. > In a recent edgeR user's manual ,it has been suggested to use a dispersion > value of 0.1 in case of no replicates for organisms that are genetically > similar to humans. > so it is possible to be used for mouse data for example,what about other > organisms like yeast ?? Yes, it is possible. But, as mentioned in the manual, none of these solutions are ideal. > As exact test is used to get DE genes across 2 samples and i need to get DE > genes across my time course experiment,i used GLM. > > when i used GLM model to find DE genes across my samples i ran the > following commands: > > mm<-model.matrix(~0+my_groups_factor) > fit<-glmfit(my_data,mm,dispresion_value) > fit<-glmLRT(data,fit) > topTags(fit) > > ----- > i got: > > coefficient: group8 > logConc logFC PValue FDR > gene1 ..... ..... .... > .... > gene2 .. .... > .... .... > .. > . > .. > > > --- > I have a basic question(sorry i'm not statistician) > Why the last gorup 8hrs was chooosen to be the coef.?? Have a look at the documentation for glmLRT and specifically the 'coef' argument: ?glmLRT ---- coef: ? By default, the last column of the design matrix is dropped to form the design matrix for the null model. ---- You have 5 levels for your time course (therefore, design matrix has 5 columns). To test for any difference between groups, I would suggest reparameterizing your design matrix and then specifying the 'coef' argument. Something like the following: mm <- model.matrix(~my_groups_factor) fit <- glmFit(my_data,mm,dispresion_value) fit <- glmLRT(data,fit,coef=2:5) topTags(fit) Hope that helps. Regards, Mark > > Thank you in advance. > Best Regards, > Rabe ---------- Prof. Dr. Mark Robinson Bioinformatics Institute of Molecular Life Sciences University of Zurich Winterthurerstrasse 190 8057 Zurich Switzerland v: +41 44 635 4848 f: +41 44 635 6898 e: mark.robinson at imls.uzh.ch o: Y32-J-34 w: http://tiny.cc/mrobin
Hi Mark, Thank you for help. when i tried : >mm<-model.matrix(~0+my_groups_factor) > fit<-glmfit(my_data,mm,dispresion_value) > fit<-glmLRT(my_data,fit) > topTags(fit) --------------- i got: Error in abs(object$table$logFC) : Non-numeric argument to mathematical function ---------------- Any idea? Thank you, Rabe On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson <mark.robinson@imls.uzh.ch>wrote: > Hi Rabe, > > Some comments below. > > > On 19.12.2011, at 10:57, Asma rabe wrote: > > > Hi All, > > > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs. > > > > I would like to use edgeR to get genes that are differentially expressed > > throughout the experiment. > > In a recent edgeR user's manual ,it has been suggested to use a > dispersion > > value of 0.1 in case of no replicates for organisms that are genetically > > similar to humans. > > so it is possible to be used for mouse data for example,what about other > > organisms like yeast ?? > > Yes, it is possible. But, as mentioned in the manual, none of these > solutions are ideal. > > > > As exact test is used to get DE genes across 2 samples and i need to get > DE > > genes across my time course experiment,i used GLM. > > > > when i used GLM model to find DE genes across my samples i ran the > > following commands: > > > > mm<-model.matrix(~0+my_groups_factor) > > fit<-glmfit(my_data,mm,dispresion_value) > > fit<-glmLRT(data,fit) > > topTags(fit) > > > > ----- > > i got: > > > > coefficient: group8 > > logConc logFC PValue FDR > > gene1 ..... ..... .... > > .... > > gene2 .. .... > > .... .... > > .. > > . > > .. > > > > > > --- > > I have a basic question(sorry i'm not statistician) > > Why the last gorup 8hrs was chooosen to be the coef.?? > > Have a look at the documentation for glmLRT and specifically the 'coef' > argument: > > ?glmLRT > > ---- > coef: By default, the > last column of the design matrix is dropped to form the > design matrix for the null model. > ---- > > You have 5 levels for your time course (therefore, design matrix has 5 > columns). To test for any difference between groups, I would suggest > reparameterizing your design matrix and then specifying the 'coef' > argument. Something like the following: > > mm <- model.matrix(~my_groups_factor) > fit <- glmFit(my_data,mm,dispresion_value) > fit <- glmLRT(data,fit,coef=2:5) > topTags(fit) > > Hope that helps. > > Regards, > Mark > > > > > > Thank you in advance. > > Best Regards, > > Rabe > > > ---------- > Prof. Dr. Mark Robinson > Bioinformatics > Institute of Molecular Life Sciences > University of Zurich > Winterthurerstrasse 190 > 8057 Zurich > Switzerland > > v: +41 44 635 4848 > f: +41 44 635 6898 > e: mark.robinson@imls.uzh.ch > o: Y32-J-34 > w: http://tiny.cc/mrobin > > > [[alternative HTML version deleted]]
Hi Rabe, You could make it easier by giving us more information. It's always a good plan to describe your objects and send the output of sessionInfo(). Since I don't know what is in all of your objects (e.g. 'my_groups_factor', 'my_data', etc.), I'll make some guesses given what you've described. Below is a snippet of code (taken almost entirely from the glmFit/glmLRT documentation -- have you read this?) that hard-codes dispersion at 0.1 for 5 samples from different groups with no replication: -------------- nlibs <- 5 ntags <- 100 dispersion.true <- 0.1 # Make first a factor with 5 levels x <- factor(1:5) design <- model.matrix(~x) # Generate count data y <- rnbinom(ntags*nlibs,mu=8,size=1/dispersion.true) y <- matrix(y,ntags,nlibs) colnames(y) <- paste("s",1:5,sep="") rownames(y) <- paste("Gene",1:ntags,sep="") d <- DGEList(y) # Normalize d <- calcNormFactors(d) # Fit the NB GLMs fit <- glmFit(d, design, dispersion=0.1) # Likelihood ratio test for differences b/w groups results <- glmLRT(d, fit, coef=2:5) topTags(results) -------------- Perhaps, you can modify this to suit your purpose. Of course, as I hope you can appreciate, this is not as ideal as having biological replicates and estimating the dispersion through the sophisticated methods available ... Regards, Mark On 20.12.2011, at 09:31, Asma rabe wrote: > Hi Mark, > > Thank you for help. > > when i tried : > > >mm<-model.matrix(~0+my_groups_factor) > > fit<-glmfit(my_data,mm,dispresion_value) > > fit<-glmLRT(my_data,fit) > > topTags(fit) > --------------- > > i got: > > Error in abs(object$table$logFC) : > Non-numeric argument to mathematical function > ---------------- > Any idea? > > Thank you, > Rabe > > > On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson <mark.robinson at="" imls.uzh.ch=""> wrote: > Hi Rabe, > > Some comments below. > > > On 19.12.2011, at 10:57, Asma rabe wrote: > > > Hi All, > > > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs. > > > > I would like to use edgeR to get genes that are differentially expressed > > throughout the experiment. > > In a recent edgeR user's manual ,it has been suggested to use a dispersion > > value of 0.1 in case of no replicates for organisms that are genetically > > similar to humans. > > so it is possible to be used for mouse data for example,what about other > > organisms like yeast ?? > > Yes, it is possible. But, as mentioned in the manual, none of these solutions are ideal. > > > > As exact test is used to get DE genes across 2 samples and i need to get DE > > genes across my time course experiment,i used GLM. > > > > when i used GLM model to find DE genes across my samples i ran the > > following commands: > > > > mm<-model.matrix(~0+my_groups_factor) > > fit<-glmfit(my_data,mm,dispresion_value) > > fit<-glmLRT(data,fit) > > topTags(fit) > > > > ----- > > i got: > > > > coefficient: group8 > > logConc logFC PValue FDR > > gene1 ..... ..... .... > > .... > > gene2 .. .... > > .... .... > > .. > > . > > .. > > > > > > --- > > I have a basic question(sorry i'm not statistician) > > Why the last gorup 8hrs was chooosen to be the coef.?? > > Have a look at the documentation for glmLRT and specifically the 'coef' argument: > > ?glmLRT > > ---- > coef: ? By default, the > last column of the design matrix is dropped to form the > design matrix for the null model. > ---- > > You have 5 levels for your time course (therefore, design matrix has 5 columns). To test for any difference between groups, I would suggest reparameterizing your design matrix and then specifying the 'coef' argument. Something like the following: > > mm <- model.matrix(~my_groups_factor) > fit <- glmFit(my_data,mm,dispresion_value) > fit <- glmLRT(data,fit,coef=2:5) > topTags(fit) > > Hope that helps. > > Regards, > Mark > > > > > > Thank you in advance. > > Best Regards, > > Rabe > > > ---------- > Prof. Dr. Mark Robinson > Bioinformatics > Institute of Molecular Life Sciences > University of Zurich > Winterthurerstrasse 190 > 8057 Zurich > Switzerland > > v: +41 44 635 4848 > f: +41 44 635 6898 > e: mark.robinson at imls.uzh.ch > o: Y32-J-34 > w: http://tiny.cc/mrobin > > > ---------- Prof. Dr. Mark Robinson Bioinformatics Institute of Molecular Life Sciences University of Zurich Winterthurerstrasse 190 8057 Zurich Switzerland v: +41 44 635 4848 f: +41 44 635 6898 e: mark.robinson at imls.uzh.ch o: Y32-J-34 w: http://tiny.cc/mrobin
Hi Mark, Thank you for help When i tried: nlibs <- 5 ntags <- 100 dispersion.true <- 0.1 # Make first a factor with 5 levels x <- factor(1:5) design <- model.matrix(~x) # Generate count data y <- rnbinom(ntags*nlibs,mu=8,size= 1/dispersion.true) y <- matrix(y,ntags,nlibs) colnames(y) <- paste("s",1:5,sep="") rownames(y) <- paste("Gene",1:ntags,sep="") d <- DGEList(y) # Normalize d <- calcNormFactors(d) # Fit the NB GLMs fit <- glmFit(d, design, dispersion=0.1) # Likelihood ratio test for differences b/w groups results <- glmLRT(d, fit, coef=2:5) topTags(results) ------------------- i got same errorr: Error in abs(object$table$logFC) : Non-numeric argument to mathematical function ---------- On Tue, Dec 20, 2011 at 5:51 PM, Mark Robinson <mark.robinson@imls.uzh.ch>wrote: > Hi Rabe, > > You could make it easier by giving us more information. It's always a > good plan to describe your objects and send the output of sessionInfo(). > Since I don't know what is in all of your objects (e.g. > 'my_groups_factor', 'my_data', etc.), I'll make some guesses given what > you've described. > > Below is a snippet of code (taken almost entirely from the glmFit/glmLRT > documentation -- have you read this?) that hard-codes dispersion at 0.1 for > 5 samples from different groups with no replication: > > -------------- > nlibs <- 5 > ntags <- 100 > dispersion.true <- 0.1 > > # Make first a factor with 5 levels > x <- factor(1:5) > design <- model.matrix(~x) > > # Generate count data > y <- rnbinom(ntags*nlibs,mu=8,size=1/dispersion.true) > y <- matrix(y,ntags,nlibs) > colnames(y) <- paste("s",1:5,sep="") > rownames(y) <- paste("Gene",1:ntags,sep="") > d <- DGEList(y) > > # Normalize > d <- calcNormFactors(d) > > # Fit the NB GLMs > fit <- glmFit(d, design, dispersion=0.1) > > # Likelihood ratio test for differences b/w groups > results <- glmLRT(d, fit, coef=2:5) > topTags(results) > -------------- > > Perhaps, you can modify this to suit your purpose. > > Of course, as I hope you can appreciate, this is not as ideal as having > biological replicates and estimating the dispersion through the > sophisticated methods available ... > > Regards, > Mark > > > > On 20.12.2011, at 09:31, Asma rabe wrote: > > > Hi Mark, > > > > Thank you for help. > > > > when i tried : > > > > >mm<-model.matrix(~0+my_groups_factor) > > > fit<-glmfit(my_data,mm,dispresion_value) > > > fit<-glmLRT(my_data,fit) > > > topTags(fit) > > --------------- > > > > i got: > > > > Error in abs(object$table$logFC) : > > Non-numeric argument to mathematical function > > ---------------- > > Any idea? > > > > Thank you, > > Rabe > > > > > > On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson < > mark.robinson@imls.uzh.ch> wrote: > > Hi Rabe, > > > > Some comments below. > > > > > > On 19.12.2011, at 10:57, Asma rabe wrote: > > > > > Hi All, > > > > > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and > 8hrs. > > > > > > I would like to use edgeR to get genes that are differentially > expressed > > > throughout the experiment. > > > In a recent edgeR user's manual ,it has been suggested to use a > dispersion > > > value of 0.1 in case of no replicates for organisms that are > genetically > > > similar to humans. > > > so it is possible to be used for mouse data for example,what about > other > > > organisms like yeast ?? > > > > Yes, it is possible. But, as mentioned in the manual, none of these > solutions are ideal. > > > > > > > As exact test is used to get DE genes across 2 samples and i need to > get DE > > > genes across my time course experiment,i used GLM. > > > > > > when i used GLM model to find DE genes across my samples i ran the > > > following commands: > > > > > > mm<-model.matrix(~0+my_groups_factor) > > > fit<-glmfit(my_data,mm,dispresion_value) > > > fit<-glmLRT(data,fit) > > > topTags(fit) > > > > > > ----- > > > i got: > > > > > > coefficient: group8 > > > logConc logFC PValue FDR > > > gene1 ..... ..... .... > > > .... > > > gene2 .. .... > > > .... .... > > > .. > > > . > > > .. > > > > > > > > > --- > > > I have a basic question(sorry i'm not statistician) > > > Why the last gorup 8hrs was chooosen to be the coef.?? > > > > Have a look at the documentation for glmLRT and specifically the 'coef' > argument: > > > > ?glmLRT > > > > ---- > > coef: By default, the > > last column of the design matrix is dropped to form the > > design matrix for the null model. > > ---- > > > > You have 5 levels for your time course (therefore, design matrix has 5 > columns). To test for any difference between groups, I would suggest > reparameterizing your design matrix and then specifying the 'coef' > argument. Something like the following: > > > > mm <- model.matrix(~my_groups_factor) > > fit <- glmFit(my_data,mm,dispresion_value) > > fit <- glmLRT(data,fit,coef=2:5) > > topTags(fit) > > > > Hope that helps. > > > > Regards, > > Mark > > > > > > > > > > Thank you in advance. > > > Best Regards, > > > Rabe > > > > > > ---------- > > Prof. Dr. Mark Robinson > > Bioinformatics > > Institute of Molecular Life Sciences > > University of Zurich > > Winterthurerstrasse 190 > > 8057 Zurich > > Switzerland > > > > v: +41 44 635 4848 > > f: +41 44 635 6898 > > e: mark.robinson@imls.uzh.ch > > o: Y32-J-34 > > w: http://tiny.cc/mrobin > > > > > > > > ---------- > Prof. Dr. Mark Robinson > Bioinformatics > Institute of Molecular Life Sciences > University of Zurich > Winterthurerstrasse 190 > 8057 Zurich > Switzerland > > v: +41 44 635 4848 > f: +41 44 635 6898 > e: mark.robinson@imls.uzh.ch > o: Y32-J-34 > w: http://tiny.cc/mrobin > > [[alternative HTML version deleted]]
Dear Rabe, Please read the posting guide: http://bioconductor.org/help/mailing-list/posting-guide/ Again, you give very little detail. My guess is you are using an old version of R/edgeR, but you ignored my suggestion to give more information. That code example runs fine on my environment: > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] de_CH.UTF-8/de_CH.UTF-8/de_CH.UTF-8/C/de_CH.UTF-8/de_CH.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] BiocInstaller_1.2.1 edgeR_2.4.1 loaded via a namespace (and not attached): [1] limma_3.10.0 tools_2.14.0 Mark On 20.12.2011, at 10:25, Asma rabe wrote: > Hi Mark, > > Thank you for help > When i tried: > > nlibs <- 5 > ntags <- 100 > dispersion.true <- 0.1 > > # Make first a factor with 5 levels > x <- factor(1:5) > design <- model.matrix(~x) > > # Generate count data > y <- rnbinom(ntags*nlibs,mu=8,size= > 1/dispersion.true) > y <- matrix(y,ntags,nlibs) > colnames(y) <- paste("s",1:5,sep="") > rownames(y) <- paste("Gene",1:ntags,sep="") > d <- DGEList(y) > > # Normalize > d <- calcNormFactors(d) > > # Fit the NB GLMs > fit <- glmFit(d, design, dispersion=0.1) > > # Likelihood ratio test for differences b/w groups > results <- glmLRT(d, fit, coef=2:5) > topTags(results) > ------------------- > i got same errorr: > > Error in abs(object$table$logFC) : > Non-numeric argument to mathematical function > > ---------- > On Tue, Dec 20, 2011 at 5:51 PM, Mark Robinson <mark.robinson at="" imls.uzh.ch=""> wrote: > Hi Rabe, > > You could make it easier by giving us more information. It's always a good plan to describe your objects and send the output of sessionInfo(). Since I don't know what is in all of your objects (e.g. 'my_groups_factor', 'my_data', etc.), I'll make some guesses given what you've described. > > Below is a snippet of code (taken almost entirely from the glmFit/glmLRT documentation -- have you read this?) that hard-codes dispersion at 0.1 for 5 samples from different groups with no replication: > > -------------- > nlibs <- 5 > ntags <- 100 > dispersion.true <- 0.1 > > # Make first a factor with 5 levels > x <- factor(1:5) > design <- model.matrix(~x) > > # Generate count data > y <- rnbinom(ntags*nlibs,mu=8,size=1/dispersion.true) > y <- matrix(y,ntags,nlibs) > colnames(y) <- paste("s",1:5,sep="") > rownames(y) <- paste("Gene",1:ntags,sep="") > d <- DGEList(y) > > # Normalize > d <- calcNormFactors(d) > > # Fit the NB GLMs > fit <- glmFit(d, design, dispersion=0.1) > > # Likelihood ratio test for differences b/w groups > results <- glmLRT(d, fit, coef=2:5) > topTags(results) > -------------- > > Perhaps, you can modify this to suit your purpose. > > Of course, as I hope you can appreciate, this is not as ideal as having biological replicates and estimating the dispersion through the sophisticated methods available ... > > Regards, > Mark > > > > On 20.12.2011, at 09:31, Asma rabe wrote: > > > Hi Mark, > > > > Thank you for help. > > > > when i tried : > > > > >mm<-model.matrix(~0+my_groups_factor) > > > fit<-glmfit(my_data,mm,dispresion_value) > > > fit<-glmLRT(my_data,fit) > > > topTags(fit) > > --------------- > > > > i got: > > > > Error in abs(object$table$logFC) : > > Non-numeric argument to mathematical function > > ---------------- > > Any idea? > > > > Thank you, > > Rabe > > > > > > On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson <mark.robinson at="" imls.uzh.ch=""> wrote: > > Hi Rabe, > > > > Some comments below. > > > > > > On 19.12.2011, at 10:57, Asma rabe wrote: > > > > > Hi All, > > > > > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs. > > > > > > I would like to use edgeR to get genes that are differentially expressed > > > throughout the experiment. > > > In a recent edgeR user's manual ,it has been suggested to use a dispersion > > > value of 0.1 in case of no replicates for organisms that are genetically > > > similar to humans. > > > so it is possible to be used for mouse data for example,what about other > > > organisms like yeast ?? > > > > Yes, it is possible. But, as mentioned in the manual, none of these solutions are ideal. > > > > > > > As exact test is used to get DE genes across 2 samples and i need to get DE > > > genes across my time course experiment,i used GLM. > > > > > > when i used GLM model to find DE genes across my samples i ran the > > > following commands: > > > > > > mm<-model.matrix(~0+my_groups_factor) > > > fit<-glmfit(my_data,mm,dispresion_value) > > > fit<-glmLRT(data,fit) > > > topTags(fit) > > > > > > ----- > > > i got: > > > > > > coefficient: group8 > > > logConc logFC PValue FDR > > > gene1 ..... ..... .... > > > .... > > > gene2 .. .... > > > .... .... > > > .. > > > . > > > .. > > > > > > > > > --- > > > I have a basic question(sorry i'm not statistician) > > > Why the last gorup 8hrs was chooosen to be the coef.?? > > > > Have a look at the documentation for glmLRT and specifically the 'coef' argument: > > > > ?glmLRT > > > > ---- > > coef: ? By default, the > > last column of the design matrix is dropped to form the > > design matrix for the null model. > > ---- > > > > You have 5 levels for your time course (therefore, design matrix has 5 columns). To test for any difference between groups, I would suggest reparameterizing your design matrix and then specifying the 'coef' argument. Something like the following: > > > > mm <- model.matrix(~my_groups_factor) > > fit <- glmFit(my_data,mm,dispresion_value) > > fit <- glmLRT(data,fit,coef=2:5) > > topTags(fit) > > > > Hope that helps. > > > > Regards, > > Mark > > > > > > > > > > Thank you in advance. > > > Best Regards, > > > Rabe > > > > > > ---------- > > Prof. Dr. Mark Robinson > > Bioinformatics > > Institute of Molecular Life Sciences > > University of Zurich > > Winterthurerstrasse 190 > > 8057 Zurich > > Switzerland > > > > v: +41 44 635 4848 > > f: +41 44 635 6898 > > e: mark.robinson at imls.uzh.ch > > o: Y32-J-34 > > w: http://tiny.cc/mrobin > > > > > > > > ---------- > Prof. Dr. Mark Robinson > Bioinformatics > Institute of Molecular Life Sciences > University of Zurich > Winterthurerstrasse 190 > 8057 Zurich > Switzerland > > v: +41 44 635 4848 > f: +41 44 635 6898 > e: mark.robinson at imls.uzh.ch > o: Y32-J-34 > w: http://tiny.cc/mrobin > > ---------- Prof. Dr. Mark Robinson Bioinformatics Institute of Molecular Life Sciences University of Zurich Winterthurerstrasse 190 8057 Zurich Switzerland v: +41 44 635 4848 f: +41 44 635 6898 e: mark.robinson at imls.uzh.ch o: Y32-J-34 w: http://tiny.cc/mrobin
Hi Mark, Thank you for help. my session info is: -------------- > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_2.0.5 loaded via a namespace (and not attached): [1] limma_3.6.9 ...... your guess was correct,i'll update R. Best Regards, Rabe On Tue, Dec 20, 2011 at 6:36 PM, Mark Robinson <mark.robinson@imls.uzh.ch>wrote: > Dear Rabe, > > Please read the posting guide: > http://bioconductor.org/help/mailing-list/posting-guide/ > > Again, you give very little detail. My guess is you are using an old > version of R/edgeR, but you ignored my suggestion to give more information. > That code example runs fine on my environment: > > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] de_CH.UTF-8/de_CH.UTF-8/de_CH.UTF-8/C/de_CH.UTF-8/de_CH.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BiocInstaller_1.2.1 edgeR_2.4.1 > > loaded via a namespace (and not attached): > [1] limma_3.10.0 tools_2.14.0 > > > Mark > > On 20.12.2011, at 10:25, Asma rabe wrote: > > > Hi Mark, > > > > Thank you for help > > When i tried: > > > > nlibs <- 5 > > ntags <- 100 > > dispersion.true <- 0.1 > > > > # Make first a factor with 5 levels > > x <- factor(1:5) > > design <- model.matrix(~x) > > > > # Generate count data > > y <- rnbinom(ntags*nlibs,mu=8,size= > > 1/dispersion.true) > > y <- matrix(y,ntags,nlibs) > > colnames(y) <- paste("s",1:5,sep="") > > rownames(y) <- paste("Gene",1:ntags,sep="") > > d <- DGEList(y) > > > > # Normalize > > d <- calcNormFactors(d) > > > > # Fit the NB GLMs > > fit <- glmFit(d, design, dispersion=0.1) > > > > # Likelihood ratio test for differences b/w groups > > results <- glmLRT(d, fit, coef=2:5) > > topTags(results) > > ------------------- > > i got same errorr: > > > > Error in abs(object$table$logFC) : > > Non-numeric argument to mathematical function > > > > ---------- > > On Tue, Dec 20, 2011 at 5:51 PM, Mark Robinson < > mark.robinson@imls.uzh.ch> wrote: > > Hi Rabe, > > > > You could make it easier by giving us more information. It's always a > good plan to describe your objects and send the output of sessionInfo(). > Since I don't know what is in all of your objects (e.g. > 'my_groups_factor', 'my_data', etc.), I'll make some guesses given what > you've described. > > > > Below is a snippet of code (taken almost entirely from the glmFit/glmLRT > documentation -- have you read this?) that hard-codes dispersion at 0.1 for > 5 samples from different groups with no replication: > > > > -------------- > > nlibs <- 5 > > ntags <- 100 > > dispersion.true <- 0.1 > > > > # Make first a factor with 5 levels > > x <- factor(1:5) > > design <- model.matrix(~x) > > > > # Generate count data > > y <- rnbinom(ntags*nlibs,mu=8,size=1/dispersion.true) > > y <- matrix(y,ntags,nlibs) > > colnames(y) <- paste("s",1:5,sep="") > > rownames(y) <- paste("Gene",1:ntags,sep="") > > d <- DGEList(y) > > > > # Normalize > > d <- calcNormFactors(d) > > > > # Fit the NB GLMs > > fit <- glmFit(d, design, dispersion=0.1) > > > > # Likelihood ratio test for differences b/w groups > > results <- glmLRT(d, fit, coef=2:5) > > topTags(results) > > -------------- > > > > Perhaps, you can modify this to suit your purpose. > > > > Of course, as I hope you can appreciate, this is not as ideal as having > biological replicates and estimating the dispersion through the > sophisticated methods available ... > > > > Regards, > > Mark > > > > > > > > On 20.12.2011, at 09:31, Asma rabe wrote: > > > > > Hi Mark, > > > > > > Thank you for help. > > > > > > when i tried : > > > > > > >mm<-model.matrix(~0+my_groups_factor) > > > > fit<-glmfit(my_data,mm,dispresion_value) > > > > fit<-glmLRT(my_data,fit) > > > > topTags(fit) > > > --------------- > > > > > > i got: > > > > > > Error in abs(object$table$logFC) : > > > Non-numeric argument to mathematical function > > > ---------------- > > > Any idea? > > > > > > Thank you, > > > Rabe > > > > > > > > > On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson < > mark.robinson@imls.uzh.ch> wrote: > > > Hi Rabe, > > > > > > Some comments below. > > > > > > > > > On 19.12.2011, at 10:57, Asma rabe wrote: > > > > > > > Hi All, > > > > > > > > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and > 8hrs. > > > > > > > > I would like to use edgeR to get genes that are differentially > expressed > > > > throughout the experiment. > > > > In a recent edgeR user's manual ,it has been suggested to use a > dispersion > > > > value of 0.1 in case of no replicates for organisms that are > genetically > > > > similar to humans. > > > > so it is possible to be used for mouse data for example,what about > other > > > > organisms like yeast ?? > > > > > > Yes, it is possible. But, as mentioned in the manual, none of these > solutions are ideal. > > > > > > > > > > As exact test is used to get DE genes across 2 samples and i need to > get DE > > > > genes across my time course experiment,i used GLM. > > > > > > > > when i used GLM model to find DE genes across my samples i ran the > > > > following commands: > > > > > > > > mm<-model.matrix(~0+my_groups_factor) > > > > fit<-glmfit(my_data,mm,dispresion_value) > > > > fit<-glmLRT(data,fit) > > > > topTags(fit) > > > > > > > > ----- > > > > i got: > > > > > > > > coefficient: group8 > > > > logConc logFC PValue FDR > > > > gene1 ..... ..... .... > > > > .... > > > > gene2 .. .... > > > > .... .... > > > > .. > > > > . > > > > .. > > > > > > > > > > > > --- > > > > I have a basic question(sorry i'm not statistician) > > > > Why the last gorup 8hrs was chooosen to be the coef.?? > > > > > > Have a look at the documentation for glmLRT and specifically the > 'coef' argument: > > > > > > ?glmLRT > > > > > > ---- > > > coef: By default, the > > > last column of the design matrix is dropped to form the > > > design matrix for the null model. > > > ---- > > > > > > You have 5 levels for your time course (therefore, design matrix has 5 > columns). To test for any difference between groups, I would suggest > reparameterizing your design matrix and then specifying the 'coef' > argument. Something like the following: > > > > > > mm <- model.matrix(~my_groups_factor) > > > fit <- glmFit(my_data,mm,dispresion_value) > > > fit <- glmLRT(data,fit,coef=2:5) > > > topTags(fit) > > > > > > Hope that helps. > > > > > > Regards, > > > Mark > > > > > > > > > > > > > > Thank you in advance. > > > > Best Regards, > > > > Rabe > > > > > > > > > ---------- > > > Prof. Dr. Mark Robinson > > > Bioinformatics > > > Institute of Molecular Life Sciences > > > University of Zurich > > > Winterthurerstrasse 190 > > > 8057 Zurich > > > Switzerland > > > > > > v: +41 44 635 4848 > > > f: +41 44 635 6898 > > > e: mark.robinson@imls.uzh.ch > > > o: Y32-J-34 > > > w: http://tiny.cc/mrobin > > > > > > > > > > > > > ---------- > > Prof. Dr. Mark Robinson > > Bioinformatics > > Institute of Molecular Life Sciences > > University of Zurich > > Winterthurerstrasse 190 > > 8057 Zurich > > Switzerland > > > > v: +41 44 635 4848 > > f: +41 44 635 6898 > > e: mark.robinson@imls.uzh.ch > > o: Y32-J-34 > > w: http://tiny.cc/mrobin > > > > > > ---------- > Prof. Dr. Mark Robinson > Bioinformatics > Institute of Molecular Life Sciences > University of Zurich > Winterthurerstrasse 190 > 8057 Zurich > Switzerland > > v: +41 44 635 4848 > f: +41 44 635 6898 > e: mark.robinson@imls.uzh.ch > o: Y32-J-34 > w: http://tiny.cc/mrobin > > [[alternative HTML version deleted]]