Time course experiment....
1
0
Entering edit mode
@gordon-smyth
Last seen 7 hours ago
WEHI, Melbourne, Australia
Dear Sohail, Well, there are lots of ways to generate such a table. Perhaps the simplest is fitsel <- fit2[sel.dif, ] as.data.frame( fitsel ) Best wishes Gordon >Date: Tue, 20 Dec 2005 14:03:43 -0500 >From: "Khan, Sohail" <khan at="" cshl.edu=""> >Subject: [BioC] Time course experiment.... >To: <bioconductor at="" stat.math.ethz.ch=""> > >Dear List, > >I have performed a time course analysis using limma, as described in >"Bioinformatics and Computational Biology Solutions ........". >How can I get a list of differentially expressed genes? I've tried >the code below: >sel.dif<-p.adjust(fit2$F.p.vlaue,method="fdr") <0.05 >This produces a logical vector, right?. I would like a table of >differentially expressed genes with p vales etc. Sorry, if I missed >this in the limma user's guide. Thanks for any suggestions. > > >Sohail Khan >Scientific Programmer >COLD SPRING HARBOR LABORATORY >Genome Research Center >500 Sunnyside Boulevard >Woodbury, NY 11797 >(516)422-4076
limma limma • 1.5k views
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 4.5 years ago
United States
I did not get the original posting, but doesn't Sohail just need "TopTable" for this? --Naomi At 08:51 AM 1/2/2006, Gordon Smyth wrote: >Dear Sohail, > >Well, there are lots of ways to generate such a table. Perhaps the simplest is > > fitsel <- fit2[sel.dif, ] > as.data.frame( fitsel ) > >Best wishes >Gordon > > >Date: Tue, 20 Dec 2005 14:03:43 -0500 > >From: "Khan, Sohail" <khan at="" cshl.edu=""> > >Subject: [BioC] Time course experiment.... > >To: <bioconductor at="" stat.math.ethz.ch=""> > > > >Dear List, > > > >I have performed a time course analysis using limma, as described in > >"Bioinformatics and Computational Biology Solutions ........". > >How can I get a list of differentially expressed genes? I've tried > >the code below: > >sel.dif<-p.adjust(fit2$F.p.vlaue,method="fdr") <0.05 > >This produces a logical vector, right?. I would like a table of > >differentially expressed genes with p vales etc. Sorry, if I missed > >this in the limma user's guide. Thanks for any suggestions. > > > > > >Sohail Khan > >Scientific Programmer > >COLD SPRING HARBOR LABORATORY > >Genome Research Center > >500 Sunnyside Boulevard > >Woodbury, NY 11797 > >(516)422-4076 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
topTable() doesn't have a facility for sorting on or presenting the F-statistics, because it is individual coefficient orientated. I have toyed with the idea that perhaps topTable() should output a table based on the F-statistic when the argument 'coefficient' is set to NULL. In other words topTable() would give individual coef significance when a coef is specified, otherwise it would give overall significance. Would you find that a useful facility? Best wishes Gordon At 09:04 AM 3/01/2006, Naomi Altman wrote: >I did not get the original posting, but doesn't Sohail just need >"TopTable" for this? > >--Naomi > >At 08:51 AM 1/2/2006, Gordon Smyth wrote: >>Dear Sohail, >> >>Well, there are lots of ways to generate such a table. Perhaps the >>simplest is >> >> fitsel <- fit2[sel.dif, ] >> as.data.frame( fitsel ) >> >>Best wishes >>Gordon >> >> >Date: Tue, 20 Dec 2005 14:03:43 -0500 >> >From: "Khan, Sohail" <khan at="" cshl.edu=""> >> >Subject: [BioC] Time course experiment.... >> >To: <bioconductor at="" stat.math.ethz.ch=""> >> > >> >Dear List, >> > >> >I have performed a time course analysis using limma, as described in >> >"Bioinformatics and Computational Biology Solutions ........". >> >How can I get a list of differentially expressed genes? I've tried >> >the code below: >> >sel.dif<-p.adjust(fit2$F.p.vlaue,method="fdr") <0.05 >> >This produces a logical vector, right?. I would like a table of >> >differentially expressed genes with p vales etc. Sorry, if I missed >> >this in the limma user's guide. Thanks for any suggestions. >> > >> > >> >Sohail Khan >> >Scientific Programmer >> >COLD SPRING HARBOR LABORATORY >> >Genome Research Center >> >500 Sunnyside Boulevard >> >Woodbury, NY 11797 >> >(516)422-4076 >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > >Naomi S. Altman 814-865-3791 (voice) >Associate Professor >Dept. of Statistics 814-863-7114 (fax) >Penn State University 814-865-1348 (Statistics) >University Park, PA 16802-2111 >
ADD REPLY
0
Entering edit mode
I would like TopTable to print: p-value adjusted "p-value" using whatever set of p-values is relevant - i.e. either the t-test for a coefficient or the F-test for a set of coefficients or the overall F-test. --Naomi At 08:00 PM 1/2/2006, Gordon Smyth wrote: >topTable() doesn't have a facility for sorting on or presenting the >F-statistics, because it is individual coefficient orientated. > >I have toyed with the idea that perhaps topTable() should output a >table based on the F-statistic when the argument 'coefficient' is >set to NULL. In other words topTable() would give individual coef >significance when a coef is specified, otherwise it would give >overall significance. Would you find that a useful facility? > >Best wishes >Gordon > >At 09:04 AM 3/01/2006, Naomi Altman wrote: >>I did not get the original posting, but doesn't Sohail just need >>"TopTable" for this? >> >>--Naomi >> >>At 08:51 AM 1/2/2006, Gordon Smyth wrote: >>>Dear Sohail, >>> >>>Well, there are lots of ways to generate such a table. Perhaps the >>>simplest is >>> >>> fitsel <- fit2[sel.dif, ] >>> as.data.frame( fitsel ) >>> >>>Best wishes >>>Gordon >>> >>> >Date: Tue, 20 Dec 2005 14:03:43 -0500 >>> >From: "Khan, Sohail" <khan at="" cshl.edu=""> >>> >Subject: [BioC] Time course experiment.... >>> >To: <bioconductor at="" stat.math.ethz.ch=""> >>> > >>> >Dear List, >>> > >>> >I have performed a time course analysis using limma, as described in >>> >"Bioinformatics and Computational Biology Solutions ........". >>> >How can I get a list of differentially expressed genes? I've tried >>> >the code below: >>> >sel.dif<-p.adjust(fit2$F.p.vlaue,method="fdr") <0.05 >>> >This produces a logical vector, right?. I would like a table of >>> >differentially expressed genes with p vales etc. Sorry, if I missed >>> >this in the limma user's guide. Thanks for any suggestions. >>> > >>> > >>> >Sohail Khan >>> >Scientific Programmer >>> >COLD SPRING HARBOR LABORATORY >>> >Genome Research Center >>> >500 Sunnyside Boulevard >>> >Woodbury, NY 11797 >>> >(516)422-4076 >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor at stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>Naomi S. Altman 814-865-3791 (voice) >>Associate Professor >>Dept. of Statistics 814-863-7114 (fax) >>Penn State University 814-865-1348 (Statistics) >>University Park, PA 16802-2111 > Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
At 07:12 PM 1/2/2006, Naomi Altman wrote: >I would like TopTable to print: > >p-value >adjusted "p-value" > >using whatever set of p-values is relevant - i.e. either the t-test >for a coefficient or the F-test for a set of coefficients or the >overall F-test. I concur... it would be helpful to see whether or not p-value adjustment had been done in the object created by topTable, rather than having to look back to the code when topTable was called (if it's still around...). Furthermore, it is possible to indicate which adjustment has been done by the name of the column, e.g. BH.p or BY.p? I understand these requests may not be feasible if the resulting object needs to have the same number of columns with the same names every time. A couple of other suggestions for minor improvements on topTable functionality: 1) have "ID" be an option for resort.by; 2) have "all" be an option (or even the default!) for number. This second request is really me just being lazy and not wanting to look up and type in the number of genes, or coding it by out.table <- topTable(fit, number=length(fit$genes[,1])) Happy New Year! Jenny >--Naomi > >At 08:00 PM 1/2/2006, Gordon Smyth wrote: > >topTable() doesn't have a facility for sorting on or presenting the > >F-statistics, because it is individual coefficient orientated. > > > >I have toyed with the idea that perhaps topTable() should output a > >table based on the F-statistic when the argument 'coefficient' is > >set to NULL. In other words topTable() would give individual coef > >significance when a coef is specified, otherwise it would give > >overall significance. Would you find that a useful facility? > > > >Best wishes > >Gordon > > > >At 09:04 AM 3/01/2006, Naomi Altman wrote: > >>I did not get the original posting, but doesn't Sohail just need > >>"TopTable" for this? > >> > >>--Naomi > >> > >>At 08:51 AM 1/2/2006, Gordon Smyth wrote: > >>>Dear Sohail, > >>> > >>>Well, there are lots of ways to generate such a table. Perhaps the > >>>simplest is > >>> > >>> fitsel <- fit2[sel.dif, ] > >>> as.data.frame( fitsel ) > >>> > >>>Best wishes > >>>Gordon > >>> > >>> >Date: Tue, 20 Dec 2005 14:03:43 -0500 > >>> >From: "Khan, Sohail" <khan at="" cshl.edu=""> > >>> >Subject: [BioC] Time course experiment.... > >>> >To: <bioconductor at="" stat.math.ethz.ch=""> > >>> > > >>> >Dear List, > >>> > > >>> >I have performed a time course analysis using limma, as described in > >>> >"Bioinformatics and Computational Biology Solutions ........". > >>> >How can I get a list of differentially expressed genes? I've tried > >>> >the code below: > >>> >sel.dif<-p.adjust(fit2$F.p.vlaue,method="fdr") <0.05 > >>> >This produces a logical vector, right?. I would like a table of > >>> >differentially expressed genes with p vales etc. Sorry, if I missed > >>> >this in the limma user's guide. Thanks for any suggestions. > >>> > > >>> > > >>> >Sohail Khan > >>> >Scientific Programmer > >>> >COLD SPRING HARBOR LABORATORY > >>> >Genome Research Center > >>> >500 Sunnyside Boulevard > >>> >Woodbury, NY 11797 > >>> >(516)422-4076 > >>> > >>>_______________________________________________ > >>>Bioconductor mailing list > >>>Bioconductor at stat.math.ethz.ch > >>>https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > >>Naomi S. Altman 814-865-3791 (voice) > >>Associate Professor > >>Dept. of Statistics 814-863-7114 (fax) > >>Penn State University 814-865-1348 (Statistics) > >>University Park, PA 16802-2111 > > > >Naomi S. Altman 814-865-3791 (voice) >Associate Professor >Dept. of Statistics 814-863-7114 (fax) >Penn State University 814-865-1348 (Statistics) >University Park, PA 16802-2111 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu
ADD REPLY
0
Entering edit mode
At 03:21 AM 4/01/2006, Jenny Drnevich wrote: >At 07:12 PM 1/2/2006, Naomi Altman wrote: >>I would like TopTable to print: >> >>p-value >>adjusted "p-value" >> >>using whatever set of p-values is relevant - i.e. either the t-test >>for a coefficient or the F-test for a set of coefficients or the >>overall F-test. > >I concur... it would be helpful to see whether or not p-value >adjustment had been done in the object created by topTable, rather >than having to look back to the code when topTable was called (if >it's still around...). Furthermore, it is possible to indicate >which adjustment has been done by the name of the column, e.g. BH.p >or BY.p? I understand these requests may not be feasible if the >resulting object needs to have the same number of columns with the >same names every time. I am very reluctant to change the names of the columns in the data.frame output by topTable() depending on the adjustment method. This would make the output object difficult to program with. If someone writes a program using topTable() they won't know what column to extract to get the P-values unless they know what adjustment was used. I could write a print method for toptable objects which would display informative column names. However that wouldn't change the column names in the output object itself, so you wouldn't get the informative column names if you save the results from topTable() and then write them to a file, which is what I think you're doing. How about this? topTable() could output the name of the coefficient and the adjustment method before starting the tabular output, e.g., > topTable(fit, coef=2) Coefficient name: MTvsWT Adjustment method: BH ProbeID M A t P.value Adj.P.value B etc >A couple of other suggestions for minor improvements on topTable >functionality: 1) have "ID" be an option for resort.by; 2) have >"all" be an option (or even the default!) for number. This second >request is really me just being lazy and not wanting to look up and >type in the number of genes, or coding it by out.table <- >topTable(fit, number=length(fit$genes[,1])) Sorting on ID is hard for the reasons I explain to Naomi. I can include an "all" option for the number of genes. The reason I haven't done so yet is that I wanted to discourage the use of topTable() for outputing a file with all the genes in it, because that is the purpose of write.fit() rather than topTable(). Can you try out write.fit() and give me feedback on whether it does what you want? What I find that I want sometimes is a topTable() type presentation for a specified list of genes. In other words, something like topTable(fit[ mygenes, ], sort="none") is required. Cheers Gordon >Happy New Year! >Jenny
ADD REPLY
0
Entering edit mode
On 1/2/06 8:00 PM, "Gordon Smyth" <smyth at="" wehi.edu.au=""> wrote: > topTable() doesn't have a facility for sorting on or presenting the > F-statistics, because it is individual coefficient orientated. > > I have toyed with the idea that perhaps topTable() should output a > table based on the F-statistic when the argument 'coefficient' is set > to NULL. In other words topTable() would give individual coef > significance when a coef is specified, otherwise it would give > overall significance. Would you find that a useful facility? Gordon, I do think your suggested additions would be useful. In addition, I often find myself generating a large spreadsheet based on the output of a number of individual coefficients so that the biologist can use that spreadsheet in something like Excel to sort and filter data at will. While less statistically palatable than the current method and your F-stat extension, would it be possible to include a third variation such that the coefficient argument could be set to a vector? This then leaves the question of how to "rank" or "sort" genes, but one could choose arbitrarily to order by the input to topTable (the order of the fit object--I typically use ALL genes for output) or to use the first value in the coefficient vector as the "key" coefficient. For complicated experiments, I think many biologists like to see how a gene looks for multiple coefficients simultaneously. Thanks, Sean
ADD REPLY
0
Entering edit mode
On Tue, January 3, 2006 11:24 pm, Sean Davis wrote: > On 1/2/06 8:00 PM, "Gordon Smyth" <smyth at="" wehi.edu.au=""> wrote: >> topTable() doesn't have a facility for sorting on or presenting the >> F-statistics, because it is individual coefficient orientated. >> >> I have toyed with the idea that perhaps topTable() should output a >> table based on the F-statistic when the argument 'coefficient' is set >> to NULL. In other words topTable() would give individual coef >> significance when a coef is specified, otherwise it would give >> overall significance. Would you find that a useful facility? > > Gordon, > > I do think your suggested additions would be useful. In addition, I often > find myself generating a large spreadsheet based on the output of a number > of individual coefficients so that the biologist can use that spreadsheet in > something like Excel to sort and filter data at will. > > While less statistically palatable than the current method and your F-stat > extension, would it be possible to include a third variation such that the > coefficient argument could be set to a vector? This then leaves the > question of how to "rank" or "sort" genes, but one could choose arbitrarily > to order by the input to topTable (the order of the fit object--I typically > use ALL genes for output) or to use the first value in the coefficient > vector as the "key" coefficient. For complicated experiments, I think many > biologists like to see how a gene looks for multiple coefficients > simultaneously. > > Thanks, > Sean Hi Sean, I'm not sure what output you are thinking of for vector coef. How is it different from write.fit(fit) or write.fit(fit[,selectedcoefs]) ? In general, the use of contrasts is intended for looking at subsets of coefficients. E.g., contrasts.fit() followed by topTable() with coef=NULL could rank genes on the F-statistic for any selection of coefficients or contrasts. My thought of what a vector coef argument to topTable() would be is that it should (i) compute the fit object for the set of contrasts defined by those coefficients and (ii) output a top-table ranked by the F-statistics for that set of contrasts. Cheers Gordon
ADD REPLY
0
Entering edit mode
On 1/3/06 7:40 AM, "Gordon K Smyth" <smyth at="" wehi.edu.au=""> wrote: > On Tue, January 3, 2006 11:24 pm, Sean Davis wrote: >> On 1/2/06 8:00 PM, "Gordon Smyth" <smyth at="" wehi.edu.au=""> wrote: >>> topTable() doesn't have a facility for sorting on or presenting the >>> F-statistics, because it is individual coefficient orientated. >>> >>> I have toyed with the idea that perhaps topTable() should output a >>> table based on the F-statistic when the argument 'coefficient' is set >>> to NULL. In other words topTable() would give individual coef >>> significance when a coef is specified, otherwise it would give >>> overall significance. Would you find that a useful facility? >> >> Gordon, >> >> I do think your suggested additions would be useful. In addition, I often >> find myself generating a large spreadsheet based on the output of a number >> of individual coefficients so that the biologist can use that spreadsheet in >> something like Excel to sort and filter data at will. >> >> While less statistically palatable than the current method and your F-stat >> extension, would it be possible to include a third variation such that the >> coefficient argument could be set to a vector? This then leaves the >> question of how to "rank" or "sort" genes, but one could choose arbitrarily >> to order by the input to topTable (the order of the fit object--I typically >> use ALL genes for output) or to use the first value in the coefficient >> vector as the "key" coefficient. For complicated experiments, I think many >> biologists like to see how a gene looks for multiple coefficients >> simultaneously. >> >> Thanks, >> Sean > > Hi Sean, > > I'm not sure what output you are thinking of for vector coef. How is it > different from > > write.fit(fit) > > or > > write.fit(fit[,selectedcoefs]) > > ? It isn't.... Thanks for pointing out the obvious to those of us who didn't see it. Thanks, Sean
ADD REPLY
0
Entering edit mode
Just as this discussion got underway, I met with my current microarray collaborator. She asked that a) The results be sorted by gene id b) all the coefficients and their p-values be printed out to the same (big) table c) all the gene annotations be added to the table Very timely. --Naomi At 07:45 AM 1/3/2006, Sean Davis wrote: >On 1/3/06 7:40 AM, "Gordon K Smyth" <smyth at="" wehi.edu.au=""> wrote: > > > On Tue, January 3, 2006 11:24 pm, Sean Davis wrote: > >> On 1/2/06 8:00 PM, "Gordon Smyth" <smyth at="" wehi.edu.au=""> wrote: > >>> topTable() doesn't have a facility for sorting on or presenting the > >>> F-statistics, because it is individual coefficient orientated. > >>> > >>> I have toyed with the idea that perhaps topTable() should output a > >>> table based on the F-statistic when the argument 'coefficient' is set > >>> to NULL. In other words topTable() would give individual coef > >>> significance when a coef is specified, otherwise it would give > >>> overall significance. Would you find that a useful facility? > >> > >> Gordon, > >> > >> I do think your suggested additions would be useful. In addition, I often > >> find myself generating a large spreadsheet based on the output of a number > >> of individual coefficients so that the biologist can use that > spreadsheet in > >> something like Excel to sort and filter data at will. > >> > >> While less statistically palatable than the current method and your F-stat > >> extension, would it be possible to include a third variation such that the > >> coefficient argument could be set to a vector? This then leaves the > >> question of how to "rank" or "sort" genes, but one could choose > arbitrarily > >> to order by the input to topTable (the order of the fit > object--I typically > >> use ALL genes for output) or to use the first value in the coefficient > >> vector as the "key" coefficient. For complicated experiments, I > think many > >> biologists like to see how a gene looks for multiple coefficients > >> simultaneously. > >> > >> Thanks, > >> Sean > > > > Hi Sean, > > > > I'm not sure what output you are thinking of for vector coef. How is it > > different from > > > > write.fit(fit) > > > > or > > > > write.fit(fit[,selectedcoefs]) > > > > ? > >It isn't.... Thanks for pointing out the obvious to those of us who didn't >see it. > >Thanks, >Sean > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
I give results to many of my collaborators in exactly the same way (except for sorting by gene ID, see below). They like to have all the results in an Excel spreadsheet so that they can play around with it a bit themselves. I use write.fit(fit) to create such a file. Does this do what you want? Regarding the gene IDs, there isn't a way for limma functions to sort results by gene ID, because there isn't a requirement that the 'genes' component of a MArrayLM object contains a column which is a unique probe identifier. Even if such a column does exist, there is no requirement that it has to be called "ID". So limma doesn't have a way to know which column to use to sort by gene ID. However with any specific data set, you will know which column is your ID column. So you could use o <- order( fit$genes$ID ) write.fit( fit[o,] ) Cheers Gordon At 11:38 AM 4/01/2006, Naomi Altman wrote: >Just as this discussion got underway, I met with my current >microarray collaborator. > >She asked that > >a) The results be sorted by gene id >b) all the coefficients and their p-values be printed out to the >same (big) table >c) all the gene annotations be added to the table > >Very timely. > >--Naomi
ADD REPLY
0
Entering edit mode
Thanks, Gordon. I think that this is exactly what we need. Thanks. --Naomi At 09:11 PM 1/3/2006, you wrote: >I give results to many of my collaborators in exactly the same way >(except for sorting by gene ID, see below). They like to have all >the results in an Excel spreadsheet so that they can play around >with it a bit themselves. I use > > write.fit(fit) > >to create such a file. Does this do what you want? > >Regarding the gene IDs, there isn't a way for limma functions to >sort results by gene ID, because there isn't a requirement that the >'genes' component of a MArrayLM object contains a column which is a >unique probe identifier. Even if such a column does exist, there is >no requirement that it has to be called "ID". So limma doesn't have >a way to know which column to use to sort by gene ID. > >However with any specific data set, you will know which column is >your ID column. So you could use > > o <- order( fit$genes$ID ) > write.fit( fit[o,] ) > >Cheers >Gordon > >At 11:38 AM 4/01/2006, Naomi Altman wrote: >>Just as this discussion got underway, I met with my current >>microarray collaborator. >> >>She asked that >> >>a) The results be sorted by gene id >>b) all the coefficients and their p-values be printed out to the >>same (big) table >>c) all the gene annotations be added to the table >> >>Very timely. >> >>--Naomi > Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY

Login before adding your answer.

Traffic: 655 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6