Question: topTable; fold-change and data extraction
0
10.9 years ago by
Wijchers, Patrick40 wrote:
Dear all, Forgive me for asking help for such a basic issue. After following a course on R and microarray analysis, I am determined to get better at it. I believe I am very close to at least doing all the basic gene expression analysis (and hope to get more experienced from there), but I am stuck now at a relatively crucial stage. My aim is straightforward: to extract sets of upregulated and downregulated genes between two conditions from experimental microarray data obtained with mouse430_2 arrays. So, I obtained moderated t-stats after rma preprocessing: > library(limma) > design<-model.matrix(~factor(m31_eset$genotype)) > fit<-lmFit(m31_eset, design) > ebayes<-eBayes(fit) > etab<-topTable(ebayes, coef=2, number=50, adjust.method="fdr", p.value=0.05,lfc=0) > etab ID logFC AveExpr t P.Value adj.P.Val 37172 1452877_at -2.1867149 10.604163 -28.094315 4.561519e-10 2.057291e-05 17792 1433486_at -1.7610156 8.260051 -25.770171 9.826099e-10 2.215834e-05 25554 1441248_at -2.0141624 7.457400 -23.580934 2.159235e-09 3.246122e-05 17793 1433487_at -1.5573738 7.662227 -21.945441 4.078449e-09 4.598553e-05 14414 1430108_at -1.9558887 5.105592 -18.933815 1.497980e-08 1.288990e-04 941 1416610_a_at -1.6156069 8.287563 -18.644676 1.714805e-08 1.288990e-04 22672 1438366_x_at -1.6615440 9.385415 -16.024782 6.449002e-08 3.712528e-04 31808 1447502_at -1.9017788 6.208795 -15.986360 6.585269e-08 3.712528e-04 18573 1434267_at -1.3436798 7.025515 -13.057854 3.789018e-07 1.807532e-03 39737 1455442_at -1.4437945 6.134666 -12.972677 4.007742e-07 1.807532e-03 13110 1428804_at -1.1810039 4.567535 -12.800456 4.493937e-07 1.842555e-03 20345 1436039_at -0.9342354 8.330907 -10.566949 2.282503e-06 8.578598e-03 22898 1438592_at -1.5371677 4.240373 -10.361602 2.689797e-06 9.331735e-03 27128 1442822_at 0.7779881 6.374739 10.236780 2.976198e-06 9.587823e-03 21052 1436746_at 0.7136296 8.349428 9.664449 4.799472e-06 1.443073e-02 28394 1444088_at -0.8543858 6.742698 -9.354540 6.279629e-06 1.770110e-02 19280 1434974_at 0.6953667 6.055993 9.044643 8.278120e-06 2.077425e-02 12901 1428595_at -0.9757636 3.665744 -9.042911 8.291091e-06 2.077425e-02 25266 1440960_at 0.8017912 5.218203 8.634198 1.208269e-05 2.868112e-02 9419 1425113_x_at -0.8156575 8.968278 -8.348717 1.585442e-05 3.407913e-02 20142 1435836_at 0.6078861 7.724573 8.347830 1.586798e-05 3.407913e-02 B 37172 6.877374 17792 6.751444 25554 6.602080 17793 6.464845 14414 6.132030 941 6.093091 22672 5.663773 31808 5.656270 18573 4.942105 39737 4.916286 13110 4.863031 20345 4.022656 22898 3.928929 27128 3.870369 The outcome looks fine, with genes with p-values below 0.05, independent of fold-change (and genes have been confirmed in 'Resolver'). However, I want separate files for probe sets with logFC>0 (upregulated) and logFC<0 (downregulated), but the option lfc=0 does not distinguish between positive or negative fold change. I have tried lfc>0 and <0, but the function does not recognise that. There must be a simple way of obtaining these separate data files, but I have been banging my head over this for the whole weekend now, but cannot see the best solution. I have tried all kinds of things and feel I am really close, but nothing has done the job. I have also searched google and the bioconductor FAQs, but to no avail. Any help is greatly appreciated, I do not want to give up, and I do not want to go back to 'Resolver' or 'Genespring' (even though they make some things very simple). Thank you, Patrick sessionInfo() R version 2.7.1 (2008-06-23) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] splines tools stats graphics grDevices utils datasets [8] methods base other attached packages: [1] vsn_3.6.0 lattice_0.17-8 genefilter_1.20.0 [4] survival_2.34-1 affy_1.18.2 preprocessCore_1.2.1 [7] affyio_1.8.1 Biobase_2.0.1 limma_2.14.5 loaded via a namespace (and not attached): [1] annotate_1.18.0 AnnotationDbi_1.2.2 DBI_0.2-4 [4] grid_2.7.1 RSQLite_0.6-9 -- Patrick Wijchers Gene control mechanisms and disease group MRC Clinical Sciences Centre Imperial College Hammersmith Campus Du Cane Road London W12 0NN Phone: +44 (0)20 8383 8317 (lab) +44 (0)20 8383 8500 (office) Fax: +44 (0)20 8383 8306 Email: patrick.wijchers@csc.mrc.ac.uk -- Patrick Wijchers Gene control mechanisms and disease group MRC Clinical Sciences Centre Imperial College Hammersmith Campus Du Cane Road London W12 0NN Phone: +44 (0)20 8383 8317 (lab) +44 (0)20 8383 8500 (office) Fax: +44 (0)20 8383 8306 Email: patrick.wijchers@csc.mrc.ac.uk [[alternative HTML version deleted]] microarray go probe • 1.6k views ADD COMMENTlink modified 10.9 years ago by Thomas Hampton740 • written 10.9 years ago by Wijchers, Patrick40 Answer: topTable; fold-change and data extraction 0 10.9 years ago by Thomas Hampton740 wrote: Do not despair. I think the column you want to look at can be referenced as etab$logFC That should pull out the fold changes (log wise) of the probes in etab. etab$logFC > 0 should create a truth table of all those that are greater than zero. Now, you can use this truth table to pluck out the rows of etab that you want by using this truth table as an index like this etab[etab$logFC > 0,] The syntax above says "select the rows that are TRUE" The comma with nothing is shorthand for "use all the columns" since we didn't specify any particular ones. Hope this helps. Tom On Aug 10, 2008, at 5:44 PM, Wijchers, Patrick wrote: > Dear all, > > Forgive me for asking help for such a basic issue. After following > a course > on R and microarray analysis, I am determined to get better at it. > I believe I am very close to at least doing all the basic gene > expression > analysis (and hope to get more experienced from there), but I am > stuck > now at a relatively crucial stage. > > My aim is straightforward: to extract sets of upregulated and > downregulated > genes between two conditions from experimental microarray data > obtained > with mouse430_2 arrays. > > So, I obtained moderated t-stats after rma preprocessing: > >> library(limma) >> design<-model.matrix(~factor(m31_eset\$genotype)) >> fit<-lmFit(m31_eset, design) >> ebayes<-eBayes(fit) >> etab<-topTable(ebayes, coef=2, number=50, adjust.method="fdr", >> p.value=0.05,lfc=0) >> etab > ID logFC AveExpr t P.Value > adj.P.Val > 37172 1452877_at -2.1867149 10.604163 -28.094315 4.561519e-10 > 2.057291e-05 > 17792 1433486_at -1.7610156 8.260051 -25.770171 9.826099e-10 > 2.215834e-05 > 25554 1441248_at -2.0141624 7.457400 -23.580934 2.159235e-09 > 3.246122e-05 > 17793 1433487_at -1.5573738 7.662227 -21.945441 4.078449e-09 > 4.598553e-05 > 14414 1430108_at -1.9558887 5.105592 -18.933815 1.497980e-08 > 1.288990e-04 > 941 1416610_a_at -1.6156069 8.287563 -18.644676 1.714805e-08 > 1.288990e-04 > 22672 1438366_x_at -1.6615440 9.385415 -16.024782 6.449002e-08 > 3.712528e-04 > 31808 1447502_at -1.9017788 6.208795 -15.986360 6.585269e-08 > 3.712528e-04 > 18573 1434267_at -1.3436798 7.025515 -13.057854 3.789018e-07 > 1.807532e-03 > 39737 1455442_at -1.4437945 6.134666 -12.972677 4.007742e-07 > 1.807532e-03 > 13110 1428804_at -1.1810039 4.567535 -12.800456 4.493937e-07 > 1.842555e-03 > 20345 1436039_at -0.9342354 8.330907 -10.566949 2.282503e-06 > 8.578598e-03 > 22898 1438592_at -1.5371677 4.240373 -10.361602 2.689797e-06 > 9.331735e-03 > 27128 1442822_at 0.7779881 6.374739 10.236780 2.976198e-06 > 9.587823e-03 > 21052 1436746_at 0.7136296 8.349428 9.664449 4.799472e-06 > 1.443073e-02 > 28394 1444088_at -0.8543858 6.742698 -9.354540 6.279629e-06 > 1.770110e-02 > 19280 1434974_at 0.6953667 6.055993 9.044643 8.278120e-06 > 2.077425e-02 > 12901 1428595_at -0.9757636 3.665744 -9.042911 8.291091e-06 > 2.077425e-02 > 25266 1440960_at 0.8017912 5.218203 8.634198 1.208269e-05 > 2.868112e-02 > 9419 1425113_x_at -0.8156575 8.968278 -8.348717 1.585442e-05 > 3.407913e-02 > 20142 1435836_at 0.6078861 7.724573 8.347830 1.586798e-05 > 3.407913e-02 > B > 37172 6.877374 > 17792 6.751444 > 25554 6.602080 > 17793 6.464845 > 14414 6.132030 > 941 6.093091 > 22672 5.663773 > 31808 5.656270 > 18573 4.942105 > 39737 4.916286 > 13110 4.863031 > 20345 4.022656 > 22898 3.928929 > 27128 3.870369 > > The outcome looks fine, with genes with p-values below 0.05, > independent > of fold-change (and genes have been confirmed in 'Resolver'). > > However, I want separate files for probe sets with logFC>0 > (upregulated) > and logFC<0 (downregulated), but the option lfc=0 does not > distinguish > between positive or negative fold change. I have tried lfc>0 and <0, > but the function does not recognise that. > > There must be a simple way of obtaining these separate data files, > but I have been banging my head over this for the whole weekend now, > but cannot see the best solution. I have tried all kinds of things > and feel > I am really close, but nothing has done the job. I have also searched > google and the bioconductor FAQs, but to no avail. > > Any help is greatly appreciated, I do not want to give up, and I do > not want > to go back to 'Resolver' or 'Genespring' (even though they make > some things > very simple). > > Thank you, > > Patrick > > sessionInfo() > R version 2.7.1 (2008-06-23) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US > .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N > AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI > FICATION=C > > attached base packages: > [1] splines tools stats graphics grDevices utils > datasets > [8] methods base > > other attached packages: > [1] vsn_3.6.0 lattice_0.17-8 genefilter_1.20.0 > [4] survival_2.34-1 affy_1.18.2 preprocessCore_1.2.1 > [7] affyio_1.8.1 Biobase_2.0.1 limma_2.14.5 > > loaded via a namespace (and not attached): > [1] annotate_1.18.0 AnnotationDbi_1.2.2 DBI_0.2-4 > [4] grid_2.7.1 RSQLite_0.6-9 > > > -- > Patrick Wijchers > Gene control mechanisms and disease group > MRC Clinical Sciences Centre > Imperial College > Hammersmith Campus > Du Cane Road > London W12 0NN > Phone: +44 (0)20 8383 8317 (lab) > +44 (0)20 8383 8500 (office) > Fax: +44 (0)20 8383 8306 > Email: patrick.wijchers at csc.mrc.ac.uk > > > > -- > Patrick Wijchers > Gene control mechanisms and disease group > MRC Clinical Sciences Centre > Imperial College > Hammersmith Campus > Du Cane Road > London W12 0NN > Phone: +44 (0)20 8383 8317 (lab) > +44 (0)20 8383 8500 (office) > Fax: +44 (0)20 8383 8306 > Email: patrick.wijchers at csc.mrc.ac.uk > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor