Question

limma for time course dataset 2

0

Entering edit mode

Joseph J Hou ▴ 20

@joseph-j-hou-6407

Last seen 9.6 years ago

Hi Gordon, If I have multiplex time points, when set coeff to "NULL" in topTable function, it return results (named as R1) from F-statistics, however, if I set coeff to each time point, one by one, then I get DEG in each time point (named as R2, combined all DEG symbol in one list). Then, R1 and R2 have some overlap, but R2 usually has more genes that excluded in R1. ?So, in this case, which approach back correct DEG gene list? ? Best, Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn ?·¢¼þÈË£º?Joseph J Hou·¢ËÍÊ±¼ä£º?2014-02-17?10:30ÊÕ¼þÈË£º?smyth³ËÍ£º? bioconductorÖ÷Ìâ£º?limma for time course dataset Hi?Gordon, My project is time course dataset, 9 time points and 21 subjects. When I use "topTable" function to call differential genes, how to set argument "coeff" ? If I set it as each time point (column name specifying) one by one, it'll reture results by t-test. ??If I set to as "NULL", topTable will use F-ststistic to rank genes with all coefficient, and usually the difference gene by F- test will smaller than t-test by each time points. I'm worry about F-ststistic will miss some of meaning informations. So, as you suggestion, how to set the coeff parameter to time-course dataset, to find allover DEG and DEG in individual time point??p.s. Usually, in heatmap figure, log-Fold change or normalized intensity of each group used in visualization?? Best,Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn [[alternative HTML version deleted]]

• 1.3k views

ADD COMMENT • link updated 10.2 years ago by houjue@cmpt.ac.cn ▴ 20 • written 10.2 years ago by Joseph J Hou ▴ 20

score 0 · Answer 1 · 2014-02-19

Hi Yunshun,Thanks so much for your reply. Actually, I could understand your points, and both one-way layout and fit regression spline trend have been used in my data. What make me confused and consider is fit regression will miss some of DEG that actual present significant difference in one-way approach. Like, Gene A showed as DEG when I compare baseline(Day0) with Day 14, but not in regression approach. So I want to make a whole heatmap inculed all DEG, how to achieve this goal in limma? one more thing, usually, in heatmap, as you suggestion, intensity or fold-change of each group would be used in heatmap? I feel, intensity looks not pretty clear and obvious. Best,Joe houjue@cmpt.ac.cn From: Yunshun ChenDate: 2014-02-19 08:46To: houjue@cmpt.ac.cnCC: bioconductor@r-project.org; Gordon K SmythSubject: RE: [BioC] limma for time course dataset 2Hi Joseph, The way to analyse a time-course experiment depends on what scientific question you want to answer. If, say, you are interested in the difference between time 1 and time 2 for all 21 subjects, then you can compare those two groups using the standard one-way layout analysis. If you are interested in whether the expression levels of the genes change along the time, then you can analyse the data by fitting a trend using a regression spline or a polynomial. There is no information on how you fit your data and what you are trying to find. Did you use the one-way layout approach or fit a regression spline trend? That determines the meaning of the coefficients in your output. In other words, what 'coef' to use for the 'topTable' depends on what model you fit to your data and what question you want to answer. There is a time course experiment case study in the limma user's guide (section 9.6). It might be helpful to answer your question. Regards, Yunshun ----------------------------- Yunshun Chen, Research Officer, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Email: yuchen@wehi.edu.au ------------------------------ Message: 14 Date: Tue, 18 Feb 2014 15:15:21 +0800 From: Joseph J Hou <houjue00722@sina.com> To: smyth <smyth@wehi.edu.au> Cc: bioconductor <bioconductor@stat.math.ethz.ch> Subject: [BioC] limma for time course dataset 2 Message-ID: <2014021815152088222923@sina.com> Content-Type: text/plain Hi Gordon, If I have multiplex time points, when set coeff to "NULL" in topTable function, it return results (named as R1) from F-statistics, however, if I set coeff to each time point, one by one, then I get DEG in each time point (named as R2, combined all DEG symbol in one list). Then, R1 and R2 have some overlap, but R2 usually has more genes that excluded in R1. ?So, in this case, which approach back correct DEG gene list? ? Best, Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn ?7"<~HK#:?Joseph J Hou7"KMJ1<d#:?2014-02-17?10:30ju<~hk#:?smyth3-km#:?bioconductorvwlb#:? limma="" for="" time="" course="" dataset="" hi?gordon,="" my="" project="" is="" time="" course="" dataset,="" 9="" time="" points="" and="" 21="" subjects.="" when="" i="" use="" "toptable"="" function="" to="" call="" differential="" genes,="" how="" to="" set="" argument="" "coeff"="" ?="" if="" i="" set="" it="" as="" each="" time="" point="" (column="" name="" specifying)="" one="" by="" one,="" it'll="" reture="" results="" by="" t-test.="" ??if="" i="" set="" to="" as="" "null",="" toptable="" will="" use="" f-ststistic="" to="" rank="" genes="" with="" all="" coefficient,="" and="" usually="" the="" difference="" gene="" by="" f-="" test="" will="" smaller="" than="" t-test="" by="" each="" time="" points.="" i'm="" worry="" about="" f-ststistic="" will="" miss="" some="" of="" meaning="" informations.="" so,="" as="" you="" suggestion,="" how="" to="" set="" the="" coeff="" parameter="" to="" time-course="" dataset,="" to="" find="" allover="" deg="" and="" deg="" in="" individual="" time="" point??p.s.="" usually,="" in="" heatmap="" figure,="" log-fold="" change="" or="" normalized="" intensity="" of="" each="" group="" used="" in="" visualization??="" best,joe="" joseph="" j="" hou="" ----------------------------------------------="" jue="" hou="" ph.d.="" research="" assistant="" center="" of="" medical="" physics="" and="" technology="" hefei="" institutes="" of="" physical="" science="" chinese="" academy="" of="" sciences="" no.350="" shushanhu="" road,shushan="" district,heifei,p.r.="" china="" tel.="" +86551-65595385="" email:="" joseph.houjue@gmail.com;="" houjue@cmpt.ac.cn="" [[alternative="" html="" version="" deleted]]="" ------------------------------="" _______________________________________________="" bioconductor="" mailing="" list="" bioconductor@r-project.org="" https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor="" end="" of="" bioconductor="" digest,="" vol="" 132,="" issue="" 19="" *********************************************="" ______________________________________________________________________="" the="" information="" in="" this="" email="" is="" confidential="" and="" intended="" solely="" for="" the="" addressee.="" you="" must="" not="" disclose,="" forward,="" print="" or="" use="" it="" without="" the="" permission="" of="" the="" sender.="" ______________________________________________________________________="" [[alternative="" html="" version="" deleted]]="" <="" div="">

score 0 · Answer 2 · 2014-02-19

Hi Yunshun,Thanks so much. Â I get your pointsÂ eventually. Previously, I sorted DEG at Day14 from regression model as setting absolute log2FC more than log2(1.5) from topTable function. Now, I guess it's not correct, it should be done by one-way pair-wise comparisons.For heatmap, you mean get mean value of each group (with every gene), then substract baseline mean intensity ? That's right?Â Best,Joe houjue@cmpt.ac.cn Â From:Â Yunshun ChenDate:Â 2014-02-19Â 10:19To:Â houjue@cmpt.ac.cnCC:Â bioconductor@r-project.org; Gordon K SmythSubject:Â RE: [BioC] limma for time course dataset 2Hi Joe,Â DEG (DE genes I suppose) have to be defined under particular comparisons.It is very common that a gene is DE between group 1 and group 2, but not DE between group 1 and group 3.Â Also, the one-way approach and the regression approach are testing different hypotheses.The one-way approach can produce results for pair-wise comparisons or comparisons with any contrasts.The regression approach tends to detect the general differences due to the time effect.I wonder how you compare Day 0 with Day 14 under the regression model.Â When you refer to âall DEGâ, how do you define your âall DEGâ?Do you mean the genes that are differentially expressed in at least one of the pair-wise comparisons? If so, you can do all the pair-wise comparisons under the one-way approach and then take the union of them.Â For the heatmap, you might need to subtract the average expression value of each gene for every gene.Â Hope that helps.Â Regards,YunshunÂ Â From: houjue@cmpt.ac.cn [mailto:houjue@cmpt.ac.cn] Sent: Wednesday, 19 February 2014 12:12 PM To: yuchen@wehi.EDU.AU Cc: bioconductor@r-project.org; smyth Subject: Re: RE: [BioC] limma for time course dataset 2Â Hi Yunshun,Thanks so much for your reply. Actually, I could understand your points, and both one-way layout and fit regression spline trend have been used in my data. What make me confused and consider is fit regression will miss some of DEG that actual present significant difference in one-way approach. Like, Gene A showed as DEG when I compare baseline(Day0) with Day 14, but not in regression approach. So I want to make a whole heatmap inculed all DEG, how to achieve this goal in limma? one more thing, usually, in heatmap, as you suggestion, intensity or fold-change of each group would be used in heatmap? I feel, intensity looks not pretty clear and obvious.Â Â Best,JoeÂ houjue@cmpt.ac.cnÂ From:Â Yunshun ChenDate:Â 2014-02-19Â 08:46To:Â houjue@cmpt.ac.cnCC:Â bioconductor@r-project.org; Gordon K SmythSubject:Â RE: [BioC] limma for time course dataset 2Hi Joseph,Â The way to analyse a time-course experiment depends on what scientificquestion you want to answer.If, say, you are interested in the difference between time 1 and time 2 forall 21 subjects, then you can compare those two groups using the standardone-way layout analysis.If you are interested in whether the expression levels of the genes changealong the time, then you can analyse the data by fitting a trend using aregression spline or a polynomial.Â There is no information on how you fit your data and what you are trying tofind. Did you use the one-way layout approach or fit a regression spline trend? That determines the meaning of the coefficients in your output.In other words, what 'coef' to use for the 'topTable' depends on what modelyou fit to your data and what question you want to answer.Â There is a time course experiment case study in the limma user's guide(section 9.6).It might be helpful to answer your question.Â Regards,YunshunÂ -----------------------------Yunshun Chen,Research Officer,Bioinformatics Division,Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia.Email: yuchen@wehi.edu.auÂ Â Â ------------------------------Â Message: 14Date: Tue, 18 Feb 2014 15:15:21 +0800From: Joseph J Hou <houjue00722@sina.com>To: smyth <smyth@wehi.edu.au>Cc: bioconductor <bioconductor@stat.math.ethz.ch>Subject: [BioC] limma for time course dataset 2Message-ID: <2014021815152088222923@sina.com>Content-Type: text/plainÂ Â Hi Gordon,If I have multiplex time points, when set coeff to "NULL" in topTablefunction, it return results (named as R1) from F-statistics, however, if Iset coeff to each time point, one by one, then I get DEG in each time point(named as R2, combined all DEG symbol in one list). Then, R1 and R2 havesome overlap, but R2 usually has more genes that excluded in R1. ?So, inthis case, which approach back correct DEG gene list? ?Â Best,JoeÂ Â Â Joseph J Hou ----------------------------------------------Â Jue Hou Ph.D.Research AssistantÂ Center of Medical Physics and TechnologyHefei Institutes of Physical ScienceChinese Academy of SciencesNo.350 Shushanhu Road,Shushan District,Heifei,P.R. ChinaTel. +86551-65595385Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn?7"<~HK#:?Joseph JHou7"KMJ1< d#:?2014-02-17?10:30JU<~HK#:?smyth3-KM#:?bioconductorVwLb#:?limmafor time course datasetHi?Gordon,My project is time course dataset, 9 time points and 21 subjects. When I use"topTable" function to call differential genes, how to set argument "coeff"? If I set it as each time point (column name specifying) one by one, it'llreture results by t-test. ??If I set to as "NULL", topTable will useF-ststistic to rank genes with all coefficient, and usually the differencegene by F- test will smaller than t-test by each time points. I'm worryabout F-ststistic will miss some of meaning informations. So, as yousuggestion, how to set the coeff parameter to time-course dataset, to findallover DEG and DEG in individual time point??p.s. Usually, in heatmapfigure, log-Fold change or normalized intensity of each group used invisualization??Â Best,JoeÂ Â Â Joseph J Hou ----------------------------------------------Â Jue Hou Ph.D.Research AssistantÂ Center of Medical Physics and TechnologyHefei Institutes of Physical ScienceChinese Academy of SciencesNo.350 Shushanhu Road,Shushan District,Heifei,P.R. ChinaTel. +86551-65595385Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cnÂ Â [[alternative HTML version deleted]]Â Â ------------------------------Â _______________________________________________Bioconductor mailing li stBioconductor@r-project.orghttps://stat.ethz.ch/mailman/listinfo/bioc onductorÂ Â End of Bioconductor Digest, Vol 132, Issue 19*********************************************Â Â ___________________ ___________________________________________________The information in this email is confidential and intended solely for the addressee.You must not disclose, forward, print or use it without the permission of the sender.___________________________________________________________ ___________ ______________________________________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. ______________________________________________________________________ [[alternative HTML version deleted]]

score 0 · Answer 3 · 2014-02-19

Hi Joseph, The way to analyse a time-course experiment depends on what scientific question you want to answer. If, say, you are interested in the difference between time 1 and time 2 for all 21 subjects, then you can compare those two groups using the standard one-way layout analysis. If you are interested in whether the expression levels of the genes change along the time, then you can analyse the data by fitting a trend using a regression spline or a polynomial. There is no information on how you fit your data and what you are trying to find. Did you use the one-way layout approach or fit a regression spline trend? That determines the meaning of the coefficients in your output. In other words, what 'coef' to use for the 'topTable' depends on what model you fit to your data and what question you want to answer. There is a time course experiment case study in the limma user's guide (section 9.6). It might be helpful to answer your question. Regards, Yunshun ----------------------------- Yunshun Chen, Research Officer, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Email: yuchen at wehi.edu.au ------------------------------ Message: 14 Date: Tue, 18 Feb 2014 15:15:21 +0800 From: Joseph J Hou <houjue00722@sina.com> To: smyth <smyth at="" wehi.edu.au=""> Cc: bioconductor <bioconductor at="" stat.math.ethz.ch=""> Subject: [BioC] limma for time course dataset 2 Message-ID: <2014021815152088222923 at sina.com> Content-Type: text/plain Hi Gordon, If I have multiplex time points, when set coeff to "NULL" in topTable function, it return results (named as R1) from F-statistics, however, if I set coeff to each time point, one by one, then I get DEG in each time point (named as R2, combined all DEG symbol in one list). Then, R1 and R2 have some overlap, but R2 usually has more genes that excluded in R1. ?So, in this case, which approach back correct DEG gene list? ? Best, Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue at gmail.com; houjue at cmpt.ac.cn ?7"<~HK#:?Joseph J Hou7"KMJ1<d#:?2014-02-17?10:30ju<~hk#:?smyth3-km#:?bioconductorvwlb#:? limma="" for="" time="" course="" dataset="" hi?gordon,="" my="" project="" is="" time="" course="" dataset,="" 9="" time="" points="" and="" 21="" subjects.="" when="" i="" use="" "toptable"="" function="" to="" call="" differential="" genes,="" how="" to="" set="" argument="" "coeff"="" ?="" if="" i="" set="" it="" as="" each="" time="" point="" (column="" name="" specifying)="" one="" by="" one,="" it'll="" reture="" results="" by="" t-test.="" ??if="" i="" set="" to="" as="" "null",="" toptable="" will="" use="" f-ststistic="" to="" rank="" genes="" with="" all="" coefficient,="" and="" usually="" the="" difference="" gene="" by="" f-="" test="" will="" smaller="" than="" t-test="" by="" each="" time="" points.="" i'm="" worry="" about="" f-ststistic="" will="" miss="" some="" of="" meaning="" informations.="" so,="" as="" you="" suggestion,="" how="" to="" set="" the="" coeff="" parameter="" to="" time-course="" dataset,="" to="" find="" allover="" deg="" and="" deg="" in="" individual="" time="" point??p.s.="" usually,="" in="" heatmap="" figure,="" log-fold="" change="" or="" normalized="" intensity="" of="" each="" group="" used="" in="" visualization??="" best,joe="" joseph="" j="" hou="" ----------------------------------------------="" jue="" hou="" ph.d.="" research="" assistant="" center="" of="" medical="" physics="" and="" technology="" hefei="" institutes="" of="" physical="" science="" chinese="" academy="" of="" sciences="" no.350="" shushanhu="" road,shushan="" district,heifei,p.r.="" china="" tel.="" +86551-65595385="" email:="" joseph.houjue="" at="" gmail.com;="" houjue="" at="" cmpt.ac.cn="" [[alternative="" html="" version="" deleted]]="" ------------------------------="" _______________________________________________="" bioconductor="" mailing="" list="" bioconductor="" at="" r-project.org="" https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor="" end="" of="" bioconductor="" digest,="" vol="" 132,="" issue="" 19="" *********************************************="" ______________________________________________________________________="" the="" information="" in="" this="" email="" is="" confidential="" and="" intend...{{dropped:4}}="" <="" div="">

score 0 · Answer 4 · 2014-02-19

Hi Joe, DEG (DE genes I suppose) have to be defined under particular comparisons. It is very common that a gene is DE between group 1 and group 2, but not DE between group 1 and group 3. Also, the one-way approach and the regression approach are testing different hypotheses. The one-way approach can produce results for pair-wise comparisons or comparisons with any contrasts. The regression approach tends to detect the general differences due to the time effect. I wonder how you compare Day 0 with Day 14 under the regression model. When you refer to 'all DEG', how do you define your 'all DEG'? Do you mean the genes that are differentially expressed in at least one of the pair-wise comparisons? If so, you can do all the pair-wise comparisons under the one-way approach and then take the union of them. For the heatmap, you might need to subtract the average expression value of each gene for every gene. Hope that helps. Regards, Yunshun From: houjue@cmpt.ac.cn [mailto:houjue@cmpt.ac.cn] Sent: Wednesday, 19 February 2014 12:12 PM To: yuchen@wehi.EDU.AU Cc: bioconductor@r-project.org; smyth Subject: Re: RE: [BioC] limma for time course dataset 2 Hi Yunshun, Thanks so much for your reply. Actually, I could understand your points, and both one-way layout and fit regression spline trend have been used in my data. What make me confused and consider is fit regression will miss some of DEG that actual present significant difference in one-way approach. Like, Gene A showed as DEG when I compare baseline(Day0) with Day 14, but not in regression approach. So I want to make a whole heatmap inculed all DEG, how to achieve this goal in limma? one more thing, usually, in heatmap, as you suggestion, intensity or fold-change of each group would be used in heatmap? I feel, intensity looks not pretty clear and obvious. Best, Joe _____ houjue@cmpt.ac.cn From: Yunshun Chen <mailto:yuchen@wehi.edu.au> Date: 2014-02-19 08:46 To: houjue@cmpt.ac.cn CC: bioconductor@r-project.org; Gordon K Smyth <mailto:smyth@wehi.edu.au> Subject: RE: [BioC] limma for time course dataset 2 Hi Joseph, The way to analyse a time-course experiment depends on what scientific question you want to answer. If, say, you are interested in the difference between time 1 and time 2 for all 21 subjects, then you can compare those two groups using the standard one-way layout analysis. If you are interested in whether the expression levels of the genes change along the time, then you can analyse the data by fitting a trend using a regression spline or a polynomial. There is no information on how you fit your data and what you are trying to find. Did you use the one-way layout approach or fit a regression spline trend? That determines the meaning of the coefficients in your output. In other words, what 'coef' to use for the 'topTable' depends on what model you fit to your data and what question you want to answer. There is a time course experiment case study in the limma user's guide (section 9.6). It might be helpful to answer your question. Regards, Yunshun ----------------------------- Yunshun Chen, Research Officer, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Email: yuchen@wehi.edu.au ------------------------------ Message: 14 Date: Tue, 18 Feb 2014 15:15:21 +0800 From: Joseph J Hou <houjue00722@sina.com> To: smyth <smyth@wehi.edu.au> Cc: bioconductor <bioconductor@stat.math.ethz.ch> Subject: [BioC] limma for time course dataset 2 Message-ID: <2014021815152088222923@sina.com> Content-Type: text/plain Hi Gordon, If I have multiplex time points, when set coeff to "NULL" in topTable function, it return results (named as R1) from F-statistics, however, if I set coeff to each time point, one by one, then I get DEG in each time point (named as R2, combined all DEG symbol in one list). Then, R1 and R2 have some overlap, but R2 usually has more genes that excluded in R1. ?So, in this case, which approach back correct DEG gene list? ? Best, Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn ?7"<~HK#:?Joseph J Hou7"KMJ1<d#:?2014-02-17?10:30ju<~hk#:?smyth3-km#:?bioconductorvwlb#:? limma="" for="" time="" course="" dataset="" hi?gordon,="" my="" project="" is="" time="" course="" dataset,="" 9="" time="" points="" and="" 21="" subjects.="" when="" i="" use="" "toptable"="" function="" to="" call="" differential="" genes,="" how="" to="" set="" argument="" "coeff"="" ?="" if="" i="" set="" it="" as="" each="" time="" point="" (column="" name="" specifying)="" one="" by="" one,="" it'll="" reture="" results="" by="" t-test.="" ??if="" i="" set="" to="" as="" "null",="" toptable="" will="" use="" f-ststistic="" to="" rank="" genes="" with="" all="" coefficient,="" and="" usually="" the="" difference="" gene="" by="" f-="" test="" will="" smaller="" than="" t-test="" by="" each="" time="" points.="" i'm="" worry="" about="" f-ststistic="" will="" miss="" some="" of="" meaning="" informations.="" so,="" as="" you="" suggestion,="" how="" to="" set="" the="" coeff="" parameter="" to="" time-course="" dataset,="" to="" find="" allover="" deg="" and="" deg="" in="" individual="" time="" point??p.s.="" usually,="" in="" heatmap="" figure,="" log-fold="" change="" or="" normalized="" intensity="" of="" each="" group="" used="" in="" visualization??="" best,joe="" joseph="" j="" hou="" ----------------------------------------------="" jue="" hou="" ph.d.="" research="" assistant="" center="" of="" medical="" physics="" and="" technology="" hefei="" institutes="" of="" physical="" science="" chinese="" academy="" of="" sciences="" no.350="" shushanhu="" road,shushan="" district,heifei,p.r.="" china="" tel.="" +86551-65595385="" email:="" joseph.houjue@gmail.com;="" houjue@cmpt.ac.cn="" [[alternative="" html="" version="" deleted]]="" ------------------------------="" _______________________________________________="" bioconductor="" mailing="" list="" bioconductor@r-project.org="" https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor="" end="" of="" bioconductor="" digest,="" vol="" 132,="" issue="" 19="" *********************************************="" ______________________________________________________________________="" the="" information="" in="" this="" email="" is="" confidential="" and="" intend...{{dropped:17}}="" <="" div="">

score 0 · Answer 5 · 2014-02-19

Hi Joe, For the heatmap, what I meant was to subtract the mean value of each gene from the intensity values of that gene. Alternatively, you can set scale=ârowâ in the heatmap function. Regards, Yunshun ----------------------------- Yunshun Chen, Research Officer, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Email: yuchen@wehi.edu.au From: houjue@cmpt.ac.cn [mailto:houjue@cmpt.ac.cn] Sent: Wednesday, 19 February 2014 1:41 PM To: yuchen@wehi.EDU.AU Cc: bioconductor@r-project.org; smyth Subject: Re: RE: [BioC] limma for time course dataset 2 Hi Yunshun, Thanks so much. I get your points eventually. Previously, I sorted DEG at Day14 from regression model as setting absolute log2FC more than log2(1.5) from topTable function. Now, I guess it's not correct, it should be done by one-way pair-wise comparisons. For heatmap, you mean get mean value of each group (with every gene), then substract baseline mean intensity ? That's right? Best, Joe _____ houjue@cmpt.ac.cn From: Yunshun Chen <mailto:yuchen@wehi.edu.au> Date: 2014-02-19 10:19 To: houjue@cmpt.ac.cn CC: bioconductor@r-project.org; Gordon K Smyth <mailto:smyth@wehi.edu.au> Subject: RE: [BioC] limma for time course dataset 2 Hi Joe, DEG (DE genes I suppose) have to be defined under particular comparisons. It is very common that a gene is DE between group 1 and group 2, but not DE between group 1 and group 3. Also, the one-way approach and the regression approach are testing different hypotheses. The one-way approach can produce results for pair-wise comparisons or comparisons with any contrasts. The regression approach tends to detect the general differences due to the time effect. I wonder how you compare Day 0 with Day 14 under the regression model. When you refer to âall DEGâ, how do you define your âall DEGâ? Do you mean the genes that are differentially expressed in at least one of the pair-wise comparisons? If so, you can do all the pair-wise comparisons under the one-way approach and then take the union of them. For the heatmap, you might need to subtract the average expression value of each gene for every gene. Hope that helps. Regards, Yunshun From: houjue@cmpt.ac.cn [mailto:houjue@cmpt.ac.cn] Sent: Wednesday, 19 February 2014 12:12 PM To: yuchen@wehi.EDU.AU Cc: bioconductor@r-project.org; smyth Subject: Re: RE: [BioC] limma for time course dataset 2 Hi Yunshun, Thanks so much for your reply. Actually, I could understand your points, and both one-way layout and fit regression spline trend have been used in my data. What make me confused and consider is fit regression will miss some of DEG that actual present significant difference in one-way approach. Like, Gene A showed as DEG when I compare baseline(Day0) with Day 14, but not in regression approach. So I want to make a whole heatmap inculed all DEG, how to achieve this goal in limma? one more thing, usually, in heatmap, as you suggestion, intensity or fold-change of each group would be used in heatmap? I feel, intensity looks not pretty clear and obvious. Best, Joe _____ houjue@cmpt.ac.cn From: Yunshun Chen <mailto:yuchen@wehi.edu.au> Date: 2014-02-19 08:46 To: houjue@cmpt.ac.cn CC: bioconductor@r-project.org; Gordon K Smyth <mailto:smyth@wehi.edu.au> Subject: RE: [BioC] limma for time course dataset 2 Hi Joseph, The way to analyse a time-course experiment depends on what scientific question you want to answer. If, say, you are interested in the difference between time 1 and time 2 for all 21 subjects, then you can compare those two groups using the standard one-way layout analysis. If you are interested in whether the expression levels of the genes change along the time, then you can analyse the data by fitting a trend using a regression spline or a polynomial. There is no information on how you fit your data and what you are trying to find. Did you use the one-way layout approach or fit a regression spline trend? That determines the meaning of the coefficients in your output. In other words, what 'coef' to use for the 'topTable' depends on what model you fit to your data and what question you want to answer. There is a time course experiment case study in the limma user's guide (section 9.6). It might be helpful to answer your question. Regards, Yunshun ----------------------------- Yunshun Chen, Research Officer, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. Email: yuchen@wehi.edu.au ------------------------------ Message: 14 Date: Tue, 18 Feb 2014 15:15:21 +0800 From: Joseph J Hou <houjue00722@sina.com> To: smyth <smyth@wehi.edu.au> Cc: bioconductor <bioconductor@stat.math.ethz.ch> Subject: [BioC] limma for time course dataset 2 Message-ID: <2014021815152088222923@sina.com> Content-Type: text/plain Hi Gordon, If I have multiplex time points, when set coeff to "NULL" in topTable function, it return results (named as R1) from F-statistics, however, if I set coeff to each time point, one by one, then I get DEG in each time point (named as R2, combined all DEG symbol in one list). Then, R1 and R2 have some overlap, but R2 usually has more genes that excluded in R1. ?So, in this case, which approach back correct DEG gene list? ? Best, Joe Joseph J Hou ---------------------------------------------- Jue Hou Ph.D. Research Assistant Center of Medical Physics and Technology Hefei Institutes of Physical Science Chinese Academy of Sciences No.350 Shushanhu Road,Shushan District,Heifei,P.R. China Tel. +86551-65595385 Email: joseph.houjue@gmail.com; houjue@cmpt.ac.cn ?7"<~HK#:?Joseph J Hou7"KMJ1<d#:?2014-02-17?10:30ju<~hk#:?smyth3-km#:?bioconductorvwlb#:? limma="" for="" time="" course="" dataset="" hi?gordon,="" my="" project="" is="" time="" course="" dataset,="" 9="" time="" points="" and="" 21="" subjects.="" when="" i="" use="" "toptable"="" function="" to="" call="" differential="" genes,="" how="" to="" set="" argument="" "coeff"="" ?="" if="" i="" set="" it="" as="" each="" time="" point="" (column="" name="" specifying)="" one="" by="" one,="" it'll="" reture="" results="" by="" t-test.="" ??if="" i="" set="" to="" as="" "null",="" toptable="" will="" use="" f-ststistic="" to="" rank="" genes="" with="" all="" coefficient,="" and="" usually="" the="" difference="" gene="" by="" f-="" test="" will="" smaller="" than="" t-test="" by="" each="" time="" points.="" i'm="" worry="" about="" f-ststistic="" will="" miss="" some="" of="" meaning="" informations.="" so,="" as="" you="" suggestion,="" how="" to="" set="" the="" coeff="" parameter="" to="" time-course="" dataset,="" to="" find="" allover="" deg="" and="" deg="" in="" individual="" time="" point??p.s.="" usually,="" in="" heatmap="" figure,="" log-fold="" change="" or="" normalized="" intensity="" of="" each="" group="" used="" in="" visualization??="" best,joe="" joseph="" j="" hou="" ----------------------------------------------="" jue="" hou="" ph.d.="" research="" assistant="" center="" of="" medical="" physics="" and="" technology="" hefei="" institutes="" of="" physical="" science="" chinese="" academy="" of="" sciences="" no.350="" shushanhu="" road,shushan="" district,heifei,p.r.="" china="" tel.="" +86551-65595385="" email:="" joseph.houjue@gmail.com;="" houjue@cmpt.ac.cn="" [[alternative="" html="" version="" deleted]]="" ------------------------------="" _______________________________________________="" bioconductor="" mailing="" list="" bioconductor@r-project.org="" https:="" stat.ethz.ch="" mailman="" listinfo="" bioconductor="" end="" of="" bioconductor="" digest,="" vol="" 132,="" issue="" 19="" *********************************************="" ______________________________________________________________________="" the="" information="" in="" this="" email="" is="" confidential="" and="" intend...{{dropped:26}}="" <="" div="">