Question: maSigPro and "vars" argument
0
8.2 years ago by
Andrea Grilli240
Italy, Bologna, Rizzoli Orthopaedic Institute
Andrea Grilli240 wrote:
Hi to all, I'm analyzing time series experiment with maSigPro package as first time, and I get problems to understand if experimental design is correct or not, in particular I'm doubtful with "vars" argument. Data comes from Affymetrix gene chip from 2 different cell lines, 4 time points, 2 replicates at each time. I normalized with RMA, and filtered out low expressed/low changing genes, getting from initial 54k probes about 12k probes. I'm interested in genes varying (i)in either cell lines between the different time points (ii) between the two cell lines across time. I did the analysis with vars argument as "groups", getting these comparisons: > (ts.analysis$sig.genes$) ts.analysis$sig.genes$Group1 ts.analysis$sig.genes$Group2vsGroup1 So, If I well understood, I have 2 gene sets of significant genes, the first with those changing across time in Group1 cells, the second with those changing in Group2 vs Group1 cells across time. My questions: how can I also get significant genes for Group2?? Should I split the experiment in two parts and performing separately? Last question: using vars = "each", what I exactly get? I mean biologically speaking... This is my design matrix: Time Replicates Group1 Group2 wt22_g21 21 1 1 0 wt22_g7 7 2 1 0 wt36_g21 21 1 1 0 wt36_g7 7 2 1 0 Saos1_g21 21 5 0 1 Saos2_g21 21 5 0 1 Saos1_g7 7 6 0 1 Saos2_g7 7 6 0 1 wt22_g0 0 3 1 0 wt22_g14 14 4 1 0 wt36_g0 0 3 1 0 wt36_g14 14 4 1 0 Saos1_g0 0 7 0 1 Saos2_g0 0 7 0 1 Saos1_g14 14 8 0 1 Saos2_g14 14 8 0 1 This is the command line: > ts.analysis <- maSigPro (Data, parameters2, min.obs=4, rsq=0.7, > step.method="backward", pdf = TRUE, main = "./results.pdf", alfa = > 0.05, degree = 2, k = 9, vars = "groups") I checked in Bioconductor documentation, but things remain confused to me. Any clarification is really appreciated, Thanks, Andrea Dr. Andrea Grilli andrea.grilli at ior.it phone 051/63.66.756 Laboratory of Experimental Oncology Rizzoli Orthopaedic Institute Codivilla Putti Research Center via di Barbiano 1/10 40136 - Bologna - Italy
masigpro • 697 views
modified 8.2 years ago by Mª José Nueda90 • written 8.2 years ago by Andrea Grilli240
0
8.2 years ago by
Spain
Mª José Nueda90 wrote:
Dear Andrea, 1) Your experimental design is correct. 2) Your explanation about the 2 groups you have when vars="groups" is also correct. Normally the first group is a reference (the control group) and maSigPro looks for genes that have differences between other treatments and the control. If you want to find genes with changes in time for the second group you can make 2 things: -Selecting group2 as the reference (first group) or, as you say, spliting the data in 2 groups. But this last option doesn't give you genes with differences between groups. 3) Using vars="each" you get a many lists as variables you have in the model. The meaning "biologically speaking" depends on the study. This is an option that allows look for specific questions (differents to "all" or "groups") that a user can be interested in. For instance, if you are looking for all the genes with linear changes but not quadratic changes or whatever. You can manage these lists of genes to get the question you desire. If you don't understand my answer, please contact me again. Thank you for using maSigPro. Mar?a J. Nueda. -------------------------------------------------- From: <andrea.grilli@ior.it> Sent: Thursday, September 08, 2011 5:41 PM To: <bioconductor at="" r-project.org=""> Subject: [BioC] maSigPro and "vars" argument > Hi to all, > I'm analyzing time series experiment with maSigPro package as first time, > and I get problems to understand if experimental design is correct or > not, in particular I'm doubtful with "vars" argument. > > Data comes from Affymetrix gene chip from 2 different cell lines, 4 time > points, 2 replicates at each time. I normalized with RMA, and filtered > out low expressed/low changing genes, getting from initial 54k probes > about 12k probes. > > I'm interested in genes varying (i)in either cell lines between the > different time points (ii) between the two cell lines across time. > > I did the analysis with vars argument as "groups", getting these > comparisons: >> (ts.analysis$sig.genes$) > ts.analysis$sig.genes$Group1 ts.analysis$sig.genes$Group2vsGroup1 > > So, If I well understood, I have 2 gene sets of significant genes, the > first with those changing across time in Group1 cells, the second with > those changing in Group2 vs Group1 cells across time. > > My questions: how can I also get significant genes for Group2?? Should I > split the experiment in two parts and performing separately? > Last question: using vars = "each", what I exactly get? I mean > biologically speaking... > > > > This is my design matrix: > Time Replicates Group1 Group2 > wt22_g21 21 1 1 0 > wt22_g7 7 2 1 0 > wt36_g21 21 1 1 0 > wt36_g7 7 2 1 0 > Saos1_g21 21 5 0 1 > Saos2_g21 21 5 0 1 > Saos1_g7 7 6 0 1 > Saos2_g7 7 6 0 1 > wt22_g0 0 3 1 0 > wt22_g14 14 4 1 0 > wt36_g0 0 3 1 0 > wt36_g14 14 4 1 0 > Saos1_g0 0 7 0 1 > Saos2_g0 0 7 0 1 > Saos1_g14 14 8 0 1 > Saos2_g14 14 8 0 1 > > This is the command line: >> ts.analysis <- maSigPro (Data, parameters2, min.obs=4, rsq=0.7, >> step.method="backward", pdf = TRUE, main = "./results.pdf", alfa = 0.05, >> degree = 2, k = 9, vars = "groups") > > I checked in Bioconductor documentation, but things remain confused to me. > Any clarification is really appreciated, > Thanks, > Andrea > > > Dr. Andrea Grilli > andrea.grilli at ior.it > phone 051/63.66.756 > > Laboratory of Experimental Oncology > Rizzoli Orthopaedic Institute > Codivilla Putti Research Center > via di Barbiano 1/10 > 40136 - Bologna - Italy > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Mar?a, thank you for your detailed explanation. Only some doubt remain regarding point 3: looking at my data as example, I get 6 different variables (and than 6 different results), but it's difficult to me to understand to what correspond each variable, in particular how to manage the 6 different groups of clustering I get as results. Maybe my problem is to understand how variable itself is created: I know it comes from regression model, but nothing more. Thanks in advance, Andrea Citando M? Jos? Nueda <mj.nueda at="" ua.es="">: > Dear Andrea, > > 1) Your experimental design is correct. > 2) Your explanation about the 2 groups you have when vars="groups" is > also correct. Normally the first group is a reference (the control > group) and maSigPro looks for genes that have differences between other > treatments and the control. If you want to find genes with changes in > time for the second group you can make 2 things: -Selecting group2 as > the reference (first group) or, as you say, spliting the data in 2 > groups. But this last option doesn't give you genes with differences > between groups. > 3) Using vars="each" you get a many lists as variables you have in the > model. The meaning "biologically speaking" depends on the study. This > is an option that allows look for specific questions (differents to > "all" or "groups") that a user can be interested in. For instance, if > you are looking for all the genes with linear changes but not quadratic > changes or whatever. You can manage these lists of genes to get the > question you desire. > > If you don't understand my answer, please contact me again. Thank you > for using maSigPro. > > Mar?a J. Nueda. > > -------------------------------------------------- > From: <andrea.grilli at="" ior.it=""> > Sent: Thursday, September 08, 2011 5:41 PM > To: <bioconductor at="" r-project.org=""> > Subject: [BioC] maSigPro and "vars" argument > >> Hi to all, >> I'm analyzing time series experiment with maSigPro package as first >> time, and I get problems to understand if experimental design is >> correct or not, in particular I'm doubtful with "vars" argument. >> >> Data comes from Affymetrix gene chip from 2 different cell lines, 4 >> time points, 2 replicates at each time. I normalized with RMA, >> and filtered out low expressed/low changing genes, getting from >> initial 54k probes about 12k probes. >> >> I'm interested in genes varying (i)in either cell lines between the >> different time points (ii) between the two cell lines across time. >> >> I did the analysis with vars argument as "groups", getting these >> comparisons: >>> (ts.analysis$sig.genes$) >> ts.analysis$sig.genes$Group1 ts.analysis$sig.genes$Group2vsGroup1 >> >> So, If I well understood, I have 2 gene sets of significant genes, >> the first with those changing across time in Group1 cells, the >> second with those changing in Group2 vs Group1 cells across time. >> >> My questions: how can I also get significant genes for Group2?? >> Should I split the experiment in two parts and performing separately? >> Last question: using vars = "each", what I exactly get? I mean >> biologically speaking... >> >> >> >> This is my design matrix: >> Time Replicates Group1 Group2 >> wt22_g21 21 1 1 0 >> wt22_g7 7 2 1 0 >> wt36_g21 21 1 1 0 >> wt36_g7 7 2 1 0 >> Saos1_g21 21 5 0 1 >> Saos2_g21 21 5 0 1 >> Saos1_g7 7 6 0 1 >> Saos2_g7 7 6 0 1 >> wt22_g0 0 3 1 0 >> wt22_g14 14 4 1 0 >> wt36_g0 0 3 1 0 >> wt36_g14 14 4 1 0 >> Saos1_g0 0 7 0 1 >> Saos2_g0 0 7 0 1 >> Saos1_g14 14 8 0 1 >> Saos2_g14 14 8 0 1 >> >> This is the command line: >>> ts.analysis <- maSigPro (Data, parameters2, min.obs=4, rsq=0.7, >>> step.method="backward", pdf = TRUE, main = "./results.pdf", alfa = >>> 0.05, degree = 2, k = 9, vars = "groups") >> >> I checked in Bioconductor documentation, but things remain confused to me. >> Any clarification is really appreciated, >> Thanks, >> Andrea >>
Hi Andrea, In the regression model of each gene you will have included the "statistically significant" variables that are the variables that make change the response (gene-expression). For each gene you can have several significant variables. Examples: - A gene with flat profile for group1 and linear trend for group2, in the model: time and time^2 will not be significant and time_group2 will be significant. - A gene with linear profile for group1 and quadratic profile for group2, will have time, time2_group2 significant and time2 no significant. Anyway this is difficult to interpret. I think that if you are looking for genes that change in whatever way you have to choose vars="all" and you will have only one group with significant genes. If you need an specific question, then you have to decide how to use the 6 groups of genes you have when vars="each". Good luck, Mar?a. -------------------------------------------------- From: <andrea.grilli@ior.it> Sent: Monday, September 12, 2011 2:59 PM To: "M? Jos?Nueda" <mj.nueda at="" ua.es=""> Cc: <bioconductor at="" r-project.org=""> Subject: Re: [BioC] maSigPro and "vars" argument > Hi Mar?a, > thank you for your detailed explanation. > > Only some doubt remain regarding point 3: looking at my data as example, > I get 6 > different variables (and than 6 different results), but it's difficult to > me to > understand to what correspond each variable, in particular how to manage > the 6 different > groups of clustering I get as results. > Maybe my problem is to understand how variable itself is created: I know > it comes from > regression model, but nothing more. > > Thanks in advance, > Andrea > > > > Citando M? Jos? Nueda <mj.nueda at="" ua.es="">: > >> Dear Andrea, >> >> 1) Your experimental design is correct. >> 2) Your explanation about the 2 groups you have when vars="groups" is >> also correct. Normally the first group is a reference (the control >> group) and maSigPro looks for genes that have differences between other >> treatments and the control. If you want to find genes with changes in >> time for the second group you can make 2 things: -Selecting group2 as >> the reference (first group) or, as you say, spliting the data in 2 >> groups. But this last option doesn't give you genes with differences >> between groups. >> 3) Using vars="each" you get a many lists as variables you have in the >> model. The meaning "biologically speaking" depends on the study. This >> is an option that allows look for specific questions (differents to >> "all" or "groups") that a user can be interested in. For instance, if >> you are looking for all the genes with linear changes but not quadratic >> changes or whatever. You can manage these lists of genes to get the >> question you desire. >> >> If you don't understand my answer, please contact me again. Thank you >> for using maSigPro. >> >> Mar?a J. Nueda. >> >> -------------------------------------------------- >> From: <andrea.grilli at="" ior.it=""> >> Sent: Thursday, September 08, 2011 5:41 PM >> To: <bioconductor at="" r-project.org=""> >> Subject: [BioC] maSigPro and "vars" argument >> >>> Hi to all, >>> I'm analyzing time series experiment with maSigPro package as first >>> time, and I get problems to understand if experimental design is >>> correct or not, in particular I'm doubtful with "vars" argument. >>> >>> Data comes from Affymetrix gene chip from 2 different cell lines, 4 >>> time points, 2 replicates at each time. I normalized with RMA, and >>> filtered out low expressed/low changing genes, getting from initial >>> 54k probes about 12k probes. >>> >>> I'm interested in genes varying (i)in either cell lines between the >>> different time points (ii) between the two cell lines across time. >>> >>> I did the analysis with vars argument as "groups", getting these >>> comparisons: >>>> (ts.analysis$sig.genes$) >>> ts.analysis$sig.genes$Group1 ts.analysis$sig.genes$Group2vsGroup1 >>> >>> So, If I well understood, I have 2 gene sets of significant genes, the >>> first with those changing across time in Group1 cells, the second with >>> those changing in Group2 vs Group1 cells across time. >>> >>> My questions: how can I also get significant genes for Group2?? Should >>> I split the experiment in two parts and performing separately? >>> Last question: using vars = "each", what I exactly get? I mean >>> biologically speaking... >>> >>> >>> >>> This is my design matrix: >>> Time Replicates Group1 Group2 >>> wt22_g21 21 1 1 0 >>> wt22_g7 7 2 1 0 >>> wt36_g21 21 1 1 0 >>> wt36_g7 7 2 1 0 >>> Saos1_g21 21 5 0 1 >>> Saos2_g21 21 5 0 1 >>> Saos1_g7 7 6 0 1 >>> Saos2_g7 7 6 0 1 >>> wt22_g0 0 3 1 0 >>> wt22_g14 14 4 1 0 >>> wt36_g0 0 3 1 0 >>> wt36_g14 14 4 1 0 >>> Saos1_g0 0 7 0 1 >>> Saos2_g0 0 7 0 1 >>> Saos1_g14 14 8 0 1 >>> Saos2_g14 14 8 0 1 >>> >>> This is the command line: >>>> ts.analysis <- maSigPro (Data, parameters2, min.obs=4, rsq=0.7, >>>> step.method="backward", pdf = TRUE, main = "./results.pdf", alfa = >>>> 0.05, degree = 2, k = 9, vars = "groups") >>> >>> I checked in Bioconductor documentation, but things remain confused to >>> me. >>> Any clarification is really appreciated, >>> Thanks, >>> Andrea >>> >