limma modeling, paired samples and continuous variable
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi, I'm sorry for re-posting the message, but I cannot find it in the archive Thanks a lot for attention Hi, I would like to model and retrieve differential expression data regarding an experimental design in which different patients (9) have different disease classes (3 disease classes) and a feature represented with a percentage (0, 0.50, 0.75,1). some conditions are replicated 2 or 3 times, regarding the disease condition Till now I have done an analysis considering Genotype and Disease in the model (as a paired samples analysis) design <- model.matrix(~Genotype+Disease) or design <- model.matrix(~0+Genotype+Disease) now I would like to model also considering a continuous variable , namely r this way: design <- model.matrix(~Genotype+Disease+r) to see if differential gene expression between two classes of disease are correlated with the r status but till now it is not possible to gain results Coefficients not estimable: r0,5 r0,75 r1 Warning message: Partial NA coefficients for 15246 probe(s) if I model design <- model.matrix(~Disease+r) it goes well, but it would not consider the different genotypes I thank you very much for attention Thanks a lot Michela -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_3.18.13 loaded via a namespace (and not attached): [1] tools_3.0.2 -- Sent via the guest posting facility at bioconductor.org.
• 2.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 14 hours ago
United States
Hi Michela, On 4/15/2014 5:05 AM, michela riba [guest] wrote: > Hi, > I'm sorry for re-posting the message, but I cannot find it in the archive > Thanks a lot for attention > > > Hi, > I would like to model and retrieve differential expression data > regarding an experimental design in which different patients (9) have different disease classes (3 disease classes) and a feature represented with a percentage (0, 0.50, 0.75,1). > some conditions are replicated 2 or 3 times, regarding the disease condition > Till now I have done an analysis considering Genotype and Disease in the model (as a paired samples analysis) > > design <- model.matrix(~Genotype+Disease) > or > design <- model.matrix(~0+Genotype+Disease) > > now I would like to model also considering > a continuous variable , namely r > > this way: design <- model.matrix(~Genotype+Disease+r) > > to see if differential gene expression between two classes of disease are correlated with the r status > > but till now it is not possible to gain results > Coefficients not estimable: r0,5 r0,75 r1 > Warning message: > Partial NA coefficients for 15246 probe(s) This indicates that R is using your 'r' data as factor rather than numeric. I assume that is not what you want? If so you need to ensure that R thinks that 'r' is a numeric vector. If you really are trying to treat 'r' as a factor, then note that you have either an over-specified model (meaning you are trying to estimate more parameters than you have observations), or that three of the coefficients for 'r' are linear combinations of existing coefficients when you already have genotype and disease in the model. Best, Jim > > if I model > design <- model.matrix(~Disease+r) > it goes well, but it would not consider the different genotypes > > I thank you very much for attention > > Thanks a lot > > Michela > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_3.18.13 > > loaded via a namespace (and not attached): > [1] tools_3.0.2 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi, thanks a lot for your kind answer. I have to specify an additional observation: the "r"parameter is indeed a numeric variable and also in this situation the result is the same. Would be reasonable to try and model the design as: design<- <- model.matrix(~0+r) #where "r"is a numeric variable? for the points about the coefficients I have to reason about Thanks a lot Michela Il giorno 15/apr/2014, alle ore 16:23, James W. MacDonald <jmacdon@uw.edu> ha scritto: > Hi Michela, > > On 4/15/2014 5:05 AM, michela riba [guest] wrote: >> Hi, >> I'm sorry for re-posting the message, but I cannot find it in the archive >> Thanks a lot for attention >> >> >> Hi, >> I would like to model and retrieve differential expression data >> regarding an experimental design in which different patients (9) have different disease classes (3 disease classes) and a feature represented with a percentage (0, 0.50, 0.75,1). >> some conditions are replicated 2 or 3 times, regarding the disease condition >> Till now I have done an analysis considering Genotype and Disease in the model (as a paired samples analysis) >> >> design <- model.matrix(~Genotype+Disease) >> or >> design <- model.matrix(~0+Genotype+Disease) >> >> now I would like to model also considering >> a continuous variable , namely r >> >> this way: design <- model.matrix(~Genotype+Disease+r) >> >> to see if differential gene expression between two classes of disease are correlated with the r status >> >> but till now it is not possible to gain results >> Coefficients not estimable: r0,5 r0,75 r1 >> Warning message: >> Partial NA coefficients for 15246 probe(s) > > This indicates that R is using your 'r' data as factor rather than numeric. I assume that is not what you want? If so you need to ensure that R thinks that 'r' is a numeric vector. > > If you really are trying to treat 'r' as a factor, then note that you have either an over-specified model (meaning you are trying to estimate more parameters than you have observations), or that three of the coefficients for 'r' are linear combinations of existing coefficients when you already have genotype and disease in the model. > > Best, > > Jim > > >> >> if I model >> design <- model.matrix(~Disease+r) >> it goes well, but it would not consider the different genotypes >> >> I thank you very much for attention >> >> Thanks a lot >> >> Michela >> >> -- output of sessionInfo(): >> >> R version 3.0.2 (2013-09-25) >> Platform: x86_64-apple-darwin10.8.0 (64-bit) >> >> locale: >> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_3.18.13 >> >> loaded via a namespace (and not attached): >> [1] tools_3.0.2 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > Dr. Michela Riba Genome Function Unit Center for Translational Genomics and Bioinformatics San Raffaele Scientific Institute Via Olgettina 58 20132 Milano Italy lab: +39 02 2643 9114 skype: mic_mir32 riba.michela@gmail.com riba.michela@hsr.it Se avete ricevuto il presente messaggio per errore, Vi preghiamo di darne immediata comunicazione al mittente e di provvedere alla sua cancellazione dal vostro computer. Grazie If you have received this e-mail in error, please let the sender know and delete it from your computer. Thank you [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

> Date: Thu, 17 Apr 2014 09:26:33 +0200
> From: Riba Michela <riba.michela at="" gmail.com="">
> To: "James W. MacDonald" <jmacdon at="" uw.edu="">
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] limma modeling, paired samples and continuous
>       variable
>
> Hi,
> thanks a lot for your kind answer.
> I have to specify an additional observation:
> the "r"parameter is indeed a numeric variable and also in this situation the result is the same.

Actually it is not possible to get the same message as before if you have correctly code r as a numeric variable.

> Would be reasonable to try and model the design as:
> design<- <- model.matrix(~0+r)
> #where "r"is a numeric variable?
>
> for the points about the coefficients I have to reason about

No.

To answer your question "if differential gene expression between two classes of disease are correlated with the r status", you probably need a Genotype:r iteraction term in your model.

You probably need to show us the whole targets frame for us to help you
further.  In other words, we need to see:

   data.frame(Genotype,Disease,r)

Best wishes
Gordon

ADD COMMENT
0
Entering edit mode
Hi, thanks a lot for your answer and I'm forwarding the covariate matrix of our design. target<- readTargets("targetPTpGSp.txt") head(target) Genotype <- factor(target$Genotype) Disease<- factor(target$Disease, levels=c("stageA", "stageB", "stageC")) # Condition <-factor(target$Condition) r<-target$Condition #this should be numeric I'm just recalling the most striking parts of what I ideally would try to do and what I have already did. Till now I have performed a paired samples analysis using design <- model.matrix(~Genotype+Disease) but I would like to include also a continuous parameter ("Condition") in the model because it seems that differentially expressed genes in two different stages of the disease e.g. "stageB" results in the fit , coming from the above specified paired sample design and indicating stageB-stageA differentially expressed genes) somehow correlate with the "Condition"parameter At the model I could make it function using design<- model.matrix(~Disease+r) but not using design <- model.matrix(~Genotype+r) nor using design <- model.matrix(~Genotype+Disease+r) I'm not sure on what design I should place to try and face the question and the simplest I could imagine: design <- model.matrix(~Genotype+Disease+r) does not work I thank you very much for your supportive help Michela Il giorno 18/apr/2014, alle ore 03:43, Gordon K Smyth <smyth at="" wehi.edu.au=""> ha scritto: > >> Date: Thu, 17 Apr 2014 09:26:33 +0200 >> From: Riba Michela <riba.michela at="" gmail.com=""> >> To: "James W. MacDonald" <jmacdon at="" uw.edu=""> >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] limma modeling, paired samples and continuous >> variable >> >> Hi, >> thanks a lot for your kind answer. >> I have to specify an additional observation: >> the "r"parameter is indeed a numeric variable and also in this situation the result is the same. > > Actually it is not possible to get the same message as before if you have correctly code r as a numeric variable. > >> Would be reasonable to try and model the design as: >> design<- <- model.matrix(~0+r) >> #where "r"is a numeric variable? >> >> for the points about the coefficients I have to reason about > > No. > > To answer your question "if differential gene expression between two classes of disease are correlated with the r status", you probably need a Genotype:r iteraction term in your model. > > You probably need to show us the whole targets frame for us to help you further. In other words, we need to see: > > data.frame(Genotype,Disease,r) > > Best wishes > Gordon > >> Thanks a lot >> >> Michela >> Il giorno 15/apr/2014, alle ore 16:23, James W. MacDonald <jmacdon at="" uw.edu=""> ha scritto: >> >>> Hi Michela, >>> >>> On 4/15/2014 5:05 AM, michela riba [guest] wrote: >>>> Hi, >>>> I'm sorry for re-posting the message, but I cannot find it in the archive >>>> Thanks a lot for attention >>>> >>>> >>>> Hi, >>>> I would like to model and retrieve differential expression data >>>> regarding an experimental design in which different patients (9) have different disease classes (3 disease classes) and a feature represented with a percentage (0, 0.50, 0.75,1). >>>> some conditions are replicated 2 or 3 times, regarding the disease condition >>>> Till now I have done an analysis considering Genotype and Disease in the model (as a paired samples analysis) >>>> >>>> design <- model.matrix(~Genotype+Disease) >>>> or >>>> design <- model.matrix(~0+Genotype+Disease) >>>> >>>> now I would like to model also considering >>>> a continuous variable , namely r >>>> >>>> this way: design <- model.matrix(~Genotype+Disease+r) >>>> >>>> to see if differential gene expression between two classes of disease are correlated with the r status >>>> >>>> but till now it is not possible to gain results >>>> Coefficients not estimable: r0,5 r0,75 r1 >>>> Warning message: >>>> Partial NA coefficients for 15246 probe(s) >>> >>> This indicates that R is using your 'r' data as factor rather than numeric. I assume that is not what you want? If so you need to ensure that R thinks that 'r' is a numeric vector. >>> >>> If you really are trying to treat 'r' as a factor, then note that you have either an over-specified model (meaning you are trying to estimate more parameters than you have observations), or that three of the coefficients for 'r' are linear combinations of existing coefficients when you already have genotype and disease in the model. >>> >>> Best, >>> >>> Jim >>> >>> >>>> >>>> if I model >>>> design <- model.matrix(~Disease+r) >>>> it goes well, but it would not consider the different genotypes >>>> >>>> I thank you very much for attention >>>> >>>> Thanks a lot >>>> >>>> Michela >>>> >>>> -- output of sessionInfo(): >>>> >>>> R version 3.0.2 (2013-09-25) >>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>> >>>> locale: >>>> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] limma_3.18.13 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] tools_3.0.2 >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >> >> Dr. Michela Riba >> Genome Function Unit >> Center for Translational Genomics and Bioinformatics >> San Raffaele Scientific Institute >> Via Olgettina 58 >> 20132 Milano >> Italy >> >> lab: +39 02 2643 9114 >> skype: mic_mir32 >> riba.michela at gmail.com >> riba.michela at hsr.it > > ______________________________________________________________________ > The information in this email is confidential and intended solely for the addressee. > You must not disclose, forward, print or use it without the permission of the sender. > ______________________________________________________________________ Dr. Michela Riba Genome Function Unit Center for Translational Genomics and Bioinformatics San Raffaele Scientific Institute Via Olgettina 58 20132 Milano Italy lab: +39 02 2643 9114 skype: mic_mir32 riba.michela at gmail.com riba.michela at hsr.it Se avete ricevuto il presente messaggio per errore, Vi preghiamo di darne immediata comunicazione al mittente e di provvedere alla sua cancellazione dal vostro computer. Grazie If you have received this e-mail in error, please let the sender know and delete it from your computer. Thank you
ADD REPLY
0
Entering edit mode

Dear Riba,

Well, a couple of points.

First, if you want to treat Condition as numeric, then you must not declare it to be a factor.  In R, a "factor" is a variable that is categorical instead of numeric.

Second, Condition is entirely confounded with Genotype in your experiment. Samples for the same Genotype always have the same Condition. For example, all samples of Genotype pt01 have Condition=0, all samples of Genotype
pt06 have Condition=0.5, and so on.  Hence you cannot include Condition and Genotype in the same model, because they are giving the same information.  If you adjust for Genotype in the model, then you have necessarily also adjusted for the Condition.

You need:

   target<- readTargets("targetPTpGSp.txt")
   Genotype <- factor(target$Genotype)
   Disease<- factor(target$Disease)
   Condition <- target$Condition

Then can use either:

   design <- model.matrix(~Genotype+Disease)

or

   design <- mode.matrix(~Condition+Disease)

Best wishes
Gordon

ADD REPLY
0
Entering edit mode
Hi Professor Gordon, thanks for you answer. I just want to add some observations: -about the factor I actually declared as a factor, but afterwards I used another : >> r<-target$Condition #this should be numeric where actually I returned back to the target file and extracted the column of the numeric variable producing an object, "r". The class of "r"is "numeric sorry for the misleading mentioning of a factor that I did eventually not use or better when I found that was considered as factor I considered how to render it numeric -I see the second point even if I would add that yes each Genotype has the same Condition parameter but some Condition is shared between Genotypes in this sense they do not represent exactly the same information or better only in one sense to my understanding in any case I would go through the results of the models you propose and see differentially expressed genes. And see how I could put the information coming from the two parametrizations together Anyhow if you have additional comments about this point I would appreciate this very much thanks a lot and Happy Easter Michela Il giorno 19/apr/2014, alle ore 01:53, Gordon K Smyth <smyth@wehi.edu.au> ha scritto: > Dear Riba, > > Well, a couple of points. > > First, if you want to treat Condition as numeric, then you must not declare it to be a factor. In R, a "factor" is a variable that is categorical instead of numeric. > > Second, Condition is entirely confounded with Genotype in your experiment. Samples for the same Genotype always have the same Condition. For example, all samples of Genotype pt01 have Condition=0, all samples of Genotype pt06 have Condition=0.5, and so on. Hence you cannot include Condition and Genotype in the same model, because they are giving the same information. If you adjust for Genotype in the model, then you have necessarily also adjusted for the Condition. > > You need: > > target<- readTargets("targetPTpGSp.txt") > Genotype <- factor(target$Genotype) > Disease<- factor(target$Disease) > Condition <- target$Condition > > Then can use either: > > design <- model.matrix(~Genotype+Disease) > > or > > design <- mode.matrix(~Condition+Disease) > > Best wishes > Gordon > >> Date: Fri, 18 Apr 2014 11:19:56 +0200 >> From: Riba Michela <riba.michela@gmail.com> >> To: Gordon K Smyth <smyth@wehi.edu.au> >> Cc: Bioconductor mailing list <bioconductor@r-project.org>, James W. > MacDonald <jmacdon@uw.edu> >> Subject: Re: limma modeling, paired samples and continuous variable >> >> Hi, >> thanks a lot for your answer and I'm forwarding the covariate matrix >> of our design. >> >> target<- readTargets("targetPTpGSp.txt") >> head(target) >> >> Genotype <- factor(target$Genotype) >> Disease<- factor(target$Disease, levels=c("stageA", "stageB", >> "stageC")) >> >> # Condition <-factor(target$Condition) >> r<-target$Condition #this should be numeric >> >> I'm just recalling the most striking parts of what I ideally would >> try to do and what I have already did. >> >> Till now I have performed a paired samples analysis using >> design <- model.matrix(~Genotype+Disease) >> >> but I would like to include also a continuous parameter ("Condition") >> in the model because it seems >> that differentially expressed genes in two different stages of the >> disease e.g. "stageB" results in the fit , coming from the above >> specified paired sample design and indicating stageB-stageA >> differentially expressed genes) >> somehow correlate with the "Condition"parameter >> >> At the model I could make it function using >> design<- model.matrix(~Disease+r) >> but not using >> design <- model.matrix(~Genotype+r) >> nor using >> design <- model.matrix(~Genotype+Disease+r) >> >> I'm not sure on what design I should place to try and face the >> question >> and the simplest I could imagine: >> design <- model.matrix(~Genotype+Disease+r) >> >> does not work >> >> I thank you very much for your supportive help >> >> Michela >> >> > >>> From smyth@wehi.edu.au Fri Apr 18 11:43:34 2014 >>> Date: Fri, 18 Apr 2014 11:43:28 +1000 (AUS Eastern Standard Time) >>> From: Gordon K Smyth <smyth@wehi.edu.au> >>> To: Riba Michela <riba.michela@gmail.com> >>> Cc: Bioconductor mailing list <bioconductor@r-project.org>, James W. > MacDonald <jmacdon@uw.edu> >>> Subject: limma modeling, paired samples and continuous variable >>> >>> >>>> Date: Thu, 17 Apr 2014 09:26:33 +0200 >>>> From: Riba Michela <riba.michela@gmail.com> >>>> To: "James W. MacDonald" <jmacdon@uw.edu> >>>> Cc: bioconductor@r-project.org >>>> Subject: Re: [BioC] limma modeling, paired samples and continuous >>>> variable >>>> >>>> Hi, >>>> thanks a lot for your kind answer. >>>> I have to specify an additional observation: >>>> the "r"parameter is indeed a numeric variable and also in this > situation >>>> the result is the same. >>> >>> Actually it is not possible to get the same message as before if you > have >>> correctly code r as a numeric variable. >>> >>>> Would be reasonable to try and model the design as: >>>> design<- <- model.matrix(~0+r) >>>> #where "r"is a numeric variable? >>>> >>>> for the points about the coefficients I have to reason about >>> >>> No. >>> >>> To answer your question "if differential gene expression between two > classes >>> of disease are correlated with the r status", you probably need a > Genotype:r >>> iteraction term in your model. >>> >>> You probably need to show us the whole targets frame for us to help you >>> further. In other words, we need to see: >>> >>> data.frame(Genotype,Disease,r) >>> >>> Best wishes >>> Gordon >>> >>>> Thanks a lot >>>> >>>> Michela >>>> Il giorno 15/apr/2014, alle ore 16:23, James W. MacDonald > <jmacdon@uw.edu> >>>> ha scritto: >>>> >>>>> Hi Michela, >>>>> >>>>> On 4/15/2014 5:05 AM, michela riba [guest] wrote: >>>>>> Hi, >>>>>> I'm sorry for re-posting the message, but I cannot find it in the >>>>>> archive >>>>>> Thanks a lot for attention >>>>>> >>>>>> >>>>>> Hi, >>>>>> I would like to model and retrieve differential expression data >>>>>> regarding an experimental design in which different patients (9) > have >>>>>> different disease classes (3 disease classes) and a feature >>>>>> represented with a percentage (0, 0.50, 0.75,1). >>>>>> some conditions are replicated 2 or 3 times, regarding the > disease >>>>>> condition >>>>>> Till now I have done an analysis considering Genotype and Disease in >>>>>> the model (as a paired samples analysis) >>>>>> >>>>>> design <- model.matrix(~Genotype+Disease) >>>>>> or >>>>>> design <- model.matrix(~0+Genotype+Disease) >>>>>> >>>>>> now I would like to model also considering >>>>>> a continuous variable , namely r >>>>>> >>>>>> this way: design <- model.matrix(~Genotype+Disease+r) >>>>>> >>>>>> to see if differential gene expression between two classes of > disease >>>>>> are correlated with the r status >>>>>> >>>>>> but till now it is not possible to gain results >>>>>> Coefficients not estimable: r0,5 r0,75 r1 >>>>>> Warning message: >>>>>> Partial NA coefficients for 15246 probe(s) >>>>> >>>>> This indicates that R is using your 'r' data as factor rather than >>>>> numeric. I assume that is not what you want? If so you need to ensure >>>>> that R thinks that 'r' is a numeric vector. >>>>> >>>>> If you really are trying to treat 'r' as a factor, then note that you >>>>> have either an over-specified model (meaning you are trying to > estimate >>>>> more parameters than you have observations), or that three of the >>>>> coefficients for 'r' are linear combinations of existing coefficients >>>>> when you already have genotype and disease in the model. >>>>> >>>>> Best, >>>>> >>>>> Jim >>>>> >>>>> >>>>>> >>>>>> if I model >>>>>> design <- model.matrix(~Disease+r) >>>>>> it goes well, but it would not consider the different genotypes >>>>>> >>>>>> I thank you very much for attention >>>>>> >>>>>> Thanks a lot >>>>>> >>>>>> Michela >>>>>> >>>>>> -- output of sessionInfo(): >>>>>> >>>>>> R version 3.0.2 (2013-09-25) >>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>>>> >>>>>> locale: >>>>>> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 >>>>>> >>>>>> attached base packages: >>>>>> [1] stats graphics grDevices utils datasets methods base >>>>>> >>>>>> other attached packages: >>>>>> [1] limma_3.18.13 >>>>>> >>>>>> loaded via a namespace (and not attached): >>>>>> [1] tools_3.0.2 >>>>>> >>>>> -- >>>>> James W. MacDonald, M.S. >>>>> Biostatistician >>>>> University of Washington >>>>> Environmental and Occupational Health Sciences >>>>> 4225 Roosevelt Way NE, # 100 >>>>> Seattle WA 98105-6099 >>>>> >>>> >>>> Dr. Michela Riba >>>> Genome Function Unit >>>> Center for Translational Genomics and Bioinformatics >>>> San Raffaele Scientific Institute >>>> Via Olgettina 58 >>>> 20132 Milano >>>> Italy >>>> >>>> lab: +39 02 2643 9114 >>>> skype: mic_mir32 >>>> riba.michela@gmail.com >>>> riba.michela@hsr.it > > ______________________________________________________________________ > The information in this email is confidential and intended solely for the addressee. > You must not disclose, forward, print or use it without the permission of the sender. > ______________________________________________________________________ Dr. Michela Riba Genome Function Unit Center for Translational Genomics and Bioinformatics San Raffaele Scientific Institute Via Olgettina 58 20132 Milano Italy lab: +39 02 2643 9114 skype: mic_mir32 riba.michela@gmail.com riba.michela@hsr.it Se avete ricevuto il presente messaggio per errore, Vi preghiamo di darne immediata comunicazione al mittente e di provvedere alla sua cancellazione dal vostro computer. Grazie If you have received this e-mail in error, please let the sender know and delete it from your computer. Thank you [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Sat, 19 Apr 2014, Riba Michela wrote: > Hi Professor Gordon, > thanks for you answer. > > I just want to add some observations: > -about the factor > I actually declared as a factor, > but afterwards I used another : > >>> r<-target$Condition #this should be numeric > > where actually I returned back to the target file and extracted the column of the numeric variable > producing an object, "r". > The class of "r"is "numeric OK. > sorry for the misleading mentioning of a factor that I did eventually > not use or better when I found that was considered as factor I > considered how to render it numeric > > > -I see the second point even if I would add that yes each Genotype has > the same Condition parameter but some Condition is shared between > Genotypes in this sense they do not represent exactly the same > information or better only in one sense to my understanding All information in Condition is also contained in Genotype, but not the other way around. You can examine whether Condition (one design column) is an adequate substitute for Genotype (8 design columns). Best Gordon > in any case I would go through the results of the models you propose and > see differentially expressed genes. And see how I could put the > information coming from the two parametrizations together > > Anyhow if you have additional comments about this point > I would appreciate this very much > > thanks a lot > > and > Happy Easter > > > Michela > > Il giorno 19/apr/2014, alle ore 01:53, Gordon K Smyth <smyth at="" wehi.edu.au=""> ha scritto: > >> Dear Riba, >> >> Well, a couple of points. >> >> First, if you want to treat Condition as numeric, then you must not declare it to be a factor. In R, a "factor" is a variable that is categorical instead of numeric. >> >> Second, Condition is entirely confounded with Genotype in your experiment. Samples for the same Genotype always have the same Condition. For example, all samples of Genotype pt01 have Condition=0, all samples of Genotype pt06 have Condition=0.5, and so on. Hence you cannot include Condition and Genotype in the same model, because they are giving the same information. If you adjust for Genotype in the model, then you have necessarily also adjusted for the Condition. >> >> You need: >> >> target<- readTargets("targetPTpGSp.txt") >> Genotype <- factor(target$Genotype) >> Disease<- factor(target$Disease) >> Condition <- target$Condition >> >> Then can use either: >> >> design <- model.matrix(~Genotype+Disease) >> >> or >> >> design <- mode.matrix(~Condition+Disease) >> >> Best wishes >> Gordon >> >>> Date: Fri, 18 Apr 2014 11:19:56 +0200 >>> From: Riba Michela <riba.michela at="" gmail.com=""> >>> To: Gordon K Smyth <smyth at="" wehi.edu.au=""> >>> Cc: Bioconductor mailing list <bioconductor at="" r-project.org="">, James W. >> MacDonald <jmacdon at="" uw.edu=""> >>> Subject: Re: limma modeling, paired samples and continuous variable >>> >>> Hi, >>> thanks a lot for your answer and I'm forwarding the covariate matrix >>> of our design. >>> >>> target<- readTargets("targetPTpGSp.txt") >>> head(target) >>> >>> Genotype <- factor(target$Genotype) >>> Disease<- factor(target$Disease, levels=c("stageA", "stageB", >>> "stageC")) >>> >>> # Condition <-factor(target$Condition) >>> r<-target$Condition #this should be numeric >>> >>> I'm just recalling the most striking parts of what I ideally would >>> try to do and what I have already did. >>> >>> Till now I have performed a paired samples analysis using >>> design <- model.matrix(~Genotype+Disease) >>> >>> but I would like to include also a continuous parameter ("Condition") >>> in the model because it seems >>> that differentially expressed genes in two different stages of the >>> disease e.g. "stageB" results in the fit , coming from the above >>> specified paired sample design and indicating stageB-stageA >>> differentially expressed genes) >>> somehow correlate with the "Condition"parameter >>> >>> At the model I could make it function using >>> design<- model.matrix(~Disease+r) >>> but not using >>> design <- model.matrix(~Genotype+r) >>> nor using >>> design <- model.matrix(~Genotype+Disease+r) >>> >>> I'm not sure on what design I should place to try and face the >>> question >>> and the simplest I could imagine: >>> design <- model.matrix(~Genotype+Disease+r) >>> >>> does not work >>> >>> I thank you very much for your supportive help >>> >>> Michela >>> >>> >> >>>> From smyth at wehi.edu.au Fri Apr 18 11:43:34 2014 >>>> Date: Fri, 18 Apr 2014 11:43:28 +1000 (AUS Eastern Standard Time) >>>> From: Gordon K Smyth <smyth at="" wehi.edu.au=""> >>>> To: Riba Michela <riba.michela at="" gmail.com=""> >>>> Cc: Bioconductor mailing list <bioconductor at="" r-project.org="">, James W. >> MacDonald <jmacdon at="" uw.edu=""> >>>> Subject: limma modeling, paired samples and continuous variable >>>> >>>> >>>>> Date: Thu, 17 Apr 2014 09:26:33 +0200 >>>>> From: Riba Michela <riba.michela at="" gmail.com=""> >>>>> To: "James W. MacDonald" <jmacdon at="" uw.edu=""> >>>>> Cc: bioconductor at r-project.org >>>>> Subject: Re: [BioC] limma modeling, paired samples and continuous >>>>> variable >>>>> >>>>> Hi, >>>>> thanks a lot for your kind answer. >>>>> I have to specify an additional observation: >>>>> the "r"parameter is indeed a numeric variable and also in this >> situation >>>>> the result is the same. >>>> >>>> Actually it is not possible to get the same message as before if you >> have >>>> correctly code r as a numeric variable. >>>> >>>>> Would be reasonable to try and model the design as: >>>>> design<- <- model.matrix(~0+r) >>>>> #where "r"is a numeric variable? >>>>> >>>>> for the points about the coefficients I have to reason about >>>> >>>> No. >>>> >>>> To answer your question "if differential gene expression between two >> classes >>>> of disease are correlated with the r status", you probably need a >> Genotype:r >>>> iteraction term in your model. >>>> >>>> You probably need to show us the whole targets frame for us to help you >>>> further. In other words, we need to see: >>>> >>>> data.frame(Genotype,Disease,r) >>>> >>>> Best wishes >>>> Gordon >>>> >>>>> Thanks a lot >>>>> >>>>> Michela >>>>> Il giorno 15/apr/2014, alle ore 16:23, James W. MacDonald >> <jmacdon at="" uw.edu=""> >>>>> ha scritto: >>>>> >>>>>> Hi Michela, >>>>>> >>>>>> On 4/15/2014 5:05 AM, michela riba [guest] wrote: >>>>>>> Hi, >>>>>>> I'm sorry for re-posting the message, but I cannot find it in the >>>>>>> archive >>>>>>> Thanks a lot for attention >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> I would like to model and retrieve differential expression data >>>>>>> regarding an experimental design in which different patients (9) >> have >>>>>>> different disease classes (3 disease classes) and a feature >>>>>>> represented with a percentage (0, 0.50, 0.75,1). >>>>>>> some conditions are replicated 2 or 3 times, regarding the >> disease >>>>>>> condition >>>>>>> Till now I have done an analysis considering Genotype and Disease in >>>>>>> the model (as a paired samples analysis) >>>>>>> >>>>>>> design <- model.matrix(~Genotype+Disease) >>>>>>> or >>>>>>> design <- model.matrix(~0+Genotype+Disease) >>>>>>> >>>>>>> now I would like to model also considering >>>>>>> a continuous variable , namely r >>>>>>> >>>>>>> this way: design <- model.matrix(~Genotype+Disease+r) >>>>>>> >>>>>>> to see if differential gene expression between two classes of >> disease >>>>>>> are correlated with the r status >>>>>>> >>>>>>> but till now it is not possible to gain results >>>>>>> Coefficients not estimable: r0,5 r0,75 r1 >>>>>>> Warning message: >>>>>>> Partial NA coefficients for 15246 probe(s) >>>>>> >>>>>> This indicates that R is using your 'r' data as factor rather than >>>>>> numeric. I assume that is not what you want? If so you need to ensure >>>>>> that R thinks that 'r' is a numeric vector. >>>>>> >>>>>> If you really are trying to treat 'r' as a factor, then note that you >>>>>> have either an over-specified model (meaning you are trying to >> estimate >>>>>> more parameters than you have observations), or that three of the >>>>>> coefficients for 'r' are linear combinations of existing coefficients >>>>>> when you already have genotype and disease in the model. >>>>>> >>>>>> Best, >>>>>> >>>>>> Jim >>>>>> >>>>>> >>>>>>> >>>>>>> if I model >>>>>>> design <- model.matrix(~Disease+r) >>>>>>> it goes well, but it would not consider the different genotypes >>>>>>> >>>>>>> I thank you very much for attention >>>>>>> >>>>>>> Thanks a lot >>>>>>> >>>>>>> Michela >>>>>>> >>>>>>> -- output of sessionInfo(): >>>>>>> >>>>>>> R version 3.0.2 (2013-09-25) >>>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit) >>>>>>> >>>>>>> locale: >>>>>>> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8 >>>>>>> >>>>>>> attached base packages: >>>>>>> [1] stats graphics grDevices utils datasets methods base >>>>>>> >>>>>>> other attached packages: >>>>>>> [1] limma_3.18.13 >>>>>>> >>>>>>> loaded via a namespace (and not attached): >>>>>>> [1] tools_3.0.2 >>>>>>> >>>>>> -- >>>>>> James W. MacDonald, M.S. >>>>>> Biostatistician >>>>>> University of Washington >>>>>> Environmental and Occupational Health Sciences >>>>>> 4225 Roosevelt Way NE, # 100 >>>>>> Seattle WA 98105-6099 >>>>>> >>>>> >>>>> Dr. Michela Riba >>>>> Genome Function Unit >>>>> Center for Translational Genomics and Bioinformatics >>>>> San Raffaele Scientific Institute >>>>> Via Olgettina 58 >>>>> 20132 Milano >>>>> Italy >>>>> >>>>> lab: +39 02 2643 9114 >>>>> skype: mic_mir32 >>>>> riba.michela at gmail.com >>>>> riba.michela at hsr.it >> >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the addressee. >> You must not disclose, forward, print or use it without the permission of the sender. >> ______________________________________________________________________ > > Dr. Michela Riba > Genome Function Unit > Center for Translational Genomics and Bioinformatics > San Raffaele Scientific Institute > Via Olgettina 58 > 20132 Milano > Italy > > lab: +39 02 2643 9114 > skype: mic_mir32 > riba.michela at gmail.com > riba.michela at hsr.it > > > Se avete ricevuto il presente messaggio per errore, Vi preghiamo di darne immediata comunicazione al mittente e di provvedere alla sua cancellazione dal vostro computer. Grazie > > If you have received this e-mail in error, please let the sender know and delete it from your computer. Thank you > > > > > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY
0
Entering edit mode

Sorry, I meant "Dear Michela".

Gordon

ADD REPLY

Login before adding your answer.

Traffic: 876 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6