Dear Christos,
Thanks for pointing out this research paper!
This is interesting!
I am wondering if non-specific filtering based on variances is a good
way to
reduce the number of genes (probes) in this case? Lets say - we
exclude the
genes above a particular variance cutoff (for eg: >90 percentile)
Thx
S.
On Mon, Jan 25, 2010 at 3:30 PM, Christos Hatzis <
christos.hatzis@nuverabio.com> wrote:
> This strategy is bound to be less efficient, though.
> See a recent article on this subject.
>
http://www.biomedcentral.com/1471-2105/10/402
>
> -Christos
>
>
> Christos Hatzis, Ph.D.
> Nuvera Biosciences, Inc.
> 400 West Cummings Park, Suite 5350
> Woburn, MA 01801
> 781-938-3844
>
>
>
> -----Original Message-----
> From: bioconductor-bounces@stat.math.ethz.ch
> [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of sabrina
s
> Sent: Monday, January 25, 2010 3:17 PM
> To: Sunny Srivastava
> Cc: bioconductor@stat.math.ethz.ch
> Subject: Re: [BioC] question about lmFit model
>
> Dear Sunny:
> Thanks for your input. personally I prefer combine p-value and fc
together
> because you can not validate all genes detected, but pick some with
higher
> FC will probably feasible to do.
>
> Sabrina
>
>
>
> On Mon, Jan 25, 2010 at 12:05 AM, Sunny Srivastava
> <research.baba@gmail.com>wrote:
>
> > Dear Sabrina,
> > Experienced members of the group will have better things to say
but here
> is
> > my $0.25.
> > As a statistician - I would prefer Design 1. The reason is - that
data
> > should never be ignored.
> >
> > Also, more the data, Limma can take more advantage of this
information in
> > the Empirical Bayesian Estimation of S.D. Lower p-values are
because of
> this
> > fact. (Taking less data might result in inflated SDs which can
also
> result
> > in lower p-values.)
> >
> > Comparing Differential expression and Fold Change is like
comparing Apple
> > and oranges. Differential expression has nothing to do with low
fold
> change.
> > As a statistician, I would always trust differential expression
than
> > Fold-Change.
> > If you think that fold-change is important for you then you should
select
> > the differentially expressed genes ONLY if their log fold-change
is above
> > say 2.
> >
> > you can do this in limma using topTable and/or decideTests.
> >
> > Pls correct me if I am wrong.
> >
> > Thx
> > S.
> >
> > On Thu, Jan 21, 2010 at 1:32 PM, sabrina s
<sabrina.shao@gmail.com>
> wrote:
> >
> >> Hi, Jenny:
> >> Thanks for the quick reply. And thanks for pointing out about
posting. I
> >> thought maybe my subject was not good enough to be noticed and
that is
> why
> >> I
> >> posted again. This is my first post, so long way to go!
> >> Regarding your second point: I don't think my question is a
general one
> >> about why ANOVA is better than a series of t-tests. I actually
did both,
> >> but
> >> realized that the result from one single model ( use all samples)
gave
> me
> >> much lower p-values, but when I looked at the expression value,
the fold
> >> change was nothing , like 0.5. That is why I wonder if the
inflated DOF
> >> gave
> >> me much low p-values. Any thoughts on that?
> >>
> >> Thanks!
> >>
> >> Sabrina
> >>
> >> On Thu, Jan 21, 2010 at 12:05 PM, Jenny Drnevich
<drnevich@illinois.edu> >> >wrote:
> >>
> >> > Hi Sabrina,
> >> >
> >> > First, a little list ettiquette. If you don't get a response to
a post
> >> > within a day, it's not considered polite to just repost the
same
> >> question
> >> > verbatim the next day under a different Subject.
> >> >
> >> > Second: your question isn't specific to the modeling of lmFit.
> Instead,
> >> > it's a general statistical question about why it's better to
one ANOVA
> >> model
> >> > instead of a series of t-tests. I suggest you consult a basic
> >> statistical
> >> > textbook or a local statistician to find the answer.
> >> >
> >> > Cheers,
> >> > Jenny
> >> >
> >> >
> >> > At 10:39 AM 1/21/2010, sabrina s wrote:
> >> >
> >> >> Hello, everyone:
> >> >>
> >> >> I have a question related to conceptual understanding of
lmFit.
> >> >>
> >> >> I have the following experiment that I want to conduct, but I
am not
> >> sure
> >> >> which is the right way to use design matrix and contrasts.
Here is
> the
> >> >> experiment:
> >> >>
> >> >> say I have 3 different strains that are genetically different,
A, B
> and
> >> C
> >> >> where A is the control. I also have two different treatments,
> >> >> T1 and T2. For each strain, I have 4 arrays for each
treatment, so
> in
> >> >> total, I have 24 arrays. What I want to find out is the
significantly
> >> >> differentially expressed genes for the following comparison:
> >> >> 1) for control strain A: T1 vs T2
> >> >> 2) under T1, B vs. A (control)
> >> >> 3) under T1, C vs. A
> >> >> 4) for B, T1 vs T2
> >> >> 5) for C, T1 vs T2
> >> >> 6) interaction term of A and B , T1 and T2
> >> >> 7) interaction term of A and C, T1 and T2.
> >> >>
> >> >> There are two ways I could use lmFit
> >> >>
> >> >> One is:
> >> >>
> >> >> for the design matrix, I will include all 3 strains and 2
conditions,
> >> >> I use the following code:
> >> >> A_T1, A_T2, B_T1, B_T2, C_T1, C_T2
> >> >> sample1: 1 ,0 ,0, 0, 0 , 0
> >> >> sample2 :
> >> >>
> >> >> Then make a contrast matrix and follow the code below:
> >> >>
> >> >> fitGene<-lmFit(gene,design=design,weights=arrayWt);
> >> >> fitGene2<-contrasts.fit(fitGene,cont.matrix)
> >> >> fitGene2<-eBayes(fitGene2,proportion=p);
> >> >>
> >> >>
> >> >> Two:
> >> >> Instead of using all samples at one time to fit into a lmFit
> function,
> >> I
> >> >> use
> >> >> two design matrix only involves A and B, T1 and T2,
> >> >> and second design matrix that involves A and C, T1 and T2, and
make
> >> >> contrast
> >> >> matrix and fit separately. and later on I can compare these
two
> >> >> results if I want to.
> >> >>
> >> >>
> >> >>
> >> >> The question I have is: which one is the right one? For the
first
> >> method,
> >> >> I
> >> >> will have large DOF , and much lower p-values, but it was
testing the
> >> >> same thing as the second one, so am I creating an artifact?
Thanks
> for
> >> >> your help!
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Sabrina
> >> >>
> >> >> [[alternative HTML version deleted]]
> >> >>
> >> >> _______________________________________________
> >> >> Bioconductor mailing list
> >> >> Bioconductor@stat.math.ethz.ch
> >> >>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> >> Search the archives:
> >> >>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> >>
> >> >
> >> > Jenny Drnevich, Ph.D.
> >> >
> >> > Functional Genomics Bioinformatics Specialist
> >> > W.M. Keck Center for Comparative and Functional Genomics
> >> > Roy J. Carver Biotechnology Center
> >> > University of Illinois, Urbana-Champaign
> >> >
> >> > 330 ERML
> >> > 1201 W. Gregory Dr.
> >> > Urbana, IL 61801
> >> > USA
> >> >
> >> > ph: 217-244-7355
> >> > fax: 217-265-5066
> >> > e-mail: drnevich@illinois.edu
> >> >
> >>
> >>
> >>
> >> --
> >> Sabrina
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor@stat.math.ethz.ch
> >>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> >
>
>
> --
> Sabrina
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
[[alternative HTML version deleted]]