I'm confused about LIMMA statistics
2
0
Entering edit mode
Amy Johnson ▴ 40
@amy-johnson-3014
Last seen 10.2 years ago
Hi, I'm new in microarray data analysis and I have a quick and maybe "simple" question about LIMMA and hope some one can help. We have 6 Agilent data (3 controls and 3 treated samples). I like to use LIMMA to figure out differentially expressing genes. I'm confused about LIMMA statistics (logFC, AveExpr, t, P.Value, adj.P.Val, and B). Which one researchers typically use to select differentially expressing genes? Can I simple use adj.P.Val < 0.05 or P.Value < 0.05? Or, should I use combination of these statistics? Thanks. Amy [[alternative HTML version deleted]]
Microarray limma Microarray limma • 1.2k views
ADD COMMENT
0
Entering edit mode
Mete Civelek ▴ 180
@mete-civelek-4566
Last seen 10.2 years ago
Hi All, I want countGenomicOverlaps to output a weighted hit count such that when a read maps to, for example four loci, a feature at one of those loci would get 1/4th of a count from that read. At the moment, countGenomicOverlaps doesn't behave the way I expect it to. Consider this example: subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), strand='+')) countGenomicOverlaps(qry, subj, resolution='divide') I would have expected the hit count to be 1/3 but instead it reports it as 1/2. Am I using this function correctly? My sessioninfo is: R version 2.12.2 (2011-02-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicRanges_1.4.0 IRanges_1.10.0 IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}}
ADD COMMENT
0
Entering edit mode
Hi Mete, Yes, you are using the function correctly and you have found a bug. I'll let you know as soon as it's fixed. Thanks, Valerie On 04/25/2011 04:38 PM, Mete Civelek wrote: > Hi All, > > I want countGenomicOverlaps to output a weighted hit count such that when a > read maps to, for example four loci, a feature at one of those loci would > get 1/4th of a count from that read. > At the moment, countGenomicOverlaps doesn't behave the way I expect it to. > > Consider this example: > > subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) > qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), > strand='+')) > countGenomicOverlaps(qry, subj, resolution='divide') > > I would have expected the hit count to be 1/3 but instead it reports it as > 1/2. Am I using this function correctly? > > My sessioninfo is: > > > R version 2.12.2 (2011-02-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicRanges_1.4.0 IRanges_1.10.0 > > > > IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Mete, The bug is now fixed in the devel trunk (version 1.5.5) and the release branch (version 1.4.2). It will be a day before the new package versions propagate through the build system and are available through biocLite. If you want to retrieve them directly they are available via svn at release : https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_2_8/madman/Rp acks/GenomicRanges devel : https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/GenomicRan ges I've included an additional example on the man page for countGenomicOverlaps illustrating the handling of split reads. Let me know if you run into other problems. Take care, Valerie On 04/26/2011 03:56 PM, Valerie Obenchain wrote: > Hi Mete, > > Yes, you are using the function correctly and you have found a bug. > I'll let you know as soon as it's fixed. > > Thanks, > Valerie > > > On 04/25/2011 04:38 PM, Mete Civelek wrote: >> Hi All, >> >> I want countGenomicOverlaps to output a weighted hit count such that >> when a >> read maps to, for example four loci, a feature at one of those loci >> would >> get 1/4th of a count from that read. >> At the moment, countGenomicOverlaps doesn't behave the way I expect >> it to. >> >> Consider this example: >> >> subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), >> strand='+')) >> qry = GRangesList(read1=GRanges(seq='1', >> IRanges(c(10,60,100),c(20,70,110)), >> strand='+')) >> countGenomicOverlaps(qry, subj, resolution='divide') >> >> I would have expected the hit count to be 1/3 but instead it reports >> it as >> 1/2. Am I using this function correctly? >> >> My sessioninfo is: >> >> >> R version 2.12.2 (2011-02-25) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] GenomicRanges_1.4.0 IRanges_1.10.0 >> >> >> >> IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}} >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Mon, Apr 25, 2011 at 6:30 PM, Amy Johnson <a7johnson at="" gmail.com=""> wrote: > Hi, > > I'm new in microarray data analysis and I have a quick and maybe "simple" > question about LIMMA and hope some one can help. > > We have 6 Agilent data (3 controls and 3 treated samples). I like to use > LIMMA to figure out differentially expressing genes. I'm confused about > LIMMA statistics (logFC, AveExpr, t, P.Value, adj.P.Val, and B). Which one > researchers typically use to select differentially expressing genes? Can I > simple use adj.P.Val < 0.05 or P.Value < 0.05? Or, should I use combination > of these statistics? Thanks. Hi, Amy. Your best bet is to thoroughly read the Limma User Guide and the help pages for ALL the commands you used to generate your topTable results. Also, it will help to get some basics of statistics under your belt. There are no hard-and-fast rules about what should be used, but many folks will use adj.P.Val (adjusted to be a False Discovery Rate) with or without logFC cutoffs. Sean
ADD COMMENT

Login before adding your answer.

Traffic: 566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6