Hello community, I'm using DESeq and edgeR to conduct RNA-Seq data
analysis.
I want to get log2FC adjusted for (possible) lane effects. For
example, in edgeR I use:
design = model.matrix(~ lane + condition, data=tmp)
de = DGEList(counts, group=tmp$condition)
de = calcNormFactors(de)
de = estimateGLMCommonDisp(de, design)
de = estimateGLMTrendedDisp(de, design)
de = estimateGLMTagwiseDisp(de, design)
fit = glmFit(de, design)
lrt = glmLRT(fit, coef='conditiontreated')
while in DESeq2, I have:
rawData <- DESeqDataSetFromMatrix(counts, pd, ~ lane + condition)
dds <- DESeq(rawData, test='LRT', reduced= ~ lane)
>From the documentation, this seems the right way of getting the DE
genes when I want to account for the lane effect. Is this correct? Or
would be something like
DESeq(rawData)
sufficient? (and of course getting the results for the proper
coefficient) Does the same apply for edgeR?
-- output of sessionInfo():
R version 3.0.3 Patched (2014-03-06 r65200)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel splines stats graphics grDevices utils
datasets
[8] methods base
other attached packages:
[1] DESeq2_1.2.10 RcppArmadillo_0.4.200.0 Rcpp_0.11.1
[4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
[7] BiocGenerics_0.8.0 edgeR_3.4.2 limma_3.18.13
loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
[4] DBI_0.2-7 genefilter_1.44.0 grid_3.0.3
[7] lattice_0.20-29 locfit_1.5-9.1 RColorBrewer_1.0-5
[10] RSQLite_0.11.4 stats4_3.0.3 survival_2.37-7
[13] XML_3.95-0.2 xtable_1.7-3
--
Sent via the guest posting facility at bioconductor.org.
hi Rafael,
On Tue, Apr 15, 2014 at 9:50 AM, Rafael Moreira [guest]
<guest at="" bioconductor.org=""> wrote:
>
>
> while in DESeq2, I have:
> rawData <- DESeqDataSetFromMatrix(counts, pd, ~ lane + condition)
> dds <- DESeq(rawData, test='LRT', reduced= ~ lane)
>
> >From the documentation, this seems the right way of getting the DE
genes when I want to account for the lane effect. Is this correct? Or
would be something like
>
> DESeq(rawData)
>
> sufficient? (and of course getting the results for the proper
coefficient)
These are both correct for DESeq2. The first formulation performs the
likelihood ratio test (by default not shrinking LFCs and the second
performs the Wald test by default shrinking LFCs). See ?DESeq for more
information. Either way results() will give you the log2 fold change
for condition.
Mike
> Date: Tue, 15 Apr 2014 06:50:34 -0700 (PDT)
> From: "Rafael Moreira [guest]" <guest at="" bioconductor.org="">
> To: bioconductor at r-project.org, rafaelmoreira at gmail.com
> Subject: [BioC] Which log2FC to report?
>
>
> Hello community, I'm using DESeq and edgeR to conduct RNA-Seq data
analysis.
> I want to get log2FC adjusted for (possible) lane effects. For
example, in edgeR I use:
>
> design = model.matrix(~ lane + condition, data=tmp)
> de = DGEList(counts, group=tmp$condition)
> de = calcNormFactors(de)
> de = estimateGLMCommonDisp(de, design)
> de = estimateGLMTrendedDisp(de, design)
> de = estimateGLMTagwiseDisp(de, design)
> fit = glmFit(de, design)
> lrt = glmLRT(fit, coef='conditiontreated')
Yes, this is the right way to test for DE genes for condition adjusted
for
lane effects. It is also the right way to get the corresponding
log-fold-changes, although for that purpose the glmLRT() step is not
needed.
Note that edgeR gives shrunk logFC, and the amount of shrinkage is
controlled by the prior.count argument to glmFit(). The default is to
do
a minimal but useful amount of shrinkage, which is almost always
preferable to raw fold changes.
The edgeR User's guide includes two fully worked case studies that are
very closely analogous to your experiment. In the oral carcinoma case
study (Section 4.4), the patients are equivalent to the lane effects
in
your experiment. In the arabdopsis case study (Section 4.5), the
batch
effects are equivalent to the lane effects in your experiment.
> while in DESeq2, I have:
> rawData <- DESeqDataSetFromMatrix(counts, pd, ~ lane + condition)
> dds <- DESeq(rawData, test='LRT', reduced= ~ lane)
>
>> From the documentation, this seems the right way of getting the DE
genes when I want to account for the lane effect. Is this correct? Or
would be something like
>
> DESeq(rawData)
>
> sufficient? (and of course getting the results for the proper
coefficient)
> Does the same apply for edgeR?
Does what apply to edgeR?
Best wishes
Gordon
> -- output of sessionInfo():
>
> R version 3.0.3 Patched (2014-03-06 r65200)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel splines stats graphics grDevices utils
datasets
> [8] methods base
>
> other attached packages:
> [1] DESeq2_1.2.10 RcppArmadillo_0.4.200.0 Rcpp_0.11.1
> [4] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
> [7] BiocGenerics_0.8.0 edgeR_3.4.2 limma_3.18.13
>
> loaded via a namespace (and not attached):
> [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
> [4] DBI_0.2-7 genefilter_1.44.0 grid_3.0.3
> [7] lattice_0.20-29 locfit_1.5-9.1 RColorBrewer_1.0-5
> [10] RSQLite_0.11.4 stats4_3.0.3 survival_2.37-7
> [13] XML_3.95-0.2 xtable_1.7-3
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}