Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.6 years ago
I have implemented a linear model to fit RNAseq read counts for ~1000
samples using ~200 covariates plus the intercept. The pipeline follows
the example in the user manual, i.e. v=voom(y,design) ->
fit=lmFit(v,design) -> fit2=eBayes(fit). Then I am using decideTests()
to assess the effect of each covariate on gene expression. However,
the output is less complete than topTable(), so I used this function
to obtain values on fold-changes and individual p-values. At this
stage, values become difficult to interpret. If the LM is:
gene_i,j = intercept_i + alfa_i*Factor1_j + beta_i*Factor2_j + (~200
other factors)
and, to simplify, let's say covariates are binary. Then what do the
fold-changes for topTable(fit2,coef="alfa") mean?
My interpretation was the up/down-regulation due to the presence of
covariate alfa, but if I boxplot or summarize the expression values
for the two groups (alfa = 1 vs alfa = 0) I get quite contradictory
results. For example this is the most DE gene when the sample has a
mutation in KRAS (the covariate, 0 if wild-type, 1 if mutated):
> topTable(fit2,coef="metadataKRAS",number=1)
hgnc_symbol logFC AveExpr t P.Value adj.P.Val
B
GOLGA6A -1.314213 -5.9000494 -7.375175 3.800569e-13 7.204358e-09
18.80961
The summary of gene expression values for wild-type KRAS vs. mutated
is:
> by(E["GOLGA6A",],design[,"metadataKRAS"],summary)
design[, "metadataKRAS"]: 0
Min. 1st Qu. Median Mean 3rd Qu. Max.
-7.882 -6.865 -6.461 -6.021 -5.770 1.548
----------------------------------------------------------------
design[, "metadataKRAS"]: 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
-7.657 -6.247 -5.487 -5.239 -4.582 -1.311
This is an up-regulation, and of magnitude way lower than the one
indicated by topTable. Is this because I am misusing the function
topTable? I cannot wrap my head around this. Thanks for your help!
-- output of sessionInfo():
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] sv_SE.UTF-8/sv_SE.UTF-8/sv_SE.UTF-8/C/sv_SE.UTF-8/sv_SE.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_3.4.2 limma_3.18.9
loaded via a namespace (and not attached):
[1] tools_3.0.2
--
Sent via the guest posting facility at bioconductor.org.