Question: edgeR DGEList and design matrix
0
gravatar for ctstackh
4 months ago by
ctstackh0
ctstackh0 wrote:

I'm new to edgeR and trying to perform differential expression for my samples, but having difficulties getting started. I have 8 pairs of samples most with 3 replicates each (1 replicate each from 2 groups had to be thrown out because they didn't pass QC).

So in general, each pair consists of 3 control and 3 treated biological replicates.

What I've tried to do is this:

group <- factor(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,6,6,6,6,6,6,7,7,7,7,7,8,8,8,8,8,8))
treat <- factor(c(1,1,1,2,2,2,1,1,1,2,2,2,1,1,1,2,2,2,1,1,1,2,2,2,1,1,1,2,2,1,1,1,2,2,2,1,1,2,2,2,1,1,1,2,2,2))
## Set up y and design
y <- DGEList(counts = x, group = group)
y <- calcNormFactors(y)
design <- model.matrix(~treat+group, data = y$samples)
y <- estimateDisp(y,design)

However, when I run estimateDisp I get the following error:

Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  : 
  newsplit: out of vertex space

Am I setting up the DGEList object and design matrix correctly? If not, what should I do differently?

Thanks! -Christian

edger dge • 218 views
ADD COMMENTlink modified 4 months ago by Yunshun Chen540 • written 4 months ago by ctstackh0
Answer: edgeR DGEList and design matrix
0
gravatar for Gordon Smyth
4 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

To use edgeR pipelines, you need to first filter out non-expressed or very low expressed genes using filterByExpr(). See the User's Guide or one of the examples:

https://bioconductor.org/packages/release/workflows/vignettes/RnaSeqGeneEdgeRQL/inst/doc/edgeRQL.html

The error you report does not have anything to do with the DGEList or the design matrix. I have never myself seen this error triggered in 10 years of using edgeR and I've only heard of it reported once before:

https://support.bioconductor.org/p/84696/

The error comes from the locfit package, which is called by edgeR to fit a lowess curve, and I think the error will go away if you filter your data as recommended.

Alternatively, running estimateDisp with trend.method="loess" will also make the error go away at the cost of making estimateDisp very slightly slower.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Gordon Smyth39k

keep <- filterByExpr(x)

x <- x[keep,]

solved the issue!

I'm still not sure what my design should be: treat+group, group+treat, group*treat, group:treat? I want to compare treated and untreated within groups, but I also want to compare all treated vs all untreated. Would I use et for these comparisons?

Thanks! Christian

ADD REPLYlink written 4 months ago by ctstackh0

Your design matrix is correct as it is.

For the filtering, you should use

fillterByExpr(x, group=treat)

or

filterByExpr(y, group=treat)
ADD REPLYlink modified 4 months ago • written 4 months ago by Gordon Smyth39k
Answer: edgeR DGEList and design matrix
0
gravatar for Yunshun Chen
4 months ago by
Yunshun Chen540
Australia
Yunshun Chen540 wrote:

Is your x a count matrix?

If so, have you performed filtering? Lowly expressed genes (especially with zero counts across all samples) can cause problems in dispersion estimation. You can perform filtering using the filterByExpr() function.

ADD COMMENTlink modified 4 months ago • written 4 months ago by Yunshun Chen540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 154 users visited in the last hour