Question: Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, : newsplit: out of vertex space in paired edgeR comparison
0
3.2 years ago by
manasishah8630
manasishah8630 wrote:

I am getting  a warning for estimateDisp : "In estimateDisp(x, Design) : No residual df: setting dispersion to NA"

Then followed by an error in glmFit(x, Design) : Error in qr.coef(qr(design), matrix(beta.mean, nrow= nlibs, ncol = ngenes

'qr' and 'y' must have same number of genes

Phenotype <- c("biopsy_carcinoma", "biopsy_carcinoma", "biopsy_carcinoma", "biopsy_carcinoma", "fecal_carcinoma", "fecal_carcinoma", "fecal_carcinoma", "fecal_carcinoma", "fecal_carcinoma")


Pair_factor <- c("WFB5", "WFB4",  "WFB2",  "WFB3", "WFB2", "WFB4", "WFB5", "WFB3", "WFB1")

design = model.matrix(~Pair_factor + Phenotype)


colnames(design)

x = DGElist(counts = x, group = Phenotype, genes = gene_table, remove.zeros = TRUE)

where dim(x) = 894 10

dim(gene_table) = 894 7

then,

x = calcNormFactors(x, method = "RLE")

x = estimateDisp(x, design) which is when I get the warning

fit = glmFit(x, design) which is when I get the error

Any idea what might be going wrong? Any inputs will be much appreciated,

Thanks a tonne,

Manasi Shah, MS

PhD Epidemiology Candidate, UTSPH

modified 3.2 years ago by Gordon Smyth38k • written 3.2 years ago by manasishah8630

Further edit:

I cleared my environment and realized there were some old objects it was calling. Now I get a new error

Error in as.vector(x, mode) :
cannot coerce type 'closure' to vector of type 'any' in the estimateDisp(x, design) step

how did my x become of type closure?

Even further edit, I seemed to have fixed the coerce error, I took the exact same code for a similar study that worked and used it for another subset

Got a new error and this seems to be the final one:

Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  :
newsplit: out of vertex space

Any idea what causes this?

Thanks,

Manasi

2

You should quit R, and restart using

R --vanilla

to make sure you aren't loading up a bunch of cruft from whatever .RData file you have floating around in that directory. As an aside, I never save my R workspace - I might save one or two objects, but when I quit R, it's automatically set to not save anything. In my opinion there is no profit in having stuff get automatically loaded into an R session - it often takes a long time to load, and then there are all these objects that I don't remember creating that can wreak havoc on the analysis.

Once you have reloaded a clean R session, try running your code again.

I have tried to keep my R session clean with more discipline (and developed more faith on the reproducibility of my code and results) since this post :) since this post. Thank you

Answer: No residual df, setting dispersion to NA, Error: 'qr' and ''y' must have same nu
0
3.2 years ago by
United States
James W. MacDonald51k wrote:

You get that error if your design matrix has the same or more columns as your count matrix, so you appear to be passing in a matrix of counts that doesn't match your design matrix. But without seeing more of your code it's impossible to say more than that.

Hi James, I checked my dimensions and have edited the question. I did try posting the entire code multiple times earlier but it kept giving me some formatting error when I tried to submit possibly because I copied it directly from my R console.

My design matrix has 6 columns counting the intercept, whereas my count matrix has 10 columns for 5 pairs of samples.

Can you please have a look? I am stuck on this since two days :/

Thanks,

Manasi

Well R has lexical scoping, so it shouldn't be getting confused about all those 'x' objects you have floating around, but who knows? What happens if you use something more descriptive like 'dglst'?

dglst <- DGEList(counts = x, group = Phenotype, genes = gene_table, remove.zeros = TRUE)

dglst <- calcNormFactors(dglst)

dglst <- estimateDisp(dglst, design)

Also at the point before calling estimateDisp, what do you get for

dim(dglst)

Same error Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  :
newsplit: out of vertex space

if I use descriptive names at each step: x_dge, x_norm, x

> dim(x_norm)
[1] 1168   10
> dim(design)
[1] 10  6
> dim(x_dge)
[1] 1168   10
> dim(gene_table)
[1] 1168    7

I edited the question title to reflect this, thanks

Answer: Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,
0
3.2 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

The lfproc error message is from the locfit package, which is called by estimateDisp(). I haven't seen this error message triggered before, which leads me to think there must be some something very weird with your data.

1

Thank you for your response. I had some features with very low variance. I removed these and did not see this error anymore. Is there any variance threshold one should keep in mind when using edgeR / limma?