Question: Error in lfproc(x, y, weights = weights, cens = cens,... : newsplit: out of vertex space in dba.analyze(diff_contrast, method=DBA_DESEQ2)
16 months ago by
U of Michigan
zhaolin200420080 wrote:
Hi all, I am working on ATAC seq data set, two cell types and two conditions, replicates >= 3 .

diff <- dba(sampleSheet="SampleSheet041217.csv")
diff_count <- dba.count(diff, minOverlap=2)

20 Samples, 649843 sites in matrix:

ID Tissue Condition Treatment Replicate Caller Intervals FRiP
1   4471_64502   CD4p      CLAp        0h         1 counts    649843 0.13
2   4471_64506   CD4p      CLAp        0h         2 counts    649843 0.16
..
20 99011_70076   CD8p      CLAp        0h         6 counts    649843 0.18
diff_contrast_CD4 <- dba.contrast(diff_count,categories=DBA_CONDITION,block=diff$masks$CD4p)
diff_analysis_CD4 <- dba.analyze(diff_contrast_CD4, method=DBA_DESEQ2)

converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
converting counts to integer mode
DESeq2 multi-factor analysis
gene-wise dispersion estimates
mean-dispersion relationship
Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  :
newsplit: out of vertex space
In addition: There were 18 warnings (use warnings() to see them)

I have read the error is something to do with maxk in locfit,default =100. It suggests to give a big value (> 500) to the parameter maxk when calling the function locfit.  But I'm not sure how to edit this number in DiffBind. Could you help me solve the problem? Thanks a lot. 

P.S. I also tried method=DBA_ALL_METHODS and got the same problem.

16 months ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:

Can you do some exploratory data analysis (EDA) & visualisation to see how the mean-variance relationship of your data looks like?

@Wolfgang,

I will try analyze and visualize the mean-variance relationship. Do you mean the mean-variance is too large to fit in vertex space?

Not too large, but too different from what DESeq2 expects.

You can use fitType="mean", which at the least will not throw an error. But also I would examine plotDistEsts(dds).

Thank you for the suggestion. The samples are same cell type from different patients that's why I think Diffbind would do the normalization before differential enrichment analysis properly.

Also, I have checked some solution, but it is from DEseq R script. I am going to open a new post ask how could debug the issue in DiffBind R script.

estimateDispersionsFit(object, fitType = c("parametric", "local", "mean"), minDisp = 1e-08, quiet = FALSE)


16 months ago by
U of Michigan
zhaolin200420080 wrote:

I update my answer. I failed to change fitType in DiffBind as suggested. On the other hand, I changed the datasets in DBA project. Using plot(diff_count) and dba.plotPCA(diff_count,DBA_TREATMENT,label= DBA_ID), I could check the dataset distribution and relationship.Instead of a big DBA, I group them in several small DBA project. I also use DBA_SCORE_RPKM for dba.count. I don't knrow which solution play the role but it works. dba.analyze ran DESeq2 to the end.