DiffBind - Counting reads - counts not found
Entering edit mode
ryclb22 • 0
Last seen 2.9 years ago


I'm having an issue during the 'Counting reads' and 'Performing the differential analysis' step. I am receiving these messages/errors:

## Performing the differential analysis # .... [TRUNCATED] 
Warning message:
In data(Condition_counts) : data set ‘Condition_counts’ not found


## Performing the differential analysis ##
> Condition <- dba.analyze(Skin)
converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
Error in estimateDispersionsFit(object, fitType = fitType, quiet = quiet) : 
  all gene-wise dispersion estimates are within 2 orders of magnitude
  from the minimum value, and so the standard curve fitting techniques will not work.
  One can instead use the gene-wise estimates as final estimates:
  dds <- estimateDispersionsGeneEst(dds)
  dispersions(dds) <- mcols(dds)$dispGeneEst
  ...then continue with testing using nbinomWaldTest or nbinomLRT
In addition: Warning message:
In data(Condition_counts) : data set ‘Condition_counts’ not found

This is the code I run:

## Reading in the peaksets ##
samples <- read.csv(file.path(system.file("extra", package = "DiffBind"), "CellLine1_CellLine2.csv"))
#Create DBA
Condition <- dba(sampleSheet="CellLine1_CellLine2.csv", dir=system.file("extra", package="DiffBind"))

## Counting reads ##
Condition <- dba.count(Condition, summits=250)

## Establishing a contrast ##
Condition <- dba.contrast(Condition, categories=DBA_CONDITION, minMembers = 2)

## Performing the differential analysis ##
Condition <- dba.analyze(Condition)
plot(Condition, contrast=1)

I'd appreciate any help in understanding these messages/errors and amending them so, I can perform my differential analysis.

diffbind differential analysis genetics #chip-seq • 423 views
Entering edit mode
Rory Stark ★ 4.3k
Last seen 11 hours ago
CRUK, Cambridge, UK

It looks like you are following the vignette script very closely, but given that you are working with another dataset, some of the specific references do not apply.

In the first line you are reading in a samplesheet. In the vignette, the example samplesheet is included with the package installation, which is why the file.path(system.file("extra", package = "DiffBind") syntax is used. For your data, you should not place the samplesheet CellLine1_CellLine2.csv in the package directory, you should put it in a data directory in your filespace.

All the lines where you use the data() call are not meaningful. In the vignette and examples, these are used to load the sample data shipped with DiffBind. It does not make sense to change the names of the data objects to your own as they are not pre-loaded.

The error you are getting from DESeq2 (all gene-wise dispersion estimates are within 2 orders of magnitude from the minimum value...")" is caused when the aligned sequencing reads are too similar to each other (not much variation in the number of reads in peaks between samples). Often this is because the same bam file is being used more than once.

It would help to see your samplesheet.


Login before adding your answer.

Traffic: 248 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6