Question

DiffBind - Counting reads - counts not found

0

Entering edit mode

ryclb22 • 0

@ryclb22-20027

Last seen 5.2 years ago

Hi,

I'm having an issue during the 'Counting reads' and 'Performing the differential analysis' step. I am receiving these messages/errors:

## Performing the differential analysis # .... [TRUNCATED] 
Warning message:
In data(Condition_counts) : data set ‘Condition_counts’ not found

OR

## Performing the differential analysis ##
> Condition <- dba.analyze(Skin)
converting counts to integer mode
gene-wise dispersion estimates
mean-dispersion relationship
Error in estimateDispersionsFit(object, fitType = fitType, quiet = quiet) : 
  all gene-wise dispersion estimates are within 2 orders of magnitude
  from the minimum value, and so the standard curve fitting techniques will not work.
  One can instead use the gene-wise estimates as final estimates:
  dds <- estimateDispersionsGeneEst(dds)
  dispersions(dds) <- mcols(dds)$dispGeneEst
  ...then continue with testing using nbinomWaldTest or nbinomLRT
In addition: Warning message:
In data(Condition_counts) : data set ‘Condition_counts’ not found

This is the code I run:

## Reading in the peaksets ##
samples <- read.csv(file.path(system.file("extra", package = "DiffBind"), "CellLine1_CellLine2.csv"))
names(samples)
#Create DBA
Condition <- dba(sampleSheet="CellLine1_CellLine2.csv", dir=system.file("extra", package="DiffBind"))
#plot(Condition)

## Counting reads ##
Condition <- dba.count(Condition, summits=250)
data(Condition_counts)
#plot(Condition)

## Establishing a contrast ##
Condition <- dba.contrast(Condition, categories=DBA_CONDITION, minMembers = 2)

## Performing the differential analysis ##
Condition <- dba.analyze(Condition)
plot(Condition, contrast=1)

I'd appreciate any help in understanding these messages/errors and amending them so, I can perform my differential analysis.

diffbind differential analysis genetics #chip-seq • 1.2k views

ADD COMMENT • link updated 5.1 years ago by Rory Stark ★ 5.2k • written 5.2 years ago by ryclb22 • 0

score 0 · Answer 1 · 2019-03-13

It looks like you are following the vignette script very closely, but given that you are working with another dataset, some of the specific references do not apply.

In the first line you are reading in a samplesheet. In the vignette, the example samplesheet is included with the package installation, which is why the file.path(system.file("extra", package = "DiffBind") syntax is used. For your data, you should not place the samplesheet CellLine1_CellLine2.csv in the package directory, you should put it in a data directory in your filespace.

All the lines where you use the data() call are not meaningful. In the vignette and examples, these are used to load the sample data shipped with DiffBind. It does not make sense to change the names of the data objects to your own as they are not pre-loaded.

The error you are getting from DESeq2 (all gene-wise dispersion estimates are within 2 orders of magnitude from the minimum value...")" is caused when the aligned sequencing reads are too similar to each other (not much variation in the number of reads in peaks between samples). Often this is because the same bam file is being used more than once.

It would help to see your samplesheet.