Question

Error "if (is.na(peaks)) { : argument is of length zero " in DiffBind

1

Entering edit mode

maria.kondili ▴ 10

@mariakondili-15560

Last seen 6.0 years ago

Hello,

I d like to see the analysis of DiffBind for my peakset and alignment files with 2 Replicates but I have a frustratingly repeated Error saying :

`Error in if (is.na(peaks)) { : argument is of length zero`

coming from the code line :

`IRF5data <- dba(sampleSheet="IRF5_Samples_Descr.csv") `

>WT_0    Liver    IRF5    WT    CCL4    1   bam/WT_0m_R1.bam    bam/KO_0m_R1.bam    KO_0   BroadPeaks/WT_0m_R1.bed    bed     NA raw
Error in if (is.na(peaks)) { : argument is of length zero

My peaks(in .bed) come from MACS2 ,so they are tab-delimited with 5th column to be the score.

I verified that the file directories are correct, the Header Strings are as required from dba,

so I don't know what else could be the source of the problem.

Any suggestion would be really helpful.

I am working on R version 3.4.2 (2017-09-28) -- "Short Summer", with DiffBind v2.6.6 in Ubuntu 16.04.

Here is my csv file :

SampleID

Tissue

Factor

Condition

Treatment

Replicate

bamReads

bamControl

ControlID

Peaks

PeakCaller

WT_0

Liver

IRF5

WT

CCL4

1

bam/WT_0m_R1.bam

bam/KO_0m_R1.bam

KO_0

BroadPeaks/WT_0m_R1.bed

bed

WT_0

Liver

IRF5

WT

CCL4

2

bam/WT_0m_R2.bam

bam/KO_0m_R2.bam

KO_0

BroadPeaks/WT_0m_R2.bed

bed

WT_120

Liver

IRF5

WT

CCL4

1

bam/WT_120m_R1.bam

bam/KO_120m_R1.bam

KO_120

BroadPeaks/WT_120m_R1.bed

bed

WT_120

Liver

IRF5

WT

CCL4

2

bam/WT_120m_R2.bam

bam/KO_120m_R2.bam

KO_120

BroadPeaks/WT_120m_R2.bed

bed

diffbind chipseq peaks dba is.na • 4.1k views

ADD COMMENT • link 6.0 years ago maria.kondili ▴ 10

0

Entering edit mode

maria.kondili ▴ 10

@mariakondili-15560

Last seen 6.0 years ago

Yes, Thanks Mr Stark !

It seems that the comma in the csv file was the problem..I had defined the sep="\t" and created a tab file. I turned the tabs into commas and it is nicely reading the samples.

From a user's prespective, since we re used to tab-delim files,could you make the dba to accept those too ?

Best

mk

ADD COMMENT • link 6.0 years ago maria.kondili ▴ 10

0

Entering edit mode

I'll log that suggestion down as a feature request, it may appear at some point...

ADD REPLY • link 6.0 years ago Rory Stark ★ 5.2k

score 2 · Accepted Answer · 2018-04-16

My guess is that your sample sheet is tab-separated, not comma-separated.

The output message surprising as it prints out not only the SampleID, Tissue, Factor, Condition, Replicate, and PeakCaller, but also the paths for bamReads, bamControl, and Peaks. There isn't anyplace in DiffBind where that happens! (I just did some greps of the source code to confirm that).

Something is causing DiffBind to pick up the values for multiple columns as a single value. As a result, by the time it looks to read the PeakCaller, there is nothing there so it defaults to raw. The error you are seeing is consistent with this, as raw looks in the fourth column for the score; in MACS2 broad peaks format, the fourth column is a name string which can not be coerced into a numerical value to treat as a score, which cases that error. (Note to self: I should catch this condition and print out a more informative error message.)

Have a close look at the sample sheet. This should be a .csv file, which means it should be comma-separated, not tab separated. Check to see if there really is exactly one comma between each column. If it is a well-formed .csv file, you can send it to me (IRF5_Samples_Descr.csv) and I'll have a look at what is going on internally.

-R