Search
Question: Error "if (is.na(peaks)) { : argument is of length zero " in DiffBind
0
gravatar for maria.kondili
7 months ago by
maria.kondili0 wrote:

Hello,

I d like to see the analysis of DiffBind for my peakset and alignment files with 2 Replicates but I have a frustratingly repeated Error saying :

`Error in if (is.na(peaks)) { : argument is of length zero` 

coming from the code line :

`IRF5data <- dba(sampleSheet="IRF5_Samples_Descr.csv") `

>WT_0    Liver    IRF5    WT    CCL4    1   bam/WT_0m_R1.bam    bam/KO_0m_R1.bam    KO_0   BroadPeaks/WT_0m_R1.bed    bed     NA raw
Error in if (is.na(peaks)) { : argument is of length zero

My peaks(in .bed) come from MACS2 ,so they are tab-delimited with 5th column to be the score.

I verified that the file directories are correct, the Header Strings are as required from dba,

so I don't know what else could be the source of the problem.

Any suggestion would be really helpful.

I am working on R version 3.4.2 (2017-09-28) -- "Short Summer", with DiffBind v2.6.6  in Ubuntu 16.04.

Here is my csv file :

SampleID
Tissue
Factor
Condition
Treatment
Replicate
bamReads
bamControl
ControlID
Peaks
PeakCaller
WT_0
Liver
IRF5
WT
CCL4
1
bam/WT_0m_R1.bam bam/KO_0m_R1.bam
KO_0
BroadPeaks/WT_0m_R1.bed
bed
WT_0
Liver
IRF5
WT
CCL4
2 bam/WT_0m_R2.bam bam/KO_0m_R2.bam
KO_0
BroadPeaks/WT_0m_R2.bed
bed
WT_120
Liver
IRF5
WT
CCL4
1
bam/WT_120m_R1.bam bam/KO_120m_R1.bam
KO_120
BroadPeaks/WT_120m_R1.bed
bed
WT_120
Liver
IRF5
WT
CCL4
2 bam/WT_120m_R2.bam bam/KO_120m_R2.bam
KO_120
BroadPeaks/WT_120m_R2.bed
bed
                     
                     
                     
                     
                     
ADD COMMENTlink modified 7 months ago • written 7 months ago by maria.kondili0
2
gravatar for Rory Stark
7 months ago by
Rory Stark2.6k
CRUK, Cambridge, UK
Rory Stark2.6k wrote:

My guess is that your sample sheet is tab-separated, not comma-separated.

The output message surprising as it prints out not only the SampleID, Tissue, Factor, Condition, Replicate, and PeakCaller, but also the paths for bamReads, bamControl, and Peaks. There isn't anyplace in DiffBind where that happens! (I just did some greps of the source code to confirm that).

Something is causing DiffBind to pick up the values for multiple columns as a single value. As a result, by the time it looks to read the PeakCaller, there is nothing there so it defaults to raw. The error you are seeing is consistent with this, as raw looks in the fourth column for the score; in MACS2 broad peaks format, the fourth column is a name string which can not be coerced into a numerical value to treat as a score, which cases that error. (Note to self: I should catch this condition and print out a more informative error message.)

Have a close look at the sample sheet. This should be a .csv file, which means it should be comma-separated, not tab separated. Check to see if there really is exactly one comma between each column. If it is a well-formed .csv file, you can send it to me (IRF5_Samples_Descr.csv) and I'll have a look at what is going on internally.

-R

ADD COMMENTlink written 7 months ago by Rory Stark2.6k
0
gravatar for maria.kondili
7 months ago by
maria.kondili0 wrote:

Yes, Thanks Mr Stark !

It seems that the comma in the csv file was the problem..I had defined the sep="\t" and created a tab file. I turned the tabs into commas and it is nicely reading the samples.

From a user's prespective, since we re used to tab-delim files,could you make the dba to accept those too ?

Best

mk

 

ADD COMMENTlink written 7 months ago by maria.kondili0

I'll log that suggestion down as a feature request, it may appear at some point...

ADD REPLYlink written 7 months ago by Rory Stark2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 244 users visited in the last hour