Search
Question: Error "if (is.na(peaks)) { : argument is of length zero " in DiffBind
0
7 months ago by
maria.kondili0 wrote:

Hello,

I d like to see the analysis of DiffBind for my peakset and alignment files with 2 Replicates but I have a frustratingly repeated Error saying :

Error in if (is.na(peaks)) { : argument is of length zero

coming from the code line :

IRF5data <- dba(sampleSheet="IRF5_Samples_Descr.csv") 

>WT_0    Liver    IRF5    WT    CCL4    1   bam/WT_0m_R1.bam    bam/KO_0m_R1.bam    KO_0   BroadPeaks/WT_0m_R1.bed    bed     NA raw
Error in if (is.na(peaks)) { : argument is of length zero

My peaks(in .bed) come from MACS2 ,so they are tab-delimited with 5th column to be the score.

I verified that the file directories are correct, the Header Strings are as required from dba,

so I don't know what else could be the source of the problem.

Any suggestion would be really helpful.

I am working on R version 3.4.2 (2017-09-28) -- "Short Summer", with DiffBind v2.6.6  in Ubuntu 16.04.

Here is my csv file :

 SampleID
 Tissue
 Factor
 Condition
 Treatment
 Replicate
 bamControl
 ControlID
Peaks
 PeakCaller
WT_0
 Liver
 IRF5
 WT
 CCL4
 1
bam/WT_0m_R1.bam bam/KO_0m_R1.bam
 KO_0
 bed
WT_0
 Liver
 IRF5
 WT
 CCL4
2 bam/WT_0m_R2.bam bam/KO_0m_R2.bam
 KO_0
 bed
WT_120
 Liver
 IRF5
 WT
 CCL4
 1
bam/WT_120m_R1.bam bam/KO_120m_R1.bam
 KO_120
 bed
WT_120
 Liver
 IRF5
 WT
 CCL4
2 bam/WT_120m_R2.bam bam/KO_120m_R2.bam
 KO_120
 bed
modified 7 months ago • written 7 months ago by maria.kondili0
2
7 months ago by
Rory Stark2.6k
CRUK, Cambridge, UK
Rory Stark2.6k wrote:

My guess is that your sample sheet is tab-separated, not comma-separated.

The output message surprising as it prints out not only the SampleID, Tissue, Factor, Condition, Replicate, and PeakCaller, but also the paths for bamReads, bamControl, and Peaks. There isn't anyplace in DiffBind where that happens! (I just did some greps of the source code to confirm that).

Something is causing DiffBind to pick up the values for multiple columns as a single value. As a result, by the time it looks to read the PeakCaller, there is nothing there so it defaults to raw. The error you are seeing is consistent with this, as raw looks in the fourth column for the score; in MACS2 broad peaks format, the fourth column is a name string which can not be coerced into a numerical value to treat as a score, which cases that error. (Note to self: I should catch this condition and print out a more informative error message.)

Have a close look at the sample sheet. This should be a .csv file, which means it should be comma-separated, not tab separated. Check to see if there really is exactly one comma between each column. If it is a well-formed .csv file, you can send it to me (IRF5_Samples_Descr.csv) and I'll have a look at what is going on internally.

-R

0
7 months ago by
maria.kondili0 wrote:

Yes, Thanks Mr Stark !

It seems that the comma in the csv file was the problem..I had defined the sep="\t" and created a tab file. I turned the tabs into commas and it is nicely reading the samples.

From a user's prespective, since we re used to tab-delim files,could you make the dba to accept those too ?

Best

mk