Error "if (is.na(peaks)) { : argument is of length zero " in DiffBind
2
0
Entering edit mode
@mariakondili-15560
Last seen 3.4 years ago

Hello,

I d like to see the analysis of DiffBind for my peakset and alignment files with 2 Replicates but I have a frustratingly repeated Error saying :

`Error in if (is.na(peaks)) { : argument is of length zero` 

coming from the code line :

`IRF5data <- dba(sampleSheet="IRF5_Samples_Descr.csv") `

>WT_0    Liver    IRF5    WT    CCL4    1   bam/WT_0m_R1.bam    bam/KO_0m_R1.bam    KO_0   BroadPeaks/WT_0m_R1.bed    bed     NA raw
Error in if (is.na(peaks)) { : argument is of length zero

My peaks(in .bed) come from MACS2 ,so they are tab-delimited with 5th column to be the score.

I verified that the file directories are correct, the Header Strings are as required from dba,

so I don't know what else could be the source of the problem.

Any suggestion would be really helpful.

I am working on R version 3.4.2 (2017-09-28) -- "Short Summer", with DiffBind v2.6.6  in Ubuntu 16.04.

Here is my csv file :

SampleID
Tissue
Factor
Condition
Treatment
Replicate
bamReads
bamControl
ControlID
Peaks
PeakCaller
WT_0
Liver
IRF5
WT
CCL4
1
bam/WT_0m_R1.bam bam/KO_0m_R1.bam
KO_0
BroadPeaks/WT_0m_R1.bed
bed
WT_0
Liver
IRF5
WT
CCL4
2 bam/WT_0m_R2.bam bam/KO_0m_R2.bam
KO_0
BroadPeaks/WT_0m_R2.bed
bed
WT_120
Liver
IRF5
WT
CCL4
1
bam/WT_120m_R1.bam bam/KO_120m_R1.bam
KO_120
BroadPeaks/WT_120m_R1.bed
bed
WT_120
Liver
IRF5
WT
CCL4
2 bam/WT_120m_R2.bam bam/KO_120m_R2.bam
KO_120
BroadPeaks/WT_120m_R2.bed
bed
                     
                     
                     
                     
                     
diffbind chipseq peaks dba is.na • 2.2k views
ADD COMMENT
2
Entering edit mode
Rory Stark ★ 4.1k
@rory-stark-5741
Last seen 9 hours ago
CRUK, Cambridge, UK

My guess is that your sample sheet is tab-separated, not comma-separated.

The output message surprising as it prints out not only the SampleID, Tissue, Factor, Condition, Replicate, and PeakCaller, but also the paths for bamReads, bamControl, and Peaks. There isn't anyplace in DiffBind where that happens! (I just did some greps of the source code to confirm that).

Something is causing DiffBind to pick up the values for multiple columns as a single value. As a result, by the time it looks to read the PeakCaller, there is nothing there so it defaults to raw. The error you are seeing is consistent with this, as raw looks in the fourth column for the score; in MACS2 broad peaks format, the fourth column is a name string which can not be coerced into a numerical value to treat as a score, which cases that error. (Note to self: I should catch this condition and print out a more informative error message.)

Have a close look at the sample sheet. This should be a .csv file, which means it should be comma-separated, not tab separated. Check to see if there really is exactly one comma between each column. If it is a well-formed .csv file, you can send it to me (IRF5_Samples_Descr.csv) and I'll have a look at what is going on internally.

-R

ADD COMMENT
0
Entering edit mode
@mariakondili-15560
Last seen 3.4 years ago

Yes, Thanks Mr Stark !

It seems that the comma in the csv file was the problem..I had defined the sep="\t" and created a tab file. I turned the tabs into commas and it is nicely reading the samples.

From a user's prespective, since we re used to tab-delim files,could you make the dba to accept those too ?

Best

mk

 

ADD COMMENT
0
Entering edit mode

I'll log that suggestion down as a feature request, it may appear at some point...

ADD REPLY

Login before adding your answer.

Traffic: 255 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6