Error with Diffbind
Entering edit mode
bdk • 0
Last seen 6.1 years ago
United States

I was trying to use Diffbind package and could not move beyond three steps.  I can see that dba object is created but without any metadata. I think that is what causing problem.

Thanks for any help.


> samples = read.csv(file.path("/n/projects/bdk/fly/chip_seq_final/","Book1.csv"))

> samples

   sampleID antibody replicates                   bamReads ControlID

1        s1      Ubx          1  Ultrabithorax-Agarose.bam        C1

2        s2      Ubx          2 Ultrabithorax-Magbeads.bam        C2

3        s3    Med19          1                Med19-4.bam        C3

4        s4    Med19          2                Med19-5.bam        C4

5        s5      chn          1                  Chn-4.bam        C5

6        s6      chn          2                  Chn-5.bam        C6

7        s7      hth          1                  Hth-4.bam        C7

8        s8      hth          2                  Hth-5.bam        C8

9        s9      exd          1          Extradenticle.bam        C9

10      s10      exd          2                  Exd-4.bam       C10

11      s11     AbdA          1            abdominal-A.bam       C11

12      s12     AbdA          2                abd-A-4.bam       C12

    bamControl               Peaks peakCaller

1    Input.bam   ./peaks/Ubx_1.txt        raw

2    Input.bam   ./peaks/Ubx_2.txt        raw

3    Input.bam ./peaks/med19_1.txt        raw

4  Input-5.bam ./peaks/med19_2.txt        raw

5    Input.bam   ./peaks/chn_1.txt        raw

6  Input-5.bam   ./peaks/chn_2.txt        raw

7    Input.bam   ./peaks/hth_1.txt        raw

8  Input-5.bam   ./peaks/hth_2.txt        raw

9    Input.bam   ./peaks/exd_1.txt        raw

10   Input.bam   ./peaks/exd_2.txt        raw

11   Input.bam  ./peaks/abdA_1.txt        raw

12   Input.bam  ./peaks/abdA_2.txt        raw

> hox = dba(sampleSheet = "Book1.csv")

1     NA raw

2     NA raw

3     NA raw

4     NA raw

5     NA raw

6     NA raw

7     NA raw

8     NA raw

9     NA raw

10     NA raw

11     NA raw

12     NA raw

> hox

12 Samples, 43577 sites in matrix (62180 total):

   ID Caller Intervals

1   1    raw     32208

2   2    raw     32208

3   3    raw     31161

4   4    raw     31406

5   5    raw     33776

6   6    raw     18907

7   7    raw     38511

8   8    raw     14651

9   9    raw     40903

10 10    raw     27576

11 11    raw     27362

12 12    raw     27398

> plot(hox)

Error in ncol(atts):1 : argument of length 0



R version 3.2.0 (2015-04-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] DiffBind_1.14.5         RSQLite_1.0.0           DBI_0.3.1
 [4] locfit_1.5-9.1          GenomicAlignments_1.4.1 limma_3.24.15
 [7] CoverageView_1.5.2      rtracklayer_1.28.9      Rsamtools_1.20.4
[10] Biostrings_2.36.3       XVector_0.8.0           GenomicRanges_1.20.5
[13] GenomeInfoDb_1.4.2      IRanges_2.2.7           S4Vectors_0.6.3
[16] BiocGenerics_0.14.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.0            lattice_0.20-33        GO.db_3.1.2
 [4] gtools_3.5.0           digest_0.6.8           plyr_1.8.3
 [7] BatchJobs_1.6          futile.options_1.0.0   ShortRead_1.26.0
[10] ggplot2_1.0.1          gplots_2.17.0          zlibbioc_1.14.0
[13] annotate_1.46.1        gdata_2.17.0           Matrix_1.2-2
[16] checkmate_1.6.2        systemPipeR_1.2.21     proto_0.3-10
[19] GOstats_2.34.0         splines_3.2.0          BiocParallel_1.2.20
[22] stringr_1.0.0          pheatmap_1.0.7         RCurl_1.95-4.7
[25] munsell_0.4.2          sendmailR_1.2-1        base64enc_0.1-3
[28] BBmisc_1.9             fail_1.2               edgeR_3.10.2
[31] XML_3.98-1.3           AnnotationForge_1.10.1 MASS_7.3-43
[34] bitops_1.0-6           grid_3.2.0             RBGL_1.44.0
[37] xtable_1.7-4           GSEABase_1.30.2        gtable_0.1.2
[40] magrittr_1.5           scales_0.2.5           graph_1.46.0
[43] KernSmooth_2.23-15     amap_0.8-14            stringi_0.5-5
[46] hwriter_1.3.2          reshape2_1.4.1         genefilter_1.50.0
[49] latticeExtra_0.6-26    futile.logger_1.4.1    brew_1.0-6
[52] rjson_0.2.15           lambda.r_1.1.7         RColorBrewer_1.1-2
[55] tools_3.2.0            Biobase_2.28.0         Category_2.34.2
[58] survival_2.38-3        AnnotationDbi_1.30.1   colorspace_1.2-6
[61] caTools_1.17.1

diffbind • 1.7k views
Entering edit mode
Rory Stark ★ 4.1k
Last seen 23 days ago
CRUK, Cambridge, UK

Hi Bony-

Indeed this problem relates to the lack of metadata in the DBA object. There are two things going on here.

First: the reason that there is no metadata is that the column names are incorrect. "sampleID" should be "SampleID" (case sensitive) and "replicates" should be "Replicate". I suspect that you'd be better off re-naming "antibody" to "Factor" as well. Also "peakCaller" should be "PeakCaller", but as the default is "raw" it ends up the same. See the man page ?dba for the metadata column names that DiffBind recognizes.

Second: there is no good reason for DiffBind to blow up in this case, even if there is no metadata. This is a bug that I have now fixed and will be functional from version 1.14.6 onwards.



Entering edit mode
bdk • 0
Last seen 6.1 years ago
United States

Thanks a lot. It works.


Entering edit mode
bdk • 0
Last seen 6.1 years ago
United States

I was trying to generate contrast  from above file.

hox = dba.contrast(hox,categories = DBA_FACTOR)

Warning message:

No contrasts added. Perhaps try more categories, or lower value for minMembers


Thanks for any help.


Entering edit mode
Rory Stark ★ 4.1k
Last seen 23 days ago
CRUK, Cambridge, UK

It looks like you have two replicates of each factor. By default, DiffBind wants at least three replicates to generate statistically meaningful results. There are no pairs of factors where each group has at least three replicates.

You can override this by setting minMembers=2, which will contrast every possible pair of factors. Be aware that the low replicate numbers may result in unreliable results. In this case, at a minimum, I would run dba.analyze() with method=c(DBA_EDGER,DBA_DESEQ2) to check how well the two methods agree, but any results would need to be validated in another way.




Login before adding your answer.

Traffic: 291 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6