Question

Error with Diffbind

0

Entering edit mode

bdk • 0

@bdk-6765

Last seen 8.6 years ago

United States

I was trying to use Diffbind package and could not move beyond three steps. I can see that dba object is created but without any metadata. I think that is what causing problem.

Thanks for any help.

Bony

> samples = read.csv(file.path("/n/projects/bdk/fly/chip_seq_final/","Book1.csv"))

> samples

sampleID antibody replicates bamReads ControlID

1 s1 Ubx 1 Ultrabithorax-Agarose.bam C1

2 s2 Ubx 2 Ultrabithorax-Magbeads.bam C2

3 s3 Med19 1 Med19-4.bam C3

4 s4 Med19 2 Med19-5.bam C4

5 s5 chn 1 Chn-4.bam C5

6 s6 chn 2 Chn-5.bam C6

7 s7 hth 1 Hth-4.bam C7

8 s8 hth 2 Hth-5.bam C8

9 s9 exd 1 Extradenticle.bam C9

10 s10 exd 2 Exd-4.bam C10

11 s11 AbdA 1 abdominal-A.bam C11

12 s12 AbdA 2 abd-A-4.bam C12

bamControl Peaks peakCaller

1 Input.bam ./peaks/Ubx_1.txt raw

2 Input.bam ./peaks/Ubx_2.txt raw

3 Input.bam ./peaks/med19_1.txt raw

4 Input-5.bam ./peaks/med19_2.txt raw

5 Input.bam ./peaks/chn_1.txt raw

6 Input-5.bam ./peaks/chn_2.txt raw

7 Input.bam ./peaks/hth_1.txt raw

8 Input-5.bam ./peaks/hth_2.txt raw

9 Input.bam ./peaks/exd_1.txt raw

10 Input.bam ./peaks/exd_2.txt raw

11 Input.bam ./peaks/abdA_1.txt raw

12 Input.bam ./peaks/abdA_2.txt raw

> hox = dba(sampleSheet = "Book1.csv")

1 NA raw

2 NA raw

3 NA raw

4 NA raw

5 NA raw

6 NA raw

7 NA raw

8 NA raw

9 NA raw

10 NA raw

11 NA raw

12 NA raw

> hox

12 Samples, 43577 sites in matrix (62180 total):

ID Caller Intervals

1 1 raw 32208

2 2 raw 32208

3 3 raw 31161

4 4 raw 31406

5 5 raw 33776

6 6 raw 18907

7 7 raw 38511

8 8 raw 14651

9 9 raw 40903

10 10 raw 27576

11 11 raw 27362

12 12 raw 27398

> plot(hox)

Error in ncol(atts):1 : argument of length 0

R version 3.2.0 (2015-04-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] DiffBind_1.14.5         RSQLite_1.0.0           DBI_0.3.1
[4] locfit_1.5-9.1          GenomicAlignments_1.4.1 limma_3.24.15
[7] CoverageView_1.5.2      rtracklayer_1.28.9      Rsamtools_1.20.4
[10] Biostrings_2.36.3       XVector_0.8.0           GenomicRanges_1.20.5
[13] GenomeInfoDb_1.4.2      IRanges_2.2.7           S4Vectors_0.6.3
[16] BiocGenerics_0.14.0

loaded via a namespace (and not attached):
[1] Rcpp_0.12.0            lattice_0.20-33        GO.db_3.1.2
[4] gtools_3.5.0           digest_0.6.8           plyr_1.8.3
[7] BatchJobs_1.6          futile.options_1.0.0   ShortRead_1.26.0
[10] ggplot2_1.0.1          gplots_2.17.0          zlibbioc_1.14.0
[13] annotate_1.46.1        gdata_2.17.0           Matrix_1.2-2
[16] checkmate_1.6.2        systemPipeR_1.2.21     proto_0.3-10
[19] GOstats_2.34.0         splines_3.2.0          BiocParallel_1.2.20
[22] stringr_1.0.0          pheatmap_1.0.7         RCurl_1.95-4.7
[25] munsell_0.4.2          sendmailR_1.2-1        base64enc_0.1-3
[28] BBmisc_1.9             fail_1.2               edgeR_3.10.2
[31] XML_3.98-1.3           AnnotationForge_1.10.1 MASS_7.3-43
[34] bitops_1.0-6           grid_3.2.0             RBGL_1.44.0
[37] xtable_1.7-4           GSEABase_1.30.2        gtable_0.1.2
[40] magrittr_1.5           scales_0.2.5           graph_1.46.0
[43] KernSmooth_2.23-15     amap_0.8-14            stringi_0.5-5
[46] hwriter_1.3.2          reshape2_1.4.1         genefilter_1.50.0
[49] latticeExtra_0.6-26    futile.logger_1.4.1    brew_1.0-6
[52] rjson_0.2.15           lambda.r_1.1.7         RColorBrewer_1.1-2
[55] tools_3.2.0            Biobase_2.28.0         Category_2.34.2
[58] survival_2.38-3        AnnotationDbi_1.30.1   colorspace_1.2-6
[61] caTools_1.17.1

diffbind • 2.3k views

ADD COMMENT • link updated 8.6 years ago by Rory Stark ★ 5.2k • written 8.6 years ago by bdk • 0

score 0 · Answer 1 · 2015-09-10

Hi Bony-

Indeed this problem relates to the lack of metadata in the DBA object. There are two things going on here.

First: the reason that there is no metadata is that the column names are incorrect. "sampleID" should be "SampleID" (case sensitive) and "replicates" should be "Replicate". I suspect that you'd be better off re-naming "antibody" to "Factor" as well. Also "peakCaller" should be "PeakCaller", but as the default is "raw" it ends up the same. See the man page ?dba for the metadata column names that DiffBind recognizes.

Second: there is no good reason for DiffBind to blow up in this case, even if there is no metadata. This is a bug that I have now fixed and will be functional from version 1.14.6 onwards.

Cheers-

Rory

score 0 · Answer 2 · 2015-09-10

0

Entering edit mode

bdk • 0

@bdk-6765

Last seen 8.6 years ago

United States

Thanks a lot. It works.

Bony

ADD COMMENT • link 8.6 years ago bdk • 0

score 0 · Answer 3 · 2015-09-10

0

Entering edit mode

bdk • 0

@bdk-6765

Last seen 8.6 years ago

United States

I was trying to generate contrast from above file.

hox = dba.contrast(hox,categories = DBA_FACTOR)

Warning message:

No contrasts added. Perhaps try more categories, or lower value for minMembers

Thanks for any help.

Bony

ADD COMMENT • link 8.6 years ago bdk • 0

score 0 · Answer 4 · 2015-09-10

It looks like you have two replicates of each factor. By default, DiffBind wants at least three replicates to generate statistically meaningful results. There are no pairs of factors where each group has at least three replicates.

You can override this by setting minMembers=2, which will contrast every possible pair of factors. Be aware that the low replicate numbers may result in unreliable results. In this case, at a minimum, I would run dba.analyze() with method=c(DBA_EDGER,DBA_DESEQ2) to check how well the two methods agree, but any results would need to be validated in another way.

-R