Question

diffbind covariates

0

Entering edit mode

Alpesh Querer ▴ 220

@alpesh-querer-4895

Last seen 7 weeks ago

United States

Is there a way to specify other covariates like batch, individual, height, exercise_level etc in dba.contrast? Since it says these factors are only allowable in the design formula. Essentially I want to use deseq2 to set up contrasts among the treatment level . Is the following code valid ?


If a design formula is specified, it must be composed from the following allowable factors:

Tissue

Factor

Condition

Treatment

Replicate

Caller




library(DiffBind)


dba_obj <- dba(sampleSheet = "sample_sheet.csv")


dba_obj <- dba.blacklist(dba_obj, blacklist = DBA_BLACKLIST_HG38, greylist = TRUE)


dba_obj <- dba.count(dba_obj,
                     filter = 0,
                     minCount = 1,
                     summits = FALSE,
                     bParallel = TRUE,
                     bUseSummarizeOverlaps = FALSE,
                     score = DBA_SCORE_READS,
                     minOverlap = 1,
                     bScaleControl = TRUE)


count_data <- dba.peakset(dba_obj, bRetrieve = TRUE, DataType = DBA_DATA_FRAME)


dba_obj <- dba.contrast(dba_obj,
                        categories = DBA_TREATMENT,  
                        minMembers = 2,
                        design = "~batch + individual + height + exercise_level + treatment")

t
dba_obj <- dba.analyze(dba_obj)


report <- dba.report(dba_obj)


head(report)

Also

what is there any difference between generating raw counts with

dba_obj <- dba.count(dba_obj,
                     filter = 0,
                     minCount = 1,
                     summits = FALSE,
                     bParallel = TRUE,
                     bUseSummarizeOverlaps = FALSE,
                     score = DBA_SCORE_READS,
                     minOverlap = 1,
                     bScaleControl = TRUE)

and normalizing with DESeq2 as dds object versus directly

dba_obj <- dba.count(dba_obj,
                     filter = 0,
                     minCount = 1,
                     summits = FALSE,
                     bParallel = TRUE,
                     bUseSummarizeOverlaps = FALSE,
                     score = DBA_SCORE_NORMALIZED,
                     minOverlap = 1,
                     bScaleControl = TRUE)

If there is no difference , will the 2 methods result in the same count matrix? Thanks for your help

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.10 (Ootpa)

Matrix products: default
BLAS:   /usr/lib64/libblas.so.3.8.0
LAPACK: /usr/lib64/liblapack.so.3.8.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] clusterProfiler_4.6.2
 [2] org.Hs.eg.db_3.16.0
 [3] TxDb.Hsapiens.UCSC.hg38.knownGene_3.16.0
 [4] GenomicFeatures_1.50.4
 [5] AnnotationDbi_1.60.2
 [6] ChIPseeker_1.34.1
 [7] variancePartition_1.28.9
 [8] BiocParallel_1.32.6
 [9] limma_3.54.2
[10] ggplot2_3.5.1
[11] DiffBind_3.8.4
[12] SummarizedExperiment_1.28.0
[13] Biobase_2.58.0
[14] MatrixGenerics_1.10.0
[15] matrixStats_1.1.0
[16] GenomicRanges_1.50.2
[17] GenomeInfoDb_1.40.1
[18] IRanges_2.32.0
[19] S4Vectors_0.36.2
[20] BiocGenerics_0.44.0

DiffBind ATACSeq • 127 views

ADD COMMENT • link updated 7 weeks ago by Michael Love 43k • written 7 weeks ago by Alpesh Querer ▴ 220