Entering edit mode
Is there a way to specify other covariates like batch, individual, height, exercise_level etc in dba.contrast? Since it says these factors are only allowable in the design formula. Essentially I want to use deseq2 to set up contrasts among the treatment level . Is the following code valid ?
If a design formula is specified, it must be composed from the following allowable factors:
Tissue
Factor
Condition
Treatment
Replicate
Caller
library(DiffBind)
dba_obj <- dba(sampleSheet = "sample_sheet.csv")
dba_obj <- dba.blacklist(dba_obj, blacklist = DBA_BLACKLIST_HG38, greylist = TRUE)
dba_obj <- dba.count(dba_obj,
filter = 0,
minCount = 1,
summits = FALSE,
bParallel = TRUE,
bUseSummarizeOverlaps = FALSE,
score = DBA_SCORE_READS,
minOverlap = 1,
bScaleControl = TRUE)
count_data <- dba.peakset(dba_obj, bRetrieve = TRUE, DataType = DBA_DATA_FRAME)
dba_obj <- dba.contrast(dba_obj,
categories = DBA_TREATMENT,
minMembers = 2,
design = "~batch + individual + height + exercise_level + treatment")
t
dba_obj <- dba.analyze(dba_obj)
report <- dba.report(dba_obj)
head(report)
Also
what is there any difference between generating raw counts with
dba_obj <- dba.count(dba_obj,
filter = 0,
minCount = 1,
summits = FALSE,
bParallel = TRUE,
bUseSummarizeOverlaps = FALSE,
score = DBA_SCORE_READS,
minOverlap = 1,
bScaleControl = TRUE)
and normalizing with DESeq2 as dds object versus directly
dba_obj <- dba.count(dba_obj,
filter = 0,
minCount = 1,
summits = FALSE,
bParallel = TRUE,
bUseSummarizeOverlaps = FALSE,
score = DBA_SCORE_NORMALIZED,
minOverlap = 1,
bScaleControl = TRUE)
If there is no difference , will the 2 methods result in the same count matrix? Thanks for your help
> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.10 (Ootpa)
Matrix products: default
BLAS: /usr/lib64/libblas.so.3.8.0
LAPACK: /usr/lib64/liblapack.so.3.8.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] clusterProfiler_4.6.2
[2] org.Hs.eg.db_3.16.0
[3] TxDb.Hsapiens.UCSC.hg38.knownGene_3.16.0
[4] GenomicFeatures_1.50.4
[5] AnnotationDbi_1.60.2
[6] ChIPseeker_1.34.1
[7] variancePartition_1.28.9
[8] BiocParallel_1.32.6
[9] limma_3.54.2
[10] ggplot2_3.5.1
[11] DiffBind_3.8.4
[12] SummarizedExperiment_1.28.0
[13] Biobase_2.58.0
[14] MatrixGenerics_1.10.0
[15] matrixStats_1.1.0
[16] GenomicRanges_1.50.2
[17] GenomeInfoDb_1.40.1
[18] IRanges_2.32.0
[19] S4Vectors_0.36.2
[20] BiocGenerics_0.44.0