Hi all,
I'm new to diffbind.
I've been given a set of ATAC-seq data computed with nf-core/atacseq and asked if there is any peak associated with the sex of the individuals.
In my samplesheet i set the following columns:
Tissue tissue1 || tissue2 || tissue3
Condition M || F # male of female
Factor NA
my workflow , copied from a former colleague's code, contains the following calls:
dba(samplesheet)
dba.count(DBA, bParallel=FALSE) ## my job used to be killed for dba.countby our cluster until I downsampled the bams
dba.normalize(DBA)
dba.contrast(DBANorm, categories = DBA_CONDITION)
dba.analyze(DBAContrast, method=c(DBA_DESEQ2))
dba.report(DNAAnalysis, th = 1, bCalled=TRUE, method=DBA_DESEQ2)
and then I save the file:
df <- data.frame(CHROM=seqnames(DBA.report),
starts=start(DBA.report)-1,
ends=end(DBA.report),
names=c(paste0("peak",1:length(DBA.report))),
scores=c(rep("0", length(DBA.report))),
strands=c(rep(".", length(DBA.report))),
conc = DBA.report\$Conc,
conc_cond1=mcols(DBA.report)[,2],
conc_cond2=mcols(DBA.report)[,3],
fold = DBA.report\$Fold,
pval = DBA.report\$"p-value",
FDR = DBA.report\$FDR
)
1) for dba.contrast
, currently my phenotype is a binary one. In general is it ok if the column Condition contains more than two items (eg: brown eye, blue eye, gray eye).
2) what is group1 and group2 in my context ? male / female ? female / make ? how do i know the context ? is there a group3 if my phenotype is not binary ?
3) after dba.analyze
I tried dba.plotVenn(DBA,contrast=1,method=DBA_DESEQ2)
to plot a venn diagram of male vs female but it failed (I don't hae the error message anymore). Why ? I'm not sure about this . I don't really get the idea that dba.show returns contrasts associated with a DBA object.
. So should I 'loop' over all the contrasts returned by dba.show
?
4) if I want to find the difference by cell-type AND sex I should use dba.contrast(DBA categories = c(DBA_CONDITION,Tissue))` right ?
5) unless I'm wrong, even If I use DBA <- dba.contrast(DBA , categories = DBA_TISSUE)
I see Conc_F Conc_M
in the DBA.report. Is the report only for "Condition" ?
> head(DBA.report)
GRanges object with 6 ranges and 8 metadata columns:
seqnames ranges strand | Conc Conc_F Conc_M Fold
<Rle> <IRanges> <Rle> | <numeric> <numeric> <numeric> <numeric>
1 chr1 9915-10315 * | 6.61318 6.74846 6.49502 0.000946031
2 chr1 13124-13524 * | 4.57819 4.32214 4.75464 -0.001598787
3 chr1 17175-17575 * | 5.03964 5.39257 4.67800 0.004456263
4 chr1 29117-29517 * | 6.12669 5.23852 6.57850 -0.004478228
5 chr1 181182-181582 * | 6.91771 7.39266 6.37842 0.002882007
6 chr1 183675-184075 * | 6.24583 6.55481 5.93992 0.002751221
`
6) if I'm interested in the peaks in the autosomal chromosomes associated to sex, should I remove the data in the sexual chromosome not remove any bias ?
Thank you for your help
Pierre