Search
Question: a diffBind problem
0
5.4 years ago by
Rory Stark2.5k
CRUK, Cambridge, UK
Rory Stark2.5k wrote:
Hi Walter - You did the right thing by explicitly calling dba.contrast to add the WT vs KO contrast as there are only two knockouts. However you have two different factors (trimethyl and monomethly, I assume these are histone marks) for each condition. The warning from dba.analyze is that there were no differentially bound sites identified (with an FDR < 0.10 using edgeR) -- this is of course possible! Indeed as the sites for each of these two histone marks may not overlap very much, it is not too surprising The error you're seeing in calling dba.report is a typo  in the message you used a minus sign instead of an equal (assignment) sign. There are a few things you could do here. You may want to check out the overlaps of the peaks: > dba.plotVenn(rats,rats$condition$trimethly) > dba.plotVenn(rats,rats$condition$monomethly) To see if there are common peaks within each group being contrasted. After the analysis, you may want to have a look at the MA plot (dba.plotMA) to make sure things got normalized OK, and to see if it looks like there are sites that really are different between the conditions. You can also examine the stats for all the sites: > rats.DB = dba.report(rats, th=1) This would let you know what sites are identified with the most confidence, even if none have a low enough FDR. If you really do want to contrast all the mono- and tri-methly peaks together within each condition, you may want to try a multifactorial design, matching the marks across conditions: > rats = dba.contrast(rats, categories=DBA_CONDITION,block=DBA_FACTOR) When you run dba.analyze, you'll get both a single-factor analysis (as you have already done) and the matched analysis. To retrieve the stats for the multifactor analysis: > rats.DB = dba.report(rats, method=DBA_EDGER_BLOCK, th=1) There may be some more specific questions you can address with your data set, which doesn't have much replication (only monomethly WT is replicated, and that has only two replicates), using a more sophisticated multifactorial design, in which case you'll need to work with edgeR (and/or DESeq) directly. Cheers- Rory From: Walter Taylor <walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu>> Date: Thu, 31 Jan 2013 23:28:45 +0000 To: Rory Stark <rory.stark@cancer.org.uk<mailto:rory.stark@cancer.org.uk>> Subject: a diffBind problem Rory, If you have time to take a quick look at my case, I would be grateful. I have been able to get through most of the steps in creating a report in DiffBind with a fairly simple data set, but it usually reports that I have no differentially bound sites. Here is my sample sheet: And here is a copy of my commands in Rstudio: > rats = dba(sampleSheet="schneid2.csv") 1 wt trimethyl wild type 1 raw 2 wt monomethyl wild type 1 raw 3 wt monomethyl wild type 2 raw 4 ko trimethyl knockout 1 raw 5 ko monomethyl knockout 1 raw > rats = dba.count(rats) Calculating library sizes from column totals. > rats = dba.contrast(rats, categories=DBA_CONDITION) Warning message: No contrasts added. Perhaps try more categories, or lower value for minMembers. > rats = dba.contrast(rats, categories=DBA_CONDITION, minMembers=2) > > > rats = dba.analyze(rats) Calculating library sizes from column totals. Warning message: In dba.analyze(rats) : No correlation heatmap plotted -- contrast 1 does not have enough differentially bound sites. > > rats.DB - dba.report(rats) Error: object 'rats.DB' not found I am using DiffBind 1.4.1 Any hints would be greatly appreciated. Walter Walter Taylor, PhD Dartmouth-Hitchcock Medical Center Norris Cotton Cancer Center 1 Medical Center Drive Lebanon, NH 03756 walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu> 603-653-9174 [[alternative HTML version deleted]]
modified 5.4 years ago • written 5.4 years ago by Rory Stark2.5k
0
5.4 years ago by
Rory Stark2.5k
CRUK, Cambridge, UK
Rory Stark2.5k wrote:
Hi Walter- 1. The masks you are passing in to dba.plotVenn aren't quite right, instead of rats$condition$trimethyl, it should be rats$masks$trimethly: > dba.plotVenn(rats,rats$masks$trimethly) > dba.plotVenn(rats,rats$masks$monomethyl) 2. You should probably do the dba.plotVenn calls before calling dba.count. (You can try them after as well and compare the results). 3. When setting up the blocking contrast, you also have to specify minMembers: > rats$contrasts NULL > rats = dba.contrast(rats, categories=DBA_CONDITION,block=DBA_FACTOR,minMembers=2) > rats = dba.analyze(rats, bCorPlot=F) > par(mfrow=c(2,1)) > dba.plotMA(rats, method=c(DBA_EDGER,DBA_EDGER_BLOCK)) > rats.DB = dba.report(rats, method=DBA_EDGER_BLOCK, th=1) Hopefully we'll see you in June at EBI! Cheers- Rory From: Walter Taylor <walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu>> Date: Fri, 1 Feb 2013 16:44:58 +0000 To: Rory Stark <rory.stark@cruk.cam.ac.uk<mailto:rory.stark@cruk.cam.ac.uk>> Subject: Re: a diffBind problem Rory, Thanks for answering my email. I know you must get lots of requests for help. I tried all of your suggestions, and I think the best way to show what happened is to show you the console listing from RStudio: > rats = dba(sampleSheet="schneid2.csv") MS2-MS1 wt trimethyl wild type 1 raw MS3-MS1 wt monomethyl wild type 1 raw MS8-MS7 wt monomethyl wild type 2 raw MS5-MS4 ko trimethyl knockout 1 raw MS6-MS4 ko monomethyl knockout 1 raw > > rats = dba.count(rats) Calculating library sizes from column totals. > rats = dba.contrast(rats, categories=DBA_CONDITION, minMembers=2) > rats = dba.analyze(rats) Calculating library sizes from column totals. Warning message: In dba.analyze(rats) : No correlation heatmap plotted -- contrast 1 does not have enough differentially bound sites. > rats.db = dba.report(rats) Warning message: No sites above threshold > > > dba.plotVenn(rats,rats$condition$trimethly) Error in pv$sites[[A]] : attempt to select less than one element > > dba.plotVenn(rats,rats$condition$monomethly) Error in pvsites[[A]] : attempt to select less than one element > > rats.DB = dba.report(rats, th=1) > > rats = dba.contrast(rats, categories=DBA_CONDITION,block=DBA_FACTOR) Warning messages: 1: No contrasts added. Perhaps try more categories, or lower value for minMembers. 2: Blocking factor invalid for all contrasts: 3: Blocking factor has only one value > > rats.DB = dba.report(rats, method=DBA_EDGER_BLOCK, th=1) Error in pv.DBAreport(pv = DBA, contrast = contrast, method = method, : Specified contrast number is greater than number of contrasts > > > > rats = dba.contrast(rats, categories=DBA_CONDITION, minMembers=1) Error in dba.contrast(rats, categories = DBA_CONDITION, minMembers = 1) : minMembers must be at least 2. Use of replicates strongly advised. It looks to me like your comment about there really not being much difference between the knockout and control may be true, based on the output of DiffBind and that when we look at the bedgraph from MAC2,we see little difference. If we do a subtract operation in Cistrome, there are differences, but I cannot tell from a wiggle track how different, because it is not quantitation, only positional. Unless you have other ideas, I will repeat everything from the start (the alignment, MACS2, etc) and to see if I made some kind of obvious slip-up. Again, I really appreciate your help. I noticed that there is an Advanced RNA-seq and ChIP-seq Data Analysis course at EMBL-EBI in June, and that you are one of the instructors. I am think seriously of trying to attend this. If I do, you can count on me buying you many beers (if you want). Walter On Feb 1, 2013, at 6:53 AM, Rory Stark <rory.stark@cruk.cam.ac.uk<mailto:rory.stark@cruk.cam.ac.uk>> wrote: Hi Walter - You did the right thing by explicitly calling dba.contrast to add the WT vs KO contrast as there are only two knockouts. However you have two different factors (trimethyl and monomethly, I assume these are histone marks) for each condition. The warning from dba.analyze is that there were no differentially bound sites identified (with an FDR < 0.10 using edgeR) -- this is of course possible! Indeed as the sites for each of these two histone marks may not overlap very much, it is not too surprising The error you're seeing in calling dba.report is a typo  in the message you used a minus sign instead of an equal (assignment) sign. There are a few things you could do here. You may want to check out the overlaps of the peaks: > dba.plotVenn(rats,ratscondition$trimethly) > dba.plotVenn(rats,rats$condition\$monomethly) To see if there are common peaks within each group being contrasted. After the analysis, you may want to have a look at the MA plot (dba.plotMA) to make sure things got normalized OK, and to see if it looks like there are sites that really are different between the conditions. You can also examine the stats for all the sites: > rats.DB = dba.report(rats, th=1) This would let you know what sites are identified with the most confidence, even if none have a low enough FDR. If you really do want to contrast all the mono- and tri-methly peaks together within each condition, you may want to try a multifactorial design, matching the marks across conditions: > rats = dba.contrast(rats, categories=DBA_CONDITION,block=DBA_FACTOR) When you run dba.analyze, you'll get both a single-factor analysis (as you have already done) and the matched analysis. To retrieve the stats for the multifactor analysis: > rats.DB = dba.report(rats, method=DBA_EDGER_BLOCK, th=1) There may be some more specific questions you can address with your data set, which doesn't have much replication (only monomethly WT is replicated, and that has only two replicates), using a more sophisticated multifactorial design, in which case you'll need to work with edgeR (and/or DESeq) directly. Cheers- Rory From: Walter Taylor <walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu>> Date: Thu, 31 Jan 2013 23:28:45 +0000 To: Rory Stark <rory.stark@cancer.org.uk<mailto:rory.stark@cancer.org.uk>> Subject: a diffBind problem Rory, If you have time to take a quick look at my case, I would be grateful. I have been able to get through most of the steps in creating a report in DiffBind with a fairly simple data set, but it usually reports that I have no differentially bound sites. Here is my sample sheet: And here is a copy of my commands in Rstudio: > rats = dba(sampleSheet="schneid2.csv") 1 wt trimethyl wild type 1 raw 2 wt monomethyl wild type 1 raw 3 wt monomethyl wild type 2 raw 4 ko trimethyl knockout 1 raw 5 ko monomethyl knockout 1 raw > rats = dba.count(rats) Calculating library sizes from column totals. > rats = dba.contrast(rats, categories=DBA_CONDITION) Warning message: No contrasts added. Perhaps try more categories, or lower value for minMembers. > rats = dba.contrast(rats, categories=DBA_CONDITION, minMembers=2) > > > rats = dba.analyze(rats) Calculating library sizes from column totals. Warning message: In dba.analyze(rats) : No correlation heatmap plotted -- contrast 1 does not have enough differentially bound sites. > > rats.DB - dba.report(rats) Error: object 'rats.DB' not found I am using DiffBind 1.4.1 Any hints would be greatly appreciated. Walter Walter Taylor, PhD Dartmouth-Hitchcock Medical Center Norris Cotton Cancer Center 1 Medical Center Drive Lebanon, NH 03756 walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu> 603-653-9174 Walter Taylor, PhD Dartmouth-Hitchcock Medical Center Norris Cotton Cancer Center 1 Medical Center Drive Lebanon, NH 03756 walter.taylor@dartmouth.edu<mailto:walter.taylor@dartmouth.edu> 603-653-9174 [[alternative HTML version deleted]]