Bam files causing Blacklist errors in dba.analyze() but not in dba.blacklist()
1
0
Entering edit mode
ahua217 ▴ 10
@ahua217-24323
Last seen 3.1 years ago

Hello! When I used dba.analyze() with my DBA object linking to bam files, I got errors. My command was as below.

anaScramblecst <- dba.analyze(Scramblecst)

It returned the errors " No matching chromosomes found for file: ..bam" and "Blacklist error: Error in BamFile(file, character(0)): 'file' must be character(1) and not NA"

Then I tried

Scramblecstbl<-dba.blacklist(Scramblecst,blacklist=DBA_BLACKLIST_HG19,greylist=FALSE)

It successfully applied blacklist and removed some intervals. Then I took the filtered file to dba.analyze() again.

anaScramblecstbl <- dba.analyze(Scramblecstbl)
anaScramblecstbl <- dba.analyze(Scramblecstbl,bBlacklist = TRUE)
anaScramblecstbl <- dba.analyze(Scramblecstbl,bBlacklist = FALSE)

Still none of the commands worked. And they gave out the same error "Blacklist error: Error in strsplit(genome, "BSgenome."): non-character argument" and "Unable to apply Blacklist/Greylist."

To figure out what was wrong, I also moved one of my bam files to the sample folder of Diffbind vignette and changed the file name. I run dba() dba.analyze(tamoxifen) and they worked. So I am confused about what the Blacklist errors meant. How can I fix them?

Thank you very much!

DiffBind • 2.7k views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.2k
@rory-stark-5741
Last seen 14 days ago
Cambridge, UK

It looks like the problem is in the greylist and not the blacklist, since it worked when you called dba.blacklist() with greylist=FALSE.

You can skip greylists completely by setting bGreylist=FALSE in the call to dba.analyze(), or

Scramblecstbl$config$doGreyList <- FALSE

before running dba.analyze().

The question is, do you have separate control bam files, and do you want to use them to generate blacklists? If you don't have control reads, I'd need to see a copy of your DBA object to see what is going on. If you do have controls, then either a) something is wrong with how you specified one or more of them (I notice in your question the error message refers to a file called ..bam, which could be an issue), or b) something is wrong with the bam file itself.

ADD COMMENT
0
Entering edit mode

Really appreciate your help, Dr. Stark. The setting of bGreylist=FALSE indeed let dba.analyze work. My data sheet for DBA object had no control reads but spikein bam files. The control reads should be like input reads in ChIP-seq, right? My samples are not from ChIP-seq. My data sheet is like this. enter image description here

After dba(), dba.count(),dba.normalize(),dba.contrast()and dba.analyze() , it gave a DBA object like this

ID Tissue Condition Treatment Replicate Reads FRiP

1 KWScr_2 Weri_Rb1 Scramble Scramble_H3K36Me2 2 7694871 0.24

2 KWsh5_2 Weri_Rb1 sh5 sh5_H3K36Me2 2 9184486 0.12

3 KWScr_1 Weri_Rb1 Scramble Scramble_H3K36Me2 1 5194043 0.27

4 KWsh5_1 Weri_Rb1 sh5 sh5_H3K36Me2 1 6161260 0.11

5 IWScr_2 Weri_Rb1 Scramble Scramble_Rabbit_IgG 2 9565498 0.11

6 IWsh5_2 Weri_Rb1 sh5 sh5_Rabbit_IgG 2 11111771 0.10

7 IWScr_1 Weri_Rb1 Scramble Scramble_Rabbit_IgG 1 6471552 0.10

8 IWsh5_1 Weri_Rb1 sh5 sh5_Rabbit_IgG 1 7626725 0.09

Design: [~Treatment] | 2 Contrasts:

Factor Group Samples Group2 Samples2 DB.DESeq2

1 Treatment Scramble_H3K36Me2 2 Scramble_Rabbit_IgG 2 239615

2 Treatment sh5_H3K36Me2 2 sh5_Rabbit_IgG 2 8349

The .bam files have very long names. So in my last post I changed them into ..bam, which was actually xxx.bam in my primary mind. Sorry about the confusion.

Now the DBA object can achieve dba.plotVenn with the contrasts I set, but I still have 3 questions, to make sure what I tested was right.

1) In which step was the normalization done? Since the Spikein bam files were included in the data sheet, were they taken into calculation at very early step such as dba.count() or only at dba.normalize(). If the answer is the latter, which parameters in dba.normalize() determine the spikein bam reads will be introduced into the DBA object?

2) And if I apply user-supplied normalization factor at the step of dba.normalize()like this

dba.normalize(DBA, normalize=c(10.90851692,1.148264939,16.14085674,1.727614635,0.552050451556888,0.450021325196539,0.8128489004399,0.702649236076779))

does it mean the spikein bam files from data sheet were automatically discarded in normalization? And I also want to know if my user-supplied numeric vector satisfied the criteria. I think the numbers should be in the same order of my samples in row, right?

3) The third question is actually a follow-up of my old posted question Comment: Error report when loading a 0-byte bed file for DBA object in Diffbind I would like get some advice on it. As in my data sheet, I have some 0-peak bed files. I manually generated them for two reasons. (1) One 0-byte bed file from peak caller got problem to enter my DBA object. (2)I also need some IgG vs IgG peaks (I think they are 0 but the peak caller was unable to generate their bed files) for the convenience of contrast setting and plotVenn.

Thank you very much!

ADD REPLY
0
Entering edit mode

1) To utilize spike-in reads, you need to specify a value for spikein= when calling dba.normalize(). If spikein=TRUE, then the total number of reads in the Spikein bam file will be used to compute normalization factors. See the help page for dba.normalize().

2) If you explicitly specify a vector of normalization factors, this will over-ride the once that would be computed using the spike-in reads. There should be one value per sample, in the order that the samples appear in the sample sheet.

3) I'll look again at the 0-byte bed file for DBA object in DiffBind issue and see if I can suggest something.

ADD REPLY
0
Entering edit mode

Really appreciate your help, Dr. Stark. After confirming the answers, I will apply the right setting to all my samples.

ADD REPLY

Login before adding your answer.

Traffic: 1016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6