Reading data into chromVAR
1
0
Entering edit mode
Chris • 0
@3fdb6f97
Last seen 1 day ago
United States

Hi all,

I am trying to follow this vignette but confused:
https://greenleaflab.github.io/chromVAR/.
I have 8 bam files from ATAC-seq data and converted to 8 bed files. However this example has 2 bam files and only one bed file.

peakfile <- "mypeaks.bed"
peaks <- getPeaks(peakfile)

bamfiles <- c("mybam1.bam","mybam2.bam")
fragment_counts <- getCounts(bamfiles, peaks, 
                              paired =  TRUE, 
                              by_rg = TRUE, 
                              format = "bam", 
                              colData = DataFrame(celltype = c("GM","K562")))

Would you suggest me the correct way to read data in this case? Why do we need both bam and bed file for the function getCounts()? Thank you so much.

chromVAR • 940 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 22 hours ago
United States

Do note that chromVar is meant for sparse ATAC-Seq data, like what you get from scATAC-Seq. If you have conventional ATAC-Seq data, you might consider using DiffBind instead.

Anyway, the idea for any ATAC-Seq data analysis is to identify a set of peaks that might represent open chromatin in your samples, and then count up the number of reads that overlap those peaks. In general this means using MACS with all your BAM files at once to identify consensus peaks (those that are found in more than just one sample), and then using those results as the peaks. In which case you should have expected to only have one peak file and multiple BAM files.

1
Entering edit mode

It's fine, the purpose of DiffBind and chromVAR are different. In a nutshell, chromVAR matches a collection of motifs to open chromatin regions, and then calculates deviation scores for each motif based on the read counts per sample for the regions that harbor a certain motif. The output is a relatively high-dimensional overview whether certain motifs are generally more associated with opening or closing chromatin between samples/conditions. We used that before for bulk ATAC-seq and it worked reasonably well, pretty much confirming what you would see when scanning differential chromatin regions between conditions for motif enrichment, but more convenient and faster since you don't have to do all this testing and enrichment analysis.

Seconding the 2nd paragraph, it's one BED file with peaks, for example consensus peaks, and then one bam per sample, as described in the manual OP links. There is no instruction to ever convert BAM to BED, so please stick to the manual.

ADD REPLY
0
Entering edit mode

Thanks ATpoint! I have consensus file from nf-core/atac. The vignette used system.file() to read bam data in so I try to read my bam files into the tool but got error. Would you suggest the function to read data? For example, if I have 8 bam files for 4 conditions (2 biological replicates for each condition), how can I read them into? The vignette used example_counts a built in data which is a RangedSummarizedExperiment. Would you please tell me how to get this object with my data?

bamfile <- "path/to/bam_file"
fragment_counts <- getCounts(bamfile, 
                             peaks, 
                             paired = TRUE, 
                             by_rg = TRUE, 
                             format = "bam", 
                             colData = DataFrame(celltype = "GM"))

Error in value[3L] : failed to open BamFile: file(s) do not exist:
'path/to/bam_file'

If I put the path to bam file directly into the function, I get the same error:

  fragment_counts <- getCounts("path/to/bam_file", 
                                 peaks, 
                                 paired = TRUE, 
                                 by_rg = TRUE, 
                                 format = "bam", 
                                 colData = DataFrame(celltype = "GM"))

Error in value[3L] : failed to open BamFile: file(s) do not exist:
'path/to/bam_file'

I can read it now. Missed a forward slash in the file path. Thank you.

ADD REPLY
0
Entering edit mode

Thanks Jame for your help! I used DiffBind but I got an error: Help with error DiffBind

ADD REPLY

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6