comparing CHIP-seq with published data using Diffbind
1
0
Entering edit mode
jabbari • 0
@jabbari-13869
Last seen 6.6 years ago

Hello, 

I am very new with DiffBind, so pardon me if it is a silly question.

I am trying to compare our ChIP-seq data with a published data set (GEO GSE322222) and they only provided BED files. I am wondering if it is possible to use DiffBind when we don't have access to BAM files (read and control).

Thanks,

Hosna

diffbind • 753 views
ADD COMMENT
0
Entering edit mode
Rory Stark ★ 5.1k
@rory-stark-5741
Last seen 9 weeks ago
Cambridge, UK

Hi Hosna-

If all you have are peak files, you can't do an quantitative analysis with DiffBind. You can look at overlaps (eg Venn diagrams) but not much more.

In this case, however, I think the raw (read) data actually is available. GSE322222 is a dataset I helped generate and analyze. In GEO, the at the top level of the dataset, only the peak files are included, but if you drill down to the record for each individual sample, the raw data are included. For example, the first sample is GSM798383, and you can download the data using this ID. The page for GSE322222 has a section titled "Samples (62)" with a "+More" button which will show the links for all the individual samples.

Cheers-

Rory

ADD COMMENT
0
Entering edit mode

Hi Rory, 

Thank you very much for the quick response!

Just to clarify: I am assuming now that files such as GSM798383_SLX-1201.250.s_4_SLX-1202.250.s_1_peaks.txt.gz are deferentially bound sites and GSM798383_SLX-1201.250.s_4_SLX-1202.250.s_1_sw.peaks.txt.gz as raw reads. 

If this is the case, don't I need to have access to the control data to be able to use DiffBind with my data?

Thank you very much for your help, 

Hosna

ADD REPLY
0
Entering edit mode

No, GSM798383_SLX-1201.250.s_4_SLX-1202.250.s_1_peaks.txt.gz contains MACS peaks for this sample, and GSM798383_SLX-1201.250.s_4_SLX-1202.250.s_1_sw.peaks.txt.gz contains peaks from a different peak caller (SWEMBL). No differential peaks are included.

To use DiffBind on this dataset, you would want to download and extract the raw reads from the "SRA Experiment". You'd need to re-align the data and re-call peaks (as the included peaks are for an older version of the reference genome). There are "SRA Experiment" links for all the Input controls as well included among the 62 total samples.

-Rory

 

ADD REPLY

Login before adding your answer.

Traffic: 761 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6