Search
Question: Path to file in windows
0
gravatar for jorrenkuster
8 days ago by
jorrenkuster0 wrote:

Hi all,

I have a question about generation a path in R with a file in a windows path.

Currently I am using this path for a bamfile counting reads for DiffBind:

bamReads = "U:/R/win-library/3.0/Data_ChIP_Experiment/aligned_S00D47H1-RUNX1.filtered.bam"

This does not work at the moment. Cause the error is: Some read files could not be accessed. See warnings for details.

Does anyone has a suggestion?

Greats, Jorren

ADD COMMENTlink written 8 days ago by jorrenkuster0

Hi,

Does it warn about all files, or just some of them?  (The warning messages should tell you which files it can't find.)

If you post your sample sheet (or email it to me, email on the DiffBind home page) I can take a look to see if there are possible problems.

 - Gord

ADD REPLYlink written 8 days ago by Gord Brown470

Hi Gord,

It warns about all the files I want to use. I will post it there soon, thanks.

Jorren

ADD REPLYlink written 8 days ago by jorrenkuster0

If you copy the cell containing the file path from the sample sheet, and, in R, try

file.exists("...paste filename here...")

pasting the exact string from the sample sheet instead of ...paste filename here... can R find the file?  The idea is to test whether it's your path, some quirk of R, or a problem with DiffBind (entirely possible!).  (I don't have a Windows machine accessible, so can't easily test for Windows-related quirks.)

ADD REPLYlink written 8 days ago by Gord Brown470

If I run the command the output is as follows:

> file.exists("aligned_S00D47h1-RUNX1.filtered.bam")
[1] TRUE

 

ADD REPLYlink written 8 days ago by jorrenkuster0

Your original question suggested that the sample sheet had an absolute path for the files (U:/R/win-library/3.0/Data_ChIP_Experiment/aligned_S00D47H1-RUNX1.filtered.bam), but your example above is just the bare filename.  Whatever is in the sample sheet has to be either an absolute path (U:/R/win-library/...), or a path that identifies the file relative to the current directory when running DiffBind.  So... running R in the directory that you use when running DiffBind, try file.exists with the exact contents of the cell.  Do you have all the files in the same directory that you're running DiffBind in?  If so just the filename(s) in the sample sheet should be fine (without the full path).

We typically run DiffBind in a directory with the sample sheet, all the peaks in a subdirectory named "peaks", and all the BAM files in a subdirectory named "bam".  So the sample sheet contains paths like "bam/myfile.bam", relative to the directory R is running in.

ADD REPLYlink written 8 days ago by Gord Brown470

This is the Samplesheet I am using. 

The samplesheet is located in the directory where I run DiffBind.

SampleID Tissue Factor Condition Replicate bamReads bamControl Peaks PeakCaller Counts
S00D63H1 iPSC RUNX1 DOX 1 bam/aligned_S00D63H1.filtered.bam   peaks/aligned_S00D63H1.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_S00D63H1.filtered.bam.count
S00D47H1-RUNX1 iPSC RUNX1 DOX 1 bam/aligned_S00D47H1-RUNX1.filtered.bam   peaks/aligned_S00D47H1-RUNX1.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_S00D47H1-RUNX1.filtered.bam.count
S00D55H1-2006090-RUNX1 iPSC RUNX1 DOX 1 bam/aligned_S00D55H1-2006090-RUNX1.filtered.bam   peaks/aligned_S00D55H1-2006090-RUNX1.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_S00D55H1-2006090-RUNX1.filtered.bam.count
S00XUNH1-RUN iPSC RUNX1 DOX 1 bam/aligned_S00XUNH1-RUNX1.filtered.bam   peaks/aligned_S00XUNH1-RUNX1.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_S00XUNH1-RUNX1.filtered.bam.count
SRR1536791 CD34+ RUNX1   1 bam/aligned_SRR1536791.filtered.bam   peaks/aligned_SRR1536791.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_SRR1536791.filtered.bam.count
SRR772111 CD34+ RUNX1   1 bam/aligned_SRR772111.filtered.bam   peaks/aligned_SRR772111.filtered.bam.calledpeaks_peaks.bed macs counts/aligned_SRR772111.filtered.bam.count
ADD REPLYlink written 7 days ago by jorrenkuster0

A couple of points that may be relevant (or may not, but need to be fixed):

1) The Condition column should have a value for every row, even if the value is "Control".

2) You have the peak caller listed as "macs" but the peaks have a ".bed" suffix.  If you want to use bed files, set the PeakCaller to be "bed", or add a  "PeakFormat" column with "bed" as the value for each row.  MACS also produces a file with a ".xls" suffix (though it isn't an Excel file); that's the one that DiffBind is expecting if the PeakCaller is "macs". 

3) You probably don't want to set the "Counts" parameter; DiffBind is designed to do its own counting from the supplied BAM files.

Beyond that, please confirm that with R running in the same directory as the sample sheet,

file.exists("bam/aligned_S00D63H1.filtered.bam")

file.exists("peaks/aligned_S00D63H1.filtered.bam.calledpeaks_peaks.bed")

both return TRUE.  (Your previous example didn't include the "bam/" or "peaks/" path component.)

ADD REPLYlink written 7 days ago by Gord Brown470
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 238 users visited in the last hour