Entering edit mode
jorrenkuster
•
0
@jorrenkuster-12855
Last seen 7.6 years ago
Hi all,
I have a question about generation a path in R with a file in a windows path.
Currently I am using this path for a bamfile counting reads for DiffBind:
bamReads = "U:/R/win-library/3.0/Data_ChIP_Experiment/aligned_S00D47H1-RUNX1.filtered.bam"
This does not work at the moment. Cause the error is: Some read files could not be accessed. See warnings for details.
Does anyone has a suggestion?
Greats, Jorren
Hi,
Does it warn about all files, or just some of them? (The warning messages should tell you which files it can't find.)
If you post your sample sheet (or email it to me, email on the DiffBind home page) I can take a look to see if there are possible problems.
- Gord
Hi Gord,
It warns about all the files I want to use. I will post it there soon, thanks.
Jorren
If you copy the cell containing the file path from the sample sheet, and, in R, try
file.exists("...paste filename here...")
pasting the exact string from the sample sheet instead of
...paste filename here...
can R find the file? The idea is to test whether it's your path, some quirk of R, or a problem with DiffBind (entirely possible!). (I don't have a Windows machine accessible, so can't easily test for Windows-related quirks.)If I run the command the output is as follows:
> file.exists("aligned_S00D47h1-RUNX1.filtered.bam")
[1] TRUE
Your original question suggested that the sample sheet had an absolute path for the files (
U:/R/win-library/3.0/Data_ChIP_Experiment/aligned_S00D47H1-RUNX1.filtered.bam
), but your example above is just the bare filename. Whatever is in the sample sheet has to be either an absolute path (U:/R/win-library/...
), or a path that identifies the file relative to the current directory when running DiffBind. So... running R in the directory that you use when running DiffBind, tryfile.exists
with the exact contents of the cell. Do you have all the files in the same directory that you're running DiffBind in? If so just the filename(s) in the sample sheet should be fine (without the full path).We typically run DiffBind in a directory with the sample sheet, all the peaks in a subdirectory named "peaks", and all the BAM files in a subdirectory named "bam". So the sample sheet contains paths like "bam/myfile.bam", relative to the directory R is running in.
This is the Samplesheet I am using.
The samplesheet is located in the directory where I run DiffBind.
A couple of points that may be relevant (or may not, but need to be fixed):
1) The
Condition
column should have a value for every row, even if the value is "Control".2) You have the peak caller listed as "
macs
" but the peaks have a ".bed
" suffix. If you want to use bed files, set thePeakCaller
to be "bed
", or add a "PeakFormat
" column with "bed
" as the value for each row. MACS also produces a file with a ".xls
" suffix (though it isn't an Excel file); that's the one that DiffBind is expecting if thePeakCaller
is "macs
".3) You probably don't want to set the "
Counts
" parameter; DiffBind is designed to do its own counting from the supplied BAM files.Beyond that, please confirm that with R running in the same directory as the sample sheet,
both return TRUE. (Your previous example didn't include the "bam/" or "peaks/" path component.)
I input all the recommendations you said en both files return TRUE.
The error unfortunately remains the same
I don't know what to suggest. Either your current working directory isn't what you think it is when you run R, or there is some mismatch between the paths in your sample sheet and the actual paths on disk. There is no other possibility that I can think of.
In an R session, type the following commands, and post the *complete* script, including the exact commands you typed and the exact output (*all* of it):
Otherwise perhaps you can find a local person who can give you some help.
> list.files()
[1] "bam" "Examplesheet.csv" "Patient.csv" "peaks"
> list.files(path='bam')
[1] "aligned_S00D47H1-RUNX1.filtered.bam" "aligned_S00D55H1.filtered.bam" "aligned_S00D63H1.filtered.bam" "aligned_S00XUNH1-RUNX1.filtered.bam"
[5] "aligned_SRR1536791.filtered.bam" "aligned_SRR772111.filtered.bam"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] DiffBind_1.8.5 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] amap_0.8-14 bitops_1.0-6 caTools_1.17.1 edgeR_3.4.2 gdata_2.13.3 gplots_2.16.0 gtools_3.4.2 KernSmooth_2.23-10
[9] limma_3.18.13 RColorBrewer_1.1-2 stats4_3.0.2 tools_3.0.2 zlibbioc_1.8.0
Hi,
Well, the paths look right. Your sessionInfo shows that your versions of R and DiffBind are close to 4 years out of date, though. Could you upgrade to R 3.3.3 or 3.4.0 and try again? We can't support versions of DiffBind that old.
It works now, thanks a lot for your help
My guess would be the network path "//home2...."; you could try to map this (I googled for 'windows map network drive') to a standard windows drive letter and use that.