Question: txi <- tximport(files, type="salmon", tx2gene=tx2gene) Error in tximport(files, type = "salmon", tx2gene = tx2gene) : all(file.exists(files)) is not TRUE
0
7 weeks ago by
Leo K0
Leo K0 wrote:

Hello,

I am struggling to solve the following problem while trying to analyze my RNASeq data with DESeq2.

When trying to import files via tximport, I get the following error:

txi <- tximport(files, type="salmon", tx2gene=tx2gene) Error in tximport(files, type = "salmon", tx2gene = tx2gene) :
all(file.exists(files)) is not TRUE.


Having read the earlier submissions issued by others I checked on my file path und directory, both are correct. - When performing: all(file.exists(files)) I receive [1] FALSE. - When performing: all(file.exists("/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon")) I receive [1] TRUE - When only performing > file.exists(files) I receive [1] FALSE FALSE FALSE FALSE FALSE FALSE - when performing file.exists("/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon") I receive [1] TRUE

When I try to run the command "txi <- tximport(files, type="salmon", tx2gene=tx2gene)" using the file.path ("/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon") instead of files, I receive: 1 Error in guessheader(datasource, tokenizer, locale) : Cannot read file /Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon: Invalid argument

The analysis had been working fine for me up to now without changing or deleting anything. Additionally, after the error keeps on occurring I ran an update on R, RStudio, Bioconductor and the packages including tximport. The error still remains.

Leo

modified 6 weeks ago • written 7 weeks ago by Leo K0

When I type in "files", it gives me the correct files:

files sample1 "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon/xx1/quant.sf" sample2 "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon/xx2/quant.sf"

1

What do you mean by 'correct' in that context? Certainly not 'existing'?

> dirs <- dir(paste0(path.package("tximportData"), "/extdata/salmon/"), full.names = TRUE)
> dirs
[1] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188021"
[2] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188088"
[3] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188288"
[4] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188297"
[5] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188329"
[6] "C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188356"
> sapply(dirs, dir, pattern = "sf")
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188021
"quant.sf.gz"
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188088
"quant.sf.gz"
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188288
"quant.sf.gz"
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188297
"quant.sf.gz"
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188329
"quant.sf.gz"
C:/Users/jmacdon/AppData/Roaming/R/win-library/3.5/tximportData/extdata/salmon/ERR188356
"quant.sf.gz"



There are not (in my version of tximportData) any subdirs in the extdata dir called xx1, xx2, etc, nor are there any quant.sf files in those non-existant directories. There are six ERRXXXXXX directories, each of which has a quant.sf.gz file, but you aren't pointing to those.

Thank you for your quick reply and your help! I deleted the ERRXXXXXX directories and replaced them with xx1/2/3 etc.. directories (renamed; in this case "G2NULL6hIIRepS9R1001.fastq.gz“quant“) containing quant.sf files.

> dirs <- dir(paste0(path.package("tximportData"), "/extdata/salmon/"), full.names = TRUE)
> dirs
[1] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_CT_6h_IIIRep_S11_R1_001.fastq.gz“quant“"
[2] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_CT_6h_IIRep_S4_R1_001.fastq.gz“quant“"
[3] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_CT_6h_IRep_S1_R1_001.fastq.gz“quant“"
[4] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_NULL_6h_IIIRep_S6_R1_001.fastq.gz“quant“"
[5] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_NULL_6h_IIRep_S9_R1_001.fastq.gz“quant“"
[6] "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2_NULL_6h_IRep_S2_R1_001.fastq.gz“quant“"

> sapply(dirs, dir, pattern = "sf")
Error in get(as.character(FUN), mode = "function", envir = envir) :


After performing "dirs" I just saw that it gives me 2 // just before the replaced directory. Might this be the error?! Or do i explicitly need quant.sf.gz files? I have been using quant.sf until now and it worked fine.

Add-on: It gives me the same error when I reinstall tximportData and perform your commands with the preinstalled directories and files. The " double-//" is shown aswell..

ADD REPLYlink modified 7 weeks ago by Martin Morgan ♦♦ 23k • written 7 weeks ago by Leo K0

On whether to use quant.sf or quant.sf.gz, you should use the name of the file that is on your machine.

At some point I updated tximportData to gzip the quant files. You don't have to do this with your own data.

If you want to look up whether the files are gzipped or not, or if they exist at a certain location on your machine, use list.files():

1. Don't put things in your R library directory like that. You are only asking to totally bork the underlying package. Files you are working on should go in a normal 'working' directory, not special directories that R set up as part of an installation.
2. The extra // won't affect things.
3. Why do your file names include something like "quant" at the end? Can you really have weird smart quotes like that on MacOS?
4. I don't know why you get that error, but do note that what you are doing with that sapply call doesn't make sense. Your dirs object is a character vector of file locations, and your sapply call is asking 'what files with "sf" in their name are in these directories?', which doesn't make any sense, given that those aren't directories. What you need to pass to tximport is a character vector of file names (including the path if they are not in the same directory, which +/- they should be).
5. In addition, the sapply call doesn't make any sense, because none of the files you are looking for have sf in the file name!. The pattern argument to dir is intended to restrict the returned values to those files that match the pattern. If you don't have any files with sf in the name, you by definition will not get any file names returned.
1

I should also note that tximport is intended to read in files that have been generated by software like salmon, or kallisto, etc. The files you are looking to import are FASTQ files which are inputs for salmon, etc, not outputs.

There are so many things here that are wrong that it is hard to know where to start. At the bare minimum you need to peruse the tximport vignette, and you would probably be better served by finding someone local to help you, because you may be irretrievably lost.

Thank you very much for your detailed replies and help! Perhaps I should have explained the initial situation more extensively to prevent misunderstanding.

1. The files I have been using and I am using momentarily are quant.sf-files, which are generated by salmon. The "quant" at the end of the directory is an internal designation, given by us, so that we know the file is already processed by salmon. I deleted the name of the file for this post.

2.Concering " /Library/Frameworks/R.framework/Versions/3.5/Resources/library/tximportData/extdata/salmon//G2CT6hIIIRepS11R1001.fastq.gz“quant“" Those are directories, containing my quant.sf file. I replaced the initially installed directories "ERR188..." with the above-mentioned directories, named with an own designation "G2CT6hIIIRepS11R1001.fastq.gz“quant“". Unfortunately I did not copy the name of the file, I just copied the directory, which is identical to the contained file missing the quant.sf. My apologies if this lead to misunderstanding.

As I mentioned above, importing the files via tximportdata by replacing the ERR....-directories had been working perfectly fine for me until now, being able to run the full rna-seq procedure. I'll keep on exploring where the mistake is located und keep you updated.

Thank you and sorry if any misunderstandings occured.

Leo

Unless you are actually answering a question, please don't use the Add Answer box, because you aren't answering a question. As to your original question, please note that the help page for tximport says this:

files: a character vector of filenames for the transcript-level
abundances


And you are passing in a bunch of directory names, which are not the same thing.

Answer: txi <- tximport(files, type="salmon", tx2gene=tx2gene) Error in tximport(files,
1
7 weeks ago by
United States
James W. MacDonald49k wrote:

What you are showing is

1. The files you are pointing to don't exist
2. Files that are part of the tximportData package do exist.

These seem orthogonal to me? You state that when you check on your file path and directory, both are correct, but you have good evidence that this is not true, because R says the files aren't there. So you need to go back and ensure that the files argument actually point to some existing files (which I can assure you at this point they do not).