Search
Question: Tximport with RSEM error
0
gravatar for hs.lansdell
26 days ago by
hs.lansdell10
hs.lansdell10 wrote:

Hello! I am new to both RSEM and tximport, so I am unsure which this is an issue with. All of my RNA seq samples are in separate SampleName.rsem.genes.results files in one directory. Based on the vignette, it almost seems like all the samples should be in one gene.results file? Either way, I attempted to pack the files and read them in via the below code:

library(tximport)

rsem.files=list.files(".","*.genes.results")

txi.rsem=tximport(rsem.files, type = "rsem")

And got the following error:

reading in files with read.delim (install 'readr' package for speed up)
1 Error: all(c(geneIdCol, abundanceCol, lengthCol) %in% names(raw)) is not TRUE

Thank you for any feedback!

ADD COMMENTlink modified 26 days ago by k.vitting.seerup10 • written 26 days ago by hs.lansdell10
0
gravatar for k.vitting.seerup
26 days ago by
European Union
k.vitting.seerup10 wrote:

I think you need to supply to tximport is a vector of strings pointing to each of the isoform expression file you want to import.

If you want the gene expression you should use the tx2gene option in the tximport() function (letting tximport summing up the gene expression).

Try running the example from the tximport vignette and see what the input needs to be.

ADD COMMENTlink modified 26 days ago • written 26 days ago by k.vitting.seerup10

Thanks for yours thoughts, I started just at the RSEM section and missed the making of the vector. It would be helpful to see the example inputs, so I tried running the vignette examples, but I cannot locate the 'extdata' anywhere within the package. When I run the command:

dir <- system.file("extdata", package = "tximportData")
list.files(dir)

I get:

character(0)

ADD REPLYlink written 25 days ago by hs.lansdell10

Note this part of the sentence in the vignette when we talk about building the 'files' vector:

"First, we locate the directory containing the files. (Here we use system.file to locate the package directory, but for a typical use, we would just provide a path, e.g. "/path/to/dir".)"

ADD REPLYlink written 25 days ago by Michael Love14k

Which I understand and have done, what I'm trying to understand now is what the sample.txt file should look like. 

ADD REPLYlink written 25 days ago by hs.lansdell10

Ok, one thing I'm not sure about: right now, are you trying to work with your own RSEM files, or re-run the vignette code?

tximport() can import the 'genes.results' files for RSEM by setting type="rsem".

If you want RSEM quantifications on at the transcript level, you can set these columns below manually. We didn't implement an transcript-level preset for RSEM, just the gene-level import.

txi <- tximport(files, txOut=TRUE,
​  txIdCol="transcript_id", abundanceCol="FPKM",
  countsCol="expected_count", lengthCol="effective_length")

 

ADD REPLYlink written 25 days ago by Michael Love14k

I was trying to run the vignette code just to make sure I formatted my own files correctly. I want to run my own genes.results files. I'm trying to determine if a sample.txt file is needed for that as it states here:

RSEM sample.genes.results files can be imported by setting type to "rsem".

files <- file.path(dir, "rsem", samples$run, paste0(samples$run, ".genes.results"))

I thought the samples variable was made from a .txt file. As seen earlier in the vignette:

samples <- read.table(file.path(dir, "samples.txt"), header = TRUE)

Do I need a sample.txt file made to run tximport on genes.results files? I tried just running the following:

txi.rsem=tximport(files, type = "rsem")

With files being all my genes.results files. 

I get the following error:

reading in files with read.delim (install 'readr' package for speed up)
1 Error: all(c(geneIdCol, abundanceCol, lengthCol) %in% names(raw)) is not TRUE

 

ADD REPLYlink modified 25 days ago • written 25 days ago by hs.lansdell10

Maybe you can do a bit of debugging. tximport() is looking for these column names in each of the files.

    geneIdCol <- "gene_id"
    abundanceCol <- "FPKM"
    countsCol <- "expected_count"
    lengthCol <- "effective_length"

Can you confirm that all the files in 'files' have these columns?

ADD REPLYlink written 25 days ago by Michael Love14k

My files have no column names. From what I can tell my file formats seems to be gene_id some# some# and transcript_id. I reached out to the person who processed the files. Turns out they used rsem v1.1.13.... which has different columns. I'll see if I can import it manually. 

Thanks!

ADD REPLYlink modified 25 days ago • written 25 days ago by hs.lansdell10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 130 users visited in the last hour