Tximport Error - transcripts in quant files are not the same as tx2gene
1
0
Entering edit mode
Kayleigh • 0
@b7c49cbe
Last seen 8 months ago
United States

I am receiving the below error when running tximport for DEseq2:

Error in .local(object, ...) :
  None of the transcripts in the quantification files are present
  in the first column of tx2gene. Check to see that you are using
  the same annotation for both.

Below is the code I am using:

samples <- read.table("samples.txt", header=TRUE)
files <- file.path(samples$Sample, "quant.sf")
file.exists(files)
txdb <- makeTxDbFromGFF(file="gencode.v44.long_noncoding_RNAs.gtf.gz")
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- select(txdb, k, "GENEID", "TXNAME")

txi <- tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion=TRUE)

I have already confirmed that the annotation is the same for both and have also run with ignoreTxVersion = TRUE. Unfortunately the quant files are formatted like below and I can't find a simple solution to fix the issue. I know I somehow need to edit the name column to remove everything after the first | but I have over 200 quant files and I can't do it manually very efficiently. Does anyone have any suggestions??

Quant file format: 
Name    Length  EffectiveLength TPM     NumReads
ENST00000456328.2|ENSG00000290825.1|-|OTTHUMT00000362751.1|DDX11L2-202|DDX11L2|1657|    1657    1473.695    30.839887       199.500
ENST00000473358.1|ENSG00000243485.5|OTTHUMG00000000959.2|OTTHUMT00000002840.1|MIR1302-2HG-202|MIR1302-2HG|712|      712     434.000 0.000000        0.000

tx2gene format: 
             TXNAME            GENEID
1 ENST00000456328.2 ENSG00000290825.1
2 ENST00000473358.1 ENSG00000243485.5

Thank you!!

salmon tximport deseq2 • 537 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 days ago
United States
ignoreAfterBar: logical, whether to split the tx id on the '|'
          character to facilitate matching with the tx id in 'tx2gene'
          (default FALSE). if 'txOut=TRUE' it will strip the text after
          '|' on the rownames of the matrices
0
Entering edit mode

Also, you don't want to use ignoreTxVersion, since both your tx2gene and your quant files have versions!

ADD REPLY
0
Entering edit mode

If this is GENCODE, also they can just use tximeta, which does everything for you.

coldata <- data.frame(files=files, names=names)
se <- tximeta(coldata)
gse <- summarizeToGene(se)
ADD REPLY

Login before adding your answer.

Traffic: 736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6