tximport command error
1
0
Entering edit mode
eliztran • 0
@eliztran-20727
Last seen 2.4 years ago

I'm pretty new to RNAseq, and I have been following this tutorial to get my data RNAseq data DESeq2 compatible (https://bioconductor.github.io/BiocWorkshops/rna-seq-data-analysis-with-deseq2.html). I get my sf file via Salmon. However, I'm getting this error once I try to call the tximport command:

txi command:

txi = tximport(files, type="salmon", tx2gene=tx2gene, ignoreTxVersion= TRUE, ignoreAfterBar=TRUE)


error message:

"Error in summarizeToGene(txi, tx2gene, varReduce, ignoreTxVersion, ignoreAfterBar, : None of the transcripts in the quantification files are present in the first column of tx2gene. Check to see that you are using the same annotation for both. Example IDs (file): [ebi, ...] Example IDs (tx2gene): [ENST00000456328.2, ENST00000450305.2, ENST00000473358.1, ...] This can sometimes (not always) be fixed using 'ignoreTxVersion' or 'ignoreAfterBar'."


I've looked at previously answered questions, and the most common suggestion was to put ignoreTxVersion=TRUE or ignoreAfterBar = TRUE. I have tried both, but I am still getting the same error. Not too sure what to do. Any suggestions?

tx2gene txi error • 735 views
0
Entering edit mode

You need to show the results from head(tx2gene) as well as readr::read_tsv(files[1]). Which may already point out to you what the problem is.

0
Entering edit mode

# A tibble: 6 x 2
TXNAME            GENEID
<chr>             <chr>
1 ENST00000456328.2 ENSG00000223972.5
2 ENST00000450305.2 ENSG00000223972.5
3 ENST00000473358.1 ENSG00000243485.5
4 ENST00000469289.1 ENSG00000243485.5
5 ENST00000607096.1 ENSG00000284332.1
6 ENST00000606857.1 ENSG00000268020.3


Parsed with column specification:
cols(
Name = col_character(),
Length = col_double(),
EffectiveLength = col_double(),
TPM = col_double(),
)
# A tibble: 1 x 5
<chr>          <dbl>           <dbl> <dbl>    <dbl>
1 ebi.ac.uk 1203996941      1203996928     0        0

0
Entering edit mode

Your transcript names in the file are not ENST...

0
Entering edit mode

0
Entering edit mode

Is there someone you can collaborate with, who is familiar with RNA-seq pipelines? Of course there are many tutorials online but it seems you are stuck at an early stage and you’d benefit from having someone looking over your shoulder.

0
Entering edit mode

everyone in my lab is currently out, so I'm trying to work around that, but thank you for taking your time out to respond! I appreciate it!

0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

Well, your first file has one row, and the chromosome in that file is called 'ebi.ac.uk! Which obviously makes no sense. Here is an example of what you should see (note that this some mouse data"

Parsed with column specification:
cols(
Name = col_character(),
Length = col_double(),
EffectiveLength = col_double(),
TPM = col_double(),
)
# A tibble: 107,188 x 5
<chr>           <dbl>           <dbl>    <dbl>     <dbl>
1 NM_001001130.2   2218           1969.  3.73      65
2 NM_001001144.3   4226           4571. 31.2     1264.
3 NM_001001152.2   3488           2945.  1.07      27.9
4 NM_001001160.3   6688           6846.  0.222     13.4
5 NM_001001176.2   2602           2187.  3.72      72.1
6 NM_001001177.2   1900            169   0          0
7 NM_001001178.1   3992           3994.  0.0740     2.62
8 NM_001001179.3   4698           5362.  0.00203    0.0963
9 NM_001001180.2   3909           4113.  0.456     16.6
10 NM_001001181.3   1082            785. 92.3      642.
#   with 107,178 more rows

`

And my tx2gene has things like NM_001001130.2 in the first column and whatever Gene ID that corresponds to in the second column.

So it looks like something went sideways when you ran salmon, because you should have tens of thousands of rows, not one.

0
Entering edit mode

Thank you! I'll go back and see what happened there.