tximport fails when RSEM output options produce files with additional columns?
2
0
Entering edit mode
Alexander • 0
@c2c589e1
Last seen 2.7 years ago
United States

I downloaded the RSEM-generated gene counts files from ENCODE, and am hoping to produce a normalized matrix. Unfortunately, tximport doesn't seem to be able to read the RSEM files from ENCODE, which include additional output columns.

RSEM output format docs.

ENCODE's RSEM-generated gene count file headers and example row:

gene_id transcript_id(s)    length  effective_length    expected_count  TPM FPKM    posterior_mean_count    posterior_standard_deviation_of_count   pme_TPM pme_FPKM    TPM_ci_lower_bound  TPM_ci_upper_bound  TPM_coefficient_of_quartile_variation   FPKM_ci_lower_bound FPKM_ci_upper_bound FPKM_coefficient_of_quartile_variation
ENSG00000000003.14  ENST00000373020.8,ENST00000494424.1,ENST00000496771.5,ENST00000612152.4,ENST00000614008.4   1745.64 1646.64 8.00    0.12    0.15    8.00    0.00    0.24    0.30    0.0992221   0.38994 0.218542    0.126724    0.498276    0.218431

error message:

reading in files with read_tsv

1 
Warning message:
“Unnamed `col_types` should have the same length as `col_names`. Using smaller of the two.”
Warning message:
“59526 parsing failures.
row col  expected     actual                                                                           file
  1  -- 7 columns 17 columns '/Users/alex/Documents/AChroMap/data/raw/ENCODE/rna/downloads/ENCFF488ZHV.tsv'
  2  -- 7 columns 17 columns '/Users/alex/Documents/AChroMap/data/raw/ENCODE/rna/downloads/ENCFF488ZHV.tsv'
  3  -- 7 columns 17 columns '/Users/alex/Documents/AChroMap/data/raw/ENCODE/rna/downloads/ENCFF488ZHV.tsv'
  4  -- 7 columns 17 columns '/Users/alex/Documents/AChroMap/data/raw/ENCODE/rna/downloads/ENCFF488ZHV.tsv'
  5  -- 7 columns 17 columns '/Users/alex/Documents/AChroMap/data/raw/ENCODE/rna/downloads/ENCFF488ZHV.tsv'
... ... ......... .......... ..............................................................................
See problems(...) for more details.
”

The command was:

txi.rsem <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE)

I'm new to R (from python) and so would benefit most from a detailed answer. Many thanks,

Environment:

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin20.4.0 (64-bit)
Running under: macOS Big Sur 11.3

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.15_1/lib/libopenblasp-r0.3.15.dylib
LAPACK: /usr/local/Cellar/r/4.1.0/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.0   tximportData_1.20.0 readr_1.4.0        
[4] tximport_1.20.0    

loaded via a namespace (and not attached):
 [1] magrittr_2.0.1    hms_1.1.0         uuid_0.1-4        R6_2.5.0         
 [5] rlang_0.4.11      fansi_0.5.0       tools_4.1.0       utf8_1.2.1       
 [9] htmltools_0.5.1.1 ellipsis_0.3.2    digest_0.6.27     tibble_3.1.2     
[13] lifecycle_1.0.0   crayon_1.4.1      IRdisplay_1.0     repr_1.1.3       
[17] base64enc_0.1-3   vctrs_0.3.8       IRkernel_1.2      evaluate_0.14    
[21] pbdZMQ_0.3-5      compiler_4.1.0    pillar_1.6.1      jsonlite_1.7.2   
[25] pkgconfig_2.0.3
DESeq2 tximportData tximpor • 907 views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.3k
@swbarnes2-14086
Last seen 49 minutes ago
San Diego

You should always include the error message given. "It doesn't work" isn't specific enough to allow anyone to help you.

Have you tried removing all the columns after FPKM?

ADD COMMENT
0
Entering edit mode

I've updated the post with the error message. I _have_ tried chopping off the latter columns, and it appears to work.

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 37 minutes ago
United States

If the files are modified from their original software, you essentially have a custom format, you can just set type="none" and then manually specify these arguments:

geneIdCol, txIdCol, abundanceCol, countsCol, lengthCol

See ?tximport

ADD COMMENT

Login before adding your answer.

Traffic: 865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6