Problem with Tximport
1
0
Entering edit mode
@user-24665
Last seen 3.2 years ago

Hello, everyone I use the package called Tximport for long time to generate the tables I use for RNA-Seq. The code is quite simple. Here it is:

gene_id_import <- read.delim("gene_id_import.txt") # This is the file with the transcripts version and their respective gene names.
library(tximport)
filepath = "C:/ad4gy/MM"
files<- sapply(list.dirs(path = filepath, recursive = FALSE)[grep("_quant2TRIMbias",list.dirs(path = filepath, recursive = FALSE))],function(x) paste0(x,"/quant.sf"))# This command picks the quant.sf files inside each folder 
#Tximport with length scaled TPM values as output (scaled up to library size and average transcript length over samples)
txi<- tximport(files= files, type= "salmon", tx2gene = gene_id_import, countsFromAbundance = "lengthScaledTPM")#    When I run this line, now it’s giving the error:

**reading in files with read_tsv
Error in parse_con(txt, bigint_as_char) : 
  lexical error: invalid bytes in UTF8 string.
               "seqBias": [],     "": "–gcBias",     "auxDir": "aux_in
                     (right here) ------^**

#save your output file typing:
write.table(txi, file = "counts.txt", quote = TRUE, sep = "\t")

I have no idea about this error or how to correct it. Besides, it’s in the last line of the code. I noticed the mistake started after I upgrade to R4.0.3 Does anyone have any idea how to fix it. I appreciate all the help Regards,

tximportData tximport • 1.7k views
ADD COMMENT
0
Entering edit mode

Are you sure that files contains just the quant.sf files? Have you checked the contents of files?

ADD REPLY
0
Entering edit mode

Hi, I've checked the content of files. And yes, the quant.sf files are there

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

This is from jsonlite which is used to read in the metadata for the quantification for Salmon.

But tximport is working fine with 4.0.3 here:

http://bioconductor.org/checkResults/release/bioc-LATEST/tximport/

What version of jsonlite do you have?

ADD COMMENT
0
Entering edit mode

I've tried importing Salmon quants with GC bias correction using jsonlite 1.7.2 and the latest tximport and can't reproduce the error. Can you try:

dir <- system.file("extdata", package="oct4")
samples <- read.csv(file.path(dir,"coldata.csv"))
files <- file.path(dir,"quants", samples$names, "quant.sf.gz")
txi <- tximport(files, type="salmon", txOut=TRUE)

Also I can read in the command info file without issue:

jsonlite::fromJSON(file.path(dir,"/quants/SRX2236945/cmd_info.json"))

Can you include what that file (cmd_info.json) looks like by cat it on the command line?

ADD REPLY
0
Entering edit mode

sessionInfo() R version 4.0.3 (2020-10-10) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] jsonlite_1.7.2 tximport_1.16.1

loaded via a namespace (and not attached): [1] pillar_1.4.7 compiler_4.0.3 RColorBrewer_1.1-2 BiocManager_1.30.10 remotes_2.2.0 prettyunits_1.1.1
[7] tools_4.0.3 testthat_3.0.1 digest_0.6.27 pkgbuild_1.2.0 pkgload_1.1.0 memoise_1.1.0
[13] lifecycle_0.2.0 tibble_3.0.4 gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.9 cli_2.2.0
[19] rstudioapi_0.13 xfun_0.19 withr_2.4.0 dplyr_1.0.2 hms_1.0.0 fs_1.5.0
[25] desc_1.2.0 generics_0.1.0 vctrs_0.3.5 devtools_2.3.2 rprojroot_2.0.2 grid_4.0.3
[31] tidyselect_1.1.0 glue_1.4.2 R6_2.5.0 processx_3.4.5 fansi_0.4.1 sessioninfo_1.1.1
[37] pheatmap_1.0.12 readr_1.4.0 purrr_0.3.4 callr_3.5.1 magrittr_2.0.1 usethis_2.0.0
[43] scales_1.1.1 ps_1.5.0 ellipsis_0.3.1 assertthat_0.2.1 colorspace_2.0-0 tinytex_0.29
[49] munsell_0.5.0 crayon_1.3.4

ADD REPLY
0
Entering edit mode

I think you found the mistake. The character before gcBias is unknown. I've just deleted it in all the samples and now the code works again. Thank you so much for the help.

"salmon_version": "0.11.2", "index": "/scratch/ad4gy/GENOMES/salmon_index/Mus_musculus.GRCm38.p6.cdna.EYFP_INDEX", "libType": "IU", "mates1": "/scratch/ad4gy/Ren1cYFP_WT/N702_N505_CELL8_pairtrim/read_1.pairtrim.fq.gz", "mates2": "/scratch/ad4gy/Ren1cYFP_WT/N702_N505_CELL8_pairtrim/read_2.pairtrim.fq.gz", "threads": "8", "output": "/scratch/ad4gy/Ren1cYFP_WT/N702_N505_CELL8_quant2TRIMbias", "seqBias": [], "": "–gcBias", "auxDir": "aux_info"

ADD REPLY
0
Entering edit mode

I'm not sure why the character was printed in those JSON files, but not in the files I and others have...

If anyone else encounters this problem, more information about what version of Salmon / what OS may help to debug the origin of the character causing the jsonlite error.

ADD REPLY

Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6