tximport error: all(txId == raw[[txIdCol]]) is not TRUE
1
0
Entering edit mode
@nicolettesipperly-16835
Last seen 9 weeks ago
United States

I am trying to use Tximport to eventually use edgeR TMM normalization method. I am trying to get a normalization factor per species. For each species I have multiple tissue samples (Ie lung and kidney) and need to normalize across tissues. I have gene to orthogroup maps for each tissue and abundance files from Kallisto for each tissue. I am treating each tissue like a sample. Because I was aware of the fact that tximport needs to have the abundance files and tx2gene maps in the same order, I combined the abundance files for the tissues. For example, for the lung abundance file, it has the transcripts listed for the kidney but with 0 for all of the abundance information. Then I sorted and filled both abundance files and the tx2gene map so they have the same transcripts in the same order for all. I am still getting the error.

I followed the steps from:

https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#Import_transcript-level_estimates

Thank you so much for your time!!




> library(readr)
> library(tximport)
> library(tximportData)
> tools:::.BioC_version_associated_with_R_version()
[1] ‘3.11’

> dir <- "/gpfs/scratch/nsipperly/RAPID/kallisto_1Nov2021/ZEROS/Mema"
> samples <- read.table(file.path(dir, "samples.txt"), header = TRUE)
> samples
       sample run
1 Mema_kidney   1
2   Mema_lung   1
> files <- file.path(dir, samples$sample, "abundance.tsv")
> names(files) <- paste0("sample", 1:2)
> all(file.exists(files))
[1] TRUE

> files
                                                                               sample1
"/gpfs/scratch/nsipperly/RAPID/kallisto_1Nov2021/ZEROS/Mema/Mema_kidney/abundance.tsv"
                                                                               sample2
  "/gpfs/scratch/nsipperly/RAPID/kallisto_1Nov2021/ZEROS/Mema/Mema_lung/abundance.tsv"

>
> tx2gene <- read_tsv("/gpfs/scratch/nsipperly/RAPID/kallisto_1Nov2021/ZEROS/Mema/MemaAlltissue2_tx2gene.sort.whole.tsv")

── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
cols(
  TranscriptID = col_character(),
  Orthogroup = col_character()
)

> txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene = tx2gene)
Note: importing `abundance.h5` is typically faster than `abundance.tsv`
reading in files with read_tsv
1 2 Error in tximport(files, type = "kallisto", tx2gene = tx2gene) :
  all(txId == raw[[txIdCol]]) is not TRUE
> traceback()
3: stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
2: stopifnot(all(txId == raw[[txIdCol]]))
1: tximport(files, type = "kallisto", tx2gene = tx2gene)

# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /gpfs/software/R-4.0.2/lib64/R/lib/libRblas.so
LAPACK: /gpfs/software/R-4.0.2/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] tximportData_1.16.0 tximport_1.16.1     readr_1.4.0

loaded via a namespace (and not attached):
 [1] crayon_1.4.1    R6_2.5.1        lifecycle_0.2.0 magrittr_2.0.1
 [5] pillar_1.4.7    rlang_0.4.11    cli_3.0.1       rstudioapi_0.13
 [9] vctrs_0.3.5     ellipsis_0.3.1  glue_1.4.2      hms_0.5.3
[13] compiler_4.0.2  pkgconfig_2.0.3 tibble_3.0.4


#I made some dummy files for the post -- as instructed by the posting guide :) 

#Mema_kidney/abundance.tsv

target_id   length  eff_length  est_counts  tpm
Mema_kidney_Transcript_1003.p1  2865    2589.74 409 4.35504
Mema_kidney_Transcript_100606.p1    255 39.3322 117433  82331.8
Mema_kidney_Transcript_1006.p2  597 322.318 272.241 23.2913
Mema_kidney_Transcript_100754.p1    258 40.1783 16  10.9813
Mema_kidney_Transcript_1007.p1  3048    2772.74 5229    52.0037
Mema_kidney_Transcript_1011.p2  528 253.775 1012    109.966
Mema_kidney_Transcript_1012.p1  1941    1665.74 919 15.2136
Mema_kidney_Transcript_1015.p1  2769    2493.74 889 9.83049
Mema_kidney_Transcript_1019.p1  534 259.732 1420    150.761
Mema_lung_Transcript_979.p1 0   0   0   0
Mema_lung_Transcript_98263.p1   0   0   0   0
Mema_lung_Transcript_9828.p2    0   0   0   0
Mema_lung_Transcript_983.p1 0   0   0   0
Mema_lung_Transcript_985.p2 0   0   0   0
Mema_lung_Transcript_991.p1 0   0   0   0
Mema_lung_Transcript_9938.p1    0   0   0   0
Mema_lung_Transcript_9959.p2    0   0   0   0
Mema_lung_Transcript_995.p1 0   0   0   0
Mema_lung_Transcript_996.p1 0   0   0   0

#Mema_lung/abundance.tsv

target_id   length  eff_length  est_counts  tpm
Mema_kidney_Transcript_1003.p1  0   0   0   0
Mema_kidney_Transcript_100606.p1    0   0   0   0
Mema_kidney_Transcript_1006.p2  0   0   0   0
Mema_kidney_Transcript_100754.p1    0   0   0   0
Mema_kidney_Transcript_1007.p1  0   0   0   0
Mema_kidney_Transcript_1011.p2  0   0   0   0
Mema_kidney_Transcript_1012.p1  0   0   0   0
Mema_kidney_Transcript_1015.p1  0   0   0   0
Mema_kidney_Transcript_1019.p1  0   0   0   0
Mema_lung_Transcript_979.p1 1995    1764.51 819 29.9187
Mema_lung_Transcript_98263.p1   264 84.4105 55  42
Mema_lung_Transcript_9828.p2    609 380.972 1161.26 196.48
Mema_lung_Transcript_983.p1 2904    2673.51 3513    84.6992
Mema_lung_Transcript_985.p2 366 160.345 27  10.8541
Mema_lung_Transcript_991.p1 828 597.667 1179    127.156
Mema_lung_Transcript_9938.p1    414 200.22  149 47.9691
Mema_lung_Transcript_9959.p2    288 100.964 16  10.215
Mema_lung_Transcript_995.p1 1455    1224.51 630 33.1636
Mema_lung_Transcript_996.p1 1395    1164.51 442 24.4659

#/gpfs/scratch/nsipperly/RAPID/kallisto_1Nov2021/ZEROS/Mema/MemaAlltissue2_tx2gene.sort.whole.tsv

TranscriptID    Orthogroup
Mema_kidney_Transcript_1003.p1  OG0010091
Mema_kidney_Transcript_100606.p1    OG0014354
Mema_kidney_Transcript_1006.p2  OG0002057
Mema_kidney_Transcript_100754.p1    OG0027137
Mema_kidney_Transcript_1007.p1  OG0009085
Mema_kidney_Transcript_1011.p2  OG0004785
Mema_kidney_Transcript_1012.p1  OG0007052
Mema_kidney_Transcript_1015.p1  OG0002164
Mema_kidney_Transcript_1019.p1  OG0003830
Mema_lung_Transcript_979.p1 OG0002798
Mema_lung_Transcript_98263.p1   OG0002166
Mema_lung_Transcript_9828.p2    OG0006178
Mema_lung_Transcript_983.p1 OG0008817
Mema_lung_Transcript_985.p2 OG0010503
Mema_lung_Transcript_991.p1 OG0006243
Mema_lung_Transcript_9938.p1    OG0013741
Mema_lung_Transcript_9959.p2    OG0001898
Mema_lung_Transcript_995.p1 OG0008495
Mema_lung_Transcript_996.p1 OG0010962
tximport • 287 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 2 hours ago
United States

I think what this is saying

all(txId == raw[[txIdCol]]) is not TRUE

Is that some of the files are quantified against a different index than others. tximport can only import files with a shared index.

ADD COMMENT
0
Entering edit mode

OK I see I will see what I can make work with that information! Thank you!

ADD REPLY
0
Entering edit mode

Realized I never updated -- yes that was the issue :)

ADD REPLY

Login before adding your answer.

Traffic: 482 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6