tximport no counts from abundance using EnsDb.Hsapiens.v86
2
0
Entering edit mode
TJ • 0
@a7167d98
Last seen 8 months ago

Hi,

I have generated quant.sf files using salmon for six samples (2 conditions, each condition in triplicate). I have then followed the tximport workflow and am stuck at the tx2gene step below as I get $countsFromAbundance [1] "no".

Many thanks in advance for your kind help and your time

> dir <- "/Volumes/Mac_LJ_TJ/TJ/RNA/salmon/salmon_tutorial/quants"
> dir
[1] "/Volumes/Mac_LJ_TJ/TJ/RNA/salmon/salmon_tutorial/quants"
> samples <- read.table(file.path(dir, "samples.txt"), header = TRUE)
> samples

  samples                   treatment
1 sample1 WTCHG_823938_71575133_quant
2 sample2 WTCHG_823938_71585134_quant
3 sample3 WTCHG_823938_71595135_quant
4 sample4 WTCHG_823938_71605136_quant
5 sample5 WTCHG_823938_71615137_quant
6 sample6 WTCHG_823938_71625138_quant

> files <- file.path(dir, "salmon", samples$treatment, "quant.sf")
> names(files) <- paste0("sample", 1:6)
> all(file.exists(files))
[1] TRUE
> edb <- EnsDb.Hsapiens.v86
> txs <- transcripts(edb, return.type = "DataFrame")
> txi <- tximport(files, type = "salmon", tx2gene = txs, ignoreTxVersion = TRUE)

reading in files with read_tsv
1 2 3 4 5 6 
transcripts missing from tx2gene: 17113
summarizing abundance
summarizing counts
summarizing length

> head(txi$counts)
                               sample1  sample2  sample3 sample4  sample5  sample6
3prime_overlapping_ncRNA         0.000    1.052    0.000    0.00    0.000    0.000
antisense                     3623.785 4702.966 3690.267 4604.10 4791.529 4241.543
bidirectional_promoter_lncRNA   80.000   95.001   67.000   36.01   48.146   40.001
IG_C_gene                        1.000    0.000    1.000    0.00    3.000    0.000
IG_C_pseudogene                  0.000    2.000    0.000    5.00    2.000    2.000
IG_D_gene                        0.000    0.000    0.000    0.00    0.000    0.000
> 
> txi

$counts
                                        sample1      sample2      sample3      sample4      sample5
3prime_overlapping_ncRNA                  0.000        1.052        0.000        0.000        0.000
antisense                              3623.785     4702.966     3690.267     4604.100     4791.529
bidirectional_promoter_lncRNA            80.000       95.001       67.000       36.010       48.146
IG_C_gene                                 1.000        0.000        1.000        0.000        3.000
IG_C_pseudogene                           0.000        2.000        0.000        5.000        2.000
IG_D_gene                                 0.000        0.000        0.000        0.000        0.000
IG_J_gene                                 0.000        0.000        0.000        0.000        0.000
IG_J_pseudogene                           0.000        0.000        0.000        0.000        0.000
IG_pseudogene                             0.000        0.000        0.000        0.000        0.000
IG_V_gene                               205.441      217.536      202.516      203.137      234.523
IG_V_pseudogene                          24.000       27.316       26.035       30.000       13.002
lincRNA                               11653.474    15322.133    11335.241    12584.395    13595.131
non_stop_decay                        12568.506    15920.272    12464.009    11477.102    13866.972
nonsense_mediated_decay             1403931.042  1707027.271  1358093.215  1436845.152  1659518.433
polymorphic_pseudogene                   78.955       92.044       72.319       71.573       65.095
processed_pseudogene                  92082.091   126241.017    86318.116   110677.782   138921.071
processed_transcript                1096540.715  1303838.063  1040602.805  1159030.527  1288276.075
protein_coding                     52869867.637 67005440.158 51747759.212 54946608.147 64955809.746
pseudogene                                1.000       12.000        0.000        5.000        2.000
retained_intron                     2155812.260  2209586.729  1980159.016  2322248.306  2343544.031
rRNA                                    287.000      576.000      324.000     1320.000     1533.000
sense_intronic                         1990.255     2750.951     2047.189     2868.783     3277.976
sense_overlapping                      2038.084     3315.309     2158.184     2604.718     2920.475
TEC                                    3667.636     3851.902     3238.114     4082.728     3961.662
TR_C_gene                                 0.000        0.000        1.000        0.000        0.000
TR_D_gene                                 0.000        0.000        0.000        0.000        0.000
TR_J_gene                                 0.000        0.000        0.000        0.000        0.000
TR_J_pseudogene                           0.000        0.000        0.000        0.000        0.000
TR_V_gene                                25.000       19.000       20.000       21.000       28.000
TR_V_pseudogene                           9.000       13.000       13.000       10.000        6.000
transcribed_processed_pseudogene       9987.584    12754.142     9282.894    10549.953    12358.831
transcribed_unitary_pseudogene           66.438      147.927      146.285      203.033      124.679
transcribed_unprocessed_pseudogene    18345.868    20242.087    18016.170    19232.998    19849.919
unitary_pseudogene                      272.691      380.859      245.403      225.672      228.680
unprocessed_pseudogene                25769.595    24008.149    25430.889    28503.149    27337.853
                                        sample6
3prime_overlapping_ncRNA                  0.000
antisense                              4241.543
bidirectional_promoter_lncRNA            40.001
IG_C_gene                                 0.000
IG_C_pseudogene                           2.000
IG_D_gene                                 0.000
IG_J_gene                                 0.000
IG_J_pseudogene                           0.000
IG_pseudogene                             0.000
IG_V_gene                               195.168
IG_V_pseudogene                          15.968
lincRNA                               11456.067
non_stop_decay                        11973.694
nonsense_mediated_decay             1443531.471
polymorphic_pseudogene                   44.688
processed_pseudogene                 120897.648
processed_transcript                1109199.937
protein_coding                     56409697.402
pseudogene                                3.000
retained_intron                     1876370.615
rRNA                                   1539.000
sense_intronic                         2854.213
sense_overlapping                      2553.777
TEC                                    3149.544
TR_C_gene                                 2.000
TR_D_gene                                 0.000
TR_J_gene                                 0.000
TR_J_pseudogene                           0.000
TR_V_gene                                26.000
TR_V_pseudogene                           9.000
transcribed_processed_pseudogene      11716.976
transcribed_unitary_pseudogene          131.237
transcribed_unprocessed_pseudogene    17375.298
unitary_pseudogene                      149.448
unprocessed_pseudogene                20904.643


$countsFromAbundance
[1] "no"

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tximport_1.18.0           EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.14.0         
 [4] AnnotationFilter_1.14.0   GenomicFeatures_1.42.1    AnnotationDbi_1.52.0     
 [7] Biobase_2.50.0            GenomicRanges_1.42.0      GenomeInfoDb_1.26.2      
[10] IRanges_2.24.1            S4Vectors_0.28.1          BiocGenerics_0.36.0      
[13] tximportData_1.18.0      

loaded via a namespace (and not attached):
 [1] MatrixGenerics_1.2.0        httr_1.4.2                  bit64_4.0.5                
 [4] jsonlite_1.7.2              assertthat_0.2.1            askpass_1.1                
 [7] BiocFileCache_1.14.0        blob_1.2.1                  GenomeInfoDbData_1.2.4     
[10] Rsamtools_2.6.0             progress_1.2.2              sessioninfo_1.1.1          
[13] pillar_1.4.7                RSQLite_2.2.3               lattice_0.20-41            
[16] glue_1.4.2                  XVector_0.30.0              Matrix_1.3-2               
[19] XML_3.99-0.5                pkgconfig_2.0.3             biomaRt_2.46.2             
[22] zlibbioc_1.36.0             purrr_0.3.4                 BiocParallel_1.24.1        
[25] tibble_3.0.6                openssl_1.4.3               generics_0.1.0             
[28] ellipsis_0.3.1              cachem_1.0.1                withr_2.4.1                
[31] SummarizedExperiment_1.20.0 lazyeval_0.2.2              cli_2.2.0                  
[34] magrittr_2.0.1              crayon_1.3.4                memoise_2.0.0              
[37] fansi_0.4.2                 xml2_1.3.2                  tools_4.0.3                
[40] prettyunits_1.1.1           hms_1.0.0                   lifecycle_0.2.0            
[43] matrixStats_0.57.0          stringr_1.4.0               DelayedArray_0.16.1        
[46] Biostrings_2.58.0           compiler_4.0.3              tinytex_0.29               
[49] rlang_0.4.10                grid_4.0.3                  RCurl_1.98-1.2             
[52] rstudioapi_0.13             rappdirs_0.3.2              bitops_1.0-6               
[55] DBI_1.1.1                   curl_4.3                    R6_2.5.0                   
[58] GenomicAlignments_1.26.0    dplyr_1.0.3                 rtracklayer_1.50.0         
[61] fastmap_1.1.0               bit_4.0.4                   ProtGenerics_1.22.0        
[64] readr_1.4.0                 stringi_1.5.3               Rcpp_1.0.6                 
[67] vctrs_0.3.6                 dbplyr_2.0.0                tidyselect_1.1.0           
[70] xfun_0.20

Note following the txi command I removed $abundance and $length for this post for space reasons.

salmon EnsDb.Hsapiens.v86 tximport • 217 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

There is no error. See the man page for ?tximport and the information under Value, which is what the function returns.

ADD COMMENT

Login before adding your answer.

Traffic: 489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6