Entering edit mode
Hello, I'm trying to run tximport from my input salmon, the generated tx2gene file had 1,000+ genes but when I ran tximport I got missing transcripts with only RNAs are readable (all the other genes are missing). I could not find the reason why the tximport produce such output. How do I make my other genes visible? Thanks
library(GenomicFeatures)
gff_file <- "tn2-sequence.gff3"
file.exists(gff_file)
txdb <- makeTxDbFromGFF(gff_file)
keytypes(txdb)
columns(txdb)
#gene names to transcript
k <- keys(txdb, keytype="TXNAME")
tx_map <- AnnotationDbi::select(txdb, keys = k,
columns="GENEID", keytype = "TXNAME")
view(tx_map)
tx2gene <- tx_map
write.csv(tx2gene,file="tx2gene.csv",row.names = FALSE,quote=FALSE)
view (tx2gene)
--tx2gene generates 1278obs of 2 variables
##load transcript abundances -------
txi <- tximport(files = sample_files, type = "salmon",
tx2gene = tx2gene, ignoreTxVersion = TRUE)
view(txi$counts)
# results
reading in files with read_tsv
1 2 3 4 5
removing duplicated transcript rows from tx2gene
transcripts missing from tx2gene: 1545
summarizing abundance
summarizing counts
summarizing length
sessionInfo( )
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.2 stringr_1.5.0 dplyr_1.0.10
[4] purrr_1.0.0 readr_2.1.3 tidyr_1.2.1
[7] tibble_3.1.8 ggplot2_3.4.0 tidyverse_1.3.2
[10] BiocManager_1.30.19 tximport_1.26.1 GenomicFeatures_1.50.3
[13] AnnotationDbi_1.60.0 Biobase_2.58.0 GenomicRanges_1.50.2
[16] GenomeInfoDb_1.34.6 IRanges_2.32.0 S4Vectors_0.36.1
[19] BiocGenerics_0.44.0 readxl_1.4.1 magrittr_2.0.3
loaded via a namespace (and not attached):
[1] bitops_1.0-7 matrixStats_0.63.0
[3] fs_1.5.2 lubridate_1.9.0
[5] bit64_4.0.5 filelock_1.0.2
[7] progress_1.2.2 httr_1.4.4
[9] tools_4.2.2 backports_1.4.1
[11] utf8_1.2.2 R6_2.5.1
[13] DBI_1.1.3 colorspace_2.0-3
[15] withr_2.5.0 tidyselect_1.2.0
[17] prettyunits_1.1.1 bit_4.0.5
[19] curl_4.3.3 compiler_4.2.2
[21] rvest_1.0.3 cli_3.5.0
[23] xml2_1.3.3 DelayedArray_0.24.0
[25] rtracklayer_1.58.0 scales_1.2.1
[27] rappdirs_0.3.3 digest_0.6.31
[29] Rsamtools_2.14.0 XVector_0.38.0
[31] pkgconfig_2.0.3 MatrixGenerics_1.10.0
[33] dbplyr_2.2.1 fastmap_1.1.0
[35] rlang_1.0.6 rstudioapi_0.14
[37] RSQLite_2.2.20 BiocIO_1.8.0
[39] generics_0.1.3 jsonlite_1.8.4
[41] vroom_1.6.0 BiocParallel_1.32.5
[43] googlesheets4_1.0.1 RCurl_1.98-1.9
[45] GenomeInfoDbData_1.2.9 Matrix_1.5-3
[47] Rcpp_1.0.9 munsell_0.5.0
[49] fansi_1.0.3 lifecycle_1.0.3
[51] stringi_1.7.8 yaml_2.3.6
[53] SummarizedExperiment_1.28.0 zlibbioc_1.44.0
[55] BiocFileCache_2.6.0 grid_4.2.2
[57] blob_1.2.3 parallel_4.2.2
[59] crayon_1.5.2 lattice_0.20-45
[61] Biostrings_2.66.0 haven_2.5.1
[63] hms_1.1.2 KEGGREST_1.38.0
[65] pillar_1.8.1 rjson_0.2.21
[67] codetools_0.2-18 biomaRt_2.54.0
[69] reprex_2.0.2 XML_3.99-0.13
[71] glue_1.6.2 modelr_0.1.10
[73] data.table_1.14.6 tzdb_0.3.0
[75] png_0.1-8 vctrs_0.5.1
[77] cellranger_1.1.0 gtable_0.3.1
[79] assertthat_0.2.1 cachem_1.0.6
[81] broom_1.0.2 restfulr_0.0.15
[83] googledrive_2.0.0 gargle_1.2.1
[85] GenomicAlignments_1.34.0 memoise_2.0.1
[87] timechange_0.1.1 ellipsis_0.3.2