Hello Bioc community,
(this have been issued first on GitHub)
I used Rsubread::featurescounts
to quantify some BAM files with a gtf file and
I would like to rerun the unassigned reads with a different gtf. For this reason I
used the option reportReads = "BAM"
to get the reads.
EDIT: feature counts code
nc_RNA_list[1] %>% map(~ featureCounts(files = bam_file,
annot.ext = gtf_files[str_detect(gtf_files,str_glue("_{.x}_"))],
isGTFAnnotationFile =TRUE,
GTF.featureType = "exon",
GTF.attrType.extra = c("gene_type", "sRNA_id", "dna"),
nthreads = 8,
useMetaFeatures = TRUE,
allowMultiOverlap = TRUE,
largestOverlap = TRUE,
fraction = TRUE,
reportReads = "BAM",
reportReadsPath = as.character(str_glue("feature_counts_res_new/{.x}/")),
strandSpecific = 0,
verbose = TRUE) %>% write_rds(str_glue("feature_counts_{todate}_{.x}.rds"))
)
When I tried to load them on R with plyranges::read_bam
which internally uses GenomicAlignments::readGAlignments()
I got this error:
Error in value[[3L]](cond) :
failed to open BamFile: failed to open SAM/BAM file
file: './feature_counts_res_new/rRNA/R1_001_trimmed.fq.gz_sorted.featureCounts.BAM'
I tried also with Rsamtools::indexBam(the_bam)
and got:
Error in FUN(X[[i]], ...) : 'filename' is not a BAM file
file: ./feature_counts_res_new/rRNA/R1_001_trimmed.fq.gz_sorted.featureCounts.BAM
I just need to extract the "XS:Z:Unassigned_NoFeatures" reads and save them to another bam file so as to rerun them.
Thank you for your time Konstantinos
sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] edgeR_3.28.1 limma_3.42.2 tximport_1.14.0 plyranges_1.6.10
[5] GenomicRanges_1.38.0 GenomeInfoDb_1.22.0 IRanges_2.20.2 S4Vectors_0.24.3
[9] BiocGenerics_0.32.0 data.table_1.12.8 forcats_0.5.0 stringr_1.4.0
[13] dplyr_0.8.4 purrr_0.3.3 readr_1.3.1 tidyr_1.0.2
[17] tibble_2.1.3 ggplot2_3.3.0 tidyverse_1.3.0
loaded via a namespace (and not attached):
[1] Biobase_2.46.0 httr_1.4.1 jsonlite_1.6.1
[4] modelr_0.1.6 assertthat_0.2.1 BiocManager_1.30.10
[7] GenomeInfoDbData_1.2.2 cellranger_1.1.0 Rsamtools_2.2.3
[10] pillar_1.4.3 backports_1.1.5 lattice_0.20-38
[13] glue_1.3.1 XVector_0.26.0 rvest_0.3.5
[16] colorspace_1.4-1 Matrix_1.2-18 XML_3.99-0.3
[19] pkgconfig_2.0.3 broom_0.5.5 haven_2.2.0
[22] zlibbioc_1.32.0 scales_1.1.0 BiocParallel_1.20.1
[25] generics_0.0.2 withr_2.1.2 SummarizedExperiment_1.16.1
[28] cli_2.0.2 magrittr_1.5 crayon_1.3.4
[31] readxl_1.3.1 fs_1.3.1 fansi_0.4.1
[34] nlme_3.1-142 xml2_1.2.2 tools_3.6.2
[37] hms_0.5.3 lifecycle_0.2.0 matrixStats_0.55.0
[40] munsell_0.5.0 reprex_0.3.0 locfit_1.5-9.1
[43] DelayedArray_0.12.2 Biostrings_2.54.0 compiler_3.6.2
[46] rlang_0.4.5 grid_3.6.2 RCurl_1.98-1.1
[49] rstudioapi_0.11 bitops_1.0-6 gtable_0.3.0
[52] DBI_1.1.0 R6_2.4.1 GenomicAlignments_1.22.1
[55] lubridate_1.7.4 rtracklayer_1.46.0 utf8_1.1.4
[58] stringi_1.4.6 Rcpp_1.0.3 vctrs_0.2.3
[61] dbplyr_1.4.2 tidyselect_1.0.0
You will need to show the Rsubread code you used as well.