HTSeqGenie run error
1
0
Entering edit mode
zh9118 • 0
@zh9118-21668
Last seen 26 days ago
United States

Hi,

I am running the HTSeqGenie on both MacOS and Linux with the test TP53 samples. They both gave me error in reading the fastq files. It seems having problems reading the fastq.gz files in each parallel process. Could anyone help me with this please?

Error are at below:

checkConfig.R/checkConfig.template: loading template config= /Library/Frameworks/R.framework/Versions/4.2/Resources/library/HTSeqGenie/config/default-config.txt 
possible qualities of filename=../data/H1993_TP53_subset2500_1.fastq.gz are: illumina1.8, GATK-rescaled
quality_encoding is not set! setting quality_encoding to illumina1.8 
2023-03-01 14:24:01 ERROR::tools.R/safeExecute: caught exception:
2023-03-01 14:24:01 ERROR::Error in sclapply(inext = inext, fun = funlog, max.parallel.jobs = nb.parallel.jobs, : tools.R/sclapply: error in chunkid=1: Error in file(file, "wb") : cannot open the connection


2023-03-01 14:24:01 ERROR::tools.R/safeExecute: traceback:
2023-03-01 14:24:01 ERROR::10: stop(paste("tools.R/sclapply: error in chunkid=", jnodes[i],  at tools.R#210
2023-03-01 14:24:01 ERROR::9: sclapply(inext = inext, fun = funlog, max.parallel.jobs = nb.parallel.jobs,  at tools.R#120
2023-03-01 14:24:01 ERROR::8: processChunks(FastQStreamer.getReads, preprocessReadsChunk, nb.parallel.jobs = nb.parallel.jobs) at preprocessReads.R#29
2023-03-01 14:24:01 ERROR::7: eval(expr, env)
2023-03-01 14:24:01 ERROR::6: try(eval(expr, env), silent = TRUE)
2023-03-01 14:24:01 ERROR::5: serialize(what, NULL, xdr = FALSE)
2023-03-01 14:24:01 ERROR::4: safeExecute({ at preprocessReads.R#17
2023-03-01 14:24:01 ERROR::3: preprocessReads() at runPipeline.R#69
2023-03-01 14:24:01 ERROR::2: runPipelineConfig(config_update = list(...)) at runPipeline.R#48
2023-03-01 14:24:01 ERROR::1: runPipeline(input_file = Fq.1, input_file2 = Fq.2, paired_ends = TRUE, 
Error in sclapply(inext = inext, fun = funlog, max.parallel.jobs = nb.parallel.jobs,  : 
  tools.R/sclapply: error in chunkid=1: Error in file(file, "wb") : cannot open the connection
In addition: Warning messages:
1: In system("gsnap", ignore.stderr = TRUE) : error in running command
2: In system("samtools", ignore.stderr = TRUE) : error in running command

My codes are below:

library(HTSeqGenie)
library(gmapR)

Gencode.V43.GenomicFeatures <- "../Genome/Gencode/Gencode.v43/Gencode.v43.RData"
Gencode.V43.GenomicFeatures.rRNA <- "../Genome/rRNA/rRNA.Gencode.v43/rRNA.Gencode.v43.RData"

Sample.ID <- "test"
Fq.1 <- "../data/H1993_TP53_subset2500_1.fastq.gz"
Fq.2 <- "../data/H1993_TP53_subset2500_2.fastq.gz"

save_dir <- runPipeline(
  ## input
  input_file=Fq.1,
  input_file2=Fq.2,
  paired_ends=TRUE,
  # quality_encoding="illumina1.8",

  ## system
  num_cores = 4,

  ## output
  save_dir=paste("../analysis/", Sample.ID, sep=''),
  prepend_str=paste("../analysis/", Sample.ID, sep=''),
  overwrite_save_dir="erase",
  remove_processedfastq = F,
  remove_chunkdir = T,

  ## trim reads
  # trimReads.do = FALSE,
  # trimReads.length = NULL,
  # trimReads.trim5 = 0,

  ## Filter
  filterQuality.do = T,
  filterQuality.minQuality = 23,
  filterQuality.minFrac = 0.7,
  # filterQuality.minLength

  ## detect adapter contamination
  detectAdapterContam.do = T,
  detectAdapterContam.force_paired_end_adapter = F,

  ## detect ribosomal RNA
  detectRRNA.do = F,
  detectRRNA.rrna_genome = "../Genome/rRNA/rRNA.Gencode.v43/rRNA.Gencode.v43.RData",

  ## aligner
  path.gsnap_genomes="../Genome/Human/",
  alignReads.genome="GRCh38.p14",
  alignReads.additional_parameters="-M 2 -n 10 -B 2 -i 1 -N 1 -w 200000 -E 1 --pairmax-rna=200000 --clip-overlap",
  alignReads.sam_id = Sample.ID,

  ## gene model
  path.genomic_features = "../Genome/Gencode/",
  countGenomicFeatures.gfeatures = "Gencode.v43"
)
```r

# include your problematic code here with any corresponding output 
# please also include the results of running the following in an R session 

sessionInfo( )
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] HTSeqGenie_4.28.1                 GenomicFeatures_1.48.4            AnnotationDbi_1.58.0             
 [4] bambu_2.2.0                       BSgenome.Hsapiens.UCSC.hg38_1.4.4 BSgenome_1.64.0                  
 [7] rtracklayer_1.56.1                BiocManager_1.30.20               VariantAnnotation_1.42.1         
[10] ShortRead_1.54.0                  GenomicAlignments_1.32.1          SummarizedExperiment_1.26.1      
[13] Biobase_2.56.0                    MatrixGenerics_1.8.1              matrixStats_0.63.0               
[16] BiocParallel_1.30.4               gmapR_1.38.0                      Rsamtools_2.12.0                 
[19] Biostrings_2.64.1                 XVector_0.36.0                    GenomicRanges_1.48.0             
[22] GenomeInfoDb_1.32.4               IRanges_2.30.1                    S4Vectors_0.34.0                 
[25] BiocGenerics_0.42.0              

loaded via a namespace (and not attached):
 [1] bitops_1.0-7                            bit64_4.0.5                            
 [3] filelock_1.0.2                          RColorBrewer_1.1-3                     
 [5] progress_1.2.2                          httr_1.4.5                             
 [7] tools_4.2.0                             utf8_1.2.3                             
 [9] R6_2.5.1                                DBI_1.1.3                              
[11] tidyselect_1.2.0                        prettyunits_1.1.1                      
[13] bit_4.0.5                               curl_5.0.0                             
[15] compiler_4.2.0                          cli_3.6.0                              
[17] Cairo_1.6-0                             xml2_1.3.3                             
[19] DelayedArray_0.22.0                     VariantTools_1.38.0                    
[21] rappdirs_0.3.3                          stringr_1.5.0                          
[23] digest_0.6.31                           BSgenome.Hsapiens.UCSC.hg19_1.4.3      
[25] jpeg_0.1-10                             pkgconfig_2.0.3                        
[27] dbplyr_2.3.1                            fastmap_1.1.1                          
[29] rlang_1.0.6                             rstudioapi_0.14                        
[31] RSQLite_2.3.0                           TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[33] BiocIO_1.6.0                            generics_0.1.3                         
[35] hwriter_1.3.2.1                         jsonlite_1.8.4                         
[37] dplyr_1.1.0                             RCurl_1.98-1.10                        
[39] magrittr_2.0.3                          GenomeInfoDbData_1.2.8                 
[41] interp_1.1-3                            Matrix_1.5-3                           
[43] Rcpp_1.0.10                             fansi_1.0.4                            
[45] lifecycle_1.0.3                         stringi_1.7.12                         
[47] yaml_2.3.7                              zlibbioc_1.42.0                        
[49] org.Hs.eg.db_3.15.0                     BiocFileCache_2.4.0                    
[51] grid_4.2.0                              blob_1.2.3                             
[53] parallel_4.2.0                          crayon_1.5.2                           
[55] deldir_1.0-6                            lattice_0.20-45                        
[57] hms_1.1.2                               KEGGREST_1.36.3                        
[59] pillar_1.8.1                            rjson_0.2.21                           
[61] xgboost_1.7.3.1                         codetools_0.2-19                       
[63] biomaRt_2.52.0                          XML_3.99-0.13                          
[65] glue_1.6.2                              latticeExtra_0.6-30                    
[67] data.table_1.14.8                       png_0.1-8                              
[69] vctrs_0.5.2                             purrr_1.0.1                            
[71] tidyr_1.3.0                             cachem_1.0.7                           
[73] chipseq_1.46.0                          restfulr_0.0.15                        
[75] tibble_3.1.8                            memoise_2.0.1                          
[77] ellipsis_0.3.2
HTSeqGenie • 466 views
ADD COMMENT
1
Entering edit mode
zh9118 • 0
@zh9118-21668
Last seen 26 days ago
United States

It has been solved. The parameter "prepend_str" should just a directory name rather than a path, otherwise the output file couldn't be open.

ADD COMMENT

Login before adding your answer.

Traffic: 802 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6