Stuck at "Read counting step", "summarizeOverlaps"
1
0
Entering edit mode
JunLVI ▴ 40
@junlvi-8996
Last seen 6.4 years ago
Japan

Hi, I was following the workflow (http://www.bioconductor.org/help/workflows/rnaseqGene/)

somehow I got stuck at "Read counting step". 

sampleTable <- read.csv("my_sampletable.csv",row.names=1)
filenames <- paste0(sampleTable$samplename,".bam")
file.exists(filenames)
library("Rsamtools")
bamfiles <- BamFileList(filenames, yieldSize=2000000)
gtffile <- ("mm10genes.gtf") 
library("GenomicFeatures")
(txdb <- makeTxDbFromGFF(gtffile, format="gtf", circ_seqs=character()))
(ebg <- exonsBy(txdb, by="gene"))
library("GenomicAlignments")
library("BiocParallel")
register(MulticoreParam()) # for the first try, I used "register(SerialParam())"
se <- summarizeOverlaps(features=ebg, reads=bamfiles,
                        mode="Union",
                        singleEnd=FALSE,
                        ignore.strand=TRUE,
                        fragments=TRUE )

Then,  I was stuck at this step. 

Any suggestion about what might go wrong? 

if any information is needed for the trouble shooting, please let me know. 

TKS!

deseq2 • 1.5k views
ADD COMMENT
0
Entering edit mode

What do you mean by "stuck"? Was there an error message? If you don't explain what problem you're having, there's not much we can do to help you.

ADD REPLY
0
Entering edit mode

Sorry Ryan, I am sorry that I did not explain clearly.
By "stuck", I meant
when I tried to create the SummarizedExperiment object with counts:
se <- summarizeOverlaps(features=ebg, reads=bamfiles,
                        mode="Union",
                        singleEnd=FALSE,
                        ignore.strand=TRUE,
                        fragments=TRUE )

Then the computer seemed to be running the call like forever. No error message. If i wait long enough (like overnight), I got broken pipe. I wish that explained. 

ADD REPLY
0
Entering edit mode

Can you (and Ryan) provide the output of the R command sessionInfo(). And also try a simple test of MulticoreParam(). I did the following

> library(BiocParallel)
> param = MulticoreParam()
> param
class: MulticoreParam
  bpisup: FALSE; bpnworkers: 6; bptasks: 0; bpjobname: BPJOB
  bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
  bptimeout: 2592000; bpprogressbar: FALSE
  bpRNGseed: 
  bplogdir: NA
  bpresultdir: NA
  cluster type: FORK
> bplapply(1:5, function(i) i, BPPARAM=param)
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4

[[5]]
[1] 5

> sessionInfo()
R Under development (unstable) (2017-01-14 r71969)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocParallel_1.9.4   BiocInstaller_1.25.3

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0    parallel_3.4.0
ADD REPLY
0
Entering edit mode

> library(BiocParallel)

> param = MulticoreParam()

> param

class: MulticoreParam 

  bpjobname:BPJOB; bpworkers:158; bptasks:0; bptimeout:Inf; bpRNGseed:; bpisup:FALSE

  bplog:FALSE; bpthreshold:INFO; bplogdir:NA

  bpstopOnError:FALSE; bpprogressbar:FALSE

  bpresultdir:NA

cluster type: FORK 

> bplapply(1:5, function(i) i, BPPARAM=param)

[[1]]

[1] 1

 

[[2]]

[1] 2

 

[[3]]

[1] 3

 

[[4]]

[1] 4

 

[[5]]

[1] 5

 

> sessionInfo()

R version 3.2.3 (2015-12-10)

Platform: x86_64-pc-linux-gnu (64-bit)

Running under: Red Hat Enterprise Linux Server release 6.7 (Santiago)

 

locale:

 [1] LC_CTYPE=ja_JP.UTF-8       LC_NUMERIC=C              

 [3] LC_TIME=ja_JP.UTF-8        LC_COLLATE=ja_JP.UTF-8    

 [5] LC_MONETARY=ja_JP.UTF-8    LC_MESSAGES=ja_JP.UTF-8   

 [7] LC_PAPER=ja_JP.UTF-8       LC_NAME=C                 

 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

[11] LC_MEASUREMENT=ja_JP.UTF-8 LC_IDENTIFICATION=C       

 

attached base packages:

[1] parallel  stats     graphics  grDevices utils     datasets  methods  

[8] base     

 

other attached packages:

[1] BiocParallel_1.4.3

 

loaded via a namespace (and not attached):

[1] futile.logger_1.4.3  lambda.r_1.1.9       futile.options_1.0.0

>

ADD REPLY
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 12 weeks ago
Icahn School of Medicine at Mount Sinai…

Based on your comment, I think you're seeing the same problem that I sometimes see when using MulticoreParam, where it causes R to hang forever. (Bioc maintainers: I only see this sometimes and haven't figured out how to reproduce it yet.) My suggestion would be to try another parallel param, such as DoparParam():

library(parallel)
library(doParallel)
registerDoParallel(cores=detectCores())
library(BiocParallel)
register(DoparParam())

Then you should get parallel read counting.

ADD COMMENT
0
Entering edit mode

Hi Ryan, Thanks for your reply. I retried MulticoreParam, somehow it worked. (why? I do not know). I will try DoparParam when MulticoreParam do not work. TKS!

 

ADD REPLY

Login before adding your answer.

Traffic: 612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6