Search
Question: GAGE pathway analysis (R/Bioconductor): summarizeOverlaps error
0
gravatar for user31888
2.0 years ago by
user3188830
United States
user3188830 wrote:

I am trying to use the GAGE pathway analysis tutorial with 3 samples of paired-end RNA-Seq reads (placed in a "tophat_all" directory) that I mapped with TopHat (.bam files have been indexed).

An error occurred at the read count step:

> library(TxDb.Hsapiens.UCSC.hg19.knownGene)

> exByGn <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, "gene")

> library(GenomicAlignments)

> fls <- list.files("tophat_all/", pattern="bam$", full.names =T)

> bamfls <- BamFileList(fls)

> flag <- scanBamFlag(isSecondaryAlignment=FALSE, isProperPair=TRUE)

> param <- ScanBamParam(flag=flag)

> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, single.end=FALSE, param=param)

Error in validObject(.Object) : 
  invalid class "SummarizedExperiment0" object: 'assays' nrow differs from 'mcols' nrow

 

When I tried the same with only one sample I get a different error:

> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, single.end=FALSE, param=param)

Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  'data' must be of a vector type, was 'NULL'

Could someone explain me the error?

Does the error come from my reads (potential unpaired reads, different length...)?

Thanks !

Note:

(1) I am using the last version of R, Biconductor and its packages. Could it come from the newer versions of the packages that would not fit the tutorial anymore?

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] GenomicAlignments_1.6.1                
 [2] Rsamtools_1.22.0                       
 [3] Biostrings_2.38.0                      
 [4] XVector_0.10.0                         
 [5] SummarizedExperiment_1.0.1             
 [6] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [7] GenomicFeatures_1.22.4                 
 [8] AnnotationDbi_1.32.0                   
 [9] Biobase_2.30.0                         
[10] GenomicRanges_1.22.1                   
[11] GenomeInfoDb_1.6.1                     
[12] IRanges_2.4.1                          
[13] S4Vectors_0.8.2                        
[14] BiocGenerics_0.16.1                    

loaded via a namespace (and not attached):
 [1] zlibbioc_1.16.0      BiocParallel_1.4.0   tools_3.2.2         
 [4] DBI_0.3.1            lambda.r_1.1.7       futile.logger_1.4.1 
 [7] rtracklayer_1.30.1   futile.options_1.0.0 bitops_1.0-6        
[10] RCurl_1.95-4.7       biomaRt_2.26.0       RSQLite_1.0.0       
[13] XML_3.98-1.3

(2) And the traceback after the error occurred:

> traceback()
26: stop(msg, ": ", errors, domain = NA)
25: validObject(.Object)
24: initialize(value, ...)
23: initialize(value, ...)
22: new("SummarizedExperiment0", NAMES = names, elementMetadata = rowData, 
        colData = colData, assays = assays, metadata = as.list(metadata))
21: new_SummarizedExperiment0(from@assays, names(from@rowRanges), 
        mcols(from@rowRanges), from@colData, from@metadata)
20: asMethod(object)
19: as(object, superClass)
18: slot(x, "assays")
17: is(slot(x, "assays"), "Assays")
16: .valid.SummarizedExperiment0.assays_current(x)
15: valid.func(object)
14: validityMethod(as(object, superClass))
13: anyStrings(validityMethod(as(object, superClass)))
12: validObject(.Object)
11: initialize(value, ...)
10: initialize(value, ...)
9: new("RangedSummarizedExperiment", rowRanges = rowRanges, colData = colData, 
       assays = assays, elementMetadata = elementMetadata, metadata = as.list(metadata))
8: .new_RangedSummarizedExperiment(assays, rowRanges, colData, metadata)
7: .local(assays, ...)
6: SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = features, 
       colData = colData)
5: SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = features, 
       colData = colData)
4: .dispatchBamFiles(features, reads, mode, match.arg(algorithm), 
       ignore.strand, inter.feature = inter.feature, singleEnd = singleEnd, 
       fragments = fragments, param = param, preprocess.reads = preprocess.reads, 
       ...)
3: .local(features, reads, mode, algorithm, ignore.strand, ...)
2: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE, 
       single.end = FALSE, param = param)
1: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE, 
       single.end = FALSE, param = param)
ADD COMMENTlink modified 2.0 years ago by Valerie Obenchain ♦♦ 6.4k • written 2.0 years ago by user3188830
2
gravatar for Valerie Obenchain
2.0 years ago by
Valerie Obenchain ♦♦ 6.4k
United States
Valerie Obenchain ♦♦ 6.4k wrote:

Hi,

The argument should be 'singleEnd', not 'single.end'. The failure was occurring in bplappy() but the error message was not propagated due to how BiocParallel was handling errors, specifically the behavior of 'stop.on.error'.

Error handling has been changed in devel (BiocParallel 1.5.1) to throw an error when 'stop.on.error = TRUE' (default). See ?SnowParm or ?MulticoreParam for details.

With BiocParallel >= 1.5.1 you'll see this error message:

> summarizeOverlaps(exByGn, bamfls, mode="Union", 
+     ignore.strand=TRUE, single.end=FALSE, 
+     param=param)
ERROR [2015-11-18 08:41:29] unused argument (single.end = FALSE) 
error in task  1
Error in bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM) : 
  error in bplapply(): list(message = "unused argument (single.end = FALSE)", call = FUN(...))

Valerie

ADD COMMENTlink written 2.0 years ago by Valerie Obenchain ♦♦ 6.4k

Thanks Valerie for your reply! I have already tried 'singleEnd' yesterday and it returned the same error message as in my original post. Today, I have tried again and the script seems to work (started 2 hours ago and still running; which never happened before). Except checking the version of my Biocparallel and the man, I didn't do anything else compared to yesterday...strange. Anyway, fingers crossed ! Thanks !

ADD REPLYlink written 2.0 years ago by user3188830

If your bam files are large you can use BamFileList(..., yieldSize = 100000) to iterate through in chunks. Also try passing 

BPPARAM = SerialParam()

to the summarizeOverlaps() call. This eliminates parallel evaluation and often makes it easier to identify where things are going wrong. Let me know if you still have problems and I can take a look at one of your files off-line.

Valerie

ADD REPLYlink written 2.0 years ago by Valerie Obenchain ♦♦ 6.4k

Thanks !

All the steps after works until calling GAGE package. Actually I cannot even install GAGE (and its dependencies 'png' and 'KEGGREST').

See Cannot install GAGE (and its dependencies): libpng16.so.16 not found

ADD REPLYlink modified 24 months ago • written 24 months ago by user3188830

If your original question was about summarizeOverlaps, and this has been solved, and now you have a new question about libpng, use the 'ASK QUESTION' link to ask a new question. Be sure to be more specific about the flavor of linux you are using.

ADD REPLYlink modified 24 months ago • written 24 months ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 131 users visited in the last hour