Search
Question: SummarizedExperiment Error! expressionSet data structure
0
gravatar for yifangt
19 months ago by
yifangt10
Canada/NRC
yifangt10 wrote:

Hello,

I was trying RNAseq pipeline following this link.

I had no problem with the sample data hammer, but not with my own. This is the error message:

se <- SummarizedExperiment(exprs(barley.eset))

Error in if (is.null(nms) && 0L != ncol(assays[[1]])) stop("'SummarizedExperiment' assay colnames must not be NULL") :
  missing value where TRUE/FALSE needed

It seems to me my pData, fData are fine for my eSet data as:

> barley.eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 47748 features, 15 samples
  element names: exprs
protocolData: none
phenoData
  sampleNames: S68_2 S68_3 ... SBow5 (15 total)
  varLabels: sample.id num.tech.reps group
  varMetadata: labelDescription
featureData
  featureNames: XLOC_000001 XLOC_000002 ... XLOC_047748 (47748 total)
  fvarLabels: gene
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:  
> pData
An object of class 'AnnotatedDataFrame'
  rowNames: S68_2 S68_3 ... SBow5 (15 total)
  varLabels: sample.id num.tech.reps group
  varMetadata: labelDescription
> fData
An object of class 'AnnotatedDataFrame'
  rowNames: XLOC_000001 XLOC_000002 ... XLOC_047748 (47748 total)
  varLabels: gene
  varMetadata: labelDescription
> DataFrame(fData(barley.eset))
DataFrame with 47748 rows and 1 column
                   gene
               <factor>
XLOC_000001 XLOC_000001
XLOC_000002 XLOC_000002
XLOC_000003 XLOC_000003
XLOC_000004 XLOC_000004
XLOC_000005 XLOC_000005
...                 ...
XLOC_047744 XLOC_047744
XLOC_047745 XLOC_047745
XLOC_047746 XLOC_047746
XLOC_047747 XLOC_047747
XLOC_047748 XLOC_047748
>

Does anyone have a clue? Any help would be greatly appreciated!

EDIT:

Thanks Martin!

Here is the two pcs info you mentioned:

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
[1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C               LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8     LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8    LC_PAPER=en_CA.UTF-8       LC_NAME=C              
[9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ALL_1.12.0                 DESeq2_1.10.1              RcppArmadillo_0.6.600.4.0  Rcpp_0.12.4                SummarizedExperiment_1.0.2 Biobase_2.30.0             GenomicRanges_1.22.4       GenomeInfoDb_1.6.3     
[9] IRanges_2.4.8              S4Vectors_0.8.11           BiocGenerics_0.16.1       

loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3           XVector_0.10.0       tools_3.2.3          futile.options_1.0.0 zlibbioc_1.16.0      rpart_4.1-10         RSQLite_1.0.0        annotate_1.48.0      gtable_0.2.0     
[12] lattice_0.20-33      DBI_0.3.1            gridExtra_2.2.1      genefilter_1.52.1    cluster_2.0.3        locfit_1.5-9.1       grid_3.2.3           nnet_7.3-12          AnnotationDbi_1.32.3 XML_3.98-1.4         survival_2.38-3  
[23] BiocParallel_1.4.3   foreign_0.8-66       latticeExtra_0.6-28  Formula_1.2-1        geneplotter_1.48.0   ggplot2_2.1.0        lambda.r_1.1.7       Hmisc_3.17-3         scales_0.4.0         splines_3.2.3        colorspace_1.2-6 
[34] xtable_1.8-2         acepack_1.3-3.3      munsell_0.4.3     

What I am trying to do is to create ExpressionSet object, or eSet for next analysis, but got stuck at this step.

Yifangt

And this is the two pieces of info with the debug part according to your instruction:

> colnames(exprs(barley.eset))
[1] "S68_2" "S68_3" "S68_4" "S69_4" "S69_5" "S69_p" "S70_4" "S70_5" "S70_p" "S95_2" "S95_3" "S95_4" "SBow1" "SBow3" "SBow5"

>  ncol(exprs(barley.eset))
[1] 15

> head(exprs(barley.eset))
            S68_2 S68_3 S68_4 S69_4 S69_5 S69_p S70_4 S70_5 S70_p S95_2 S95_3 S95_4 SBow1 SBow3 SBow5
XLOC_000001   241    72    59    94   101   332   117    71   314   124   178   107   172   117    98
XLOC_000002     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
XLOC_000003    17     8     8     5     8    70     4     0    44     5     6    14     7    10    22
XLOC_000004  1900   562  2738  3655  2697  1976  4190  2330  1936  3345  4187  5722  2209  3895  1440
XLOC_000005   211    71     0   159   164   126   177    49   122   179   185   158   176    38   120
XLOC_000006     1     1     0     7     2     0     5     0     0     1     4     5     0     1     3

> class(exprs(barley.eset))
[1] "data.frame"

> is.null(colnames(exprs(barley.eset))) && ncol(exprs(barley.eset))
[1] FALSE

> options(error=recover)
> se <- SummarizedExperiment(exprs(barley.eset))
Error in if (is.null(nms) && 0L != ncol(assays[[1]])) stop("'SummarizedExperiment' assay colnames must not be NULL") :
  missing value where TRUE/FALSE needed

Enter a frame number, or 0 to exit   

1: SummarizedExperiment(exprs(barley.eset))
2: SummarizedExperiment(exprs(barley.eset))
3: SummarizedExperiment(do.call(SimpleList, assays), ...)
4: SummarizedExperiment(do.call(SimpleList, assays), ...)
5: .local(assays, ...)

Selection: 5
Called from: (function ()
{
    if (.isMethodsDispatchOn()) {
        tState <- tracingState(FALSE)
        on.exit(tracingState(tState))
    }
    calls <- sys.calls()
    from <- 0L
    n <- length(calls)
    if (identical(sys.function(n), recover))
        n <- n - 1L
    for (i in rev(seq_len(n))) {
        calli <- calls[[i]]
        fname <- calli[[1L]]
        if (!is.na(match(deparse(fname)[1L], c("methods::.doTrace",
            ".doTrace")))) {
            from <- i - 1L
            break
        }
    }
    if (from == 0L)
        for (i in rev(seq_len(n))) {
            calli <- calls[[i]]
            fname <- calli[[1L]]
            if (!is.name(fname) || is.na(match(as.character(fname),
                c("recover", "stop", "Stop")))) {
                from <- i
                break
            }
        }
    if (from > 0L) {
        if (!interactive()) {
            try(dump.frames())
            cat(gettext("recover called non-interactively; frames dumped, use debugger() to view\n"))
            return(NULL)
        }
        else if (identical(getOption("show.error.messages"),
            FALSE))
            return(NULL)
        calls <- limitedLabels(calls[1L:from])
        repeat {
            which <- menu(calls, title = "\nEnter a frame number, or 0 to exit  ")
            if (which)
                eval(substitute(browser(skipCalls = skip), list(skip = 7 -
                  which)), envir = sys.frame(which))
            else break
        }
    }
    else cat(gettext("No suitable frames for recover()\n"))
})()
Browse[1]> nms
NULL
Browse[1]> assays[1]
List of length 1
names(1): S68_2
Browse[1]> assays
List of length 15
names(15): S68_2 S68_3 S68_4 S69_4 S69_5 S69_p S70_4 S70_5 S70_p S95_2 S95_3 S95_4 SBow1 SBow3 SBow5
Browse[1]>

Any more clue? Thanks again!!

ADD COMMENTlink modified 19 months ago by Martin Morgan ♦♦ 20k • written 19 months ago by yifangt10

Are you using a current version of R / Bioconductor? Please update your question to include the output of sessionInfo(). Also, what does colnames(exprs(barley.eset)) say, or add head(exprs(barley.eset)) to your question. The error message is weird, implying either is.null(nms) is NA (but I think base is.null never returns NA) or 0L == ncol(assays[[1]]) is NA, but I don't know how base ncol() would generate an NA. The would translate into is.null(colnames(exprs(barley.eset))) && ncol(exprs(barley.eset)); what are these values

ADD REPLYlink written 19 months ago by Martin Morgan ♦♦ 20k

Please EDIT your original question rather than adding an 'answer'.  I don't see the output of head(exprs(barley.eset)). Maybe also class(exprs(barley.eset)). You could also try

options(error=recover)
se <- SummarizedExperiment(exprs(barley.eset))

You'll get output like

Enter a frame number, or 0 to exit   

1: SummarizedExperiment(matrix(0, 1, 1))
2: SummarizedExperiment(matrix(0, 1, 1))
3: SummarizedExperiment(SimpleList(assays), ...)
4: SummarizedExperiment(SimpleList(assays), ...)
5: .local(assays, ...)

Selection: 

Enter 5 as the selection, then print out nms, assays[[1]], and assays.

 

ADD REPLYlink written 19 months ago by Martin Morgan ♦♦ 20k
1
gravatar for Martin Morgan
19 months ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:

Thanks for the updated information. try SummarizedExperiment(as.matrix(exprs(barley.eset))).

This

> class(exprs(barley.eset))
[1] "data.frame"

is very unusual. Normally an ExpressoinSet should not contain a data.frame for 'exprs()' (usually it is a matrix), and I was unable to create an ExpressionSet containing a data.frame using some standard approaches; how was barley.eset created? Reply with a COMMENT to this answer.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Martin Morgan ♦♦ 20k

Thanks Martin!
Here is my script to create barley.est:

> library(Biobase); library(GenomicRanges)
> library(SummarizedExperiment)
> barleydata <- read.table("../20160323_rerun/results/all_featureCounts_by_gene.txt")

$ head all_featureCounts_by_gene.txt #This was done under OS console, of course.
S68_2    S68_3    S68_4    S69_4    S69_5    S69_p    S70_4    S70_5    S70_p    S95_2    S95_3    S95_4    SBow1    SBow3    SBow5
XLOC_000002    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
XLOC_000006    1    1    0    7    2    0    5    0    0    1    4    5    0    1    3
XLOC_000007    3    1    0    3    1    0    3    0    0    0    8    0    3    6    5
XLOC_000008    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
> sample.id <- NULL
> for(i in 1:15){
   sample.id[i] <- names(barleydata)[i]
   }

#mutant vs WT group
> group <- c(rep("WT",12),rep("mu",3))

> num.tech.reps <- rep(c(rep(1:3)), times = 5)

#phenodata
> pData <- new("AnnotatedDataFrame",data=data.frame(sample.id,num.tech.reps,group))

#featuredata
> fData <- new("AnnotatedDataFrame",data=data.frame(gene=rownames(barleydata)))
> sampleNames(pData) <- names(barleydata) <- sample.id
> featureNames(fData) <- rownames(barleydata)

# eset
> barley.eset <- new("ExpressionSet",
            assayData = assayDataNew(exprs=barleydata),
            phenoData=pData,featureData=fData)

Hope the data.frame is the problem, and again I appreciate your script for the correction.

ADD REPLYlink modified 19 months ago • written 19 months ago by yifangt10

I think you are right that the problem is the "data.frame" of the count table.  After I change this line to create the ExpressionSet object:

barley.eset <- new("ExpressionSet", assayData=assayDataNew(exprs=as.matrix(barleydata)), phenoData=pData,featureData=fData)

and SummarizedExperiment(exprs(barley.eset)) worked thru. It seems this can be one of the solutions for the problem, but not sure if I am doing it on the right track as this is simply a forced conversion to me. Please let me know if you have other way to match your package requirements. Thanks a lot!

ADD REPLYlink modified 19 months ago • written 19 months ago by yifangt10

Yes absolutely that is the right thing to do. A slightly more 'modern' way to do this is ExpressionSet(as.matrix(barleydata), pData, fData). Likewise AnnotatedDataFrame(data.frame(gene=rownames(barleydata))). Also remember that R is vectorized so

for(i in 1:15){
   sample.id[i] <- names(barleydata)[i]
   }

is just

sample.id <- names(barleydata)
ADD REPLYlink written 19 months ago by Martin Morgan ♦♦ 20k

Thanks a lot!

ADD REPLYlink written 19 months ago by yifangt10
0
gravatar for yifangt
19 months ago by
yifangt10
Canada/NRC
yifangt10 wrote:

Added to original post
 

ADD COMMENTlink modified 19 months ago • written 19 months ago by yifangt10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 284 users visited in the last hour