Question: DESeqDataSetFromMatrix giving error: "argument must be coercible to non-negative integer"
0
gravatar for ErickF
3.2 years ago by
ErickF20
ErickF20 wrote:

Hi, I'm trying to run DEseq2. As a test I'm using RNAseq data from 8 samples.  My countdata, coldata, and rowdata objects look (to me) formatted as they should, the dimensions/lengths match, count data is correct, etc. But when I run DESeqDataSetFromMatrix() I get this error:

> ddsFull <- DESeqDataSetFromMatrix(countData = countdata,
+    colData = coldata, rowData = rowdata, design = ~ type + sex)
Error in seq_len(length(idx) - 1) : 
  argument must be coercible to non-negative integer
In addition: Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  58 duplicate rownames were renamed by adding numbers

 

Here is the (detailed) step-by-step. First I generate my SE object (works without problems):

> ex3 <- summarizeOverlaps(features=grl, reads=bamLst, ignore.strand=T, singleEnd=T)
> class(ex3)
[1] "RangedSummarizedExperiment"
attr(,"package")
[1] "SummarizedExperiment"

Then I create the countdata, coldata, rowdata objects (without problems):

> countdata <- assay(ex3)
> coldata <- colData(ex3)
> rowdata <- rowRanges(ex3)
> class(coldata)
[1] "DataFrame"
attr(,"package")
[1] "S4Vectors"
> class(rowdata)
[1] "GRangesList"
attr(,"package")
[1] "GenomicRanges"
> class(countdata)
[1] "matrix"
> length(rowdata)
[1] 24943
> dim(coldata)
[1] 8   6
> dim(countdata)
[1] 24943   8

> head(countdata)
         OM_003  OM_005  OM_014  OM_023
A1BG        259      69     116      69
NAT2          6      11       0       0
ADA        1785     396     964     441
CDH2        119      52      35      45 ...

> head(rowdata)
GRangesList object of length 6:
$A1BG 
GRanges object with 15 ranges and 2 metadata columns:
       seqnames               ranges strand |   exon_id   exon_name
          <Rle>            <IRanges>  <Rle> | <integer> <character>
   [1]    chr19 [58346806, 58347029]      - |    264625        <NA>
   [2]    chr19 [58347353, 58347640]      - |    264626        <NA> ...

> head(coldata)
DataFrame with 6 rows and 6 columns
            type      sex   status    height   weight     tech
           <factor> <factor> <factor> <numeric> <numeric> <factor>
OM_003       AA        F     yes      15.9     36.67        2
OM_005       AA        M     no       10.5     83.35        1
OM_014       BB        F     yes      14.3     31.22        7 ...

And then the error:

> ddsFull <- DESeqDataSetFromMatrix(countData = countdata,
+    colData = coldata, rowData = rowdata, design = ~ type + sex)
Error in seq_len(length(idx) - 1) : 
  argument must be coercible to non-negative integer
In addition: Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  58 duplicate rownames were renamed by adding numbers

The traceback():

7: eval(expr, envir, enclos)
6: eval(quote(list(...)), env)
5: eval(quote(list(...)), env)
4: standardGeneric("paste")
3: paste(rnms[idx[-1]], c(seq_len(length(idx) - 1)), sep = ".")
2: DESeqDataSet(se, design = design, ignoreRank)
1: DESeqDataSetFromMatrix(countData = countdata, colData = coldata, 
       rowData = rowdata, design = ~type + sex

Any thoughts?? Count data is correct (zeros and positive integers, no "negative" counts), colData is correctly formatted, rowData seems correct as well. I am not sure what paste(rnms[idx[-1]], c(seq_len(length(idx)-1), sep=".") means, but it seems like maybe that is where the error is generating??

 

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by ErickF20
Answer: DESeqDataSetFromMatrix giving error: "argument must be coercible to non-negative
1
gravatar for ErickF
3.2 years ago by
ErickF20
ErickF20 wrote:

Figured it out!

The problem: I had replaced the original rownames in ex3 (the SE object) from entrezID numbers (which make zero sense to me) to gene symbols (which at least make some sense). I didn't think changing the rownames would matter so long as they matched rowdata, but it seems that was the big problem.

Solution: Anyway, I re-generated the SE object (took a while!) without replacing rownames and now DESeqDataSetFromMatrix worked without a problem. I was able to run the DESeq2 pipeline to results. Then I can add the gene symbols/names to the results data frame.

ADD COMMENTlink written 3.2 years ago by ErickF20

Gene names are not syntactically valid row names. You can use make.names to convert row names into syntactically valid names. But then you have mutated gene names for downstream analyses. 

A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number.

ADD REPLYlink written 2.6 years ago by tapa7410
Answer: DESeqDataSetFromMatrix giving error: "argument must be coercible to non-negative
0
gravatar for Thomas Carroll
3.2 years ago by
United States/New York/The Rockefeller University
Thomas Carroll400 wrote:

hi,

As a thought, the manual (summarizedexperiment 1.2.2) specifies rowData accepts a DataFrame and rowRanges a GRangesList (as you supply). If you  try with 'rowRanges = rowdata' instead of 'rowData=rowdata' do you see the same error?

tom

ADD COMMENTlink written 3.2 years ago by Thomas Carroll400

Hi Tom -- thanks. I tried rowRanges=rowdata but got the same error. Then I thought maybe if I defined rowdata by using rowdata <- rowData(ex3) (instead of rowRanges); and then I just omitted rowdata altogether, and still the same problem...

The only difference I can find between my data is that the class for my colData says:

[1] "DataFrame"
attr(,"package")
[1] "S4Vectors"

Whereas class(colData(parathyroidGenesSE)) says:

[1] "DataFrame"
attr(,"package")
[1] "IRanges"

I have no idea if this is the root of the problem...

ADD REPLYlink written 3.2 years ago by ErickF20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 265 users visited in the last hour