Error in designAndArgChecker(object, betaPrior) : variables in the design formula cannot have NA values
1
0
Entering edit mode
rex.burger ▴ 10
@rexburger-11187
Last seen 4.6 years ago

Hello everyone,

I have a specific problem with Deseq2. I tried to google and find similar error codes in bio-conductor web page and other sites. But couldn't find a similar error code, so i am sending this e-mail to you guys so that you could suggest me a solution.


 I am trying to analyze my data and i am trying to compare the expression of small RNAs between two stages. Here is the code and the error that i get.

> temp=read.csv("results.csv",header=TRUE, sep=",")
> head(temp)
        X.miRNA Sample.1 Sample.2 Sample.3 Sample.4 Sample.5 Sample.6 Sample.7 Sample.8
1 aae-bantam-5p       0       0       0       0       0       0       0       0
2 aae-bantam-3p       0       3       0       0       0       1       0       0
3     aae-let-7    6732     759      11      12      14      10      22      59
4     aae-miR-1      72      11       3       0       2       0      24    1360
5    aae-miR-10    1136     246      24      19       8      25    1095   70971
6   aae-miR-100    4181     300      24      19       9      32      38   31840
> rownames(temp) <- make.names(temp[,1], unique=TRUE)
> head(temp)
                    X.miRNA Sample.1 Sample.2 Sample.3 Sample.4 Sample.5 Sample.6
aae.bantam.5p aae-bantam-5p       0       0       0       0       0       0
aae.bantam.3p aae-bantam-3p       0       3       0       0       0       1
aae.let.7         aae-let-7    6732     759      11      12      14      10
aae.miR.1         aae-miR-1      72      11       3       0       2       0
aae.miR.10       aae-miR-10    1136     246      24      19       8      25
aae.miR.100     aae-miR-100    4181     300      24      19       9      32
              Sample.7 Sample.8
aae.bantam.5p       0       0
aae.bantam.3p       0       0
aae.let.7          22      59
aae.miR.1          24    1360
aae.miR.10       1095   70971
aae.miR.100        38   31840
> countdata <- temp [,-c(1)]
> head(countdata)
              Sample.1 Sample.2 Sample.3 Sample.4 Sample.5 Sample.6 Sample.7 Sample.8
aae.bantam.5p       0       0       0       0       0       0       0       0
aae.bantam.3p       0       3       0       0       0       1       0       0
aae.let.7        6732     759      11      12      14      10      22      59
aae.miR.1          72      11       3       0       2       0      24    1360
aae.miR.10       1136     246      24      19       8      25    1095   70971
aae.miR.100      4181     300      24      19       9      32      38   31840
> condition <- factor(c('Stage1', ' Stage 2', ' Stage3', ' Stage 4', ' Stage 5', ' Stage 6', ' Stage 7', ' Stage 8'))
> coldata <- data.frame(row.names = colnames(countdata), condition)
> coldata
                      condition
Sample.1           Stage1
Sample.2           Stage2
Sample.3           Stage3
Sample.4           Stage4
Sample.5           Stage5
Sample.6           Stage6
Sample.7           Stage7
Sample.8           Stage8
> ddsprep <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design =~condition)
> colData(ddsprep)$condition<-factor(colData(ddsprep)$condition,levels=c("Stage1"," Stage2"))
> dds <- DESeq(ddsprep)
Error in designAndArgChecker(object, betaPrior) : 
  variables in the design formula cannot have NA values

ddsprep$condition

condition

<factor>
Stage1
Stage2
NA
NA
NA
NA
NA

NA

As you can see, there are NAs and obviously that is the reason why the error is showing up. Then i tried to subset my samples according to the instructions given in the manual to eradicate the error above using the command and i get a similar error.

> dds <- ddsprep[ , ddsprep$condition == " Stage1"," Stage 2"]
Error: subscript contains NAs

It seems that NAs isn't accepted when you finally run the Deseq2, but i don't know how to remove NAs (i tried is.na(), but i get the error, is.na() applied to non-(list or vector) of type 'S4') and i don't know how to subset the samples for my data analysis. Could you please suggest a solution? I just wanted to compare Stage 1 and Stage 2. I am a biologist with very little knowledge on R, so this might appear as a stupid question, but i really appreciate your time and patience.

 

p.s: stage 1 to 8 doesn’t have replicates. The main purpose of this experiment is to do a exploratory and hypothesis generating analysis

Thanks for your help in advance

deseq2 • 5.2k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

You can use the following:

dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design =~condition)

dds <- dds[, dds$condition %in% c("Stage1","Stage2") ]

dds <- DESeq(dds)

 

ADD COMMENT
0
Entering edit mode

I have 36 samples(14 in condition 1, 17 in condition 2, 4 in condition 3 and 1 in condition 4 ). All conditions are subtypes of a disease. I am analysing their internal changes within the subtype. further, I did the same (condition 1 &2) but getting this error.

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design= ~ condition) dds <- dds[, dds$condition %in% c("Immature","Cortical") ] dds <- DESeq(dds) Error in designAndArgChecker(object, betaPrior) : full model matrix is less than full rank

ADD REPLY
0
Entering edit mode

Try running factor() on the column in colData after subset and before DESeq.

dds$xyz <- ...

ADD REPLY
0
Entering edit mode

Thanks, it's working fine.

dds$condition <- factor(dds$condition, levels = c("Immature","Cortical"))

ADD REPLY
0
Entering edit mode

Is it possible to do the analysis for all four conditions at a time in DESeq2? like in cuffdiff we can give multiple condition and results were saved in a single file q1,q2,q3... Please comment.

ADD REPLY
0
Entering edit mode

Nope. All the functionality is covered in the vignette.

ADD REPLY

Login before adding your answer.

Traffic: 967 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6