Question

[sva] Error in density.default(x, adjust = adj) : 'x' contains missing values

3

Entering edit mode

anton.kratz ▴ 60

@antonkratz-8836

Last seen 22 months ago

Japan, Tokyo, The Systems Biology Insti…

When running the sva function, I get an error "Error in density.default(x, adjust = adj) : 'x' contains missing values".

I have a data frame with raw RNA-seq counts with sample names in columns and gene names in rows, and another data frame describing the condition of each sample. There is only one condition column and it has 8 possible states (I cannot post the actual dataset). Here is my code:

library(sva)

mycoldata <- read.delim("contrast.txt", header = TRUE, sep = "\t")

mycountdata <- read.delim("expr_table.tsv", header = TRUE, sep = "\t")
mycountdata$checksum <- NULL
mycountdata$score <- NULL
df <- data.matrix(mycountdata)

# build the FULL model matrix
mod = model.matrix(~as.factor(condition), data=mycoldata)

# build the NULL model matrix
mod0 = model.matrix(~1,data=mycoldata)

n.sv = num.sv(df,mod,method="leek")
n.sv

# estimate the surrogate variables
svobj = sva(df,mod,mod0,n.sv=n.sv)

n.sv for this data set is 14. When I run the last command, I get the following error message:

> svobj = sva(df,mod,mod0,n.sv=n.sv)
Number of significant surrogate variables is:  14 
Iteration (out of 5 ):Error in density.default(x, adjust = adj) : 'x' contains missing values
In addition: Warning message:
In pf(fstats, df1 = (df1 - df0), df2 = (n - df1)) : NaNs produced

I found some posts describing what looks to be the same error [1][2][3], however none of these seem to have led to an accepted explanation of this error or a way to resolve this (approaches described were: just reducing the number of n.sv until it works, removing genes with low counts until it works).

[1] https://www.biostars.org/p/198820/

[2] https://support.bioconductor.org/p/78142/

[3] https://stackoverflow.com/questions/43101585/error-when-generating-the-sva-object-using-package-sva-in-r

sva • 13k views

ADD COMMENT • link updated 17 months ago by ndphuc1605 • 0 • written 8.3 years ago by anton.kratz ▴ 60

1

Entering edit mode

I had the same error, and removing low count genes from counts table (as suggested in sva tutorial, https://bioconductor.org/packages/release/bioc/vignettes/sva/inst/doc/sva.pdf) worked for me.

ADD REPLY • link 7.0 years ago rodrigo.duarte88 ▴ 40

score 0 · Answer 1 · 2019-02-05

0

Entering edit mode

Benjamin_201314 • 0

@benjamin_201314-19745

Last seen 7.0 years ago

I had a similar problem. I found that my factor variables read in as characters (which doesn't matter for some packages but it does for this one). Once I changed all my variables labeled as characters to factors the function worked.

ADD COMMENT • link 7.0 years ago Benjamin_201314 • 0

score 0 · Answer 2 · 2022-05-27

0

Entering edit mode

wahaha • 0

@19953279

Last seen 3.6 years ago

Taiwan

I found the reason, because some gene expression levels are 0 in all samples. you can add these code to check the data:

exp_mean <- apply(exp,1,mean)

filter_gene <- rownames(subset(exp_mean ,exp_mean ==0))

exp_mean2 <-exp_mean [-which(rownames(exp_mean ) %in% filter_gene),]

You can try to remove these genes.

ADD COMMENT • link 3.7 years ago wahaha • 0

score 0 · Answer 3 · 2024-07-11

0

Entering edit mode

vgrozd • 0

@0cd0f415

Last seen 18 months ago

Germany

This same error is returned also when the number of requested SVs is too high, when manually specifying with n.sv=....

ADD COMMENT • link 18 months ago vgrozd • 0

score 0 · Answer 4 · 2024-08-16

0

Entering edit mode

ndphuc1605 • 0

@da91a904

Last seen 17 months ago

Vietnam

I had the same problem. When I did not setup 'n.sv' parameter, like svobj = sva(df,mod,mod0), the code worked and n.sv was calculated when running this code.

However, I did not whether my running was right.

Hope to be replied for cross-checking.

ADD COMMENT • link 17 months ago ndphuc1605 • 0