values (NA) in p value Deseq2 (reopen)
1
0
Entering edit mode
@8e20af93
Last seen 3.0 years ago
Colombia

Hello how are you? I reopen this question because the following has happened:

I am doing a differential expression exercise using the hisat2, stringie & DESeq2 workflow. Finally I use the python prepDE.py script recommended in the StringTie manual to extract the counts.

So far so good, I have rows of genes and columns with cases (controls and patients) with number of counts. Now, when using Deseq2 when establishing the differential expression with nbinomWaldTest, I get results in p value with (NA). The question that I was reading forums why these boxes appear with NA values ​​and they tell us that:

  • If within a row, all samples have zero counts, the baseMean column will be zero, and the log2 fold change estimates, p-value, and adjusted p-value will be set to NA.
  • If a row contains a sample with an extreme count outlier, the p-value and the adjusted p-value will be set to NA. These outliers are detected by Cook's distance.
  • If a row is filtered by independent automatic filtering, having a low mean normalized count, only the adjusted p-value will be set to NA.

It is suggested that as filters are deactivated as follows:

res <- results (dds, cooksCutoff = FALSE, independentFiltering = FALSE)

However, in doing so I still have boxes with NA, I really don't know what I'm doing wrong and I hope someone can help me.

I share the script that I have use.

library("DESeq2")
setwd("C:/Users/ADMIN/Desktop/tvt/")
expression_data <- read.table("C:/Users/ADMIN/Desktop/tvt/gene_count_matrixv2.csv", row.names = "gene_id", header = TRUE, sep = ";", stringsAsFactors = FALSE)
expression_data$X <- NULL
dim(expression_data)
summary(expression_data)
apply(expression_data, 2, sum)
mx = apply( expression_data, 1, max )
expression_data = expression_data[ mx > 227, ]
condition <-factor(c("control","control","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente","paciente"),c("control","paciente"))
col_data = data.frame(condition)
dds = DESeqDataSetFromMatrix(expression_data, col_data, ~condition)
dds = estimateSizeFactors(dds)
dds = nbinomWaldTest(dds)
dds <- DESeq(dds, minReplicatesForReplace=Inf)
res <- results(dds, cooksCutoff=FALSE, independentFiltering =FALSE)
res = results(dds)
head(res)
res$padj = ifelse(is.na(res$padj), 0.1, res$padj)
pvalue NA R DESeq2 • 1.6k views
ADD COMMENT
0
Entering edit mode
@8e20af93
Last seen 3.0 years ago
Colombia

I think my mistake is due to this ... (O.o)

dds = nbinomWaldTest(dds)

ADD COMMENT

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6