Question: what should I do with many zero counts from Salmon quantification
0
gravatar for lkianmehr
12 weeks ago by
lkianmehr0
lkianmehr0 wrote:

I have quantified the RNA-seq samples by Salmon. 2 groups are wild-type and 4 groups are Dnmt2 knocked-out. I've put all in one dataset for DE analysis. box plot of their normalized counts shows the median of knocked-out samples are zero, and maximum of 20 reads are assigned to each transcript. now to perform DE by DESeq2 I have 2 questions: 1- whether zero values should be deleted before? 2- to do DE what minimum of counts has to be selected?

thanks in advance

normalization deseq2 R salmon • 192 views
ADD COMMENTlink modified 12 weeks ago by Michael Love22k • written 12 weeks ago by lkianmehr0
Answer: what should I do with many zero counts from Salmon quantification
2
gravatar for Michael Love
12 weeks ago by
Michael Love22k
United States
Michael Love22k wrote:

We have some minimal filtering code in the DESeq2 vignette you can take a look at. There's no point running a DE method when all the counts are 0 obviously, and you can additionally filter out genes which have very small counts for all samples, because these don't have enough precision for estimation of the LFC. A common rule is, for example a count of 10 or more in at least 3 or more samples. However, it does depend a bit on the dataset, for example UMI deduplicated data has counts < 10 which nevertheless give some precision to estimating the LFCs.

It is very suspicious that the maximum transcript count for a sample is 20. It is typically in the 100,000+ range for standard bulk RNA-seq of human or mouse. That there are many zeros is typical and expected, because cell types or tissues express only a subset of the transcripts in the genome.

ADD COMMENTlink modified 12 weeks ago • written 12 weeks ago by Michael Love22k

Sorry, I think, I made a mistake cause I've calculated log2(1+counts), and made a box plot. it's y axis is between 0 to 20. what does it mean?

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by lkianmehr0
1

log2 of 20 is typical. No problem with this data, or anything you've described above.

Perhaps you can look at the vignette and workflow so you get an idea of what typical RNA-seq count datasets look like.

ADD REPLYlink written 12 weeks ago by Michael Love22k

Excuse me, to make plotMA with res following DESeq2 vignette, it makes a plot based on expression log ratio and log expression, not LFC and mean of normalized counts, plotMA(ddsTxi, alpha= 0.1, main = "", xlab = "mean of normalized counts", ylim, mle = TRUE) I face with an error: Error in as.vector(x) : no method for coercing this S4 class to a vector. whether it should be converted to data.frame?

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by lkianmehr0

What is class(ddsTxi)?

If it is a DESeqDataSet or DESeqResults object it should work.

ADD REPLYlink written 12 weeks ago by Michael Love22k

yes, it's a DESeqDataSet. but it doesn't work!

ADD REPLYlink written 12 weeks ago by lkianmehr0
1

Can you try DESeq2::plotMA(). Maybe you are using another package that masks our plotMA method.

ADD REPLYlink written 12 weeks ago by Michael Love22k

exactly, it works. thanks alot

ADD REPLYlink written 12 weeks ago by lkianmehr0

Excuse me, I would appreciate if help me to making a heatmap of the count matrix which I performed according to DESeq2 vignette,

select <- order(rowMeans(counts(ddstxi,normalized=TRUE)), decreasing=TRUE)[1:20]

df <- as.data.frame(colData(ddstxi)[,c("group")])

pheatmap(assay(ntd)[select,], clusterrows=FALSE, showrownames=FALSE, clustercols=FALSE, annotationcol=df)

but it face with this error,

Error in check.length("fill") : 'gpar' element 'fill' must not be length 0

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by lkianmehr0

Not sure of that error. It’s coming from pheatmap not DESeq2 so check what you are inputting and check the help files from that package.

ADD REPLYlink written 12 weeks ago by Michael Love22k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 109 users visited in the last hour