Run DESeq with only a handful of genes
1
0
Entering edit mode
@xinlianzhang25-10779
Last seen 7.9 years ago

Hi! I am interested in doing analysis in a special setting, which is that I only want to run DESeq on a few genes and this is not usually assumed. I run into the following problem.

Here is my sample code.


  dds <- makeExampleDESeqDataSet()
  dds <- dds[1:3,]
  dds <- estimateSizeFactors(dds)
  dds <- estimateDispersions(dds)
  dds <- nbinomWaldTest(dds)
  res <- results(dds)
 

For "dds <- estimateDispersions(dds)", I keep getting the following message.

gene-wise dispersion estimates
mean-dispersion relationship
-- note: fitType='parametric', but the dispersion trend was not well captured by the
   function: y = a/x + b, and a local regression fit was automatically substituted.
   specify fitType='local' or 'mean' to avoid this message next time.
Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth,  :
  newsplit: out of vertex space
In addition: There were 17 warnings (use warnings() to see them)

 

Sometimes if this line passes, then for the line "dds <- nbinomWaldTest(dds)", I keep getting

Error in nbinomWaldTest(dds) :
  testing requires dispersion estimates, first call estimateDispersions()
 

Any help will be appreciated. Thanks!

 

XInlian

deseq2 Small number of genes software error • 1.2k views
ADD COMMENT
2
Entering edit mode

How many genes is a small number? Do you actually have data for all genes? If so I'd probably try estimating dispersions genome-wide first, and then subsetting to your genes of interest.

ADD REPLY
0
Entering edit mode

Yes, Ryan is right. For each gene, DESeq2 uses the data from all other genes to improve the estimates of dispersion, even if there is only a small number of replicate experiments. (This is an instance of what's called an empirical Bayes approach; it can only work with sufficient numbers of genes, which in a sense make up for the lack of replication.)

There is no harm in running DESeq2 on all genes, and then subset afterwards. You can limit the multiple testing computations (if needed) to the subset (stats::p.adjust).

ADD REPLY
0
Entering edit mode
@xinlianzhang25-10779
Last seen 7.9 years ago

My problem actually is that I want to look at DE analysis on exons in each gene. So what I am doing is to treat an exon as a gene. I know this sounds silly. But for exons in a gene, it is imaginable that i also need to adjust for library size, dispersion. That is why I just want to give it a try.

ADD COMMENT
1
Entering edit mode

There is a software designed specifically for this:

http://bioconductor.org/packages/DEXSeq

You should use this instead.

ADD REPLY
0
Entering edit mode

That looks great! Thanks for letting me know!!!

 

 

ADD REPLY

Login before adding your answer.

Traffic: 660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6