Question

Seurat/DESeq2 - single cell RNA-seq differential expression

1

Entering edit mode

Clara ▴ 10

@deut2016-16915

Last seen 3.1 years ago

Germany

Hi all,

I am approaching the analysis of single-cell RNA-seq data.

I have seen that Seurat package offers the option in FindMarkers (or also with the function DESeq2DETest) to use DESeq2 to analyze differential expression in two group of cells.

Assuming I have group A containing n_A cells and group_B containing n_B cells, is the result of the analysis identical to running DESeq2 on raw counts of each gene in n_A versus n_B samples? And is there a way to speed up the analysis when n_A and n_B are in the order if a few thousands cells?

In addition, I have a 'technical' question:

when I have a count table in the form of a data.frame (for example read with read.table from a text file), is it necessary to force it to matrix such as cts <- as.matrix(cts) before providing it as input to DESeqDataSetFromMatrix? It seems to be I can just provide the count table as a data frame.

Thanks,

Claire

seurat deseq2 • 13k views

ADD COMMENT • link updated 6.5 years ago by Michael Love 43k • written 6.5 years ago by Clara ▴ 10

score 0 · Answer 1 · 2018-08-14

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 23 hours ago

United States

As to speed, DESeq2 will be a lot slower than say a linear model when you have thousands of replicates. The software wasn’t optimized for this use case. According to a recent paper from Soneson and Robinson, you can use Wilcoxon effectively here for DE.

If you want to compare hundreds of cells with DESeq2, please use the zinbwave integration pipeline:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#recommendations-for-single-cell-analysis

Finally, if you provide a data frame, DESeq2 I believe just does the conversion to matrix internally.

ADD COMMENT • link 6.5 years ago Michael Love 43k

0

Entering edit mode

Thank you! I had seen in this paper by Soneson&Robinson https://www.nature.com/articles/nmeth.4612 that DESeq2 was employed, however the number of cells were much lower. In addition, input used was transcripts per million, while I think Seurat uses raw counts. In any case, I will look at the recommendations you point out, which I had missed.

Regarding the internal matrix conversion, then this was maybe already present in DESeq2 versions from years ago as I had seen using a data frame in some old code as well.

ADD REPLY • link 6.5 years ago Clara ▴ 10