Dispersion estimates with DESeq2
Entering edit mode
Julia ▴ 10
Last seen 3 days ago
United Kingdom

Essentially, due to low input RNA (issue with sequencer), there's a high number of 0s in the gene count matrix with featurecounts, and this gives a poor dispersion estimate (shown below, with ~60,000 genes) enter image description here

Then to deal with this, I filtered out the lowly expressed genes (with idx <- rowSums( counts(dds, normalized=TRUE) >= 5 ) >= 3)... which gives a slightly better fit (now use ~7000 genes) enter image description here

Or alternatively, using HTSeq for assigning features and then also removing any genes where all counts were 0 (with ~14,000 genes) the dispersion estimate looks like this... enter image description here

Which looks odd to me as it goes up and then down, but the fit looks very good? But essentially, I don't know whether filtering out the low genes, or use HTSeq rather than featurecounts is a better way of handling the data - I get differences in terms of PCAs (outliers etc) and differential gene expression (with some overlap)

Many thanks for any help or advice!

dispersionestimates dispersion DESeq2 estimates deseq2 • 116 views
Entering edit mode
Last seen 2 days ago
United States

The second plot seems fine, just change the y-axis to 1e-3 so you can see the data better.


Login before adding your answer.

Traffic: 225 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6