Question

Running DESeq2 on viral infection without control/host data

0

Entering edit mode

f99942 • 0

@13d60551

Last seen 2.4 years ago

Austria

Hi,

I have a raw count data set of a viral infection consisting of 3 time points and a virus free control with 3 replicates each produced with featureCounts.

Normalization after running template_script_DESeq2.r: enter image description here

It does not look like the distributions across samples are stabilized. Apparently there were some problems with controls, especially replicate A. Since I am working only on the viral side of the replication cycle, I have just changed condRef in the template script from "control" to to one of the groups (a time point), thus dropping the controls. Normalization improved!

Is this the correct/intended way to do that?

Thank you!

DESeq2 viral controls • 1.2k views

ADD COMMENT • link 2.5 years ago • updated 2.4 years ago f99942 • 0

score 0 · Answer 1 · 2022-10-17

0

Entering edit mode

swbarnes2 ★ 1.4k

@swbarnes2-14086

Last seen 2 days ago

San Diego

I think you might have to drop your controls. Are those counts of viral genes alone, so that's why the controls don't have any? I think you might have to include host genes in the normalization, if not the whole analysis, if that's the case.

ADD COMMENT • link 2.5 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Yes, these counts are viral genes alone. There is no host genome available yet...

ADD REPLY • link 2.5 years ago f99942 • 0

0

Entering edit mode

I agree that you may have to drop the controls.

ADD REPLY • link 2.4 years ago Michael Love 43k

0

Entering edit mode

So isn't it correct for the controls to have almost no reads? Why would you want to do anything to make pretend that they have comparable counts to infected samples? The premise of size normalization; that some genes with median expression are unchanging; looks wrong for the viral genes alone, even in the non-control samples. Did your prep collect any host RNA? If so, I'm not sure it's right to proceed by totally ignoring that. Including it would probably make normalization work.

ADD REPLY • link 2.4 years ago swbarnes2 ★ 1.4k

0

Entering edit mode

Yes, to have almost no viral reads in the virus free control is what one would expect. Since including the controls disturbs the normalization and the biological question (for now) is the expression dynamics of the viral side, it should be fine to drop them. The data I have produced this way makes more sense in the context of promoter and proteomics data...

ADD REPLY • link 2.4 years ago f99942 • 0