We have 4 control lung samples and 4 tumour lung samples of one mammalian species . In process of taking tumour tissues, the normal tissues are also taken (Don't ask, quite difficult process to take tumour sample from lung). We have virus counts in tumour samples (56500, 35440, 175788, 101192) and very few in control samples (49, 64, 116, 38 - this is due to endogenous virus in host genome). 

Since we have normal tissues present in tumour samples, the gene counts in tumour sample may be effected by this normal tissue contamination. I am looking for a way to normalise the HTSeq-counts. One way that came to my mind was using virus counts along with host genes counts to perform DESeq2 or edgeR. But this came out with not much different than using only host counts. 

Is there any other way to normalise these data to remove effect of normal tissues present in tumour samples using control samples? Or any way we can use these virus counts to normalise the host genes? 

Thank you. 

I don't have any specific advice or idea of what is the comparison of interest and how it could be achieved.

Thank you Michael. I can understand this is bit bizarre condition. I will just keep on normal normalisation method using both host and virus genes counts together. 

Do you have an estimate of the normal fraction of the tumour samples from a clinician? If the fraction is about the same for all tumour samples, then it's best to do no normalisation. Fold changes would be under-estimated but would be in the correct order.

Thank you Dario. Fraction is not same in all samples.

Yes you are right the fold change are appeared quite low. 

