We have 4 control lung samples and 4 tumour lung samples of one mammalian species . In process of taking tumour tissues, the normal tissues are also taken (Don't ask, quite difficult process to take tumour sample from lung). We have virus counts in tumour samples (56500, 35440, 175788, 101192) and very few in control samples (49, 64, 116, 38 - this is due to endogenous virus in host genome).
Since we have normal tissues present in tumour samples, the gene counts in tumour sample may be effected by this normal tissue contamination. I am looking for a way to normalise the HTSeq-counts. One way that came to my mind was using virus counts along with host genes counts to perform DESeq2 or edgeR. But this came out with not much different than using only host counts.
Is there any other way to normalise these data to remove effect of normal tissues present in tumour samples using control samples? Or any way we can use these virus counts to normalise the host genes?