Since the ssGSEA/GSVA algorithms work by determining how much more expressed the genes of our gene list are when compared to all other genes within the sample, should remove genes with 0 counts in each individual sample before running the algorithm?
Say gene x is present in sample 1 but not sample 2, should we omit it from sample 2's calculations but keep it for sample 1? (i.e. replace all 0 with "NA")
In theory, if we have 2 samples with the exact same expression of our genes of interest but sample 1 has 1000 non-0 value genes and sample 2 has 500 non-zero value genes and 500 0-value genes, not removing the 0s would give the same score to both samples, but sample 2 clearly behaves differently.
Should we remove these 0 count genes?