Your observation aligns with the expected behavior of Mutect2 in tumor-only mode, where the panel of normals (PON) integrates into the variant calling model rather than functioning solely as a post-calling filter. The PON provides site-specific estimates of artifact probabilities based on alternative allele counts observed across your five normal samples. This informs the somatic likelihood calculations during the active region determination and local assembly steps.
Without a PON, Mutect2 relies on a default uniform prior for artifact probabilities (typically around 10^{-6} per base). This can lead to conservative calling at certain sites, where borderline evidence in the tumor sample fails to exceed the emission threshold. In contrast, when using the PON, sites with minimal or no alternative alleles in the normals receive a lower artifact prior, enabling Mutect2 to call variants with lower allele fractions or weaker support that might otherwise be dismissed under the default model.
The nine additional variants in your Mutect2withPON results likely represent such cases -- true somatic events or low-level signals that become detectable due to the refined artifact modeling. Conversely, the four variants unique to Mutect2withoutPON may reflect artifacts that the PON correctly downweights, preventing their emission.
This discrepancy arises because Mutect2 does not simply subtract PON variants; instead, the PON modulates the probabilistic framework, potentially expanding the call set at low-artifact sites while suppressing it at high-artifact ones. Your assumption of strict containment is incorrect for this reason. To investigate further, compare the allele depths and qualities of the discordant variants using commands like:
bcftools view -r <region> <vcf_file> | grep <variant_position>
or annotate with tools such as VEP for functional insights.
Kevin