I found out why my somatic variants are being filtered out by PureCN. It is filtering on the value in the QUAL column of the VCF. My germline variants have a value in the QUAL column, but the somatic variants (in the same VCF file) have "." (missing value) for QUAL. It seems to me they should not be filtered out if they are ".". Couldn't there be a mixture of QUAL, BQ, and MBQ fields for different variants in a VCF file? Shouldn't the filtering code check each one on a variant-by-variant basis?
Yes, they do.
I'm considering switching the way I make the VCF. Mutect2 v4 now has an option to emit germline sites. So, I can redo my pipeline and rerun Mutect2 on all the samples, and I should end up with a more reliable VCF file, and more consistent one.
In the meantime, I'm just going to set the QUAL field to ".". The MBQ I had already removed from the somatic calls.
Looking at your code for filtering, I would think it wouldn't be hard to account for "." in fields, like this perhaps:
Or, if you want to make sure you use only one of the three for any one variant:
These assume "." is translated to NA.