In the VCF file of mutations, is it correct that the file should contain not only germline mutations that are SNPs and marked with DB and POP_AF and rs ID, but should ALSO contain germline mutations that are NOT SNPs and therefore lack DB, POP_AF, and an rs ID?


Shouldn't matter too much if you include or not. But sure, if your VCF contains lots of artifacts and if you think that most of the private germlines are artifacts or of low or too uncertain quality, you can certainly throw them out. If your matched normal has decent coverage, most of the these artifacts should be removed automatically because most should have allelic fractions significantly different from 0.5. 

Again, PureCN was mainly written for tumor-only where we want to know if these private variants are somatic, germline or artifacts. In a matched tumor/normal setting, there is probably some minor benefit in keeping only SNPs you know behave well (and keep high quality somatic calls of course!).


