Question: Dx.R somatic_ontarget column
gravatar for twtoal
12 months ago by
twtoal0 wrote:

In the Dx.R output file, for a sample I'm looking at, the column "somatic_ontarget" is 0 in 119 entries and 1 in 2 entries.  What is this column?  It sounds like it should be a count of number of somatic variants that were found to lie within the Dx.R target areas?  

purecn • 188 views
ADD COMMENTlink modified 12 months ago by markus.riester110 • written 12 months ago by twtoal0
Answer: Dx.R somatic_ontarget column
gravatar for markus.riester
12 months ago by
markus.riester110 wrote:

Yes. Have a look at section 10.4 and Table 5 for a description of the output.

If there are no somatic mutations in almost all files, then there is most likely an issue with the VCFs.  

Check the predictSomatic output (Sampleinfo_variants.csv). If the somatic variants are missing, then they are removed by the QC filters (see the log file for details). If they are assigned a low prior probability being somatic, your somatic annotation in the VCF is wrong. If they are all classified as germline despite being clearly somatic and labeled as such, then there is something very wrong with the setup. If they show up properly in predictSomatic output, then you filter them out in Dx.R using a wrong CallableLoci file for example.

I would recommend Mutect 1.1.7 for test runs. 


ADD COMMENTlink written 12 months ago by markus.riester110

Thanks, very helpful.

I didn't find Table 5 because I was searching for "somatic_ontarget" instead of "somatic.ontarget", because I had changed the dots to underscores in headers when I aggregated the files across all my samples (I hate dots in column names, but it was probably a bad thing to do).

The "somatic.all" column is the same, almost all 0.  The variants.csv file column prior_somatic has only 198 variants (across about 130 samples) > 0.1.  As far as I can tell, my VCF files properly annotate somatic variants.  They have the INFO/SOMATIC flag.  Some somatic variants are also flagged DB, if they are in the gnomAD database.

I found that the INFO field had ".;SOMATIC" instead of just "SOMATIC".  I fixed that problem, but it made no difference.

I don't see anything in the log output that would make me think PureCN had encountered a problem with somatic variants.

Is there a place in the code you could point me to that I can look for what is going on?


ADD REPLYlink modified 12 months ago • written 12 months ago by twtoal0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 301 users visited in the last hour