Dx.R somatic_ontarget column
1
0
Entering edit mode
twtoal ▴ 10
@twtoal-15473
Last seen 5 months ago
United States

In the Dx.R output file, for a sample I'm looking at, the column "somatic_ontarget" is 0 in 119 entries and 1 in 2 entries.  What is this column?  It sounds like it should be a count of number of somatic variants that were found to lie within the Dx.R target areas?  

PureCN • 1.1k views
ADD COMMENT
0
Entering edit mode
@markusriester-9875
Last seen 2.5 years ago
United States

Yes. Have a look at section 10.4 and Table 5 for a description of the output.

If there are no somatic mutations in almost all files, then there is most likely an issue with the VCFs.  

Check the predictSomatic output (Sampleinfo_variants.csv). If the somatic variants are missing, then they are removed by the QC filters (see the log file for details). If they are assigned a low prior probability being somatic, your somatic annotation in the VCF is wrong. If they are all classified as germline despite being clearly somatic and labeled as such, then there is something very wrong with the setup. If they show up properly in predictSomatic output, then you filter them out in Dx.R using a wrong CallableLoci file for example.

I would recommend Mutect 1.1.7 for test runs. 

 

ADD COMMENT
0
Entering edit mode

Thanks, very helpful.

I didn't find Table 5 because I was searching for "somatic_ontarget" instead of "somatic.ontarget", because I had changed the dots to underscores in headers when I aggregated the files across all my samples (I hate dots in column names, but it was probably a bad thing to do).

The "somatic.all" column is the same, almost all 0.  The variants.csv file column prior_somatic has only 198 variants (across about 130 samples) > 0.1.  As far as I can tell, my VCF files properly annotate somatic variants.  They have the INFO/SOMATIC flag.  Some somatic variants are also flagged DB, if they are in the gnomAD database.

I found that the INFO field had ".;SOMATIC" instead of just "SOMATIC".  I fixed that problem, but it made no difference.

I don't see anything in the log output that would make me think PureCN had encountered a problem with somatic variants.

Is there a place in the code you could point me to that I can look for what is going on?

 

ADD REPLY

Login before adding your answer.

Traffic: 778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6