Corrupt data in Sex column of .csv file
1
0
Entering edit mode
twtoal • 0
@twtoal-15473
Last seen 1 day ago
United States

The PureCN .csv output file of one of my samples has something odd in its "Sex" column. Here is the contents of the file:

"Sampleid","Purity","Ploidy","Sex","Contamination","Flagged","Failed","Curated","Comment"
"I_26837_N1_T1",0.27,2.00260700313316,"Coverage: ? VCF: M",0,TRUE,FALSE,FALSE,"LOW PURITY;EXCESSIVE LOH"


This is for PureCN version 1.16.0.

PureCN • 49 views
0
Entering edit mode

I see that the user guide says:

Sex mismatch of coverage and VCF: If the panel contains baits for chromosome Y, then the interval file was probably generated without mappability file (Section 2.2). Cross- sample contamination (Section 10.5) can also cause sex mismatches.


That doesn't say the Sex column will contain what I observe, yet the appearance of Coverage and VCF in that make me think this might be the cause.

0
Entering edit mode
@markusriester-9875
Last seen 2 days ago
United States

PureCN does two independent checks for sample sex. If they agree, you see either "F" or "M". If not, you see a string like that. "?" means the test was inconclusive. For coverage, that means some chrY coverage, but less than expected. Most common reason is cross-sample contamination for a conflict here. But not necessarily. Loss of sex chromosomes is not too uncommon in older patients, I see it in every couple of datasets.

Sub-optimal setups can also cause issues, like non-uniquely mapped reads included in this test. Or assays that have little to no chrY baits. But if all other samples are fine, you should be ok in this regard.

0
Entering edit mode

Thanks.Thanks.Thanks.