Corrupt data in Sex column of .csv file
1
0
Entering edit mode
twtoal ▴ 10
@twtoal-15473
Last seen 13 months ago
United States

The PureCN .csv output file of one of my samples has something odd in its "Sex" column. Here is the contents of the file:

"Sampleid","Purity","Ploidy","Sex","Contamination","Flagged","Failed","Curated","Comment"
"I_26837_N1_T1",0.27,2.00260700313316,"Coverage: ? VCF: M",0,TRUE,FALSE,FALSE,"LOW PURITY;EXCESSIVE LOH"

This is for PureCN version 1.16.0.

PureCN • 826 views
ADD COMMENT
0
Entering edit mode

I see that the user guide says:

Sex mismatch of coverage and VCF: If the panel contains baits for chromosome Y, then the interval file was probably generated without mappability file (Section 2.2). Cross- sample contamination (Section 10.5) can also cause sex mismatches.

That doesn't say the Sex column will contain what I observe, yet the appearance of Coverage and VCF in that make me think this might be the cause.

ADD REPLY
0
Entering edit mode
@markusriester-9875
Last seen 21 months ago
United States

PureCN does two independent checks for sample sex. If they agree, you see either "F" or "M". If not, you see a string like that. "?" means the test was inconclusive. For coverage, that means some chrY coverage, but less than expected. Most common reason is cross-sample contamination for a conflict here. But not necessarily. Loss of sex chromosomes is not too uncommon in older patients, I see it in every couple of datasets.

Sub-optimal setups can also cause issues, like non-uniquely mapped reads included in this test. Or assays that have little to no chrY baits. But if all other samples are fine, you should be ok in this regard.

ADD COMMENT
0
Entering edit mode

Thanks.Thanks.Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6