Assigning DUP/DEL p-value to CNV segment
1
0
Entering edit mode
twtoal ▴ 10
@twtoal-15473
Last seen 14 months ago
United States

Does the following seem like a good method to assign a p-value for segment DUPLICATION or DELETION, to each segment in the dnacopy.seg file?

1. Use PureCN's readCurationFile to read the .rds file into object RDS, then retrieve the log ratios of the marks in RDS$input$log.ratio

2. Compute the adjusted copy ratio of each mark by taking 2^log.ratio, then applying the purity/ploidy adjustment equation to the result, using purity and ploidy from the curation file.

3. Use a Wilcoxon 1-sample test to test whether the resulting copy ratios are significantly greater, or less, than 1.

 

 

PureCN • 1.1k views
ADD COMMENT
0
Entering edit mode
@markusriester-9875
Last seen 21 months ago
United States

Yes, you want the coverage, not the variant log-ratios. Usually less than 15% of exons (num.marks in DNAcopy output) have variants, so you would ignore most information. 

You can probably use something like voom to compare the tumor coverage against all normals in the database. This will incorporate the variance in pool of normals when calculating p-values. But I don't think this will be that useful - it is probably not sensitive enough for very low purities and for higher purities, pretty much everything PureCN calls should be significant. But worth a try if you really need p-values. Let me know if you are happy with the results.

You can use the readCoverageFile function to load the coverage files of tumor and normal, build a matrix of counts and use something like voom.

 

 

ADD COMMENT
0
Entering edit mode

Sorry, my edit to my question crossed with your response, can you re-read my question and adjust your response?

ADD REPLY
0
Entering edit mode

This works probably well for larger segments, but you assume a constant variance and ignore the fold-change. Using something like voom will incorporate the variance observed in the pool of normals and the variance due to coverage. It might be ok, the log-ratios are cleaned of most noise we can get rid off, but I'm not sure this rank based approach will work in short segments where the p-value would be most useful. A pure tumor vs normal coverage p-value also does not account for purity and ploidy.

Since this is not really a PureCN question, look around, you might also find something in the germline literature or ask in a broader forum like Biostars. 

ADD REPLY

Login before adding your answer.

Traffic: 930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6