#### Posts by markus.riester

... Can you post the log-file? No variants to plot means that PureCN found no germline SNPs to plot.  ...
written 24 days ago by markus.riester90
... I'll fix the crash, but the problem is very likely your input data. The warning means that for some of the variants in your VCF, PureCN could not find any neighboring SNPs in the mapping database. There is currently no size restriction, so neighborhood is the whole chromosome. This means either your ...
written 27 days ago by markus.riester90
... Ok. What exactly are the paired tumor/normal somatic counts? Is this straight out of Mutect without any filtering? In other words, how many of the true somatic counts are present in the predictSomatic output and how many of those are predicted to be germline? PureCN filters by coverage, allelic fra ...
written 5 weeks ago by markus.riester90
... This looks good, with the exception that your VCF contains a huge number of artifacts. Only 13% are annotated as being in dbSNP and after variant filtering still less than 50% are in dbSNP. I get >85% even in MSI-H samples with Mutect 1.1.7. Is this Mutect 1.1.7 or Mutect2 in GATK4? If M1, is th ...
written 5 weeks ago by markus.riester90
... Hi KJ, Can you add the output of the log-file (in case you used the package directly, not the PureCN.R script, you need to specify a log.file in runAbsoluteCN)? Do you use a mappability file in IntervalFile.R? What cutoffs are you using for predicted somatic? Are you including known germline sites ...
written 6 weeks ago by markus.riester90
... One of the first is chr1:10625-10704. Only inaccessible regions (Ns in the FASTA) like centromeres and telomeres should be 0. In my file generated by the old GEM it is like this below. My guess is the new GEM requires a minimal mappability. The smallest score I see in your file is 0.166667. > s ...
written 7 weeks ago by markus.riester90
... Your mappability file has a lot of regions with 0 mappability that are properly sequenced. When you generated the mappability file, did you maybe use a FASTA file in which repeats were masked?   The problem is that you end up with only small regions between these masked regions, way smaller than 20 ...
written 7 weeks ago by markus.riester90
... Not sure, I looked into this more than 2 years ago. I used the following and believe it's correct: rds <- readRDS("Sampleid.rds") r <- rds$results[[1]] r$seg$seg.mean.adjusted <- r$seg$seg.mean/r$purity - 2*(1-r$purity)/(r$purity*r\$ploidy) I haven't used it much though because I found ...
written 7 weeks ago by markus.riester90
... Thanks, I'll have a look at your mappability file in the next 1-2 days. Have a look at the CNVkit paper ( http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004873 ), which explains it nicely.  The idea is to tile the genome into off-target bins and then just count the random r ...
written 7 weeks ago by markus.riester90
... All the log-ratios are standard log2 tumor vs normal coverage (of course after normalization for total sequencing coverage). Exactly what you would get from any other copy number tool that does not do any purity/ploidy adjustment like CNVkit, GATK4 etc. So no purity adjustment.  If you need purity ...
written 7 weeks ago by markus.riester90

