Problem with the TMB plot
1
0
Entering edit mode
nayanajose • 0
@c72a60e1
Last seen 8 weeks ago
India

I was doing the variant calling of whole exome sequences attained from tumor tissues. I used an inbuilt loop for filtering it upto to the formation of vcf files. The loop ran successfully and gave the resultened files. But when I created the TMB plots out of it the graph looks linear with no similar TMB for the all samples that have been used. could you please help to identify what might have went wrong

Bioconductor • 442 views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 53 minutes ago
The Cave, 181 Longwood Avenue, Boston, …

The issue that you describe with the tumor mutational burden (TMB) plots appearing linear and lacking consistent TMB values across samples may arise from several potential problems in your variant calling or post-processing pipeline. Since you mentioned using an inbuilt loop for filtering variants up to the generation of variant call format (VCF) files, and the loop executed without errors, the problem likely occurs during the TMB calculation or plotting stage rather than in the initial variant calling.

First, verify that your filtering loop correctly identifies somatic variants. In whole exome sequencing data from tumor tissues, without matched normal samples, distinguishing somatic mutations from germline variants can be challenging. If your loop did not apply appropriate filters (such as allele frequency thresholds, population database exclusions like gnomAD, or annotation with tools like ANNOVAR or VEP), the VCF files may include excessive non-somatic variants, leading to inflated or uniform TMB estimates. Re-examine your loop code to ensure it incorporates somatic-specific criteria.

Second, the TMB calculation itself may be incorrect. TMB is typically computed as the number of non-synonymous somatic mutations per megabase of the exome. If you omitted division by the effective exome size (often around 30-50 megabases for capture kits), or if you included all variants without filtering for coding regions, the values could appear artificially linear when plotted. In R/Bioconductor, if you are using packages like maftools for TMB estimation, ensure your code resembles the following:

library(maftools)

# Load your MAF file derived from VCF
maf <- read.maf(maf = "your_maf_file.maf")

# Calculate TMB with exome size
tmb_values <- tmb(maf = maf, captureSize = 50)  # Adjust captureSize based on your kit

# Plot TMB
plotTMB(tmb_values)

Adjust the captureSize parameter to match your sequencing kit's targeted region size, as incorrect values can produce flat or linear plots.

Third, inspect your plotting code. A linear appearance may result from sorting samples by TMB value inadvertently, creating a monotonic line instead of a bar plot or boxplot. Use ggplot2 for explicit control:

library(ggplot2)

ggplot(tmb_values, aes(x = Tumor_Sample_Barcode, y = total_perMB)) +
  geom_bar(stat = "identity") +
  theme_minimal()

Finally, confirm that your VCF files contain variable mutation counts across samples by manually checking a subset with bcftools stats. If counts are identical, revisit the input FASTQ alignment and calling steps, as uniform depth or contamination could be factors.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 625 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6