Supplemental Code for Somatic Variant Detection and Filtering

This directory contains the scripts used for identifying and filtering tumor-specific somatic mutations described in the manuscript.

1. 01_mutect2_pipeline.sh
   - Runs GATK Mutect2 in tumor-normal mode.
   - No germline resource was used.
   - Applies additional hard filters based on total depth and ALT read counts.
   - Based on Lange et al. 2020.

2. 02_octopus_pipeline.sh
   - Runs Octopus for haplotype-based variant calling.
   - Used in parallel with Mutect2 to identify somatic variants.

3. 03_postprocess_vcf.sh
   - Tags each variant with additional annotations.
   - Filters out variants on chrX/Y/M, dbSNP-listed entries, and those shared between tumors.
   - MutID is computed from CHR_POS_REF>ALT and used to count unique mutations across samples.
   - Variants that meet all criteria are tagged "PASS" for downstream analysis.

All scripts are provided in Bash format and designed for reproducibility across systems.
