Recently, many tumor sequencing studies have inferred and reported on mutational signatures, short nucleotide patterns at which particular somatic base substitutions appear more often. A number of signatures reflect biological processes in the …
We propose a hierarchical latent Dirichlet allocation model (HiLDA) for characterizing somatic mutation data in cancer. The method allows us to infer mutational patterns and their relative frequencies in a set of tumor mutational catalogs and to compare the estimated frequencies between tumor sets.