The 'Context' visualization, otherwise known as the 'Lego plot' of mutational frequencies, describes the distribution of mutations across batch samples on a given region. Base substitutions are divided into six types to represent the six possible base changes (each type represented by a different color as shown in the “Mutation Type” legend). Substitutions in each type are further subdivided by the 16 possible flanking nucleotides surrounding the mutated base as listed in “Trinucleotide Context” table. The pie chart illustrates the percentage of all mutations types on said batch samples. To visualize data, upload a BGZ file in the required format and use sidebar options to customize the display.
SNV tsv.bgz File
The uploaded BGZ file must match the required format as specified below.
Check the official demo input here.
The file can be sorted and compressed from a TSV file with the following format:
#prefix is mandatory to indicate the header line.
positionrespectively stand for the chromosome and position of the mutation.
contextstands for the trinucleotide context of the mutation.
alt_allelerespectively stand for the base before and after the mutation.
tumor_fis optional. We allow user to filter out mutations with
tumor_fvalue lower than a custom threshold.
The TSV file must be
sorted by chromosome and position, and compressed by
bgzip tools for
tabix indexing to support fast data processing at the backend of Oviz-Bio.
For example tsv file, run the following command in the linux terminal (bgzip installed):
(head -1 SNV_Context_demo_MutList.tsv; sed -n '2,$p' SNV_Context_demo_MutList.tsv | sort -k1,1 -k2n) | bgzip -c > SNV_Context_demo.tsv.bgz
Custom Bed File (optional)
A custom bed file allows user to replace the default region used in our website, which is the whole genome sequence. We will filter out mutations that are not in the custom region during the calculation. Note that the uploaded bed file is only applied when you choose the custom bed option in the Settings section of the sidebar.
The uploaded TSV file must match the required format as specified below. Check the official demo input here.
The header should follow the following format:
There are three types of interactions: Tooltips, Transparent Bins and Download.
A tooltip shows necessary information of the object the mouse points to.
- bin: A tooltip is drawn when the mouse lands on the bottom of a bin. The displayed information includes the mutation type and the trinucleotide context represented by this bin as well as the exact height of the bin.
- pie: A tooltip is drawn when the mouse moves to a certain arc of the pie chart and shows the exact percentage of the corresponding mutation type among all mutations.
- Transparent Bins
When the mouse lands on the bottom of a bin, the bottom will be highlighted and all bins before it will become transparent to avoid hindered view.
One SVG file will be generated when the 'Download' button is clicked. Only the default Dark Theme is available at this point.
The sidebar provides diverse options to fine-tune the display, namely managing files, adding labels on bins, choosing base regions and setting the measurements of mutations.
- Manage Files: checklist of TSV files uploaded previously, delete or download said BGZ files.
- Upload: upload batch sample BGZ file and the optional bed TSV file. Note that the duplicated file name will be alerted and given a random postfix.
- Choose: choose files uploaded previously. Note that this function is ONLY available to registered users (each account has certain storage).
- File Sets: save a batch sample file and a bed file together as a file set. User can also choose to apply one file set from all saved ones.
- Label Buttons: choose a symbol (except for the first and the last button) to enter labeling mode. In this mode, when you click a bar, the corresponding symbol will be added to the bar as annotation.
- Off Button: click this button to exit from labeling mode. Re-click to re-enter labeling mode with the last-used symbol.
- Remove Button: click to enter removing mode. In this mode, when you click a bar with symbol, the corresponding symbol will be removed.
- Interval: choose between the default region and the custom region. If no bed file is provided, the only option is the default whole genome region.
- Y axis: provide three measurements of the mutation count, namely the numeric sum of the mutation, mutations per Mb and the percentage among all mutations.
- Filter by tumor_f: choose the compare method and the threshold for filtering mutations.
Manual version=1.2, written by Miss. Li Shiying and Dr. JIA Wenlong on 2019-12-27.
- Dulak, A. M., Stojanov, P., Peng, S., Lawrence, M. S., Fox, C., Stewart, C., ..., Getz, G. and Bass, A. J. (2013). Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nature Genetics, 45(5), 478. (PMID: 23525077, See Figure 1A)
- Dai, J. Y., Wang, X., Buas, M. F., Zhang, C., Ma, J., Wei, B., ... & Loeb, K. R. (2018). Whole-genome sequencing of esophageal adenocarcinoma in Chinese patients reveals distinct mutational signatures and genomic alterations. Communications Biology, 1(1), 174. (PMID: 30374464, See Figure 1c)
- Vinayanuwattikun, C., Le Calvez-Kelm, F., Abedi-Ardekani, B., Zaridze, D., Mukeria, A., Voegele, C., ... & Byrnes, G. (2016). Elucidating genomic characteristics of lung cancer progression from in situ to invasive adenocarcinoma. Scientific reports, 6, 31628. (PMID: 27545006, See Figure 1a-c)
- Li, X. C., Wang, M. Y., Yang, M., Dai, H. J., Zhang, B. F., Wang, W., ... & Zhang, W. (2018). A mutational signature associated with alcohol consumption and prognostically significantly mutated driver genes in esophageal squamous cell carcinoma. Annals of Oncology, 29(4), 938-944. (PMID: 29351612, See Figure 2A)
- Li, J., Yan, S., Liu, Z., Zhou, Y., Pan, Y., Yuan, W., ... & Cai, H. (2018). Multiregional sequencing reveals genomic alterations and clonal dynamics in primary malignant melanoma of the esophagus. Cancer research, 78(2), 338-347. (PMID: 28972077, See Figure 2A)
- Chang, J., Tan, W., Ling, Z., Xi, R., Shao, M., Chen, M., ... & Xia, Y. (2017). Genomic analysis of oesophageal squamous-cell carcinoma identifies alcohol drinking-related mutation signature and genomic alterations. Nature communications, 8, 15290. (PMID: 28548104, See Figure 1a)
- refine filter option notes.
- center the vertical axis label.
- show percentages in the pie-chart.
- fix 'Off' and 'Remove' bottom (Labels, Sidebar) being able to get chosen (highlighted) simultaneously.
- fix filtering with custom bed file.
- fix 'apply' bottom (Choose tab, Files, Sidebar) not active when choosing another file.
- fix legend 'C>T or G> A' to be 'C>T or G>A'.
- fix label not becoming transparent with the bin when mouse moving.
- change titles 'Interval' and 'Y axis' (Settings, Sidebar) to 'Region' and 'Measurement of counts' respectively.
- change input file type from TSV to BGZ.
- add filtering by tumor frequency in Sidebar
- add light theme.
- initial functions implemented.