Download a zip of demo file set and Check the official demo inputs.

Introduction

The 'Group Samples' visualization is an interactive and extended version of the output plot from 'GISTIC', a bioinformatics tool for identifying regions of the genome that are significantly amplified or deleted across a set of samples. In this visualization, each aberration is assigned a G-score that considers the amplitude of the aberration as well as the frequency of its occurrence across samples. The G-scores are the drawn as the red (amplification) and blue (deletion) lines on the plot. False Discovery Rate q-values are then calculated for the aberrant regions and regions with q-values below a user-defined threshold (shown as the two green lines) are considered significant. The “wide peak”, determined using a leave-one-out algorithm to allow for errors in the boundaries in a single sample, are shown by the text tags extending from the line plot. The “wide peak” boundaries are more robust for identifying the most likely gene targets in the region. We also lists genes found in each “wide peak” region in a gene box. To visualize data, upload three TSV files in the required format and use sidebar options to customize the display.

Group Samples Data (TSV files)

The uploaded TSV files must match the required format as specified below.

Scores

Check the official demo input here. User can directly use the score output from the 'GISTIC' tool as our input file.

  • header
    The first row contains eight column headings, which must be identical to those listed in the following:
    • Type: Aberration type, which is specified as Amp or Del (amplification or deletion).
    • chromosome: Chromosome.
    • Start: Location of the first base pair in the aberrant region.
    • End: Location of the last base pair in the aberrant region.
    • q-value: False Discovery Rate q-values for the aberrant regions (q-values below a user-defined threshold are considered significant).
    • G-score: G-score that considers the amplitude of the aberration as well as the frequency of its occurrence across samples.
    • average amplitude: Average amplitudes among aberrant samples.
    • frequency: Frequency of aberration across the genome for both amplifications and deletions.

amp_genes.conf_90

Check the official demo input here. User can directly use the amp_genes.conf_90 output from the 'GISTIC' tool as our input file. The amp genes file contains amplification peaks identified in the GISTIC analysis. The first four rows are cytoband, q-value, residual q-value and wide peak boundaries. The remaining rows list the genes contained in each wide peak. For peaks that contain no genes, the nearest gene is listed in brackets.

del_genes.conf_90

Check the official demo input here. User can directly use the del_genes.conf_90 output from the 'GISTIC' tool as our input file. The del genes file contains one column for each deletion peak identified in the GISTIC analysis. The file format for the del genes file is identical to the format for the amp genes file.

Display Interactions

There are four types of interactions: External Link and Download.

  • External Link
    Each gene name listed in the gene boxes are linked to the corresponding result from the genecard search webpage.
  • Download
    One SVG file will be generated when the 'Download' button is clicked. Only the default Dark Theme is available at this point.

Sidebar Functions

The sidebar provides diverse options to fine-tune the display, like managing files, viewing error genes in application or deletion and setting the q-value threshold.

  • Files
    • Manage Files: checklist of TSV files uploaded previously, delete or download said TSV files.
    • Upload: upload the three files. Note that the duplicated file name will be alerted and given a random postfix.
    • Choose: choose files uploaded previously. Note that this function is ONLY available to registered users (each account has certain storage).
    • File Sets: save the three files together as a file set. User can also choose to apply one file set from all saved ones.
  • Errors
    Error genes are the genes in the uploaded amp_genes.conf_90 or del_genes.conf_90 file but not founded in our database.
    • Error genes in Amp: list of error genes in application.
    • Error genes in Del: list of error genes in deletion.
  • Setting
    • q-value: setting the q-value threshold between 0.1 to 0.9.

Manual version=1.0, written by Miss. Li Shiying on 2019-12-19.

  1. MERMEL, C. H., SCHUMACHER, S. E., HILL, B., MEYERSON, M. L., BEROUKHIM, R. and GETZ, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biology, 12(4): R41. (PMID: 21527027, See Figure 3)
  2. Cancer Genome Atlas Research Network, et al. (2017). Integrated genomic and molecular characterization of cervical cancer. Nature543(7645), 378. (PMID: 28112728, See Extended Data Figure 3b-c)
  3. Chang, J., Tan, W., Ling, Z., Xi, R., Shao, M., Chen, M., ... & Xia, Y. (2017). Genomic analysis of oesophageal squamous-cell carcinoma identifies alcohol drinking-related mutation signature and genomic alterations. Nature Communications, 8, 15290. (PMID: 28548104, See Figure 3b)
  4. Witkiewicz, A. K., McMillan, E. A., Balaji, U., Baek, G., Lin, W. C., Mansour, J., ... & Choti, M. A. (2015). Whole-exome sequencing of pancreatic cancer defines genetic diversity and therapeutic targets. Nature Communications, 6, 6744. (PMID: 25855536, See Figure 2d)
  5. Zhu, B., Chen, S., Wang, H., Yin, C., Han, C., Peng, C., ... & Lian, C. G. (2018). The protective role of DOT1L in UV-induced melanomagenesis. Nature Communications, 9(1), 259. (PMID: 29343685, See Figure 1a)
  6. Liu, X. S., Genet, M. D., Haines, J. E., Mehanna, E. K., Wu, S., Chen, H. I. H., ... & Fisher, D. E. (2015). ZBTB7A suppresses melanoma metastasis by transcriptionally repressing MCAM. Molecular Cancer Research, 13(8), 1206-1217. (PMID: 25995384, See Figure 1A)
  7. Zhou, R., Shi, C., Tao, W., Li, J., Wu, J., Han, Y., ... & Wang, L. (2019). Analysis of mucosal melanoma whole-genome landscapes reveals clinically relevant genomic aberrations. Clinical Cancer Research, 25(12), 3548-3560.(PMID: 30782616, See Figure 4B)

Version

v1.0.2 (2020-02-21)

Developer

Mr. LI Hechen (GitHub)
Miss. LI Shiying (GitHub)

Designer

Dr. JIA Wenlong (Scholar, ORCID, GitHub)

Updates

v1.0.2

  • fix overlapped gene blocks.
  • fix chr-name display in Light Theme.
  • options to reset axis value range.

v1.0.1

  • refine axis labels.
  • add tooltip and marks on contracted gene block.
  • mark gene name not matched in database.
  • highlight gene block and cytoband name.
  • option to show single chromosome.
  • option to exchange G-score and Q-value plot.
  • refine genecards database link.
  • add horizontal scroll bar.
  • reset thresholds of G-score and Q-value.
  • optimize arrangements of gene blocks.

v1.0.0

  • initial functions implemented.
Load