Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Thursday, August 3 • 12:20 - 12:40
Concentrate – graphical tool for feature scale genomic data analysis

Sign up or log in to save this to your schedule and see who's attending!

Numerous tools for genomics data visualization exist to the moment both web-based (UCSC Genome Browser, Ensembl Genome Browser) and standalone (Integrative Genomics Viewer, Artemis, Tablet, etc.). All tools share the same principle of displaying genomic data in linear coordinates of reference genome providing user with pan and zoom controls for navigation. Being the most straightforward this classical approach suits well for operation with low-level sequencing data (e.g. BAM files) but can’t readily support scenarios emerging when working with genome annotation data and sequencing result interpretation. Genomic data is sparse meaning that elements that are of interest to the researcher (e.g. potentially disease causing variants) may be separated by thousands and millions of bases or located at different contigs. Often exact annotation feature location is of less importance than presence and number of such features (as in case of search for monogenic recessive disorders in sequencing data where presence of two pathogenic variants means potential disease) or feature relations such as intersections of genetic variants and protein functional sites. These data features are hard to discover with pan and zoom approach and difficult to visualize in linear scale. We propose different approach to genomic data visualization that uses element interaction event as graphical scale unit. Until element doesn’t interact (overlaps, covers, etc.) with other elements its visual size equals one unit and every interaction site adds one more unit to its size. Physical size scale that is stretched and shrunk according to visual elements size is used to preserve information of physical size of the objects being displayed. This ensures the most efficient use of screen space in terms of object density and makes elements interaction events straightforward to detect. We implement this principle in Concentrate – an open-source application that visualizes genomic data. In addition to interaction-based scaling it provides intra- and inter track filtering capabilities with data-type based attribute discovery and rich elements interaction based on logical operators. As an example it can be used to restrict genetics variants viewing to ones that has frequency less than 0.05 or are annotated as pathogenic and intersects exons of CFTR gene where gene data came from BED file and variation data from unrelated VCF file. Such capabilities enables deep data analysis in visualization software without the use of external tools for filtering and region manipulation such as vcftools or bedtools, which currently is not the case for other genomic browsers. Concentrate is created with Java and JavaScript and distributed as single jar file that can run on any machine with modern web-browser and Java installed. It’s based on client-server architecture and can be run both locally and as a service. Source code is licensed under GNU Affero GPLv3 and is available on GitHub.

Speakers
avatar for Anton Bragin

Anton Bragin

Head of Bioinformatics Department, Parseq Lab



Thursday August 3, 2017 12:20 - 12:40
Graduate School of Management Building, room 309 Volkhovskiy Pereulok, 3, St. Petersburg, Russia

Attendees (1)