Posts

Showing posts from December, 2017

Gene interactions - Genes don't work in isolation, and your gene lists shouldn't either

Genes don't work in isolation, and your gene lists shouldn't either. Researchers will often have a list of known genes to look for when prioritizing variants based on the patient's clinical diagnosis. but what should you do if no candidate variants can be found solely based on your gene list? There are many approaches, but one option is to expand the gene list based on known gene interactions and pathways. We rely on the highly curated database called BioGRID to expand the gene list to include the network of genes known to interact either directly or through protein-to-protein interactions. To use this new feature, there is a new checkbox called 'Gene interactions' which users can tick to expand their gene-based search in this way.

Search profiles

Users can now create their own search profiles as a way of storing commonly used search filters without having to repeatedly choose the same options over and over again. One example is to include your gene lists in a search profile. The search profiles are associated with the user only and are not shared.

BAI - BAM index files downloadable

The BAM index files, known as BAI files, are now available for download along with the BAM file. This is particularly useful when using IGV on your desktop.

New search filters: gnomAD frequency and INDEL ExAC frequencies

The bioinformatics pipeline has been updated to include gnomAD frequencies and added INDEL exac frequencies. Any new data generated from Dec 2017 onwards will have these new fields. However, none of the previously analyzed datasets will have them. They will have to be reanalyzed if you want these new fields populated. To go along with these new fields, we've added the new gnomAD frequency filter to our search page.

Exon coverage search

The sequencing and alignment process isn't perfect and often there are regions of poor coverage as a result of the pipeline analysis. Previously we made the coverage reports available for download as part of our datasets as 'exonReports'. We've taken it a step further by allowing users to search through these coverage reports based on gene, patient ID and coverage type (NO_COVERAGE, POOR_COVERAGE, PARTIAL_COVERAGE). To use this new feature, in the menus, choose 'Search exon coverage'. Furthermore, we added a new tab to display exon coverage to go along with the variant search results. The tab will only have results if users search by Patient ID and Gene. This way users can browse variants and the coverage results side-by-side providing a broader view over the quality of the variants being presented. In particular, this will be useful for difficult to diagnose patients for which no causal variants have been identified, where potentially disease-causing variants m

CACPIC Frequencies exportable

Chinese frequencies using our healthy chinese controls are now exportable to Excel as an optional column. We've made it optional because these frequencies are calculated at runtime during the export process and can delay the completion of export. For those not interested in the CACPIC frequencies, leave the checkbox unticked.

Supplementary information available for download

As part of our datasets for download, we've added some supplementary information (generated as TXT files by the pipeline) to go along with the VCFs and BAMs files. The files contain information such as the cutoffs used to qualify variants as a PASS, which include things like read depth cutoffs, median quality score cutoffs and so on. Furthermore, the files also contain general statistics about total variants passed, the proportion of variants that are exonic, splice sites, number of distinct genes and averages of read depth and median quality. Another file called the 'readReport.summary' includes information about how many were paired, mispaired, aligned, and unaligned. The supplemantary files can be found under the 'Datasets' section.

Measuring Variant Conservation with GERP Score and Siphy

The more annotations, the better! We've recently added 2 new annotations to assist with variant prioritisation as a measure of variant conservation and these are GERP scores and Siphy. GERP stands for Genomic Evolutionary Rate Profiling. Conceptually, GERP is a method for the identification of slowly evolving regions in a multiple sequence alignment, defined as ‘constrained elements’. "Constrained elements are identified by comparing the observed to the expected rates of evolution for each window, and defining all those regions whose collective observed rates of evolution are significantly lower than would be expected under a null model." More simply, it is a score used to calculate the conservation of each nucleotide in multi-species alignment with ranges from -12.3 to 6.17, with 6.17 being the most conserved. Positive scores (observed fewer than expected) indicate that a site is under evolutionary constraint. Negative scores may be weak evidence of accelerated rates of

Allele Frequency filtering

The pipeline does the best it can in assigning variants a zygosity based on the allele frequencies and counts. Usually the cutoff is around 90%. However, once we reach below this threshold, it becomes less clear. Hence we now allow users to filter by allele frequency particularly useful in cases where zygosity is not always clear. Users can filter on the VARIABLE allele frequency as well as the REFERENCE allele frequency.

gnomAD frequencies based on ethnicity

Previously the gnomAD frequency shown reflected the european frequency. We now display the gnomAD frequency from all ethnicities under the 'Latest annotations' tab.

Gene synonym searching

Previously searching by gene was based on the exact match of the provided gene name without consideration of the evolution of gene names over time. Gene names are often given synonyms or replace old gene names with new ones. With this change users will have the option to expand gene name searching to include synonyms. The expanded list of synonyms is shown in the returning page. Our testing has shown this to greatly affect the results returned.