Please select your location and reference genome:
Using trusted reference data sources to improve diagnostic yield and interpret genomic data faster

Using trusted reference data sources to improve diagnostic yield and interpret genomic data faster

8/4/2020    |    0 min read

Using trusted reference data sources to improve diagnostic yield and interpret genomic data faster

Ever since the Human Genome Project was completed 20 years ago, it has widely been believed that the integration of genomic data into healthcare could revolutionize medicine, supporting personalized disease diagnosis and treatment to transform lives.

We are on the cusp of that reality, where insights from genetic data can be used to support better health outcomes, however identifying the causative variant of a rare genetic disease is often like searching for a needle in a haystack with Next Generation Sequencing (NGS) yielding up to five million variants per patient.

Using appropriate reference data sources helps to improve diagnostic yield and can support faster interpretation of genomic data. To understand the importance and impact of genetic variants, curated datasets help to identify, compare and cross-reference genotypes of different genetic diseases in clinical cases.

To ensure variants can be interpreted against the most up to date research, Congenica has partnered with key database providers to ensure its users have access to the latest genotype and phenotype data – these include Mastermind® by Genomenon, DECIPHER, ClinVar, OMIM and many more.

Mastermind® by Genomenon is a Genomic Search Engine which identifies and indexes the world’s most comprehensive collection of genomic evidence and makes it searchable. Users can search all indexed scientific literature by disease, gene, variant, keyword, phenotype, therapy and more. Mastermind also prioritizes the data – sorting it to bring the most relevant articles to the top of your results. Through our partnership with Genomenon, Congenica users have access to the full Mastermind database.[1]

Sign up to Mastermind



Mastermind has over seven and a half million high-yield full-text genomic articles prioritized from the over 30 million articles available in the literature. Approximately 10,000 new articles are added per week, and hundreds of thousands of supplemental data sets per quarter. Mastermind is the only tool keeping pace with the cadence of new articles being published globally. This is designed to give clinicians and scientists the most up-to-date information and greatly reduces the time it takes to find relevant results and supporting evidence.

DECIPHER is a database of genomic variant data gathered from analyses of patient DNA focused on chromosomal abnormalities with submicroscopic pathogenic deletions and duplications, as well as sequence variants. Data on DECIPHER includes over 37,000 curated SNVs (single nucleotide variants) and CNVs (copy number variants) and is shared and updated by a global community. Congenica fully integrates DECIPHER SNV and CNV curated data as Curated Variant Lists.[2]

The integration of DECIPHER data into Congenica as a Curated Variant List expands the set of disease-related and evidence-based resources available in the platform. This reduces the chances of overlooking potential disease-relevant variants, providing greater opportunity for quicker decision-making during analysis.


Additional reference data to aid genomic interpretation

Congenica provides access to the widest range of genes, variants and reference data sources available, all in one place.[2] These include over 35 Reference Datasets, Automatic ACMG Score-based pathogenicity calculations, and publication search capabilities to automatically identify supporting evidence in the scientific literature. Each of these databases provide their own benefits and offer unique information to the user to help streamline their analysis.


Reference data type



Biological annotation


Exclude intergenic variants



Determine the most detrimental effect per variant

Ensembl, RefSeq

Protein domain-centric

Variant mapping to protein structure & functional domain

Pfam, Uniprot


Identify evolutionary constrained nucleotides

GERP, Species conservation track in genome browser

Splicing motifs

Variant mapping to known splice junctions per transcript

MAST, Ensembl


Pinpoint genes intolerant to loss-of-function variation



Explore the sequence uniqueness of genomic regions

UCSC ENCODE Mappability Track

Frequency annotation

Population data

Distinguish rare candidate variants from benign variation


Disease-related annotation

Gene-disease linkage

Review per-gene disease and phenotype information


Virtual gene panels

Narrow search to variants in clinically relevant genes


Evidence-based annotation

Variant repositories

Identify previously reported known variants


Literature searches

Identify variants previously cited in the literature

Mastermind CVR

Phenotype annotation

Phenotype ontology

Standardize vocabulary to describe patient phenotype and prioritize variants using Exomiser


*List not exhaustive

The inclusion of such a wide range of rich data sets in the Congenica platform enables healthcare professionals around the world to analyze and interpret more cases with greater efficiency, increased diagnostic yield and greater confidence in their decisions. The wide array of datasets integrated within Congenica means you no longer need to use multiple resources to search for variant information; everything is in one place, saving time and providing confidence in your genomic data interpretation.


Download our introductory guide to Congenica reference data

Download Congenica Reference Data Guide



[1] Congenica Partners with Genomenon to Integrate Mastermind into Congenica Clinical Platform https://www.genomenon.com/congenica-partnership-genomenon-integrate-mastermind/

[2] Congenica Integrates Powerful New Curated Data https://www.genomenon.com/congenica-integrates-powerful-new-curated-data/