GPF Genotype Browser Columns

Preview columns

Column Field Description
family familyId Family ID
study Study name
variant location The position of the variant in a 1-­‐based coordinate system of hg19 reference assembly.
variant Description of the variant: sub(R-­‐>A) stands for substitution of the reference allele R to an alternative allele A; ins(seq) stands for insertion of the provided sequence (“seq”), and del(N) stands for deletions of N nucleotides

Download columns

Field Description
familyId Family ID
study Study name
phenotype Study phenotype
location The position of the variant in a 1-­‐based coordinate system of hg19 reference assembly.
variant Description of the variant: sub(R-­‐>A) stands for substitution of the reference allele R to an alternative allele A; ins(seq) stands for insertion of the provided sequence (“seq”), and del(N) stands for deletions of N nucleotides
family genotype The best state according to the Multinomial Model (Experimental Procedures). The format of the column is “momR dadR autR sibR/momA dadA autA sibA” where (for example) momR stands for the number of copies of the reference allele in the mother’s genotype and autA stands for the number of copies of the alternative allele in the genotype of the affected child.
family structure  
from parent Shows the parental haplotypes giving rise to de novo variants when they could be identified.
in child Shows the affected status and gender of the child in which the de novo variant was observed. The two children are listed when the de novo variant is shared by both.
count The observed number of reads supporting the different alleles at a given location. The format is <reference allele counts>/<alternative allele counts>/<other allele counts> and the order of individuals is <mom> <dad> <proband> and <sibling>. For example, “10 12 5 20/1 0 8 0/0 0 0 1” indicates that there were 10 reads supporting the reference allele in the mother, there were 8 reads supporting the alternative in the proband, and there was 1 read with a non-­‐reference allele in the unaffected sibling.
alt alleles  
parents called Count of independent parents tested for this variant
worst effect type The most severe effect the variant has on genes.
genes The list of gene affected by the variant and the most severe effect for every gene. The format is <gene 1>:<effect on gene 1>|<gene 2>:<effect on gene 2>|.
all effects  
effect details Details of variant effects on each affected isoform. The format is: <isoform 1 of gene 1>; <isoform 2 or gene 1>|<isoform 1 of gene 2>; <isoform 2 of gene 2>|… The amino acid change and the position of the amino acid within the protein are shown.

Gene Weights

LGD rank

RVIS rank

pLI rank

Genomic Scores

Field Description
phyloP100

Link: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phyloP100way/. Conservation scoring by phyloP (phylogenetic p-values) from the PHAST package (http://compgen.bscb.cornell.edu/phast/) for multiple alignments of 99 vertebrate genomes to the human genome.

phyloP100
phyloP46_vertebrates

Link: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phyloP46way/. Conservation scoring by phyloP (phylogenetic p-values) from the PHAST package (http://compgen.bscb.cornell.edu/phast/) for multiple alignments of 45 vertebrate genomes to the human genome, plus alternate sets of scores for the primate species and the placental mammal species in the alignments.

phyloP46_vertebrates
phyloP46_placentals

Alternate set of phyloP46_vertebrates scores for the placental mammal subset of species in the alignments.

phyloP46_placentals
phyloP46_primates

Alternate set of phyloP46_vertebrates scores for the primates subset species in the alignments.

phyloP46_primates
phastCons100

Link: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phastCons100way/. Compressed phastCons scores for multiple alignments of 99 vertebrate genomes to the human genome. PhastCons is a program for identifying evolutionarily conserved elements in a multiple alignment, given a phylogenetic tree.

phastCons100
phastCons46_vertebrates

Link: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/phastCons46way/. Compressed phastCons scores for multiple alignments of 45 vertebrate genomes to the human genome, plus an alternate set of scores for the primates subset of species in the alignments, and an alternate set of scores for the placental mammal subset of species in the alignments. PhastCons is a program for identifying evolutionarily conserved elements in a multiple alignment, given a phylogenetic tree.

phastCons46_vertebrates
phastCons46_placentals

Alternate set of phastCons46_vertebrates scores for the placental mammal subset of species in the alignments.

phastCons46_placentals
phastCons46_primates

Alternate set of phastCons46_vertebrates scores for the primates subset of species in the alignments.

phastCons46_primates
CADD_raw

Link: https://cadd.gs.washington.edu/download ; Higher values of raw scores have relative meaning that a variant is more likely to be simulated (or “not observed”) and therefore more likely to have deleterious effects. Scaled scores are PHRED-like (-10*log10(rank/total)) scaled C-score ranking a variant relative to all possible substitutions of the human genome (8.6x10^9).

CADD raw
CADD_phred

Link: https://cadd.gs.washington.edu/download ; Higher values of raw scores have relative meaning that a variant is more likely to be simulated (or “not observed”) and therefore more likely to have deleterious effects. Scaled scores are PHRED-like (-10*log10(rank/total)) scaled C-score ranking a variant relative to all possible substitutions of the human genome (8.6x10^9).

CADD phred
Linsight

Linsight scores for prediction of deleterious noncoding variants

Linsight
FitCons i6 merged

Link: http://compgen.cshl.edu/fitCons/0downloads/tracks/i6/scores/. Indicates the fraction of genomic positions evincing a particular pattern (or “fingerprint”) of functional assay results, that are under selective pressure. Score ranges from 0.0 to 1.0. A lower score indicates higher confidence.

FitCons-i6-merged
Brain Angular Gyrus

FitCons2 Scores for E067-Brain Angular Gyrus score-Roadmap Epigenomics DHS regions

FitCons2 E067-Brain Angular Gyrus
Brain Anterior Caudate

Scores for E068-Brain Anterior Caudate score-Roadmap Epigenomics DHS regions

FitCons2 E068-Brain Anterior Caudate
Brain Cingulate Gyrus

Scores for E069-Brain Cingulate Gyrus score-Roadmap Epigenomics DHS regions

FitCons2 E069-Brain Cingulate Gyrus
Brain Germinal Matrix

Scores for E070-Brain Germinal Matrix score-Roadmap Epigenomics DHS regions

FitCons2 E070-Brain Germinal Matrix
Brain Hippocampus Middle

Scores for E071-Brain Hippocampus Middle score-Roadmap Epigenomics DHS regions

FitCons2 E071-Brain Hippocampus Middle
Brain Inferior Temporal Lobe

Scores for E072-Brain Inferior Temporal Lobe score-Roadmap Epigenomics DHS regions

FitCons2 E072-Brain Inferior Temporal Lobe
Brain Dorsolateral Prefrontal Cortex

Scores for E073-Brain Dorsolateral Prefrontal Cortex score-Roadmap Epigenomics DHS regions

FitCons2 E073-Brain Dorsolateral Prefrontal Cortex
Brain Substantia Nigra

Scores for E074-Brain Substantia Nigra score-Roadmap Epigenomics DHS regions

FitCons2 E074-Brain Substantia Nigra
Fetal Brain Male

Scores for E081-Fetal Brain Male score-Roadmap Epigenomics DHS regions

FitCons2 E081-Fetal Brain Male
Fetal Brain Female

Scores for E082-Fetal Brain Female score-Roadmap Epigenomics DHS regions

FitCons2 E082-Fetal Brain Female
SSC Frequency

SSC Frequency

SSC Frequency
genome gnomAD AC Allele counts for the genome-only subset of gnomAD v2.1.
genome gnomAD AN Allele numbers for the genome-only subset of gnomAD v2.1.
genome gnomAD AF

Allele frequencies for the genome-only subset of gnomAD v2.1. gnomAD v2.1 comprises a total of 16mln SNVs and 1.2mln indels from 125,748 exomes, and 229mln SNVs and 33mln indels from 15,708 genomes. (Cited from https://macarthurlab.org/2018/10/17/gnomad-v2-1/)

“The raw counts (ac and an) refer to the total number of chromosomes with this allele, and total that were able to be called (whether reference or alternate), respectively. Thus, the allele frequency is ac/an.” (Cited from https://macarthurlab.org/2016/03/17/reproduce-all-the-figures-a-users-guide-to-exac-part-2/)

“Deleterious variants are expected to have lower allele frequencies than neutral ones, due to negative selection.” (Cited from the ExAC paper, p.10, ‘Inferring variant deleteriousness and gene constraint’)

A total of 15,708 genomes. (Cited from https://gnomad.broadinstitute.org/faq)

genome gnomAD allele frequency
genome gnomAD AF percent

Allele frequencies for the genome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0)

genome gnomAD allele frequency percent
genome gnomAD controls AC Controls-only allele counts for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)
genome gnomAD controls AN Controls-only allele numbers for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)
genome gnomAD controls AF

Controls-only allele frequencies for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)

controls genome gnomAD allele frequency
genome gnomAD controls AF percent

Controls-only allele frequencies for the genome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0) (Only samples from individuals who were not selected as a case in a case/control study of common disease.)

controls genome gnomAD allele frequency percent
genome gnomAD non-neuro AC Non-neuro allele counts for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)
genome gnomAD non-neuro AN Non-neuro allele numbers for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)
genome gnomAD non-neuro AF

Non-neuro allele frequencies for the genome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)

non-neuro genome gnomAD allele frequency
genome gnomAD non-neuro AF percent

Non-neuro allele frequencies for the genome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0) (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)

non-neuro genome gnomAD allele frequency percent
exome gnomAD AC Allele counts for the exome-only subset of gnomAD v2.1.
exome gnomAD AN Allele numbers for the exome-only subset of gnomAD v2.1.
exome gnomAD AF

Allele frequencies for the exome-only subset of gnomAD v2.1.

A total of 125,748 exomes. (Cited from https://gnomad.broadinstitute.org/faq)

exome gnomAD allele frequency
exome gnomAD AF percent

Allele frequencies for the exome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0)

exome gnomAD allele frequency percent
exome gnomAD controls AC Controls-only allele counts for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)
exome gnomAD controls AN Controls-only allele numbers for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)
exome gnomAD controls AF

Controls-only allele frequencies for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not selected as a case in a case/control study of common disease.)

controls exome gnomAD allele frequency
exome gnomAD controls AF percent

Controls-only allele frequencies for the exome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0) (Only samples from individuals who were not selected as a case in a case/control study of common disease.)

controls exome gnomAD allele frequency percent
exome gnomAD non-neuro AC Non-neuro allele counts for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)
exome gnomAD non-neuro AN Non-neuro allele numbers for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)
exome gnomAD non-neuro AF

Non-neuro allele frequencies for the exome-only subset of gnomAD v2.1. (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)

non-neuro exome gnomAD allele frequency
exome gnomAD non-neuro AF percent

Non-neuro allele frequencies for the exome-only subset of gnomAD v2.1, as a percentage. (i.e. multiplied by 100.0) (Only samples from individuals who were not ascertained for having a neurological condition in a neurological case/control study)

non-neuro exome gnomAD allele frequency percent
MPC

MPC - Missense badness, PolyPhen-2, and Constraint

Downloaded from: MPC download link

MPC