We’ve analyzed over 1,000 more phenotypes using the UK Biobank’s latest data

Genome Diversity

Two weeks ago, we were the first research group to post our analysis results of the newly released UK Biobank exome sequence data on 50,000 individuals. We have now added to our list of phenotypes analyzed, including the blood biomarker and brain, heart, and abdominal imaging data that the UK Biobank recently made available. Our total number of phenotypes analyzed is now 2,742, including 29 blood biomarkers and 908 imaging phenotypes.
Among the new results, the blood biomarkers are definitely the stars of the show, driving more than 40 new statistically significant associations, including expected associations between loss of function variants in PCSK9 and LDL levels. We additionally find that rare variants in STAB1, a transmembrane receptor that is thought to play a role in angiogenesis, are associated with MRI measurements in several brain structures. We found this most strongly with an MRI measurement reflecting blood flow in the putamen (Median T2star).
We have also made several improvements to our interactive search tool where you can explore the results yourself. In addition to adding the new phenotypes, clicking on a gene will now show you the individual rare variants driving its association with a trait. This lets you dive right into any association of interest to understand the data underlying the association. You can also download the full result dataset and a summary of the top hits here.

Blood biomarkers

Gene Trait(s) p-value Freq Model
ALPL Alkaline phosphatase; Phosphate 2.30E-168 0.33% coding; LoF
SLC22A12 Urate 1.10E-106 0.41% coding
GOT1 Aspartate aminotransferase 7.90E-68 0.12% coding; LoF
CST3 Cystatin C 1.70E-48 0.03% LoF; coding
ABCA1 Apolipoprotein A; HDL cholesterol 7.00E-38 0.60% coding; LoF
GPLD1 Alkaline phosphatase 2.70E-33 0.42% coding; LoF
SHBG SHBG 1.50E-32 0.12% Lof; coding
GPT Alanine aminotransferase 6.30E-27 0.16% coding
APOB LDL direct; Cholesterol; Apolipoprotein B; Triglycerides 1.10E-26 0.12% LoF
PCSK9 LDL direct; Apolipoprotein B; Cholesterol 1.10E-25 0.27% coding; LoF
SLC2A9 Urate 7.20E-19 0.18% coding
ALB Albumin 2.80E-18 0.02% LoF; coding
ASGR1 Alkaline phosphatase 2.40E-16 0.16% coding; LoF
ABCB11 Alkaline phosphatase 2.00E-13 0.23% coding
LCAT HDL cholesterol; Apolipoprotein A 5.90E-13 0.12% coding
GCK Glycated haemoglobin; Glucose 6.90E-13 0.07% coding
APOA5 Triglycerides 9.00E-13 0.06% LoF
ANGPTL3 Apolipoprotein A; Cholesterol; Triglycerides 1.40E-12 0.22% coding; LoF
CRP C-reactive protein 3.20E-10 0.09% coding
LIPC Apolipoprotein A; HDL cholesterol 4.20E-10 0.35% coding
CETP HDL cholesterol 7.50E-10 0.05% LoF; coding
UGT1A1 Total bilirubin; Direct bilirubin 6.40E-09 0.08% coding
SCARB1 HDL cholesterol 8.10E-09 0.20% coding
UGT1A6 Total bilirubin 1.80E-08 0.34% coding
SLCO1B3 Total bilirubin 2.70E-08 0.62% coding
FCGRT Albumin 8.80E-08 0.08% coding
VTN Cystatin C 9.50E-08 0.02% LoF
TBX10 Phosphate 1.20E-07 0.05% LoF
NR1H3 HDL cholesterol 4.30E-07 0.24% coding


Gene Trait(s) p-value Freq Model
STAB1 Median T2star in putamen, caudate 1.40E-13 0.18% LoF; coding
MADCAM1 Volume of grey matter in Crus I Cerebellum 1.60E-07 0.10% coding
FAM177A1 Mean FA in cingulum hippocampus on FA skeleton 4.10E-07 0.06% coding
BARHL2 Weighted-mean MO in tract middle cerebellar peduncle 4.50E-07 0.09% coding
ZRANB3 Volume of grey matter in Inferior Temporal Gyrus, posterior division 5.60E-07 0.32% coding
RECQL Weighted-mean ICVF in tract cingulate gyrus part of cingulum 7.50E-07 0.30% coding
PEX2 Mean FA in corticospinal tract on FA skeleton 8.20E-07 0.10% coding


The methods we used were the same as those previously reported, but we now also make available some extra association information. Our main analysis of case-control traits requires at least 10 carriers expected in the minimum group based on the overall carrier frequency. However, associations that did not meet this criterion, but had at least 10 carriers in the smallest group, are also of interest. These associations are now highlighted in the downloadable summary file of our top associations. However, these associations should be interpreted with caution, as they are more prone to false positives than the more stringent table shown above. Also note that >50% of these associations were explained by a single variant, which was true of <10% of the associations that met our stringent criteria and is not the type of association that gene-based collapsing models are designed to identify. With these caveats, we show below the associations with p<1×10-9.

Low expected case carriers

This research has been conducted using the UK Biobank Resource under Application Number 40436.

Helix is the leading population genomics and viral surveillance company operating at the intersection of clinical care, research, and data analytics.
Categories: Helix Research