Professor Inga Prokopenko
News
Supervision
Postgraduate research supervision
I supervise on the following courses:
Publications
NA
Cardiovascular disease (CVD) represents the most common and lethal chronic disease worldwide. Lipids levels are the strongest risk factors for CVD and this is demonstrated by the fact that lipid-lowering statin therapy is largely used to prevent CVD. The role of the KIF6 gene in response to the statin therapy is controversial, and the biological mechanism through which it may act is still unknown.We investigated the role of KIF6 locus variants alone and their interaction with the well-established lipid locus at HMGCR in the variability of metabolic traits and in response to statin therapy in an Italian sample. We genotyped two intronic rs20455, rs9462535 and a coding rs9471077 within the KIF6 gene, as well as two non-coding rs3761740 and rs3846662 at HMGCR. We tested the association of these SNPs with 19 cardiometabolic phenotypes and lipid-lowering therapy response in a sample of 1645 individuals from the Brisighella cohort (BC). Established rs3846662 (Willer et al, Nat Gen 2013) at HMGCR is associated (P = 8.5x10-4) with LDL cholesterol (LDL-C) in BC. We did not find any significant association of KIF6 variants with response to statin therapy. We observe a locus-wide significant association at KIF6 between rs9471077 and APOB levels and rs20455 and HDL-C (P less than 0.001). rs3761740 at HMGCR showed an effect on systolic and diastolic blood pressure (SBP/DBP, P less than 0.007), which however wasn't significant after multiple testing correction. This is the first genetic study reported for Brisighella cohort, which confirms association with LDL-C at HMGCR locus. We noticed an effect of KIF6 variants on APOB and HDL-C, while we don't observe any effect on statin therapy. The study sample is relatively small to discover a common variant effect and might still be due to chance; therefore, we are seeking for replication in additional cohorts. These findings, if confirmed, might contribute to development of approaches for stratified patient care.
Abdominal aortic aneurysm (AAA) is a common cause of morbidity and mortality and has a significant heritability. We carried out a genome-wide association discovery study of 1866 patients with AAA and 5435 controls and replication of promising signals (lead SNP with a p value < 1 x 10(-5)) in 2871 additional cases and 32,687 controls and performed further follow-up in 1491 AAA and 11,060 controls. In the discovery study, nine loci demonstrated association with AAA (p < 1 x 10(-5)). In the replication sample, the lead SNP at one of these loci, rs1466535, located within intron 1 of low-density-lipoprotein receptor-related protein 1 (LRP1) demonstrated significant association (p = 0.0042). We confirmed the association of rs1466535 and AAA in our follow-up study (p = 0.035). In a combined analysis (6228 AAA and 49182 controls), rs1466535 had a consistent effect size and direction in all sample sets (combined p = 4.52 x 10(-10), odds ratio 1.15 [1.10-1.21]). No associations were seen for either rs1466535 or the 12q13.3 locus in independent association studies of coronary artery disease, blood pressure, diabetes, or hyperlipidaemia, suggesting that this locus is specific to AAA. Gene-expression studies demonstrated a trend toward increased LRP1 expression for the rs1466535 CC genotype in arterial tissues; there was a significant (p = 0.029) 1.19-fold (1.04-1.36) increase in LRP1 expression in CC homozygotes compared to TT homozygotes in aortic adventitia. Functional studies demonstrated that rs1466535 might alter a SREBP-1 binding site and influence enhancer activity at the locus. In conclusion, this study has identified a biologically plausible genetic variant associated specifically with AAA, and we suggest that this variant has a possible functional role in LRP1 expression.
OBJECTIVE-Many genetic variants have been associated with glucose homeostasis and type 2 diabetes in genome-wide association studies. Zinc is an essential micronutrient that is important for beta-cell function and glucose homeostasis. We tested the hypothesis that zinc intake could influence the glucose-raising effect of specific variants. RESEARCH DESIGN AND METHODS-We conducted a 14-cohort meta-analysis to assess the interaction of 20 genetic variants known to be related to glycemic traits and zinc metabolism with dietary zinc intake (food sources) and a 5-cohort meta-analysis to assess the interaction with total zinc intake (food sources and supplements) on fasting glucose levels among individuals of European ancestry without diabetes. RESULTS-We observed a significant association of total zinc intake with lower fasting glucose levels (beta-coefficient +/- SE per 1 mg/day of zinc intake: -0.0012 +/- 0.0003 mmol/L, summary P value = 0.0003), while the association of dietary zinc intake was not significant. We identified a nominally significant interaction between total zinc intake and the SLC30A8 rs11558471 variant on fasting glucose levels (beta-coefficient +/- SE per A allele for 1 mg/day of greater total zinc intake: -0.0017 +/- 0.0006 mmol/L, summary interaction P value = 0.005); this result suggests a stronger inverse association between total zinc intake and fasting glucose in individuals carrying the glucose-raising A allele compared with individuals who do not carry it. None of the other interaction tests were statistically significant. CONCLUSIONS Our results suggest that higher total zinc intake may attenuate the glucose-raising effect of the rs11558471 SLC30A8 (zinc transporter) variant. Our findings also support evidence for the association of higher total zinc intake with lower fasting glucose levels. Diabetes 60:2407-2416, 2011
To identify novel coding association signals and facilitate characterization of mechanisms influencing glycemic traits and type 2 diabetes risk, we analyzed 109,215 variants derived from exome array genotyping together with an additional 390,225 variants from exome sequence in up to 39,339 normoglycemic individuals from five ancestry groups. We identified a novel association between the coding variant (p.Pro50Thr) in AKT2 and fasting plasma insulin (FI), a gene in which rare fully penetrant mutations are causal for monogenic glycemic disorders. The low-frequency allele is associated with a 12% increase in FI levels. This variant is present at 1.1% frequency in Finns but virtually absent in individuals from other ancestries. Carriers of the FI-increasing allele had increased 2-h insulin values, decreased insulin sensitivity, and increased risk of type 2 diabetes (odds ratio 1.05). In cellular studies, the AKT2-Thr50 protein exhibited a partial loss of function. We extend the allelic spectrum for coding variants in AKT2 associated with disorders of glucose homeostasis and demonstrate bidirectional effects of variants within the pleckstrin homology domain of AKT2.
Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.
Genome-wide association studies have facilitated the discovery of thousands of loci for hundreds of phenotypes. However, the issue of missing heritability remains unsolved for most complex traits. Locus discovery could be enhanced with both improved power through multi-phenotype analysis (MPA) and use of a wider allele frequency range, including rare variants (RVs). MPA methods for single-variant association have been proposed, but given their low power for RVs, more efficient approaches are required. We propose multi-phenotype analysis of rare variants (MARV), a burden test-based method for RVs extended to the joint analysis of multiple phenotypes through a powerful reverse regression technique. Specifically, MARV models the proportion of RVs at which minor alleles are carried by individuals within a genomic region as a linear combination of multiple phenotypes, which can be both binary and continuous, and the method accommodates directly the genotyped and imputed data. The full model, including all phenotypes, is tested for association for discovery, and a more thorough dissection of the phenotype combinations for any set of RVs is also enabled. We show, via simulations, that the type I error rate is well controlled under various correlations between two continuous phenotypes, and that the method outperforms a univariate burden test in all considered scenarios. Application of MARV to 4876 individuals from the Northern Finland Birth Cohort 1966 for triglycerides, high- and low-density lipoprotein cholesterols highlights known loci with stronger signals of association than those observed in univariate RV analyses and suggests novel RV effects for these lipid traits.
Type 2 diabetes (T2D) is more prevalent in African Americans than in Europeans. However, little is known about the genetic risk in African Americans despite the recent identification of more than 70 T2D loci primarily by genome-wide association studies (GWAS) in individuals of European ancestry. In order to investigate the genetic architecture of T2D in African Americans, the MEta-analysis of type 2 DIabetes in African Americans (MEDIA) Consortium examined 17 GWAS on T2D comprising 8,284 cases and 15,543 controls in African Americans in stage 1 analysis. Single nucleotide polymorphisms (SNPs) association analysis was conducted in each study under the additive model after adjustment for age, sex, study site, and principal components. Meta-analysis of approximately 2.6 million genotyped and imputed SNPs in all studies was conducted using an inverse variance-weighted fixed effect model. Replications were performed to follow up 21 loci in up to 6,061 cases and 5,483 controls in African Americans, and 8,130 cases and 38,987 controls of European ancestry. We identified three known loci (TCF7L2, HMGA2 and KCNQ1) and two novel loci (HLA-B and INS-IGF2) at genome-wide significance (4.15x10(-94) < P < 5x10(-8), odds ratio (OR) = 1.09 to 1.36). Fine-mapping revealed that 88 of 158 previously identified T2D or glucose homeostasis loci demonstrated nominal to highly significant association (2.2x10(-23) < locus-wide P
To identify genetic variants associated with head circumference in infancy, we performed a meta-analysis of seven genome-wide association studies (GWAS) (N = 10,768 individuals of European ancestry enrolled in pregnancy and/or birth cohorts) and followed up three lead signals in six replication studies (combined N = 19,089). rs7980687 on chromosome 12q24 (P = 8.1 x 10(-9)) and rs1042725 on chromosome 12q15 (P = 2.8 x 10(-10)) were robustly associated with head circumference in infancy. Although these loci have previously been associated with adult height(1), their effects on infant head circumference were largely independent of height (P = 3.8 x 10(-7) for rs7980687 and P = 1.3 x 10(-7) for rs1042725 after adjustment for infant height). A third signal, rs11655470 on chromosome 17q21, showed suggestive evidence of association with head circumference (P = 3.9 x 10(-6)). SNPs correlated to the 17q21 signal have shown genome-wide association with adult intracranial volume(2), Parkinson's disease and other neurodegenerative diseases(3-5), indicating that a common genetic variant in this region might link early brain growth with neurological disease in later life.
To identify common variants influencing body mass index (BMI), we analyzed genome-wide association data from 16,876 individuals of European descent. After previously reported variants in FTO, the strongest association signal (rs17782313, P = 2.9 x 10(-6)) mapped 188 kb downstream of MC4R (melanocortin-4 receptor), mutations of which are the leading cause of monogenic severe childhood-onset obesity. We confirmed the BMI association in 60,352 adults (per-allele effect = 0.05 Z-score units; P = 2.8 x 10(-15)) and 5,988 children aged 7-11 (0.13 Z-score units; P = 1.5 x 10(-8)). In case-control analyses (n = 10,583), the odds for severe childhood obesity reached 1.30 (P = 8.0 x 10(-11)). Furthermore, we observed overtransmission of the risk allele to obese offspring in 660 families (P (pedigree disequilibrium test average; PDT-avg) 2.4 x 10(-4)). The SNP location and patterns of phenotypic associations are consistent with effects mediated through altered MC4R function. Our findings establish that common variants near MC4R influence fat mass, weight and obesity risk at the population level and reinforce the need for large-scale data integration to identify variants influencing continuous biomedical traits.
Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ≥0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS), supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI) at genome-wide significance, and two for fasting glucose (FG), none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3) and FG (GCK and G6PC2). The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated.
Endometriosis is a chronic inflammatory condition in women that results in pelvic pain and subfertility, and has been associated with decreased body mass index (BMI). Genetic variants contributing to the heritable component have started to emerge from genome-wide association studies (GWAS), although the majority remain unknown. Unexpectedly, we observed an intergenic locus on 7p15.2 that was genome-wide significantly associated with both endometriosis and fat distribution (waist-to-hip ratio adjusted for BMI; WHRadjBMI) in an independent meta-GWAS of European ancestry individuals. This led us to investigate the potential overlap in genetic variants underlying the aetiology of endometriosis, WHRadjBMI and BMI using GWAS data. Our analyses demonstrated significant enrichment of common variants between fat distribution and endometriosis (P = 3.7 x 10(-3)), which was stronger when we restricted the investigation to more severe (Stage B) cases (P = 4.5 x 10(-4)). However, no genetic enrichment was observed between endometriosis and BMI (P = 0.79). In addition to 7p15.2, we identify four more variants with statistically significant evidence of involvement in both endometriosis and WHRadjBMI (in/near KIFAP3, CAB39L, WNT4, GRB14); two of these, KIFAP3 and CAB39L, are novel associations for both traits. KIFAP3, WNT4 and 7p15.2 are associated with the WNT signalling pathway; formal pathway analysis confirmed a statistically significant (P = 6.41 x 10(-4)) overrepresentation of shared associations in developmental processes/WNT signalling between the two traits. Our results demonstrate an example of potential biological pleiotropy that was hitherto unknown, and represent an opportunity for functional follow-up of loci and further cross-phenotype comparisons to assess how fat distribution and endometriosis pathogenesis research fields can inform each other.
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits(1), but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait(2,3). The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P
OBJECTIVE-Glycated hemoglobin (HbA(1c)), used to monitor and diagnose diabetes, is influenced by average glycemia over a 2- to 3-month period. Genetic factors affecting expression, turnover, and abnormal glycation of hemoglobin could also be associated with increased levels of HbA(1c). We aimed to identify such genetic factors and investigate the extent to which they influence diabetes classification based on HbA(1c) levels. RESEARCH DESIGN AND METHODS-We studied associations with HbA(1c) in up to 46,368 nondiabetic adults of European descent from 23 genome-wide association studies (GWAS) and 8 cohorts with de novo genotyped single nucleotide polymorphisms (SNPs). We combined studies using inverse-variance meta-analysis and tested mediation by glycemia using conditional analyses. We estimated the global effect of HbA(1c) loci using a multilocus risk score, and used net reclassification to estimate genetic effects on diabetes screening. RESULTS-Ten loci reached genome-wide significant association with HbA(1c), including six new loci near FN3K (lead SNP/P value, rs1046896/P = 1.6 x 10(-26)), HFE (rs1800562/P = 2.6 x 10(-20)), TMPRSS6 (rs855791/P = 2.7 x 10(-14)), ANK1 (rs4737009/P = 6.1 x 10(-12)), SPTA1 (rs2779116/P = 2.8 x 10(-9)) and ATP11A/TUBGCP3 (rs7998202/P = 5.2 x 10(-9)), and four known HbA(1c) loci: HK1 (rs16926246/P = 3.1 x 10(-54)), MTNR1B (rs1387153/P = 4.0 X 10(-11)), GCK (rs1799884/P = 1.5 x 10(-20)) and G6PC2/ABCB11 (rs552976/P = 8.2 x 10(-18)). We show that associations with HbA(1c) are partly a function of hyperglycemia associated with 3 of the 10 loci (GCK, G6PC2 and MTNR1B). The seven nonglycemic loci accounted for a 0.19 (%HbA(1c)) difference between the extreme 10% tails of the risk score, and would reclassify similar to 2% of a general white population screened for diabetes with HbA(1c). CONCLUSIONS-GWAS identified 10 genetic loci reproducibly associated with HbA(1c). Six are novel and seven map to loci where rarer variants cause hereditary anemias and iron storage disorders. Common variants at these loci likely influence HbA(1c) levels via erythrocyte biology, and confer a small but detectable reclassification of diabetes diagnosis by HbA(1c) Diabetes 59: 3229-3239, 2010
The maintenance of normal body weight is disrupted in patients with anorexia nervosa (AN) for prolonged periods of time. Prior to the onset of AN, premorbid body mass index (BMI) spans the entire range from underweight to obese. After recovery, patients have reduced rates of overweight and obesity. As such, loci involved in body weight regulation may also be relevant for AN and vice versa. Our primary analysis comprised a cross-trait analysis of the 1000 single-nucleotide polymorphisms (SNPs) with the lowest Pvalues in a genome-wide association meta-analysis (GWAMA) of AN (GCAN) for evidence of association in the largest published GWAMA for BMI (GIANT). Subsequently we performed sex-stratified analyses for these 1000 SNPs. Functional ex vivo studies on four genes ensued. Lastly, a look-up of GWAMA-derived BMI-related loci was performed in the AN GWAMA. We detected significant associations (P-values
Stature is a classical and highly heritable complex trait, with 80%–90% of variation explained by genetic factors. In recent years, genome-wide association studies (GWAS) have successfully identified many common additive variants influencing human height; however, little attention has been given to the potential role of recessive genetic effects. Here, we investigated genome-wide recessive effects by an analysis of inbreeding depression on adult height in over 35,000 people from 21 different population samples. We found a highly significant inverse association between height and genome-wide homozygosity, equivalent to a height reduction of up to 3 cm in the offspring of first cousins compared with the offspring of unrelated individuals, an effect which remained after controlling for the effects of socio-economic status, an important confounder (χ 2 = 83.89, df = 1; p = 5.2×10 −20 ). There was, however, a high degree of heterogeneity among populations: whereas the direction of the effect was consistent across most population samples, the effect size differed significantly among populations. It is likely that this reflects true biological heterogeneity: whether or not an effect can be observed will depend on both the variance in homozygosity in the population and the chance inheritance of individual recessive genotypes. These results predict that multiple, rare, recessive variants influence human height. Although this exploratory work focuses on height alone, the methodology developed is generally applicable to heritable quantitative traits (QT), paving the way for an investigation into inbreeding effects, and therefore genetic architecture, on a range of QT of biomedical importance. Studies investigating the extent to which genetics influences human characteristics such as height have concentrated mainly on common variants of genes, where having one or two copies of a given variant influences the trait or risk of disease. This study explores whether a different type of genetic variant might also be important. We investigate the role of recessive genetic variants, where two identical copies of a variant are required to have an effect. By measuring genome-wide homozygosity—the phenomenon of inheriting two identical copies at a given point of the genome—in 35,000 individuals from 21 European populations, and by comparing this to individual height, we found that the more homozygous the genome, the shorter the individual. The offspring of first cousins (who have increased homozygosity) were predicted to be up to 3 cm shorter on average than the offspring of unrelated parents. Height is influenced by the combined effect of many recessive variants dispersed across the genome. This may also be true for other human characteristics and diseases, opening up a new way to understand how genetic variation influences our health.
As custom arrays are cheaper than generic GWAS arrays, larger sample size is achievable for gene discovery. Custom arrays can tag more variants through denser genotyping of SNPs at associated loci, but at the cost of losing genome-wide coverage. Balancing this trade-off is important for maximizing experimental designs. We quantified both the gain in captured SNP-heritability at known candidate regions and the loss due to imperfect genome-wide coverage for inflammatory bowel disease using immunochip (iChip) and imputed GWAS data on 61 251 and 38 550 samples, respectively. For Crohn's disease (CD), the iChip and GWAS data explained 19 and 26% of variation in liability, respectively, and SNPs in the densely genotyped iChip regions explained 13% of the SNP-heritability for both the iChip and GWAS data. For ulcerative colitis (UC), the iChip and GWAS data explained 15 and 19% of variation in liability, respectively, and the dense iChip regions explained 10 and 9% of the SNP-heritability in the iChip and the GWAS data. From bivariate analyses, estimates of the genetic correlation in risk between CD and UC were 0.75 (SE 0.017) and 0.62 (SE 0.042) for the iChip and GWAS data, respectively. We also quantified the SNP-heritability of genomic regions that did or did not contain the previous 163 GWAS hits for CD and UC, and SNP-heritability of the overlapping loci between the densely genotyped iChip regions and the 163 GWAS hits. For both diseases, over different genomic partitioning, the densely genotyped regions on the iChip tagged at least as much variation in liability as in the corresponding regions in the GWAS data, however a certain amount of tagged SNP-heritability in the GWAS data was lost using the iChip due to the low coverage at unselected regions. These results imply that custom arrays with a GWAS backbone will facilitate more gene discovery, both at associated and novel loci.
Multiple genetic variants have been associated with adult obesity and a few with severe obesity in childhood; however, less progress has been made in establishing genetic influences on common early-onset obesity. We performed a North American, Australian and European collaborative meta-analysis of 14 studies consisting of 5,530 cases (>= 95th percentile of body mass index (BMI)) and 8,318 controls (
OBJECTIVE-Recent genome-wide association studies have revealed loci associated with glucose and insulin-related traits. We aimed to characterize 19 such loci using detailed measures of insulin processing, secretion, and sensitivity to help elucidate their role in regulation of glucose control, insulin secretion and/or action. RESEARCH DESIGN AND METHODS-We investigated associations of loci identified by the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) with circulating proinsulin, measures of insulin secretion and sensitivity from oral glucose tolerance tests (OGTTs), euglycemic clamps, insulin suppression tests, or frequently sampled intravenous glucose tolerance tests in nondiabetic humans (n = 29,084). RESULTS-The glucose-raising allele in MADD was associated with abnormal insulin processing (a dramatic effect on higher proinsulin levels, but no association with insulinogenic index) at extremely persuasive levels of statistical significance (P = 2.1 x 10(-71)). Defects in insulin processing and insulin secretion were seen in glucose-raising allele carriers at TCF7L2, SCL30A8, GIPR, and C2CD4B. Abnormalities in early insulin secretion were suggested in glucose-raising allele carriers at MTNR1B, GCK, FADS1, DGKB, and PROX1 (lower insulinogenic index; no association with proinsulin or insulin sensitivity). Two loci previously associated with fasting insulin (GCKR and IGF1) were associated with OGTT-derived insulin sensitivity indices in a consistent direction. CONCLUSIONS-Genetic loci identified through their effect on hyperglycemia and/or hyperinsulinemia demonstrate considerable heterogeneity in associations with measures of insulin processing, secretion, and sensitivity. Our findings emphasize the importance of detailed physiological characterization of such loci for improved understanding of pathways associated with alterations in glucose homeostasis and eventually type 2 diabetes. Diabetes 59:1266-1275, 2010
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after imputation using the 1000 Genomes multiethnic reference panel. Promising association signals were followed up in additional data sets (of 14,545 or 7,397 T2D case and 38,994 or 71,604 control subjects). We identified 13 novel T2D-associated loci (P < 5 x 10(-8)), including variants near the GLP2R, GIP, and HLA-DQA1 genes. Our analysis brought the total number of independent T2D associations to 128 distinct signals at 113 loci. Despite substantially increased sample size and more complete coverage of low-frequency variation, all novel associations were driven by common single nucleotide variants. Credible sets of potentially causal variants were generally larger than those based on imputation with earlier reference panels, consistent with resolution of causal signals to common risk haplotypes. Stratification of T2D-associated loci based on T2D-related quantitative trait associations revealed tissue-specific enrichment of regulatory annotations in pancreatic islet enhancers for loci influencing insulin secretion and in adipocytes, monocytes, and hepatocytes for insulin action-associated loci. These findings highlight the predominant role played by common variants of modest effect and the diversity of biological mechanisms influencing T2D pathophysiology.
To extend understanding of the genetic architecture and molecular basis of type 2 diabetes (T2D), we conducted a meta-analysis of genetic variants on the Metabochip, including 34,840 cases and 114,981 controls, overwhelmingly of European descent. We identified ten previously unreported T2D susceptibility loci, including two showing sex-differentiated association. Genomewide analyses of these data are consistent with a long tail of additional common variant loci explaining much of the variation in susceptibility to T2D. Exploration of the enlarged set of susceptibility loci implicates several processes, including CREBBP-related transcription, adipocytokine signaling and cell cycle regulation, in diabetes pathogenesis.
Elevated serum urate concentrations can cause gout, a prevalent and painful inflammatory arthritis. By combining data from >140,000 individuals of European ancestry within the Global Urate Genetics Consortium (GUGC), we identified and replicated 28 genome-wide significant loci in association with serum urate concentrations (18 new regions in or near TRIM46, INHBB, SEMBT1, TMEM171, VEGFA, BAZ1B, PRKAG2, STC1, HNF4G, A1CF, ATXN2, UBE2Q2, IGF1R, NFAT5, MAF, HLF, ACVR1B-ACVRL1 and B3GNT4). Associations for many of the loci were of similar magnitude in individuals of non-European ancestry. We further characterized these loci for associations with gout, transcript expression and the fractional excretion of urate. Network analyses implicate the inhibins-activins signaling pathways and glucose metabolism in systemic urate control. New candidate genes for serum urate concentration highlight the importance of metabolic control of urate production and excretion, which may have implications for the treatment and prevention of gout.
We recently identified 68 genomic loci where common sequence variants are associated with platelet count and volume. Platelets are formed in the bone marrow by megakaryocytes, which are derived from hematopoietic stem cells by a process mainly controlled by transcription factors. The homeobox transcription factor MEIS1 is uniquely transcribed in megakaryocytes and not in the other lineage-committed blood cells. By ChIP-seq, we show that 5 of the 68 loci pinpoint a MEIS1 binding event within a group of 252 MK-overexpressed genes. In one such locus in DNM3, regulating platelet volume, the MEIS1 binding site falls within a region acting as an alternative promoter that is solely used in megakaryocytes, where allelic variation dictates different levels of a shorter transcript. The importance of dynamin activity to the latter stages of thrombopoiesis was confirmed by the observation that the inhibitor Dynasore reduced murine proplatelet formation in vitro. (Blood. 2012;120(24):4859-4868)
Genome-wide association studies have enabled identification of thousands of loci for hundreds of traits. Yet, for most human traits a substantial part of the estimated heritability is unexplained. This and recent advances in technology to produce high-dimensional data cost-effectively have led to method development beyond standard common variant analysis, including single-phenotype rare variant and multi-phenotype common variant analysis, with the latter increasing power for locus discovery and providing suggestions of pleiotropic effects. However, there are currently no optimal methods and tools for the combined analysis of rare variants and multiple phenotypes. We propose a user-friendly software tool MARV for Multi-phenotype Analysis of Rare Variants. The tool is based on a method that collapses rare variants within a genomic region and models the proportion of minor alleles in the rare variants on a linear combination of multiple phenotypes. MARV provides analyses of all phenotype combinations within one run and calculates the Bayesian Information Criterion to facilitate model selection. The running time increases with the size of the genetic data while the number of phenotypes to analyse has little effect both on running time and required memory. We illustrate the use of MARV with analysis of triglycerides (TG), fasting insulin (FI) and waist-to-hip ratio (WHR) in 4,721 individuals from the Northern Finland Birth Cohort 1966. The analysis suggests novel multi-phenotype effects for these metabolic traits at APOA5 and ZNF259, and at ZNF259 provides stronger support for association (P = 1.8 × 10 ) than observed in single phenotype rare variant analyses (P = 6.5 × 10 and P = 0.27). MARV is a computationally efficient, flexible and user-friendly software tool allowing rapid identification of rare variant effects on multiple phenotypes, thus paving the way for novel discoveries and insights into biology of complex traits.
Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. Subcutaneous and visceral fat were quantified in 5,560 women and 4,997 men from 4 population-based studies. Genome-wide genotyping was performed using standard arrays and imputed to ∼2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), VAT adjusted for body mass index, and VAT/SAT ratio (a metric of the propensity to store fat viscerally as compared to subcutaneously) in the overall sample and in women and men separately. A weighted z-score meta-analysis was conducted. For the VAT/SAT ratio, our most significant p-value was rs11118316 at LYPLAL1 gene (p = 3.1×10E-09), previously identified in association with waist–hip ratio. For SAT, the most significant SNP was in the FTO gene (p = 5.9×10E-08). Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, rs1659258 near THNSL2 (p = 1.6×10-08), but not men (p = 0.75). Validation of this SNP in the GIANT consortium data demonstrated a similar sex-specific pattern, with observed significance in women (p = 0.006) but not men (p = 0.24) for BMI and waist circumference (p = 0.04 [women], p = 0.49 [men]). Finally, we interrogated our data for the 14 recently published loci for body fat distribution (measured by waist–hip ratio adjusted for BMI); associations were observed at 7 of these loci. In contrast, we observed associations at only 7/32 loci previously identified in association with BMI; the majority of overlap was observed with SAT. Genome-wide association for visceral and subcutaneous fat revealed a SNP for VAT in women. More refined phenotypes for body composition and fat distribution can detect new loci not previously uncovered in large-scale GWAS of anthropometric traits. Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. We quantified subcutaneous and visceral fat in more than 10,000 women and men who also had genome-wide association data available. Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, near the THNSL2 gene. These findings were not observed in men. We also interrogated our data for the 14 recently published loci for body fat distribution (measured by waist–hip ratio adjusted for BMI); associations were observed for 7 of these loci, most notably for VAT/SAT ratio. We conclude that genome-wide association for visceral and subcutaneous fat revealed a SNP for VAT in women. More refined phenotypes for body composition and fat distribution can detect new loci not uncovered in large-scale GWAS of anthropometric traits.
The phenotypic effect of some single nucleotide polymorphisms (SNPs) depends on their parental origin. We present a novel approach to detect parent-of-origin effects (POEs) in genome-wide genotype data of unrelated individuals. The method exploits increased phenotypic variance in the heterozygous genotype group relative to the homozygous groups. We applied the method to >56,000 unrelated individuals to search for POEs influencing body mass index (BMI). Six lead SNPs were carried forward for replication in five family-based studies (of similar to 4,000 trios). Two SNPs replicated: the paternal rs2471083-C allele (located near the imprinted KCNK9 gene) and the paternal rs3091869-T allele (located near the SLC2A10 gene) increased BMI equally (beta = 0.11 (SD), P < 0.0027) compared to the respective maternal alleles. Real-time PCR experiments of lymphoblastoid cell lines from the CEPH families showed that expression of both genes was dependent on parental origin of the SNPs alleles (P < 0.01). Our scheme opens new opportunities to exploit GWAS data of unrelated individuals to identify POEs and demonstrates that they play an important role in adult obesity.
Background Pulmonary arterial hypertension (PAH) is a rare disorder leading to premature death. Rare genetic variants contribute to disease etiology but the contribution of common genetic variation to disease risk and outcome remains poorly characterized. Methods We performed two separate genome-wide association studies of PAH using data across 11,744 European-ancestry individuals (including 2,085 patients), one with genotypes from 5,895 whole genome sequences and another with genotyping array data from 5,849 further samples. Cross-validation of loci reaching genome-wide significance was sought by meta-analysis. We functionally annotated associated variants and tested associations with duration of survival. Findings A locus at HLA-DPA1/DPB1 within the class II major histocompatibility (MHC) region and a second near SOX17 were significantly associated with PAH. The SOX17 locus contained two independent signals associated with PAH. Functional and epigenomic data indicate that the risk variants near SOX17 alter gene regulation via an enhancer active in endothelial cells. PAH risk variants determined haplotype-specific enhancer activity and CRISPR-inhibition of the enhancer reduced SOX17 expression. Analysis of median survival showed that PAH patients with two copies of the HLA-DPA1/DPB1 risk variant had a two-fold difference (>16 years versus 8 years), compared to patients homozygous for the alternative allele. Interpretation We have found that common genetic variation at loci in HLA-DPA1/DPB1 and an enhancer near SOX17 are associated with PAH. Impairment of Sox17 function may be more common in PAH than suggested by rare mutations in SOX17 . Allelic variation at HLA-DPB1 stratifies PAH patients for survival following diagnosis, with implications for future therapeutic trial design. Funding UK NIHR, BHF, UK MRC, Dinosaur Trust, NIH/NHLBI, ERS, EMBO, Wellcome Trust, EU, AHA, ACClinPharm, Netherlands CVRI, Dutch Heart Foundation, Dutch Federation of UMC, Netherlands OHRD and RNAS, German DFG, German BMBF, APH Paris, Inserm, Université Paris-Sud, and French ANR.
Sex hormone-binding globulin (SHBG) is a glycoprotein responsible for the transport and biologic availability of sex steroid hormones, primarily testosterone and estradiol. SHBG has been associated with chronic diseases including type 2 diabetes (T2D) and with hormone-sensitive cancers such as breast and prostate cancer. We performed a genome-wide association study (GWAS) meta-analysis of 21,791 individuals from 10 epidemiologic studies and validated these findings in 7,046 individuals in an additional six studies. We identified twelve genomic regions (SNPs) associated with circulating SHBG concentrations. Loci near the identified SNPs included SHBG (rs12150660, 17p13.1, p = 1.8×10 −106 ), PRMT6 ( rs17496332, 1p13.3 , p = 1.4 × 10 −11 ), GCKR ( rs780093 , 2p23.3 , p = 2.2 × 10 −16 ), ZBTB10 ( rs440837 , 8q21.13 , p = 3.4 × 10 −09 ), JMJD1C ( rs7910927 , 10q21.3 , p = 6.1 × 10 −35 ), SLCO1B1 ( rs4149056 , 12p12.1 , p = 1.9 × 10 −08 ), NR2F2 ( rs8023580 , 15q26.2 , p = 8.3 × 10 −12 ), ZNF652 ( rs2411984 , 17q21.32 , p = 3.5 × 10 −14 ), TDGF3 ( rs1573036 , Xq22.3 , p = 4.1 × 10 −14 ), LHCGR ( rs10454142 , 2p16.3 , p = 1.3 × 10 −07 ), BAIAP2L1 ( rs3779195 , 7q21.3 , p = 2.7 × 10 −08 ), and UGT2B15 ( rs293428 , 4q13.2 , p = 5.5 × 10 −06 ). These genes encompass multiple biologic pathways, including hepatic function, lipid metabolism, carbohydrate metabolism and T2D, androgen and estrogen receptor function, epigenetic effects, and the biology of sex steroid hormone-responsive cancers including breast and prostate cancer. We found evidence of sex-differentiated genetic influences on SHBG. In a sex-specific GWAS, the loci 4q13.2- UGT2B15 was significant in men only (men p = 2.5×10 −08 , women p = 0.66, heterogeneity p = 0.003). Additionally, three loci showed strong sex-differentiated effects: 17p13.1- SHBG and Xq22.3- TDGF3 were stronger in men, whereas 8q21.12- ZBTB10 was stronger in women. Conditional analyses identified additional signals at the SHBG gene that together almost double the proportion of variance explained at the locus. Using an independent study of 1,129 individuals, all SNPs identified in the overall or sex-differentiated or conditional analyses explained ∼15.6% and ∼8.4% of the genetic variation of SHBG concentrations in men and women, respectively. The evidence for sex-differentiated effects and allelic heterogeneity highlight the importance of considering these features when estimating complex trait variance. Sex hormone-binding globulin (SHBG) is the key protein responsible for binding and transporting the sex steroid hormones, testosterone and estradiol, in the circulatory system. SHBG regulates their bioavailability and therefore their effects in the body. SHBG has been linked to chronic diseases including type 2 diabetes and to hormone-sensitive cancers such as breast and prostate cancer. SHBG concentrations are approximately 50% heritable in family studies, suggesting SHBG concentrations are under significant genetic control; yet, little is known about the specific genes that influence SHBG. We conducted a large study of the association of SHBG concentrations with markers in the human genome in ∼22,000 white men and women to determine which loci influence SHBG concentrations. Genes near the identified genomic markers in addition to the SHBG protein coding gene included PRMT6 , GCKR , ZBTB10 , JMJD1C , SLCO1B1 , NR2F2 , ZNF652 , TDGF3 , LHCGR , BAIAP2L1 , and UGT2B15 . These genes represent a wide range of biologic pathways that may relate to SHBG function and sex steroid hormone biology, including liver function, lipid metabolism, carbohydrate metabolism and type 2 diabetes, and the development and progression of sex steroid hormone-responsive cancers.
The genetic architecture of human reproductive behavior-age at first birth (AFB) and number of children ever born (NEB)-has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified, and the underlying mechanisms of AFB and NEB are poorly understood. We report a large genome-wide association study of both sexes including 251,151 individuals for AFB and 343,072 individuals for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study and 4 additional loci associated in a gene-based effort. These loci harbor genes that are likely to have a role, either directly or by affecting non-local gene expression, in human reproduction and infertility, thereby increasing understanding of these complex traits.
Chemerin is an adipokine proposed to link obesity and chronic inflammation of adipose tissue. Genetic factors determining chemerin release from adipose tissue are yet unknown. We conducted a meta-analysis of genome-wide association studies (GWAS) for serum chemerin in three independent cohorts from Europe: Sorbs and KORA from Germany and PPP-Botnia from Finland (total N = 2,791). In addition, we measured mRNA expression of genes within the associated loci in peripheral mononuclear cells by micro-arrays, and within adipose tissue by quantitative RT-PCR and performed mRNA expression quantitative trait and expression-chemerin association studies to functionally substantiate our loci. Heritability estimate of circulating chemerin levels was 16.2% in the Sorbs cohort. Thirty single nucleotide polymorphisms (SNPs) at chromosome 7 within the retinoic acid receptor responder 2 (RARRES2)/Leucine Rich Repeat Containing (LRRC61) locus reached genome-wide significance (p
The growth hormone/insulin-like growth factor (IGF) axis can be manipulated in animal models to promote longevity, and IGF-related proteins including IGF-I and IGF-binding protein-3 (IGFBP-3) have also been implicated in risk of human diseases including cardiovascular diseases, diabetes, and cancer. Through genomewide association study of up to 30 884 adults of European ancestry from 21 studies, we confirmed and extended the list of previously identified loci associated with circulating IGF-I and IGFBP-3 concentrations (IGF1, IGFBP3, GCKR, TNS3, GHSR, FOXO3, ASXL2, NUBP2/IGFALS, SORCS2, and CELSR2). Significant sex interactions, which were characterized by different genotype-phenotype associations between men and women, were found only for associations of IGFBP-3 concentrations with SNPs at the loci IGFBP3 and SORCS2. Analyses of SNPs, gene expression, and protein levels suggested that interplay between IGFBP3 and genes within the NUBP2 locus (IGFALS and HAGH) may affect circulating IGF-I and IGFBP-3 concentrations. The IGF-I-decreasing allele of SNP rs934073, which is an eQTL of ASXL2, was associated with lower adiposity and higher likelihood of survival beyond 90 years. The known longevity-associated variant rs2153960 (FOXO3) was observed to be a genomewide significant SNP for IGF-I concentrations. Bioinformatics analysis suggested enrichment of putative regulatory elements among these IGF-I- and IGFBP-3-associated loci, particularly of rs646776 at CELSR2. In conclusion, this study identified several loci associated with circulating IGF-I and IGFBP-3 concentrations and provides clues to the potential role of the IGF axis in mediating effects of known (FOXO3) and novel (ASXL2) longevity-associated loci.
We have investigated the evidence for positive selection in samples of African, European, and East Asian ancestry at 65 loci associated with susceptibility to type 2 diabetes (T2D) previously identified through genome-wide association studies. Selection early in human evolutionary history is predicted to lead to ancestral risk alleles shared between populations, whereas late selection would result in population-specific signals at derived risk alleles. By using a wide variety of tests based on the site frequency spectrum, haplotype structure, and population differentiation, we found no global signal of enrichment for positive selection when we considered all T2D risk loci collectively. However, in a locus-by-locus analysis, we found nominal evidence for positive selection at 14 of the loci. Selection favored the protective and risk alleles in similar proportions, rather than the risk alleles specifically as predicted by the thrifty gene hypothesis, and may not be related to influence on diabetes. Overall, we conclude that past positive selection has not been a powerful influence driving the prevalence of T2D risk alleles.
Using a genome-wide screen of 9.6 million genetic variants achieved through 1000 Genomes Project imputation in 62,166 samples, we identify association to lipid traits in 93 loci, including 79 previously identified loci with new lead SNPs and 10 new loci, 15 loci with a low-frequency lead SNP and 10 loci with a missense lead SNP, and 2 loci with an accumulation of rare variants. In six loci, SNPs with established function in lipid genetics (CELSR2, GCKR, LIPC and APOE) or candidate missense mutations with predicted damaging function (CD300LG and TM6SF2) explained the locus associations. The low-frequency variants increased the proportion of variance explained, particularly for low-density lipoprotein cholesterol and total cholesterol. Altogether, our results highlight the impact of low-frequency variants in complex traits and show that imputation offers a cost-effective alternative to resequencing.
Autoimmune thyroid disease (AITD), including Graves disease (GD) and Hashimotos thyroiditis (HT), is one of the most common of the immune-mediated diseases. To further investigate the genetic determinants of AITD, we conducted an association study using a custom-made single-nucleotide polymorphism (SNP) array, the ImmunoChip. The SNP array contains all known and genotype-able SNPs across 186 distinct susceptibility loci associated with one or more immune-mediated diseases. After stringent quality control, we analysed 103 875 common SNPs (minor allele frequency 0.05) in 2285 GD and 462 HT patients and 9364 controls. We found evidence for seven new AITD risk loci (P 1.12 10(6); a permutation test derived significance threshold), five at locations previously associated and two at locations awaiting confirmation, with other immune-mediated diseases.
A wealth of biospecimen samples are stored in modern globally distributed biobanks. Biomedical researchers worldwide need to be able to combine the available resources to improve the power of large-scale studies. A prerequisite for this effort is to be able to search and access phenotypic, clinical and other information about samples that are currently stored at biobanks in an integrated manner. However, privacy issues together with heterogeneous information systems and the lack of agreed-upon vocabularies have made specimen searching across multiple biobanks extremely challenging. We describe three case studies where we have linked samples and sample descriptions in order to facilitate global searching of available samples for research. The use cases include the ENGAGE (European Network for Genetic and Genomic Epidemiology) consortium comprising at least 39 cohorts, the SUMMIT (surrogate markers for micro- and macro-vascular hard endpoints for innovative diabetes tools) consortium and a pilot for data integration between a Swedish clinical health registry and a biobank. We used the Sample avAILability (SAIL) method for data linking: first, created harmonised variables and then annotated and made searchable information on the number of specimens available in individual biobanks for various phenotypic categories. By operating on this categorised availability data we sidestep many obstacles related to privacy that arise when handling real values and show that harmonised and annotated records about data availability across disparate biomedical archives provide a key methodological advance in pre-analysis exchange of information between biobanks, that is, during the project planning phase.
Anaemia is a chief determinant of global ill health, contributing to cognitive impairment, growth retardation and impaired physical capacity. To understand further the genetic factors influencing red blood cells, we carried out a genome-wide association study of haemoglobin concentration and related parameters in up to 135,367 individuals. Here we identify 75 independent genetic loci associated with one or more red blood cell phenotypes at P < 10(-8), which together explain 4-9% of the phenotypic variance per trait. Using expression quantitative trait loci and bioinformatic strategies, we identify 121 candidate genes enriched in functions relevant to red blood cell biology. The candidate genes are expressed preferentially in red blood cell precursors, and 43 have haematopoietic phenotypes in Mus musculus or Drosophila melanogaster. Through open-chromatin and coding-variant analyses we identify potential causal genetic variants at 41 loci. Our findings provide extensive new insights into genetic mechanisms and biological pathways controlling red blood cell formation and function.
Background. Genetic epidemiology data suggest that younger age of onset is associated with family history (FH) of depression. The present study tested whether the presence of FH for depression or anxiety in first-degree relatives determines younger age of onset for depression. Method. A sample of 1022 cases with recurrent major depressive disorder (MDD) was recruited at the Max Planck Institute and at two affiliated hospitals. Patients were assessed using the Schedules for Clinical Assessment in Neuropsychiatry and questionnaires including demographics, medical history, questions on the use of alcohol and tobacco, personality traits and life events. Survival analysis and the Cox proportional hazard model were used to determine whether FH of depression signals earlier age of onset of depression. Results. Patients who reported positive FH had a significantly earlier age of onset than patients who did not report FH of depression (log-rank =48, df = 1, p < 0.0001). The magnitude of association of FH varies by age of onset, with the largest estimate for MDD onset before age 20 years (hazard ratio = 2.2, p = 0.0009), whereas FH is not associated with MDD for onset after age 50 years (hazard ratio = 0.89, p = 0.5). The presence of feelings of guilt, anxiety symptoms and functional impairment due to depressive symptoms appear to characterize individuals with positive FH of depression. Conclusions. FH of depression contributes to the onset of depression at a younger age and may affect the clinical features of the illness.
Peroxisome proliferator-activated receptor gamma (PPARG) is a master transcriptional regulator of adipocyte differentiation and a canonical target of antidiabetic thiazolidinedione medications. In rare families, loss-of-function (LOF) mutations in PPARG are known to cosegregate with lipodystrophy and insulin resistance; in the general population, the common P12A variant is associated with a decreased risk of type 2 diabetes (T2D). Whether and how rare variants in PPARG and defects in adipocyte differentiation influence risk of T2D in the general population remains undetermined. By sequencing PPARG in 19,752 T2D cases and controls drawn from multiple studies and ethnic groups, we identified 49 previously unidentified, nonsynonymous PPARG variants (MAF < 0.5%). Considered in aggregate (with or without computational prediction of functional consequence), these rare variants showed no association with T2D (OR = 1.35; P = 0.17). The function of the 49 variants was experimentally tested in a novel high-throughput human adipocyte differentiation assay, and nine were found to have reduced activity in the assay. Carrying any of these nine LOF variants was associated with a substantial increase in risk of T2D (OR = 7.22; P = 0.005). The combination of large-scale DNA sequencing and functional testing in the laboratory reveals that approximately 1 in 1,000 individuals carries a variant in PPARG that reduces function in a human adipocyte differentiation assay and is associated with a substantial risk of T2D.
Metabolic syndrome (MetS) has become a health and financial burden worldwide. The MetS definition captures clustering of risk factors that predict higher risk for diabetes mellitus and cardiovascular disease. Our study hypothesis is that additional to genes influencing individual MetS risk factors, genetic variants exist that influence MetS and inflammatory markers forming a predisposing MetS genetic network. To test this hypothesis a staged approach was undertaken. (a) We analyzed 17 metabolic and inflammatory traits in more than 85,500 participants from 14 large epidemiological studies within the Cross Consortia Pleiotropy Group. Individuals classified with MetS (NCEP definition), versus those without, showed on average significantly different levels for most inflammatory markers studied. (b) Paired average correlations between 8 metabolic traits and 9 inflammatory markers from the same studies as above, estimated with two methods, and factor analyses on large simulated data, helped in identifying 8 combinations of traits for follow-up in meta-analyses, out of 130,305 possible combinations between metabolic traits and inflammatory markers studied. (c) We performed correlated meta-analyses for 8 metabolic traits and 6 inflammatory markers by using existing GWAS published genetic summary results, with about 2.5 million SNPs from twelve predominantly largest GWAS consortia. These analyses yielded 130 unique SNPs/genes with pleiotropic associations (a SNP/gene associating at least one metabolic trait and one inflammatory marker). Of them twenty-five variants (seven loci newly reported) are proposed as MetS candidates. They map to genes MACF1, KIAA0754, GCKR, GRB14, COBLL1, LOC646736-IRS1, SLC39A8, NELFE, SKIV2L, STK19, TFAP2B, BAZ1B, BCL7B, TBL2, MLXIPL, LPL, TRIB1, ATXN2, HECTD4, PTPN11, ZNF664, PDXDC1, FTO, MC4R and TOMM40. Based on large data evidence, we conclude that inflammation is a feature of MetS and several gene variants show pleiotropic genetic associations across phenotypes and might explain a part of MetS correlated genetic architecture. These findings warrant further functional investigation. (C) 2014 Elsevier Inc. All rights reserved.
White blood cell (WBC) count is a common clinical measure from complete blood count assays, and it varies widely among healthy individuals. Total WBC count and its constituent subtypes have been shown to be moderately heritable, with the heritability estimates varying across cell types. We studied 19,509 subjects from seven cohorts in a discovery analysis, and 11,823 subjects from ten cohorts for replication analyses, to determine genetic factors influencing variability within the normal hematological range for total WBC count and five WBC subtype measures. Cohort specific data was supplied by the CHARGE, HeamGen, and INGI consortia, as well as independent collaborative studies. We identified and replicated ten associations with total WBC count and five WBC subtypes at seven different genomic loci (total WBC count-6p21 in the HLA region, 17q21 near ORMDL3, and CSF3; neutrophil count-17q21; basophil count-3p21 near RPN1 and C3orf27; lymphocyte count-6p21, 19p13 at EPS15L1; monocyte count-2q31 at ITGA4, 3q21, 8q24 an intergenic region, 9q31 near EDG2), including three previously reported associations and seven novel associations. To investigate functional relationships among variants contributing to variability in the six WBC traits, we utilized gene expression-and pathways-based analyses. We implemented gene-clustering algorithms to evaluate functional connectivity among implicated loci and showed functional relationships across cell types. Gene expression data from whole blood was utilized to show that significant biological consequences can be extracted from our genome-wide analyses, with effect estimates for significant loci from the meta-analyses being highly corellated with the proximal gene expression. In addition, collaborative efforts between the groups contributing to this study and related studies conducted by the COGENT and RIKEN groups allowed for the examination of effect homogeneity for genome-wide significant associations across populations of diverse ancestral backgrounds.
Variants in the growth factor receptor-bound protein 10 (GRB10) gene were in a GWAS meta-analysis associated with reduced glucose-stimulated insulin secretion and increased risk of type 2 diabetes (T2D) if inherited from the father, but inexplicably reduced fasting glucose when inherited from the mother. GRB10 is a negative regulator of insulin signaling and imprinted in a parent-of-origin fashion in different tissues. GRB10 knock-down in human pancreatic islets showed reduced insulin and glucagon secretion, which together with changes in insulin sensitivity may explain the paradoxical reduction of glucose despite a decrease in insulin secretion. Together, these findings suggest that tissue-specific methylation and possibly imprinting of GRB10 can influence glucose metabolism and contribute to T2D pathogenesis. The data also emphasize the need in genetic studies to consider whether risk alleles are inherited from the mother or the father.
Little is known about genes regulating male puberty. Further, while many identified pubertal timing variants associate with age at menarche, a late manifestation of puberty, and body mass, little is known about these variants' relationship to pubertal initiation or tempo. To address these questions, we performed genome-wide association meta-analysis in over 11 000 European samples with data on early pubertal traits, male genital and female breast development, measured by the Tanner scale. We report the first genome-wide significant locus for male sexual development upstream of myocardin-like 2 (MKL2) (P = 8.9 x 10(-9)), a menarche locus tagging a developmental pathway linking earlier puberty with reduced pubertal growth (P = 4.6 x 10(-5)) and short adult stature (p = 7.5 x 10(-6)) in both males and females. Furthermore, our results indicate that a proportion of menarche loci are important for pubertal initiation in both sexes. Consistent with epidemiological correlations between increased prepubertal body mass and earlier pubertal timing in girls, body mass index (BMI)-increasing alleles correlated with earlier breast development. In boys, some BMI-increasing alleles associated with earlier, and others with delayed, sexual development; these genetic results mimic the controversy in epidemiological studies, some of which show opposing correlations between prepubertal BMI and male puberty. Our results contribute to our understanding of the pubertal initiation program in both sexes and indicate that although mechanisms regulating pubertal onset in males and females may largely be shared, the relationship between body mass and pubertal timing in boys may be complex and requires further genetic studies.
Recent genome-wide association studies have described many loci implicated in type 2 diabetes (T2D) pathophysiology and beta-cell dysfunction but have contributed little to the understanding of the genetic basis of insulin resistance. We hypothesized that genes implicated in insulin resistance pathways might be uncovered by accounting for differences in body mass index (BMI) and potential interactions between BMI and genetic variants. We applied a joint meta-analysis approach to test associations with fasting insulin and glucose on a genome-wide scale. We present six previously unknown loci associated with fasting insulin at P < 5 x 10(-8) in combined discovery and follow-up analyses of 52 studies comprising up to 96,496 non-diabetic individuals. Risk variants were associated with higher triglyceride and lower high-density lipoprotein (HDL) cholesterol levels, suggesting a role for these loci in insulin resistance pathways. The discovery of these loci will aid further characterization of the role of insulin resistance in T2D pathophysiology.
Progranulin is a secreted protein with important functions in processes including immune and inflammatory response, metabolism and embryonic development. The present study aimed at identification of genetic factors determining progranulin concentrations. We conducted a genome-wide association meta-analysis for serum progranulin in three independent cohorts from Europe: Sorbs (N = 848) and KORA (N = 1628) from Germany and PPP-Botnia (N = 335) from Finland (total N = 2811). Single nucleotide polymorphisms (SNPs) associated with progranulin levels were replicated in two additional German cohorts: LIFE-Heart Study (Leipzig; N = 967) and Metabolic Syndrome Berlin Potsdam (Berlin cohort; N = 833). We measured mRNA expression of genes in peripheral blood mononuclear cells (PBMC) by micro-arrays and performed mRNA expression quantitative trait and expression-progranulin association studies to functionally substantiate identified loci. Finally, we conducted siRNA silencing experiments in vitro to validate potential candidate genes within the associated loci. Heritability of circulating progranulin levels was estimated at 31.8% and 26.1% in the Sorbs and LIFE-Heart cohort, respectively. SNPs at three loci reached study-wide significance (rs660240 in CELSR2-PSRC1-MYBPHL-SORT1, rs4747197 in CDH23-PSAP and rs5848 in GRN) explaining 19.4%/15.0% of the variance and 61%/57% of total heritability in the Sorbs/LIFE-Heart Study. The strongest evidence for association was at rs660240 (P = 5.75 × 10-50), which was also associated with mRNA expression of PSRC1 in PBMC (P = 1.51 × 10-21). Psrc1 knockdown in murine preadipocytes led to a consecutive 30% reduction in progranulin secretion. In conclusion, the present meta-GWAS combined with mRNA expression identified three loci associated with progranulin and supports the role of PSRC1 in the regulation of progranulin secretion.
Obesity is heritable and predisposes to many diseases. To understand the genetic basis of obesity better, here we conduct a genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in upto 339,224 individuals. This analysis identifies 97 BMI-associated loci (P < 5 x 10(-8)), 56 of which are novel. Five loci demonstrate clear evidence of several independent association signals, and many loci have significant effects on other metabolic phenotypes. The 97 loci account for similar to 2.7% of BMI variation, and genome-wide estimates suggest that common variation accounts for >20% of BMI variation. Pathway analyses provide strong support for a role of the central nervous systemin obesity susceptibility and implicate new genes and pathways, including those related to synaptic function, glutamate signalling, insulin secretion/action, energy metabolism, lipid biology and adipogenesis.
Nonalcoholic fatty liver disease (NAFLD) clusters in families, but the only known common genetic variants influencing risk are near PNPLA3. We sought to identify additional genetic variants influencing NAFLD using genome-wide association (GWA) analysis of computed tomography (CT) measured hepatic steatosis, a non-invasive measure of NAFLD, in large population based samples. Using variance components methods, we show that CT hepatic steatosis is heritable (∼26%–27%) in family-based Amish, Family Heart, and Framingham Heart Studies (n = 880 to 3,070). By carrying out a fixed-effects meta-analysis of genome-wide association (GWA) results between CT hepatic steatosis and ∼2.4 million imputed or genotyped SNPs in 7,176 individuals from the Old Order Amish, Age, Gene/Environment Susceptibility-Reykjavik study (AGES), Family Heart, and Framingham Heart Studies, we identify variants associated at genome-wide significant levels ( p
One goal of human genetics is to understand the genetic basis of disease, a challenge for diseases of complex inheritance because risk alleles are few relative to the vast set of benign variants. Risk variants are often sought by association studies in which allele frequencies in case subjects are contrasted with those from population-based samples used as control subjects. In an ideal world we would know population-level allele frequencies, releasing researchers to focus on case subjects. We argue this ideal is possible, at least theoretically, and we outline a path to achieving it in reality. If such a resource were to exist, it would yield ample savings and would facilitate the effective use of data repositories by removing administrative and technical barriers. We call this concept the Universal Control Repository Network (UNICORN), a means to perform association analyses without necessitating direct access to individual-level control data. Our approach to UNICORN uses existing genetic resources and various statistical tools to analyze these data, including hierarchical clustering with spectral analysis of ancestry; and empirical Bayesian analysis along with Gaussian spatial processes to estimate ancestry-specific allele frequencies. We demonstrate our approach using tens of thousands of control subjects from studies of Crohn disease, showing how it controls false positives, provides power similar to that achieved when all control data are directly accessible, and enhances power when control data are limiting or even imperfectly matched ancestrally. These results highlight how UNICORN can enable reliable, powerful, and convenient genetic association analyses without access to the individual-level data.
Common variants at only two loci, FTO and MC4R, have been reproducibly associated with body mass index (BMI) in humans. To identify additional loci, we conducted meta-analysis of 15 genome-wide association studies for BMI (n > 32,000) and followed up top signals in 14 additional cohorts (n > 59,000). We strongly confirm FTO and MC4R and identify six additional loci (P < 5 x 10(-8)): TMEM18, KCTD15, GNPDA2, SH2B1, MTCH2 and NEGR1 (where a 45-kb deletion polymorphism is a candidate causal variant). Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.
Homozygosity has long been associated with rare, often devastating, Mendelian disorders(1), and Darwin was one of the first to recognize that inbreeding reduces evolutionary fitness(2). However, the effect of the more distant parental relatedness that is common in modern human populations is less well understood. Genomic data now allow us to investigate the effects of homozygosity on traits of public health importance by observing contiguous homozygous segments (runs of homozygosity), which are inferred to be homozygous along their complete length. Given the low levels of genome-wide homozygosity prevalent in most human populations, information is required on very large numbers of people to provide sufficient power(3,4). Here we use runs of homozygosity to study 16 health-related quantitative traits in 354,224 individuals from 102 cohorts, and find statistically significant associations between summed runs of homozygosity and four complex traits: height, forced expiratory lung volume in one second, general cognitive ability and educational attainment (P < 1 x 10(-300), 2.1 x 10(-6), 2.5 x 10(-10) and 1.8 x 10(-10), respectively). In each case, increased homozygosity was associated with decreased trait value, equivalent to the offspring of first cousins being 1.2 cm shorter and having 10 months' less education. Similar effect sizes were found across four continental groups and populations with different degrees of genome-wide homozygosity, providing evidence that homozygosity, rather than confounding, directly contributes to phenotypic variance. Contrary to earlier reports in substantially smaller samples(5,6), no evidence was seen of an influence of genome-wide homozygosity on blood pressure and low density lipoprotein cholesterol, or ten other cardio-metabolic traits. Since directional dominance is predicted for traits under directional evolutionary selection(7), this study provides evidence that increased stature and cognitive function have been positively selected in human evolution, whereas many important risk factors for late-onset complex diseases may not have been.
Smoking is a common risk factor for many diseases(1). We conducted genome-wide association meta-analyses for the number of cigarettes smoked per day (CPD) in smokers (n = 31,266) and smoking initiation (n = 46,481) using samples from the ENGAGE Consortium. In a second stage, we tested selected SNPs with in silico replication in the Tobacco and Genetics (TAG) and Glaxo Smith Kline (Ox-GSK) consortia cohorts (n = 45,691 smokers) and assessed some of those in a third sample of European ancestry (n = 9,040). Variants in three genomic regions associated with CPD (P < 5 x 10(-8)), including previously identified SNPs at 15q25 represented by rs1051730[A] (effect size = 0.80 CPD, P = 2.4 x 10(-69)), and SNPs at 19q13 and 8p11, represented by rs4105144[C] (effect size = 0.39 CPD, P = 2.2 x 10(-12)) and rs6474412-T (effect size = 0.29 CPD, P = 1.4 x 10(-8)), respectively. Among the genes at the two newly associated loci are genes encoding nicotine-metabolizing enzymes (CYP2A6 and CYP2B6) and nicotinic acetylcholine receptor subunits (CHRNB3 and CHRNA6), all of which have been highlighted in previous studies of smoking and nicotine dependence2-4. Nominal associations with lung cancer were observed at both 8p11 (rs6474412[T], odds ratio (OR) = 1.09, P = 0.04) and 19q13 (rs4105144[C], OR = 1.12, P = 0.0006).
The pubertal height growth spurt is a distinctive feature of childhood growth reflecting both the central onset of puberty and local growth factors. Although little is known about the underlying genetics, growth variability during puberty correlates with adult risks for hormone-dependent cancer and adverse cardiometabolic health. The only gene so far associated with pubertal height growth, LIN28B, pleiotropically influences childhood growth, puberty and cancer progression, pointing to shared underlying mechanisms. To discover genetic loci influencing pubertal height and growth and to place them in context of overall growth and maturation, we performed genome-wide association meta-analyses in 18 737 European samples utilizing longitudinally collected height measurements. We found significant associations ( P< 1.67 × 10 −8 ) at 10 loci, including LIN28B . Five loci associated with pubertal timing, all impacting multiple aspects of growth. In particular, a novel variant correlated with expression of MAPK3 , and associated both with increased prepubertal growth and earlier menarche. Another variant near ADCY3-POMC associated with increased body mass index, reduced pubertal growth and earlier puberty. Whereas epidemiological correlations suggest that early puberty marks a pathway from rapid prepubertal growth to reduced final height and adult obesity, our study shows that individual loci associating with pubertal growth have variable longitudinal growth patterns that may differ from epidemiological observations. Overall, this study uncovers part of the complex genetic architecture linking pubertal height growth, the timing of puberty and childhood obesity and provides new information to pinpoint processes linking these traits.
Adiposity, as indicated by body mass index (BMI), has been associated with risk of cardiovascular diseases in epidemiological studies. We aimed to investigate if these associations are causal, using Mendelian randomization (MR) methods. The associations of BMI with cardiovascular outcomes [coronary heart disease (CHD), heart failure and ischaemic stroke], and associations of a genetic score (32 BMI single nucleotide polymorphisms) with BMI and cardiovascular outcomes were examined in up to 22,193 individuals with 3062 incident cardiovascular events from nine prospective follow-up studies within the ENGAGE consortium. We used random-effects meta-analysis in an MR framework to provide causal estimates of the effect of adiposity on cardiovascular outcomes. There was a strong association between BMI and incident CHD (HR = 1.20 per SD-increase of BMI, 95% CI, 1.12-1.28, P = 1.9.10(-7)), heart failure (HR = 1.47, 95% CI, 1.35-1.60, P = 9.10(-19)) and ischaemic stroke (HR = 1.15, 95% CI, 1.06-1.24, P = 0.0008) in observational analyses. The genetic score was robustly associated with BMI (β = 0.030 SD-increase of BMI per additional allele, 95% CI, 0.028-0.033, P = 3.10(-107)). Analyses indicated a causal effect of adiposity on development of heart failure (HR = 1.93 per SD-increase of BMI, 95% CI, 1.12-3.30, P = 0.017) and ischaemic stroke (HR = 1.83, 95% CI, 1.05-3.20, P = 0.034). Additional cross-sectional analyses using both ENGAGE and CARDIoGRAMplusC4D data showed a causal effect of adiposity on CHD. Using MR methods, we provide support for the hypothesis that adiposity causes CHD, heart failure and, previously not demonstrated, ischaemic stroke.
OBJECTIVE-Linkage of the chromosome 1q21-25 region to type 2 diabetes has been demonstrated in multiple ethnic groups. We performed common variant fine-mapping across a 23-Mb interval in a multiethnic sample to search for variants responsible for this linkage signal. RESEARCH DESIGN AND METHODS-In all, 5,290 single nucleotide polymorphisms (SNPs) were successfully genotyped in 3,179 type 2 diabetes case and control subjects from eight populations with evidence of 1q linkage. Samples were ascertained using strategies designed to enhance power to detect variants causal for 1q linkage. After imputation, we estimate similar to 80% coverage of common variation across the region (r(2) > 0.8, Europeans). Association signals of interest were evaluated through in silico replication and de novo genotyping in similar to 8,500 case subjects and 12,400 control subjects. RESULTS-Association mapping of the 23-Mb region identified two strong signals, both of which were restricted to the subset of European-descent samples. The first mapped to the NOS1AP (CAPON) gene region (lead SNP: rs7538490, odds ratio 1.38 [95% CI 1.21-1.571, P = 1.4 x 10(-6), in 999 case subjects and 1,190 control subjects); the second mapped within an extensive region of linkage disequilibrium that includes the ASH1L and PKLR genes (lead SNP: rs11264371, odds ratio 1.48 [1.18-1.761, P = 1.0 x 10(-5), under a dominant model). However, there was no evidence for association at either signal on replication, and, across all data (>24,000 subjects), there was no indication that these variants were causally related to type 2 diabetes status. CONCLUSIONS-Detailed fine-mapping of the 23-Mb region of replicated linkage has failed to identify common variant signals contributing to the observed signal. Future studies should focus on identification of causal alleles of lower frequency and higher penetrance. Diabetes 58:1704-1709, 2009
Reduced glomerular filtration rate defines chronic kidney disease and is associated with cardiovascular and all-cause mortality. We conducted a meta-analysis of genome-wide association studies for estimated glomerular filtration rate (eGFR), combining data across 133,413 individuals with replication in up to 42,166 individuals. We identify 24 new and confirm 29 previously identified loci. Of these 53 loci, 19 associate with eGFR among individuals with diabetes. Using bioinformatics, we show that identified genes at eGFR loci are enriched for expression in kidney tissues and in pathways relevant for kidney development and transmembrane transporter activity, kidney structure, and regulation of glucose metabolism. Chromatin state mapping and DNase I hypersensitivity analyses across adult tissues demonstrate preferential mapping of associated variants to regulatory regions in kidney but not extra-renal tissues. These findings suggest that genetic determinants of eGFR are mediated largely through direct effects within the kidney and highlight important cell types and biological pathways.
To identify genetic variants associated with birth weight, we meta-analyzed six genome-wide association (GWA) studies (n = 10,623 Europeans from pregnancy/birth cohorts) and followed up two lead signals in 13 replication studies (n = 27,591). rs900400 near LEKR1 and CCNL1 (P = 2 x 10(-35)) and rs9883204 in ADCY5 (P = 7 x 10(-15)) were robustly associated with birth weight. Correlated SNPs in ADCY5 were recently implicated in regulation of glucose levels and susceptibility to type 2 diabetes(1), providing evidence that the well-described association between lower birth weight and subsequent type 2 diabetes(2,3) has a genetic component, distinct from the proposed role of programming by maternal nutrition. Using data from both SNPs, we found that the 9% of Europeans carrying four birth weight-lowering alleles were, on average, 113 g (95% CI 89-137 g) lighter at birth than the 24% with zero or one alleles (P-trend = 7 x 10(-30)). The impact on birth weight is similar to that of a mother smoking 4-5 cigarettes per day in the third trimester of pregnancy(4).
Coffee, a major dietary source of caffeine, is among the most widely consumed beverages in the world and has received considerable attention regarding health risks and benefits. We conducted a genome-wide (GW) meta-analysis of predominately regular-type coffee consumption (cups per day) among up to 91,462 coffee consumers of European ancestry with top single-nucleotide polymorphisms (SNPs) followed-up in ~30 062 and 7964 coffee consumers of European and African-American ancestry, respectively. Studies from both stages were combined in a trans-ethnic meta-analysis. Confirmed loci were examined for putative functional and biological relevance. Eight loci, including six novel loci, met GW significance (log10Bayes factor (BF)>5.64) with per-allele effect sizes of 0.03-0.14 cups per day. Six are located in or near genes potentially involved in pharmacokinetics (ABCG2, AHR, POR and CYP1A2) and pharmacodynamics (BDNF and SLC6A4) of caffeine. Two map to GCKR and MLXIPL genes related to metabolic traits but lacking known roles in coffee consumption. Enhancer and promoter histone marks populate the regions of many confirmed loci and several potential regulatory SNPs are highly correlated with the lead SNP of each. SNP alleles near GCKR, MLXIPL, BDNF and CYP1A2 that were associated with higher coffee consumption have previously been associated with smoking initiation, higher adiposity and fasting insulin and glucose but lower blood pressure and favorable lipid, inflammatory and liver enzyme profiles (P
Common diseases such as type 2 diabetes are phenotypically heterogeneous. Obesity is a major risk factor for type 2 diabetes, but patients vary appreciably in body mass index. We hypothesized that the genetic predisposition to the disease may be different in lean (BMI= 30 Kg/m(2)). We performed two case-control genome-wide studies using two accepted cut-offs for defining individuals as overweight or obese. We used 2,112 lean type 2 diabetes cases (BMI= 30 kg/m(2)), and 54,412 un-stratified controls. Replication was performed in 2,881 lean cases or 8,702 obese cases, and 18,957 un-stratified controls. To assess the effects of known signals, we tested the individual and combined effects of SNPs representing 36 type 2 diabetes loci. After combining data from discovery and replication datasets, we identified two signals not previously reported in Europeans. A variant (rs8090011) in the LAMA1 gene was associated with type 2 diabetes in lean cases (P = 8.4610 29, OR = 1.13 [95% CI 1.09-1.18]), and this association was stronger than that in obese cases (P = 0.04, OR = 1.03 [95% CI 1.00-1.06]). A variant in HMG20A-previously identified in South Asians but not Europeans-was associated with type 2 diabetes in obese cases (P = 1.3 x 10(-8), OR= 1.11 [95% CI 1.07-1.15]), although this association was not significantly stronger than that in lean cases (P = 0.02, OR = 1.09 [95% CI 1.02-1.17]). For 36 known type 2 diabetes loci, 29 had a larger odds ratio in the lean compared to obese (binomial P = 0.0002). In the lean analysis, we observed a weighted per-risk allele OR = 1.13 [95% CI 1.10-1.17], P = 3.2 x 10(-14). This was larger than the same model fitted in the obese analysis where the OR = 1.06 [95% CI 1.05-1.08], P = 2.2 x 10(-16). This study provides evidence that stratification of type 2 diabetes cases by BMI may help identify additional risk variants and that lean cases may have a stronger genetic predisposition to type 2 diabetes.
Interindividual variation in mean leukocyte telomere length (LTL) is associated with cancer and several age-associated diseases. We report here a genome-wide meta-analysis of 37,684 individuals with replication of selected variants in an additional 10,739 individuals. We identified seven loci, including five new loci, associated with mean LTL (P < 5 x 10(-8)). Five of the loci contain candidate genes (TERC, TERT, NAF1, OBFC1 and RTEL1) that are known to be involved in telomere biology. Lead SNPs at two loci (TERC and TERT) associate with several cancers and other diseases, including idiopathic pulmonary fibrosis. Moreover, a genetic risk score analysis combining lead variants at all 7 loci in 22,233 coronary artery disease cases and 64,762 controls showed an association of the alleles associated with shorter LTL with increased risk of coronary artery disease (21% (95% confidence interval, 5-35%) per standard deviation in LTL, P = 0.014). Our findings support a causal role of telomere-length variation in some age-related diseases.
Smoking influences body weight such that smokers weigh less than non-smokers and smoking cessation often leads to weight increase. The relationship between body weight and smoking is partly explained by the effect of nicotine on appetite and metabolism. However, the brain reward system is involved in the control of the intake of both food and tobacco. We evaluated the effect of single-nucleotide polymorphisms (SNPs) affecting body mass index (BMI) on smoking behavior, and tested the 32 SNPs identified in a meta-analysis for association with two smoking phenotypes, smoking initiation (SI) and the number of cigarettes smoked per day (CPD) in an Icelandic sample (N = 34 216 smokers). Combined according to their effect on BMI, the SNPs correlate with both SI (r = 0.019, P = 0.00054) and CPD (r = 0.032, P = 8.0 x 10(-7)). These findings replicate in a second large data set (N = 127 274, thereof 76 242 smokers) for both SI (P = 1.2 x 10(-5)) and CPD (P = 9.3 x 10(-5)). Notably, the variant most strongly associated with BMI (rs1558902-A in FTO) did not associate with smoking behavior. The association with smoking behavior is not due to the effect of the SNPs on BMI. Our results strongly point to a common biological basis of the regulation of our appetite for tobacco and food, and thus the vulnerability to nicotine addiction and obesity.
The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.
Birth weight within the normal range is associated with a variety of adult-onset diseases, but the mechanisms behind these associations are poorly understood(1). Previous genome-wide association studies of birth weight identified a variant in the ADCY5 gene associated both with birth weight and type 2 diabetes and a second variant, near CCNL1, with no obvious link to adult traits(2). In an expanded genome-wide association metaanalysis and follow-up study of birth weight (of up to 69,308 individuals of European descent from 43 studies), we have now extended the number of loci associated at genome-wide significance to 7, accounting for a similar proportion of variance as maternal smoking. Five of the loci are known to be associated with other phenotypes: ADCY5 and CDKAL1 with type 2 diabetes, ADRB1 with adult blood pressure and HMGA2 and LCORL with adult height. Our findings highlight genetic links between fetal growth and postnatal growth and metabolism.
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
Identification of genetic risk factors for albuminuria may alter strategies for early prevention of CKD progression, particularly among patients with diabetes. Little is known about the influence of common genetic variants on albuminuria in both general and diabetic populations. We performed a meta-analysis of data from 63,153 individuals of European ancestry with genotype information from genome-wide association studies (CKDGen Consortium) and from a large candidate gene study (CARe Consortium) to identify susceptibility loci for the quantitative trait urinary albumin-to-creatinine ratio (UACR) and the clinical diagnosis microalbuminuria. We identified an association between a missense variant (I2984V) in the CUBN gene, which encodes cubilin, and both UACR ( P = 1.1 × 10 −11 ) and microalbuminuria ( P = 0.001). We observed similar associations among 6981 African Americans in the CARe Consortium. The associations between this variant and both UACR and microalbuminuria were significant in individuals of European ancestry regardless of diabetes status. Finally, this variant associated with a 41% increased risk for the development of persistent microalbuminuria during 20 years of follow-up among 1304 participants with type 1 diabetes in the prospective DCCT/EDIC Study. In summary, we identified a missense CUBN variant that associates with levels of albuminuria in both the general population and in individuals with diabetes.
Glioma, the most common central nervous system cancer in adults, has poor prognosis. Here we identify a new SNP associated with glioma risk, rs1920116 (near TERC), that reached genome-wide significance (Pcombined = 8.3 × 10(-9)) in a meta-analysis of genome-wide association studies (GWAS) of high-grade glioma and replication data (1,644 cases and 7,736 controls). This region has previously been associated with mean leukocyte telomere length (LTL). We therefore examined the relationship between LTL and both this new risk locus and other previously established risk loci for glioma using data from a recent GWAS of LTL (n = 37,684 individuals). Alleles associated with glioma risk near TERC and TERT were strongly associated with longer LTL (P = 5.5 × 10(-20) and 4.4 × 10(-19), respectively). In contrast, risk-associated alleles near RTEL1 were inconsistently associated with LTL, suggesting the presence of distinct causal alleles. No other risk loci for glioma were associated with LTL. The identification of risk alleles for glioma near TERC and TERT that also associate with telomere length implicates telomerase in gliomagenesis.
OBJECTIVE-To investigate whether associations of common genetic variants recently identified for fasting glucose or insulin levels in nondiabetic adults are detectable in healthy children and adolescents. RESEARCH DESIGN AND METHODS-A total of 16 single nucleotide polymorphisms (SNPs) associated with fasting glucose were genotyped in six studies of children and adolescents of European origin, including over 6,000 boys and girls aged 9-16 years. We performed meta-analyses to test associations of individual SNPs and a weighted risk score of the 16 loci with fasting glucose. RESULTS-Nine loci were associated with glucose levels in healthy children and adolescents, with four of these associations reported in previous studies and five reported here for the first time (GLIS3, PROX1, SLC2A2, ADCY5, and CRY2). Effect sizes were similar to those in adults, suggesting age-independent effects of these fasting glucose loci. Children and adolescents carrying glucose-raising alleles of G6PC2, MTNR1B, GCK, and GLIS3 also showed reduced p-cell function, as indicated by homeostasis model assessment of beta-cell function. Analysis using a weighted risk score showed an increase [beta (95% CI)] in fasting glucose level of 0.026 mrnol/L (0.021-0.031) for each unit increase in the score. CONCLUSIONS-Novel fasting glucose loci identified in genome-wide association studies of adults are associated with altered fasting glucose levels in healthy children and adolescents with effect sizes comparable to adults. In nondiabetic adults, fasting glucose changes little over time, and our results suggest that age-independent effects of fasting glucose loci contribute to long-term interindividual differences in glucose levels from childhood onwards. Diabetes 60:1805-1812, 2011
Background: The association between adiposity and cardiometabolic traits is well known from epidemiological studies. Whilst the causal relationship is clear for some of these traits, for others it is not. We aimed to determine whether adiposity is causally related to various cardiometabolic traits using the Mendelian randomization approach. Methods and Findings: We used the adiposity-associated variant rs9939609 at the FTO locus as an instrumental variable (IV) for body mass index (BMI) in a Mendelian randomization design. Thirty-six population-based studies of individuals of European descent contributed to the analyses. Age-and sex-adjusted regression models were fitted to test for association between (i) rs9939609 and BMI (n = 198,502), (ii) rs9939609 and 24 traits, and (iii) BMI and 24 traits. The causal effect of BMI on the outcome measures was quantified by IV estimators. The estimators were compared to the BMI-trait associations derived from the same individuals. In the IV analysis, we demonstrated novel evidence for a causal relationship between adiposity and incident heart failure (hazard ratio, 1.19 per BMI-unit increase; 95% CI, 1.03-1.39) and replicated earlier reports of a causal association with type 2 diabetes, metabolic syndrome, dyslipidemia, and hypertension (odds ratio for IV estimator, 1.1-1.4; all p
Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identified 18 HbA1c-associated genetic variants. These variants proved to be classifiable by their likely biological action as erythrocytic (also associated with erythrocyte traits) or glycemic (associated with other glucose-related traits). In this study, we tested the hypotheses that, in a very large scale GWAS, we would identify more genetic variants associated with HbA1c and that HbA1c variants implicated in erythrocytic biology would affect the diagnostic accuracy of HbA1c. We therefore expanded the number of HbA1c-associated loci and tested the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance. Throughout this multiancestry study, we kept a focus on interancestry differences in HbA1c genetics performance that might influence race-ancestry differences in health outcomes. Using genome-wide association meta-analyses in up to 159,940 individuals from 82 cohorts of European, African, East Asian, and South Asian ancestry, we identified 60 common genetic variants associated with HbA1c. We classified variants as implicated in glycemic, erythrocytic, or unclassified biology and tested whether additive genetic scores of erythrocytic variants (GS-E) or glycemic variants (GS-G) were associated with higher T2D incidence in multiethnic longitudinal cohorts (N = 33,241). Nineteen glycemic and 22 erythrocytic variants were associated with HbA1c at genome-wide significance. GS-G was associated with higher T2D risk (incidence OR = 1.05, 95% CI 1.04-1.06, per HbA1c-raising allele, p = 3 × 10-29); whereas GS-E was not (OR = 1.00, 95% CI 0.99-1.01, p = 0.60). In Europeans and Asians, erythrocytic variants in aggregate had only modest effects on the diagnostic accuracy of HbA1c. Yet, in African Americans, the X-linked G6PD G202A variant (T-allele frequency 11%) was associated with an absolute decrease in HbA1c of 0.81%-units (95% CI 0.66-0.96) per allele in hemizygous men, and 0.68%-units (95% CI 0.38-0.97) in homozygous women. The G6PD variant may cause approximately 2% (N = 0.65 million, 95% CI 0.55-0.74) of African American adults with T2D to remain undiagnosed when screened with HbA1c. Limitations include the smaller sample sizes for non-European ancestries and the inability to classify approximately one-third of the variants. Further studies in large multiethnic cohorts with HbA1c, glycemic, and erythrocytic traits are required to better determine the biological action of the unclassified variants. As G6PD deficiency can be clinically silent until illness strikes, we recommend investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G6PD deficiency is common. Screening with direct glucose measurements, or genetically-informed HbA1c diagnostic thresholds in people with G6PD deficiency, may be required to avoid missed or delayed diagnoses.
Crohn's disease and ulcerative colitis, the two common forms of inflammatory bowel disease (IBD), affect over 2.5 million people of European ancestry, with rising prevalence in other populations(1). Genome-wide association studies and subsequent meta-analyses of these two diseases(2,3) as separate phenotypes have implicated previously unsuspected mechanisms, such as autophagy(4), in their pathogenesis and showed that some IBD loci are shared with other inflammatory diseases(5). Here we expand on the knowledge of relevant pathways by undertaking a meta-analysis of Crohn's disease and ulcerative colitis genome-wide association scans, followed by extensive validation of significant findings, with a combined total of more than 75,000 cases and controls. We identify 71 new associations, for a total of 163 IBD loci, that meet genome-wide significance thresholds. Most loci contribute to both phenotypes, and both directional (consistently favouring one allele over the course of human history) and balancing (favouring the retention of both alleles within populations) selection effects are evident. Many IBD loci are also implicated in other immune-mediated disorders, most notably with ankylosing spondylitis and psoriasis. We also observe considerable overlap between susceptibility loci for IBD and mycobacterial infection. Gene co-expression network analysis emphasizes this relationship, with pathways shared between host responses to mycobacteria and those predisposing to IBD.
Gene-lifestyle interactions have been suggested to contribute to the development of type 2 diabetes. Glucose levels 2 h after a standard 75-g glucose challenge are used to diagnose diabetes and are associated with both genetic and lifestyle factors. However, whether these factors interact to determine 2-h glucose levels is unknown. We meta-analyzed single nucleotide polymorphism (SNP) X BMI and SNP x physical activity (PA) interaction regression models for five SNPs previously associated with 2-h glucose levels from up to 22 studies comprising 54,884 individuals without diabetes. PA levels were dichotomized, with individuals below the first quintile classified as inactive (20%) and the remainder as active (80%). BMI was considered a continuous trait. Inactive individuals had higher 2-h glucose levels than active individuals (beta = 0.22 mmol/L [95% CI 0.13-0.31], P = 1.63 X 10(-6)). All SNPs were associated with 2-h glucose (beta = 0.06-0.12 mmol/allele, P 0.18) or BMI (P >= 0.04). In this large study of gene-lifestyle interaction, we observed no interactions between genetic and lifestyle factors, both of which were associated with 2-h glucose. It is perhaps unlikely that top loci from genome-wide association studies will exhibit strong subgroup-specific effects, and may not, therefore, make the best candidates for the study of interactions. Diabetes 61:1291-1296, 2012
Background: Mitochondrial dysfunction has been implicated in the pathophysiology of Parkinson's disease (PD)-related pathologies. Objective: To investigate the role of the Translocase of the Outer Mitochondrial Membrane 40 homolog (TOMM40) variants in PD without dementia (PDND), PD with dementia (PDD) and in Dementia with Lewy bodies (DLB). Methods: 248 individuals, including 92 PDND, 55 PDD, and 101 DLB, were included. The rs10524523 locus in the TOMM40 gene (TOMM40 poly-T repeat) is characterized by a variable number of T residues that were classified into three groups based on length; short (S), long (L), and very long (VL). We tested log-additive genetic model of association with dementia and adjusted for age, sex, and APOE epsilon 4 carrier status. We analyzed cerebrospinal fluid (CSF) levels of A beta(42) and Tau, biomarkers related to Alzheimer's disease (AD). Results: PDD/DBL status and abnormal CSF AD biomarkers (A beta(42) and A beta(42)/Tau ratio) were both associated with the APOE-epsilon 4 allele (p < 0.014) and the L allele of TOMM40 poly-T repeat (p < 0.008). The VL allele was less frequently observed in the PDD/DLB group (p = 0.013). In APOE-epsilon 4 adjusted analyses, the relationships between the L and VL alleles and dementia status as well as CSF AD biomarkers were not significant. When adjusting for APOE-epsilon 4, however, there were associations between S carrier status and PDD/DLB (p = 0.019) and abnormal CSF levels of A beta(42)/Tau ratio (p = 0.037) although these were not significant after adjustment for multiple comparisons. Conclusion: Our results do not support the notion that TOMM40 poly-T repeat variants have independent effects on PDD and DLB pathology. This relationship seems to be driven by APOE-epsilon 4.
Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age- and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to ~2.8M SNPs with BMI and WHRadjBMI in four strata (men ≤50y, men >50y, women ≤50y, women >50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR
Polycystic ovary syndrome (PCOS) is a common endocrine condition in women of reproductive age understudied in non-European populations. In India, PCOS affects the life of up to 19.4 million women of age 14-25 years. Gut microbiome composition might contribute to PCOS susceptibility. We profiled the microbiome in DNA isolated from faecal samples by 16S rRNA sequencing in 19/20 women with/without PCOS from Kashmir, India. We assigned genera to sequenced species with an average 121k reads depth and included bacteria detected in at least 1/3 of the subjects or with average relative abundance ≥0.1%. We compared the relative abundances of 40/58 operational taxonomic units in family/genus level between cases and controls, and in relation to 33 hormonal and metabolic factors, by multivariate analyses adjusted for confounders, and corrected for multiple testing. Seven genera were significantly enriched in PCOS cases: Sarcina, Alkalibacterium and Megasphaera, and previously reported for PCOS Bifidobacterium, Collinsella, Paraprevotella and Lactobacillus. We identified significantly increased relative abundance of Bifidobacteriaceae (median 6.07% vs. 2.77%) and Aerococcaceae (0.03% vs. 0.004%), whereas we detected lower relative abundance Peptococcaceae (0.16% vs. 0.25%) in PCOS cases. For the first time, we identified a significant direct association between butyrate producing Eubacterium and follicle-stimulating hormone levels. We observed increased relative abundance of Collinsella and Paraprevotella with higher fasting blood glucose levels, and Paraprevotella and Alkalibacterium with larger hip and waist circumference, and weight. We show a relationship between gut microbiome composition and PCOS linking it to specific reproductive health metabolic and hormonal predictors in Indian women.
Physical activity (PA) may modify the genetic effects that give rise to increased risk of obesity. To identify adiposity loci whose effects are modified by PA, we performed genome-wide interaction meta-analyses of BMI and BMI-adjusted waist circumference and waist-hip ratio from up to 200,452 adults of European (n = 180,423) or other ancestry (n = 20,029). We standardized PA by categorizing it into a dichotomous variable where, on average, 23% of participants were categorized as inactive and 77% as physically active. While we replicate the interaction with PA for the strongest known obesity-risk locus in the FTO gene, of which the effect is attenuated by similar to 30% in physically active individuals compared to inactive individuals, we do not identify additional loci that are sensitive to PA. In additional genome-wide meta-analyses adjusting for PA and interaction with PA, we identify 11 novel adiposity loci, suggesting that accounting for PA or other environmental factors that contribute to variation in adiposity may facilitate gene discovery.
Background: High birth weight is associated with adult body mass index (BMI). We hypothesized that birth weight and BMI may partly share a common genetic background. Objective: The objective was to examine the associations of 12 established BMI variants in or near the NEGR1, SEC16B, TMEM18, ETV5, GNPDA2, BDNF, MTCH2, BCDIN3D, SH2B1, FTO, MC4R, and KCTD15 genes and their additive score with birth weight. Design: A meta-analysis was conducted with the use of 1) the European Prospective Investigation into Cancer and Nutrition (EPIC)-Norfolk, Hertfordshire, Fenland, and European Youth Heart Study cohorts (n(max) = 14,060); 2) data extracted from the Early Growth Genetics Consortium meta-analysis of 6 genome-wide association studies for birth weight (n(max) = 10,623); and 3) all published data (n(max) = 14,837). Results: Only the MTCH2 and FTO loci showed a nominally significant association with birth weight. The BMI-increasing allele of the MTCH2 variant (rs10838738) was associated with a lower birth weight (beta +/- SE: 213 +/- 5 g/allele; P = 0.012; n = 23,680), and the BMI-increasing allele of the FTO variant (rs1121980) was associated with a higher birth weight (beta +/- SE: 11 +/- 4 g/allele; P = 0.013; n = 28,219). These results were not significant after correction for multiple testing. Conclusions: Obesity-susceptibility loci have a small or no effect on weight at birth. Some evidence of an association was found for the MTCH2 and FTO loci, ie, lower and higher birth weight, respectively. These findings may provide new insights into the underlying mechanisms by which these loci confer an increased risk of obesity. Am J Clin Nutr 2011;93:851-60.
Nearly three-quarters of the 143 genetic signals associated with platelet and erythrocyte phenotypes identified by meta-analyses of genome-wide association (GWA) studies are located at non-protein-coding regions. Here, we assessed the role of candidate regulatory variants associated with cell type-restricted, closely related hematological quantitative traits in biologically relevant hematopoietic cell types. We used formaldehyde-assisted isolation of regulatory elements followed by next-generation sequencing (FAIRE-seq) to map regions of open chromatin in three primary human blood cells of the myeloid lineage. In the precursors of platelets and erythrocytes, as well as in monocytes, we found that open chromatin signatures reflect the corresponding hematopoietic lineages of the studied cell types and associate with the cell type-specific gene expression patterns. Dependent on their signal strength, open chromatin regions showed correlation with promoter and enhancer histone marks, distance to the transcription start site, and ontology classes of nearby genes. Cell type-restricted regions of open chromatin were enriched in sequence variants associated with hematological indices. The majority (63.6%) of such candidate functional variants at platelet quantitative trait loci (QTLs) coincided with binding sites of five transcription factors key in regulating megakaryopoiesis. We experimentally tested 13 candidate regulatory variants at 10 platelet QTLs and found that 10 (76.9%) affected protein binding, suggesting that this is a frequent mechanism by which regulatory variants influence quantitative trait levels. Our findings demonstrate that combining large-scale GWA data with open chromatin profiles of relevant cell types can be a powerful means of dissecting the genetic architecture of closely related quantitative traits.
Recent genome-wide association (GWA) studies described 95 loci controlling serum lipid levels. These common variants explain ∼25% of the heritability of the phenotypes. To date, no unbiased screen for gene–environment interactions for circulating lipids has been reported. We screened for variants that modify the relationship between known epidemiological risk factors and circulating lipid levels in a meta-analysis of genome-wide association (GWA) data from 18 population-based cohorts with European ancestry (maximum N = 32,225). We collected 8 further cohorts ( N = 17,102) for replication, and rs6448771 on 4p15 demonstrated genome-wide significant interaction with waist-to-hip-ratio (WHR) on total cholesterol (TC) with a combined P -value of 4.79×10 −9 . There were two potential candidate genes in the region, PCDH7 and CCKAR , with differential expression levels for rs6448771 genotypes in adipose tissue. The effect of WHR on TC was strongest for individuals carrying two copies of G allele, for whom a one standard deviation (sd) difference in WHR corresponds to 0.19 sd difference in TC concentration, while for A allele homozygous the difference was 0.12 sd. Our findings may open up possibilities for targeted intervention strategies for people characterized by specific genomic profiles. However, more refined measures of both body-fat distribution and metabolic measures are needed to understand how their joint dynamics are modified by the newly found locus. Circulating serum lipids contribute greatly to the global health by affecting the risk for cardiovascular diseases. Serum lipid levels are partly inherited, and already 95 loci affecting high- and low-density lipoprotein cholesterol, total cholesterol, and triglycerides have been found. Serum lipids are also known to be affected by multiple epidemiological risk factors like body composition, lifestyle, and sex. It has been hypothesized that there are loci modifying the effects between risk factors and serum lipids, but to date only candidate gene studies for interactions have been reported. We conducted a genome-wide screen with meta-analysis approach to identify loci having interactions with epidemiological risk factors on serum lipids with over 30,000 population-based samples. When combining results from our initial datasets and 8 additional replication cohorts (maximum N = 17,102), we found a genome-wide significant locus in chromosome 4p15 with a joint P -value of 4.79×10 −9 modifying the effect of waist-to-hip ratio on total cholesterol. In the area surrounding this genetic variant, there were two genes having association between the genotypes and the gene expression in adipose tissue, and we also found enrichment of association in genes belonging to lipid metabolism related functions.
Birth weight (BW) has been shown to be influenced by both fetal and maternal factors and in observational studies is reproducibly associated with future risk of adult metabolic diseases including type 2 diabetes (T2D) and cardiovascular disease. These life-course associations have often been attributed to the impact of an adverse early life environment. Here, we performed a multi-ancestry genome-wide association study (GWAS) meta-analysis of BW in 153,781 individuals, identifying 60 loci where fetal genotype was associated with BW (P
Adult height is a model polygenic trait, but there has been limited success in identifying the genes underlying its normal variation. To identify genetic variants influencing adult human height, we used genome-wide association data from 13,665 individuals and genotyped 39 variants in an additional 16,482 samples. We identified 20 variants associated with adult height ( P < 5 x 10(-7), with 10 reaching P < 1 iota x 10(-10)). Combined, the 20 SNPs explain similar to 3% of height variation, with a similar to 5 cm difference between the 6.2% of people with iota 7 or fewer 'tall' alleles compared to the 5.5% with 27 or more 'tall' alleles. The loci we identified implicate genes in Hedgehog signaling ( IHH, HHIP, PTCH1), extracellular matrix ( EFEMP1, ADAMTSL3, ACAN) and cancer ( CDK6, HMGA2, DLEU7) pathways, and provide new insights into human growth and developmental processes. Finally, our results provide insights into the genetic architecture of a classic quantitative trait.
The development of type 2 diabetes (T2D) is influenced both by environmental and by genetic determinants. Obesity is an important risk factor for T2D, mostly mediated by obesity-related insulin resistance. Obesity and insulin resistance are also modulated by the genetic milieu; thus, genes affecting risk of obesity and insulin resistance might also modulate risk of T2D. Recently, 32 loci have been associated with body mass index (BMI) by genome-wide studies, including one locus on chromosome 16p11 containing the SH2B1 gene. Animal studies have suggested that SH2B1 is a physiological enhancer of the insulin receptor and humans with rare deletions or mutations at SH2B1 are obese with a disproportionately high insulin resistance. Thus, the role of SH2B1 in both obesity and insulin resistance makes it a strong candidate for T2D. However, published data on the role of SH2B1 variability on the risk for T2D are conflicting, ranging from no effect at all to a robust association. The SH2B1 tag SNP rs4788102 (SNP, single nucleotide polymorphism) was genotyped in 6978 individuals from six studies for abnormal glucose homeostasis (AGH), including impaired fasting glucose, impaired glucose tolerance or T2D, from the GENetics of Type 2 Diabetes in Italy and the United States (GENIUS T2D) consortium. Data from these studies were then meta-analyzed, in a Bayesian fashion, with those from DIAGRAM+ (n = 47,117) and four other published studies (n = 39,448). Variability at the SH2B1 obesity locus was not associated with AGH either in the GENIUS consortium (overall odds ratio (OR) = 0.96; 0.89–1.04) or in the meta-analysis (OR = 1.01; 0.98–1.05). Our data exclude a role for the SH2B1 obesity locus in the modulation of AGH.
Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the "Metabochip," a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.
We carried out a genome-wide association study of type-2 diabetes (T2D) in individuals of South Asian ancestry. Our discovery set included 5,561 individuals with T2D (cases) and 14,458 controls drawn from studies in London, Pakistan and Singapore. We identified 20 independent SNPs associated with T2D at P < 10(-4) for testing in a replication sample of 13,170 cases and 25,398 controls, also all of South Asian ancestry. In the combined analysis, we identified common genetic variants at six loci (GRB14, ST6GAL1, VPS26A, HMG20A, AP3S2 and HNF4A) newly associated with T2D (P = 4.1 x 10(-8) to P = 1.9 x 10(-11)). SNPs at GRB14 were also associated with insulin sensitivity (P = 5.0 x 10(-4)), and SNPs at ST6GAL1 and HNF4A were also associated with pancreatic beta-cell function (P = 0.02 and P = 0.001, respectively). Our findings provide additional insight into mechanisms underlying T2D and show the potential for new discovery from genetic association studies in South Asians, a population with increased susceptibility to T2D.
Improved sequencing technologies offer unprecedented opportunities for investigating the role of rare genetic variation in common disease. However, there are considerable challenges with respect to study design, data analysis and replication(1). Using pooled next-generation sequencing of 507 genes implicated in the repair of DNA in 1,150 samples, an analytical strategy focused on protein-truncating variants (PTVs) and a large-scale sequencing case-control replication experiment in 13,642 individuals, here we show that rare PTVs in the p53-inducible protein phosphatase PPM1D are associated with predisposition to breast cancer and ovarian cancer. PPM1D PTV mutations were present in 25 out of 7,781 cases versus 1 out of 5,861 controls (P = 1.12 x 10(-5)), including 18 mutations in 6,912 individuals with breast cancer (P = 2.42 x 10(-4)) and 12 mutations in 1,121 individuals with ovarian cancer (P = 3.10 x 10(-9)). Notably, all of the identified PPM1D PTVs were mosaic in lymphocyte DNA and clustered within a 370-base-pair region in the final exon of the gene, carboxy-terminal to the phosphatase catalytic domain. Functional studies demonstrate that the mutations result in enhanced suppression of p53 in response to ionizing radiation exposure, suggesting that the mutant alleles encode hyperactive PPM1D isoforms. Thus, although the mutations cause premature protein truncation, they do not result in the simple loss-of-function effect typically associated with this class of variant, but instead probably have a gain-of-function effect. Our results have implications for the detection and management of breast and ovarian cancer risk. More generally, these data provide new insights into the role of rare and of mosaic genetic variants in common conditions, and the use of sequencing in their identification.
Favorable associations between magnesium intake and glycemic traits, such as fasting glucose and insulin, are observed in observational and clinical studies, but whether genetic variation affects these associations is largely unknown. We hypothesized that single nucleotide polymorphisms (SNPs) associated with either glycemic traits or magnesium metabolism affect the association between magnesium intake and fasting glucose and insulin. Fifteen studies from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) Consortium provided data from up to 52,684 participants of European descent without known diabetes. In fixed-effects meta-analyses, we quantified 1) cross-sectional associations of dietary magnesium intake with fasting glucose (mmol/L) and insulin (In-pmol/L) and 2) interactions between magnesium intake and SNPs related to fasting glucose (16 SNPs), insulin (2 SNPs), or magnesium (8 SNPs) on fasting glucose and insulin. After adjustment for age, sex, energy intake, BMI, and behavioral risk factors, magnesium (per 50-mg/d increment) was inversely associated with fasting glucose [beta = -0.009 mmol/L (95% CI: -0.013, -0.005), P< 0.0001] and insulin (-0.020 In-pmo/L (95% CI: -0.024, -0.017), P< 0.0001]. No magnesium-related SNP or interaction between any SNP and magnesium reached significance after correction for multiple testing. However, rs2274924 in magnesium transporter-encoding TRPM6 showed a nominal association (uncorrected P= 0.03) with glucose, and rs11558471 in SLC30A8and rs3740393 near CNNM2showed a nominal interaction (uncorrected, both P = 0.02) with magnesium on glucose. Consistent with other studies, a higher magnesium intake was associated with lower fasting glucose and insulin. Nominal evidence of TRPM6 influence and magnesium interaction with select loci suggests that further investigation is warranted. J. Nutr. 143: 345-353, 2013.
Large consortia have revealed hundreds of genetic loci associated with anthropometric traits, one trait at a time. We examined whether genetic variants affect body shape as a composite phenotype that is represented by a combination of anthropometric traits. We developed an approach that calculates averaged PCs (AvPCs) representing body shape derived from six anthropometric traits (body mass index, height, weight, waist and hip circumference, waist-to-hip ratio). The first four AvPCs explain >99% of the variability, are heritable, and associate with cardiometabolic outcomes. We performed genome-wide association analyses for each body shape composite phenotype across 65 studies and meta-analysed summary statistics. We identify six novel loci: LEMD2 and CD47 for AvPC1, RPS6KA5/C14orf159 and GANAB for AvPC3, and ARL15 and ANP32 for AvPC4. Our findings highlight the value of using multiple traits to define complex phenotypes for discovery, which are not captured by single-trait analyses, and may shed light onto new pathways.
Telomere maintenance has emerged as an important molecular feature with impacts on adult glioma susceptibility and prognosis. Whether longer or shorter leukocyte telomere length (LTL) is associated with glioma risk remains elusive and is often confounded by the effects of age and patient treatment. We sought to determine if genotypically-estimated LTL is associated with glioma risk and if inherited single nucleotide polymorphisms (SNPs) that are associated with LTL are glioma risk factors. Using a Mendelian randomization approach, we assessed differences in genotypically-estimated relative LTL in two independent glioma case-control datasets from the UCSF Adult Glioma Study (652 patients and 3735 controls) and The Cancer Genome Atlas (478 non-overlapping patients and 2559 controls). LTL estimates were based on a weighted linear combination of subject genotype at eight SNPs, previously associated with LTL in the ENGAGE Consortium Telomere Project. Mean estimated LTL was 31bp (5.7%) longer in glioma patients than controls in discovery analyses (P = 7.82x10-8) and 27bp (5.0%) longer in glioma patients than controls in replication analyses (1.48x10-3). Glioma risk increased monotonically with each increasing septile of LTL (O.R.=1.12; P = 3.83x10-12). Four LTL-associated SNPs were significantly associated with glioma risk in pooled analyses, including those in the telomerase component genes TERC (O.R.=1.14; 95% C.I.=1.03-1.28) and TERT (O.R.=1.39; 95% C.I.=1.27-1.52), and those in the CST complex genes OBFC1 (O.R.=1.18; 95% C.I.=1.05-1.33) and CTC1 (O.R.=1.14; 95% C.I.=1.02-1.28). Future work is needed to characterize the role of the CST complex in gliomagenesis and further elucidate the complex balance between ageing, telomere length, and molecular carcinogenesis.
OBJECTIVE - Whole-grain foods are touted for multiple health benefits including enhancing insulin sensitivity and reducing type 2 diabetes risk Recent genome-wide association studies (GWAS) have identified several single nucleotide polymorphisms (SNPs) associated with fasting glucose and insulin concentrations in individuals free of diabetes We tested the hypothesis that whole-grain food intake and genetic variation interact to influence concentrations of fasting glucose and insulin RESEARCH DESIGN AND METHODS - Via meta-analysis of data from 14 cohorts comprising similar to 48 000 participants of European descent we studied interactions of whole-grain intake with loci previously associated in GWAS with fasting glucose (16 loci) and/or insulin (2 loci) concentrations For tests of interaction we considered a P value
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
Alcohol consumption is a moderately heritable trait, but the genetic basis in humans is largely unknown, despite its clinical and societal importance. We report a genome-wide association study meta-analysis of similar to 2.5 million directly genotyped or imputed SNPs with alcohol consumption (gram per day per kilogram body weight) among 12 population-based samples of European ancestry, comprising 26,316 individuals, with replication genotyping in an additional 21,185 individuals. SNP rs6943555 in autism susceptibility candidate 2 gene (AUTS2) was associated with alcohol consumption at genome-wide significance (P = 4 x 10(-8) to P = 4 x 10(-9)). We found a genotype-specific expression of AUTS2 in 96 human prefrontal cortex samples (P = 0.026) and significant (P < 0.017) differences in expression of AUTS2 in whole-brain extracts of mice selected for differences in voluntary alcohol consumption. Downregulation of an AUTS2 homolog caused reduced alcohol sensitivity in Drosophila (P < 0.001). Our finding of a regulator of alcohol consumption adds knowledge to our understanding of genetic mechanisms influencing alcohol drinking behavior.
Substantial progress has been made in identification of type 2 diabetes (T2D) risk loci in the past few years, but our understanding of the genetic basis of T2D in ethnically diverse populations remains limited. We performed a genome-wide association study and a replication study in Chinese Hans comprising 8,569 T2D case subjects and 8,923 control subjects in total, from which 10 single nucleotide polymorphisms were selected for further follow-up in a de novo replication sample of 3,410 T2D case and 3,412 control subjects and an in silico replication sample of 6,952 T2D case and 11,865 control subjects. Besides confirming seven established T2D loci ( CDKAL1 , CDKN2A/B , KCNQ1 , CDC123 , GLIS3 , HNF1B , and DUSP9 ) at genome-wide significance, we identified two novel T2D loci, including G-protein–coupled receptor kinase 5 ( GRK5 ) (rs10886471: P = 7.1 × 10 −9 ) and RASGRP1 (rs7403531: P = 3.9 × 10 −9 ), of which the association signal at GRK5 seems to be specific to East Asians. In nondiabetic individuals, the T2D risk-increasing allele of RASGRP1 -rs7403531 was also associated with higher HbA 1c and lower homeostasis model assessment of β-cell function ( P = 0.03 and 0.0209, respectively), whereas the T2D risk-increasing allele of GRK5 -rs10886471 was also associated with higher fasting insulin ( P = 0.0169) but not with fasting glucose. Our findings not only provide new insights into the pathophysiology of T2D, but may also shed light on the ethnic differences in T2D susceptibility.
Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 x 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms.
Pulmonary arterial hypertension (PAH) is a rare disorder with a poor prognosis. Deleterious variation within components of the transforming growth factor-beta pathway, particularly the bone morphogenetic protein type 2 receptor (BMPR2), underlies most heritable forms of PAH. To identify the missing heritability we perform whole-genome sequencing in 1038 PAH index cases and 6385 PAH-negative control subjects. Case-control analyses reveal significant overrepresentation of rare variants in ATP13A3, AQP1 and SOX17, and provide independent validation of a critical role for GDF2 in PAH. We demonstrate familial segregation of mutations in SOX17 and AQP1 with PAH. Mutations in GDF2, encoding a BMPR2 ligand, lead to reduced secretion from transfected cells. In addition, we identify pathogenic mutations in the majority of previously reported PAH genes, and provide evidence for further putative genes. Taken together these findings contribute new insights into the molecular basis of PAH and indicate unexplored pathways for therapeutic intervention.
Chronic kidney disease (CKD) is a significant public health problem, and recent genetic studies have identified common CKD susceptibility variants. The CKDGen consortium performed a meta-analysis of genome-wide association data in 67,093 individuals of European ancestry from 20 predominantly population-based studies in order to identify new susceptibility loci for reduced renal function as estimated by serum creat-inine (eGFRcrea), serum cystatin c (eGFRcys) and CKD (eGFRcrea < 60 ml/min/ 1.73 m(2); n = 5,807 individuals with CKD (cases)). Follow-up of the 23 new genome-wide-significant loci (P < 5 x 10(-8)) in 22,982 replication samples identified 13 new loci affecting renal function and CKD (in or near LASS2, GCKR, ALMS1, TFDP2, DAB2, SLC34A1, VEGFA, PRKAG2, PIP5K1B, ATXN2, DACH1, UBE2Q2 and SLC7A9) and 7 loci suspected to affect creatinine production and secretion (CPS1, SLC22A2, TMEM60, WDR37, SLC6A13, WDR72 and BCAS3). These results further our understanding of the biologic mechanisms of kidney function by identifying loci that potentially influence nephrogenesis, podocyte function, angiogenesis, solute transport and metabolic functions of the kidney.
The Genetic Investigation of Anthropometric Traits (GIANT) consortium identified 14 loci in European Ancestry (EA) individuals associated with waist-to-hip ratio (WHR) adjusted for body mass index. These loci are wide and narrowing the signals remains necessary. Twelve of 14 loci identified in GIANT EA samples retained strong associations with WHR in our joint EA/individuals of African Ancestry (AA) analysis (log-Bayes factor >6.1). Trans-ethnic analyses at five loci (TBX15-WARS2, LYPLAL1, ADAMTS9, LY86 and ITPR2-SSPN) substantially narrowed the signals to smaller sets of variants, some of which are in regions that have evidence of regulatory activity. By leveraging varying linkage disequilibrium structures across different populations, single-nucleotide polymorphisms (SNPs) with strong signals and narrower credible sets from trans-ethnic meta-analysis of central obesity provide more precise localizations of potential functional variants and suggest a possible regulatory role. Meta-analysis results for WHR were obtained from 77 167 EA participants from GIANT and 23 564 AA participants from the African Ancestry Anthropometry Genetics Consortium. For fine mapping we interrogated SNPs within ± 250 kb flanking regions of 14 previously reported index SNPs from loci discovered in EA populations by performing trans-ethnic meta-analysis of results from the EA and AA meta-analyses. We applied a Bayesian approach that leverages allelic heterogeneity across populations to combine meta-analysis results and aids in fine-mapping shared variants at these locations. We annotated variants using information from the ENCODE Consortium and Roadmap Epigenomics Project to prioritize variants for possible functionality.
To further understanding of the genetic basis of type 2 diabetes (T2D) susceptibility, we aggregated published meta-analyses of genome-wide association studies (GWAS), including 26,488 cases and 83,964 controls of European, east Asian, south Asian and Mexican and Mexican American ancestry. We observed a significant excess in the directional consistency of T2D risk alleles across ancestry groups, even at SNPs demonstrating only weak evidence of association. By following up the strongest signals of association from the trans-ethnic meta-analysis in an additional 21,491 cases and 55,647 controls of European ancestry, we identified seven new T2D susceptibility loci. Furthermore, we observed considerable improvements in the fine-mapping resolution of common variant association signals at several T2D susceptibility loci. These observations highlight the benefits of trans-ethnic GWAS for the discovery and characterization of complex trait loci and emphasize an exciting opportunity to extend insight into the genetic architecture and pathogenesis of human diseases across populations of diverse ancestry.
Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR
OBJECTIVE-Proinsulin is a precursor of mature insulin and C-peptide. Higher circulating proinsulin levels are associated with impaired beta-cell function, raised glucose levels, insulin resistance, and type 2 diabetes (T2D). Studies of the insulin processing pathway could provide new insights about T2D pathophysiology. RESEARCH DESIGN AND METHODS-We have conducted a meta-analysis of genome-wide association tests of similar to 2.5 million genotyped or imputed single nucleotide polymorphisms (SNPs) and fasting proinsulin levels in 10,701 nondiabetic adults of European ancestry, with follow-up of 23 loci in up to 16,378 individuals, using additive genetic models adjusted for age, sex, fasting insulin, and study-specific covariates. RESULTS-Nine SNPs at eight loci were associated with proinsulin levels (P < 5 x 10(-8)). Two loci (LARP6 and SGSM2) have not been previously related to metabolic traits, one (MADD) has been associated with fasting glucose, one (PCSK1) has been implicated in obesity, and four (TCF7L2, SLC3OA8, VPS13C/C2CD4A/B, and ARAP1, formerly CENTD2) increase T2D risk. The proinsulin-raising allele of ARAP1 was associated with a lower fasting glucose (P = 1.7 x 10(-4)), improved beta-cell function (P = 1.1 x 10(-5)), and lower risk of T2D (odds ratio 0.88; P = 7.8 x 10(-6)). Notably, PCSK1 encodes the protein prohormone convertase 1/3, the first enzyme in the insulin processing pathway. A genotype score composed of the nine proinsulin-raising alleles was not associated with coronary disease in two large case-control datasets. CONCLUSIONS-We have identified nine genetic variants associated with fasting proinsulin. Our findings illuminate the biology underlying glucose homeostasis and T2D development in humans and argue against a direct role of proinsulin in coronary artery disease pathogenesis. Diabetes 60:2624-2634, 2011
Observational studies have reported different effects of adiposity on cardiovascular risk factors across age and sex. Since cardiovascular risk factors are enriched in obese individuals, it has not been easy to dissect the effects of adiposity from those of other risk factors. We used a Mendelian randomization approach, applying a set of 32 genetic markers to estimate the causal effect of adiposity on blood pressure, glycemic indices, circulating lipid levels, and markers of inflammation and liver disease in up to 67,553 individuals. All analyses were stratified by age (cutoff 55 years of age) and sex. The genetic score was associated with BMI in both nonstratified analysis (P = 2.8 × 10(-107)) and stratified analyses (all P < 3.3 × 10(-30)). We found evidence of a causal effect of adiposity on blood pressure, fasting levels of insulin, C-reactive protein, interleukin-6, HDL cholesterol, and triglycerides in a nonstratified analysis and in the
Multiple sclerosis (MS) is a chronic, inflammatory, disabling disease of the central nervous system, known for its complex interplay between genetic and environmental factors. We used life table techniques to calculate age-adjusted recurrence risks for different categories of relatives of MS patients from Central Sardinia (Italy), a genetically homogeneous, stable population with a high degree of consanguinity. We included 313 probands and a total of 12,717 relatives in the analysis. The overall age-adjusted recurrence risk for relatives of MS probands is 1.90% [95% confidence interval (CI): 1.57–2.30]. The age-adjusted recurrence risk in parents was 1.26% (95% CI 0.60–2.63), in children 2.33% (95% CI 0.09–5.56), in sibs 4.76% (95% CI 3.57–6.32), in second-degree relatives 0.72% (95% CI 0.42–1.22), and in third-degree relatives 1.79% (95% CI 1.27–2.51). The sex of the probands (male) and of the relatives (female), and the number of affected relatives in the family significantly increase the risk of MS in relatives.
Patients with established type 2 diabetes display both β-cell dysfunction and insulin resistance. To define fundamental processes leading to the diabetic state, we examined the relationship between type 2 diabetes risk variants at 37 established susceptibility loci, and indices of proinsulin processing, insulin secretion, and insulin sensitivity. We included data from up to 58,614 nondiabetic subjects with basal measures and 17,327 with dynamic measures. We used additive genetic models with adjustment for sex, age, and BMI, followed by fixed-effects, inverse-variance meta-analyses. Cluster analyses grouped risk loci into five major categories based on their relationship to these continuous glycemic phenotypes. The first cluster (PPARG, KLF14, IRS1, GCKR) was characterized by primary effects on insulin sensitivity. The second cluster (MTNR1B, GCK) featured risk alleles associated with reduced insulin secretion and fasting hyperglycemia. ARAP1 constituted a third cluster characterized by defects in insulin processing. A fourth cluster (TCF7L2, SLC30A8, HHEX/IDE, CDKAL1, CDKN2A/2B) was defined by loci influencing insulin processing and secretion without a detectable change in fasting glucose levels. The final group contained 20 risk loci with no clear-cut associations to continuous glycemic traits. By assembling extensive data on continuous glycemic traits, we have exposed the diverse mechanisms whereby type 2 diabetes risk variants impact disease predisposition.
Context: Serum estradiol (E2) and estrone (E1) levels exhibit substantial heritability. Objective: To investigate the genetic regulation of serum E2 and E1 in men. Design, Setting, and Participants: Genome-wide association study in 11,097 men of European origin from nine epidemiological cohorts. Main Outcome Measures: Genetic determinants of serum E2 and E1 levels. Results: Variants in/near CYP19A1 demonstrated the strongest evidence for association with E2, resolving to three independent signals. Two additional independent signals were found on the X chromosome; FAMily with sequence similarity 9, member B (FAM9B), rs5934505 (P = 3.4 x 10(-8)) and Xq27.3, rs5951794 (P = 3.1 x 10(-10)). E1 signals were found in CYP19A1 (rs2899472, P = 5.5 x 10(-23)), in Tripartite motif containing 4 (TRIM4; rs17277546, P = 5.8 x 10(-14)), and CYP11B1/B2 (rs10093796, P = 1.2 x 10(-8)). E2 signals in CYP19A1 and FAM9B were associated with bone mineral density (BMD). Mendelian randomization analysis suggested a causal effect of serum E2 on BMD in men. A 1 pg/mL genetically increased E2 was associated with a 0.048 standard deviation increase in lumbar spine BMD (P = 2.8 x 10(-12)). In men and women combined, CYP19A1 alleles associated with higher E2 levels were associated with lower degrees of insulin resistance. Conclusions: Our findings confirm that CYP19A1 is an important genetic regulator of E2 and E1 levels and strengthen the causal importance of E2 for bone health in men. We also report two independent loci on the X-chromosome for E2, and one locus each in TRIM4 and CYP11B1/B2, for E1.
Chronic kidney disease (CKD) is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR), the primary clinical measure of CKD, in or near MPPED2 , DDX1 , SLC47A1 , CDK12 , CASP9 , and INO80 . Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD. Chronic kidney disease (CKD) is an important public health problem with a hereditary component. We performed a new genome-wide association study in up to 130,600 European ancestry individuals to identify genes that may influence kidney function, specifically genes that may influence kidney function differently depending on sex, age, hypertension, and diabetes status of individuals. We uncovered 6 new loci associated with estimated glomerular filtration rate (eGFR), the primary measure of renal function, in or near MPPED2 , DDX1 , SLC47A1 , CDK12 , CASP9 , and INO80 . CDK12 effect was stronger in younger and absent in older individuals. MPPED2 , DDX1 , SLC47A1 , and CDK12 loci were associated with eGFR in African ancestry samples as well, highlighting the cross-ethnicity validity of our findings. Using the zebrafish model, we performed morpholino knockdown of mpped2 and casp9 in zebrafish embryos and revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. These results further our understanding of the pathogenesis of CKD and provide insights into potential novel mechanisms of disease.
Elevated levels of acute-phase serum amyloid A (A-SAA) cause amyloidosis and are a risk factor for atherosclerosis and its clinical complications, type 2 diabetes, as well as various malignancies. To investigate the genetic basis of A-SAA levels, we conducted the first genome-wide association study on baseline A-SAA concentrations in three population-based studies (KORA, TwinsUK, Sorbs) and one prospective case cohort study (LURIC), including a total of 4,212 participants of European descent, and identified two novel genetic susceptibility regions at 11p15.5-p13 and 1p31. The region at 11p15.5-p13 (rs4150642; p = 3.20x10(-111)) contains serum amyloid A1 (SAA1) and the adjacent general transcription factor 2 H1 (GTF2H1), Hermansky-Pudlak Syndrome 5 (HPS5), lactate dehydrogenase A (LDHA), and lactate dehydrogenase C (LDHC). This region explains 10.84% of the total variation of A-SAA levels in our data, which makes up 18.37% of the total estimated heritability. The second region encloses the leptin receptor (LEPR) gene at 1p31 (rs12753193; p = 1.22x10(-11)) and has been found to be associated with CRP and fibrinogen in previous studies. Our findings demonstrate a key role of the 11p15.5-p13 region in the regulation of baseline A-SAA levels and provide confirmative evidence of the importance of the 1p31 region for inflammatory processes and the close interplay between A-SAA, leptin, and other acute-phase proteins.
Over the past 8 years, the genetics of complex traits have benefited from an unprecedented advancement in the identification of common variant loci for diseases such as type 2 diabetes (T2D). The ability to undertake genome-wide association studies in large population-based samples for quantitative glycaemic traits has permitted us to explore the hypothesis that models arising from studies in non-diabetic individuals may reflect mechanisms involved in the pathogenesis of diabetes. Amongst 88 T2D risk and 72 glycaemic trait loci, only 29 are shared and show disproportionate magnitudes of phenotypic effects. Important mechanistic insights have been gained regarding the physiological role of T2D loci in disease predisposition through the elucidation of their contribution to glycaemic trait variability. Further investigation is warranted to define causal variants within these loci, including functional characterisation of associated variants, to dissect their role in disease mechanisms and to enable clinical translation.
Despite significant research efforts aimed at understanding the neurobiological underpinnings of psychiatric disorders, the diagnosis and the evaluation of treatment of these disorders are still based solely on relatively subjective assessment of symptoms. Therefore, biological markers which could improve the current classification of psychiatry disorders, and in perspective stratify patients on a biological basis into more homogeneous clinically distinct subgroups, are highly needed. In order to identify novel candidate biological markers for major depression and schizophrenia, we have applied a focused proteomic approach using plasma samples from a large case-control collection. Patients were diagnosed according to DSM criteria using structured interviews and a number of additional clinical variables and demographic information were assessed. Plasma samples from 245 depressed patients, 229 schizophrenic patients and 254 controls were submitted to multi analyte profiling allowing the evaluation of up to 79 proteins, including a series of cytokines, chemokines and neurotrophins previously suggested to be involved in the pathophysiology of depression and schizophrenia. Univariate data analysis showed more significant p-values than would be expected by chance and highlighted several proteins belonging to pathways or mechanisms previously suspected to be involved in the pathophysiology of major depression or schizophrenia, such as insulin and MMP-9 for depression, and BDNF, EGF and a number of chemokines for schizophrenia. Multivariate analysis was carried out to improve the differentiation of cases from controls and identify the most informative panel of markers. The results illustrate the potential of plasma biomarker profiling for psychiatric disorders, when conducted in large collections. The study highlighted a set of analytes as candidate biomarker signatures for depression and schizophrenia, warranting further investigation in independent collections.
The Sample avAILability system—SAIL—is a web based application for searching, browsing and annotating biological sample collections or biobank entries. By providing individual-level information on the availability of specific data types (phenotypes, genetic or genomic data) and samples within a collection, rather than the actual measurement data, resource integration can be facilitated. A flexible data structure enables the collection owners to provide descriptive information on their samples using existing or custom vocabularies. Users can query for the available samples by various parameters combining them via logical expressions. The system can be scaled to hold data from millions of samples with thousands of variables. Availability: SAIL is available under Aferro-GPL open source license: https://github.com/sail . Contact: gostev@ebi.ac.uk , support@simbioms.org Supplementary information : Supplementary data are available at Bioinformatics online and from http://www.simbioms.org .
Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the imputation approach may be limited by the low accuracy of the imputed rare variants. To improve imputation accuracy of rare variants, various approaches have been suggested, including increasing the sample size of the reference panel, using sequencing data from study-specific samples (i.e., specific populations), and using local reference panels by genotyping or sequencing a subset of study samples. While these approaches mainly utilize reference panels, imputation accuracy of rare variants can also be increased by using exome chips containing rare variants. The exome chip contains 250 K rare variants selected from the discovered variants of about 12,000 sequenced samples. If exome chip data are available for previously genotyped samples, the combined approach using a genotype panel of merged data, including exome chips and SNP chips, should increase the imputation accuracy of rare variants. In this study, we describe a combined imputation which uses both exome chip and SNP chip data simultaneously as a genotype panel. The effectiveness and performance of the combined approach was demonstrated using a reference panel of 848 samples constructed using exome sequencing data from the T2D-GENES consortium and 5,349 sample genotype panels consisting of an exome chip and SNP chip. As a result, the combined approach increased imputation quality up to 11 %, and genomic coverage for rare variants up to 117.7 % (MAF < 1 %), compared to imputation using the SNP chip alone. Also, we investigated the systematic effect of reference panels on imputation quality using five reference panels and three genotype panels. The best performing approach was the combination of the study specific reference panel and the genotype panel of combined data. Our study demonstrates that combined datasets, including SNP chips and exome chips, enhances both the imputation quality and genomic coverage of rare variants.
Meta-analyses of population-based genome-wide association studies (GWAS) in adults have recently led to the detection of new genetic loci for obesity. Here we aimed to discover additional obesity loci in extremely obese children and adolescents. We also investigated if these results generalize by estimating the effects of these obesity loci in adults and in population-based samples including both children and adults. We jointly analysed two GWAS of 2,258 individuals and followed-up the best, according to lowest p-values, 44 single nucleotide polymorphisms (SNP) from 21 genomic regions in 3,141 individuals. After this DISCOVERY step, we explored if the findings derived from the extremely obese children and adolescents (10 SNPs from 5 genomic regions) generalized to (i) the population level and (ii) to adults by genotyping another 31,182 individuals (GENERALIZATION step). Apart from previously identified FTO, MC4R, and TMEM18, we detected two new loci for obesity: one in SDCCAG8 (serologically defined colon cancer antigen 8 gene; p = 1.85610 x 10(-8) in the DISCOVERY step) and one between TNKS (tankyrase, TRF1-interacting ankyrin-related ADP-ribose polymerase gene) and MSRA (methionine sulfoxide reductase A gene; p = 4.84 x 10(-7)), the latter finding being limited to children and adolescents as demonstrated in the GENERALIZATION step. The odds ratios for early-onset obesity were estimated at similar to 1.10 per risk allele for both loci. Interestingly, the TNKS/MSRA locus has recently been found to be associated with adult waist circumference. In summary, we have completed a meta-analysis of two GWAS which both focus on extremely obese children and adolescents and replicated our findings in a large followed-up data set. We observed that genetic variants in or near FTO, MC4R, TMEM18, SDCCAG8, and TNKS/MSRA were robustly associated with early-onset obesity. We conclude that the currently known major common variants related to obesity overlap to a substantial degree between children and adults.
Circulating levels of adiponectin, a hormone produced predominantly by adipocytes, are highly heritable and are inversely associated with type 2 diabetes mellitus (T2D) and other metabolic traits. We conducted a meta-analysis of genome-wide association studies in 39,883 individuals of European ancestry to identify genes associated with metabolic disease. We identified 8 novel loci associated with adiponectin levels and confirmed 2 previously reported loci (P=4.5 x 10(-8)-1.2 x 10(-43)). Using a novel method to combine data across ethnicities (N = 4,232 African Americans, N = 1,776 Asians, and N = 29,347 Europeans), we identified two additional novel loci. Expression analyses of 436 human adipocyte samples revealed that mRNA levels of 18 genes at candidate regions were associated with adiponectin concentrations after accounting for multiple testing (p
Finnish samples have been extensively utilized in studying single-gene disorders, where the founder effect has clearly aided in discovery, and more recently in genome-wide association studies of complex traits, where the founder effect has had less obvious impacts. As the field starts to explore rare variants' contribution to polygenic traits, it is of great importance to characterize and confirm the Finnish founder effect in sequencing data and to assess its implications for rare-variant association studies. Here, we employ forward simulation, guided by empirical deep resequencing data, to model the genetic architecture of quantitative polygenic traits in both the general European and the Finnish populations simultaneously. We demonstrate that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants. SKAT-O, variable-threshold tests, and single-variant tests are more powerful than other rare-variant methods in the Finnish population across a range of genetic models. We also compare the relative power and efficiency of exome array genotyping to those of high-coverage exome sequencing. At a fixed cost, less expensive genotyping strategies have far greater power than sequencing; in a fixed number of samples, however, genotyping arrays miss a substantial portion of genetic signals detected in sequencing, even in the Finnish founder population. As genetic studies probe sequence variation at greater depth in more diverse populations, our simulation approach provides a framework for evaluating various study designs for gene discovery.
Through genome-wide association meta-analyses of up to 133,010 individuals of European ancestry without diabetes, including individuals newly genotyped using the Metabochip, we have increased the number of confirmed loci influencing glycemic traits to 53, of which 33 also increase type 2 diabetes risk (q < 0.05). Loci influencing fasting insulin concentration showed association with lipid levels and fat distribution, suggesting impact on insulin resistance. Gene-based analyses identified further biologically plausible loci, suggesting that additional loci beyond those reaching genome-wide significance are likely to represent real associations. This conclusion is supported by an excess of directionally consistent and nominally significant signals between discovery and follow-up studies. Functional analysis of these newly discovered loci will further improve our understanding of glycemic control.
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P
Platelets are the second most abundant cell type in blood and are essential for maintaining haemostasis. Their count and volume are tightly controlled within narrow physiological ranges, but there is only limited understanding of the molecular processes controlling both traits. Here we carried out a high-powered meta-analysis of genome-wide association studies (GWAS) in up to 66,867 individuals of European ancestry, followed by extensive biological and functional assessment. We identified 68 genomic loci reliably associated with platelet count and volume mapping to established and putative novel regulators of megakaryopoiesis and platelet formation. These genes show megakaryocyte-specific gene expression patterns and extensive network connectivity. Using gene silencing in Danio rerio and Drosophila melanogaster, we identified 11 of the genes as novel regulators of blood cell formation. Taken together, our findings advance understanding of novel gene functions controlling fate-determining events during megakaryopoiesis and platelet formation, providing a new example of successful translation of GWAS to function.
White blood cell (WBC) count is a common clinical measure used as a predictor of certain aspects of human health, including immunity and infection status. WBC count is also a complex trait that varies among individuals and ancestry groups. Differences in linkage disequilibrium structure and heterogeneity in allelic effects are expected to play a role in the associations observed between populations. Prior genome-wide association study (GWAS) meta-analyses have identified genomic loci associated with WBC and its subtypes, but much of the heritability of these phenotypes remains unexplained. Using GWAS summary statistics for over 50 000 individuals from three diverse populations (Japanese, African-American and European ancestry), a Bayesian model methodology was employed to account for heterogeneity between ancestry groups. This approach was used to perform a trans-ethnic meta-analysis of total WBC, neutrophil and monocyte counts. Ten previously known associations were replicated and six new loci were identified, including several regions harboring genes related to inflammation and immune cell function. Ninety-five percent credible interval regions were calculated to narrow the association signals and fine-map the putatively causal variants within loci. Finally, a conditional analysis was performed on the most significant SNPs identified by the trans-ethnic meta-analysis (MA), and nine secondary signals within loci previously associated with WBC or its subtypes were identified. This work illustrates the potential of trans-ethnic analysis and ascribes a critical role to multi-ethnic cohorts and consortia in exploring complex phenotypes with respect to variants that lie outside the European-biased GWAS pool.
Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed $\sim$19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated $\sim$50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease, IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes, although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases
Waist-hip ratio (WHR) is a measure of body fat distribution and a predictor of metabolic consequences independent of overall adiposity. WHR is heritable, but few genetic variants influencing this trait have been identified. We conducted a meta-analysis of 32 genome-wide association studies for WHR adjusted for body mass index (comprising up to 77,167 participants), following up 16 loci in an additional 29 studies (comprising up to 113,636 subjects). We identified 13 new loci in or near RSPO3, VEGFA, TBX15-WARS2, NFE2L3, GRB14, DNM3-PIGC, ITPR2-SSPN, LY86, HOXC13, ADAMTS9, ZNRF3-KREMEN1, NISCH-STAB1 and CPEB4 (P = 1.9 x 10(-9) to P = 1.8 x 10(-40)) and the known signal at LYPLAL1. Seven of these loci exhibited marked sexual dimorphism, all with a stronger effect on WHR in women than men (P for sex difference = 1.9 x 10(-3) to P = 1.2 x 10(-13)). These findings provide evidence for multiple loci that modulate body fat distribution independent of overall adiposity and reveal strong gene-by-sex interactions.
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ∼82 K Europeans via the exome chip, and ∼90% of low-frequency non-coding variants in ∼44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D. © The Author(s) 2017.
We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease.
[This corrects the article DOI: 10.1371/journal.pgen.1006528.].
[This corrects the article DOI: 10.1371/journal.pgen.1006528.].
To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06-0.08) mmol/l in fasting glucose levels (P = 3.2 x 10(-50)) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 x 10(-15)). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05-1.12), per G allele P = 3.3 x 10(-7)) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 x 10(-57)) and GCK (rs4607517, P = 1.0 x 10(-25)) loci.
Genome-wide association studies (GWAS) have found few common variants that influence fasting measures of insulin sensitivity. We hypothesized that a GWAS of an integrated assessment of fasting and dynamic measures of insulin sensitivity would detect novel common variants. We performed a GWAS of the modified Stumvoll Insulin Sensitivity Index (ISI) within the Meta-Analyses of Glucose and Insulin-Related Traits Consortium. Discovery for genetic association was performed in 16,753 individuals, and replication was attempted for the 23 most significant novel loci in 13,354 independent individuals. Association with ISI was tested in models adjusted for age, sex, and BMI and in a model analyzing the combined influence of the genotype effect adjusted for BMI and the interaction effect between the genotype and BMI on ISI (model 3). In model 3, three variants reached genome-wide significance: rs13422522 (NYAP2; P = 8.87 x 10(-11)), rs12454712 (BCL2; P = 2.7 x 10(-8)), and rs10506418 (FAM19A2; P = 1.9 x 10(-8)). The association at NYAP2 was eliminated by conditioning on the known IRS1 insulin sensitivity locus; the BCL2 and FAM19A2 associations were independent of known cardiometabolic loci. In conclusion, we identified two novel loci and replicated known variants associated with insulin sensitivity. Further studies are needed to clarify the causal variant and function at the BCL2 and FAM19A2 loci.
Genome-wide association studies have revealed numerous risk loci associated with diverse diseases. However, identification of disease-causing variants within association loci remains a major challenge. Divergence in gene expression due to cis-regulatory variants in noncoding regions is central to disease susceptibility. We show that integrative computational analysis of phylogenetic conservation with a complexity assessment of co-occurring transcription factor binding sites (TFBS) can identify cis-regulatory variants and elucidate their mechanistic role in disease. Analysis of established type 2 diabetes risk loci revealed a striking clustering of distinct homeobox TFBS. We identified the PRRX1 homeobox factor as a repressor of PPARG2 expression in adipose cells and demonstrate its adverse effect on lipid metabolism and systemic insulin sensitivity, dependent on the rs4684847 risk allele that triggers PRRX1 binding. Thus, cross-species conservation analysis at the level of co-occurring TFBS provides a valuable contribution to the translation of genetic association signals to disease-related molecular mechanisms.
Obesity is globally prevalent and highly heritable, but its underlying genetic factors remain largely elusive. To identify genetic loci for obesity susceptibility, we examined associations between body mass index and similar to 2.8 million SNPs in up to 123,865 individuals with targeted follow up of 42 SNPs in up to 125,931 additional individuals. We confirmed 14 known obesity susceptibility loci and identified 18 new loci associated with body mass index (P < 5 x 10(-8)), one of which includes a copy number variant near GPRC5B. Some loci (at MC4R, POMC, SH2B1 and BDNF) map near key hypothalamic regulators of energy balance, and one of these loci is near GIPR, an incretin receptor. Furthermore, genes in other newly associated loci may provide new insights into human body weight regulation.
In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P 5.6 10(9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 10(4)2.2 10(7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.
To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist-hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR) was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9x10(-11)) and MSRA (WC, P = 8.9x10(-9)). A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6x10(-8)). The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity.
Over the past two years, there has been a spectacular change in the capacity to identify common genetic variants that contribute to predisposition to complex multifactorial phenotypes such as type 2 diabetes (T2D). The principal advance has been the ability to undertake surveys of genome-wide association in large study samples. Through these and related efforts, similar to 20 common variants are now robustly implicated in T2D susceptibility. Current developments, for example in high-throughput resequencing, should help to provide a more comprehensive view of T2D susceptibility in the near future. Although additional investigation is needed to define the causal variants within these novel T2D-susceptibility regions, to understand disease mechanisms and to effect clinical translation, these findings are already highlighting the predominant contribution of defects in pancreatic beta-cell function to the development of T21D.
Twin and family studies indicate that the timing of primary tooth eruption is highly heritable, with estimates typically exceeding 80. To identify variants involved in primary tooth eruption, we performed a population-based genome-wide association study of age at first tooth and number of teeth using 5998 and 6609 individuals, respectively, from the Avon Longitudinal Study of Parents and Children (ALSPAC) and 5403 individuals from the 1966 Northern Finland Birth Cohort (NFBC1966). We tested 2 446 724 SNPs imputed in both studies. Analyses were controlled for the effect of gestational age, sex and age of measurement. Results from the two studies were combined using fixed effects inverse variance meta-analysis. We identified a total of 15 independent loci, with 10 loci reaching genome-wide significance (P 5 10(8)) for age at first tooth and 11 loci for number of teeth. Together, these associations explain 6.06 of the variation in age of first tooth and 4.76 of the variation in number of teeth. The identified loci included eight previously unidentified loci, some containing genes known to play a role in tooth and other developmental pathways, including an SNP in the protein-coding region of BMP4 (rs17563, P 9.080 10(17)). Three of these loci, containing the genes HMGA2, AJUBA and ADK, also showed evidence of association with craniofacial distances, particularly those indexing facial width. Our results suggest that the genome-wide association approach is a powerful strategy for detecting variants involved in tooth eruption, and potentially craniofacial growth and more generally organ development.
Multiple genome screens have been performed to identify regions in linkage or association with Multiple Sclerosis (MS, OMIM 126200), but little overlap has been found among them. This may be, in part, due to a low statistical power to detect small genetic effects and to genetic heterogeneity within and among the studied populations. Motivated by these considerations, we studied a very special population, namely that of Nuoro, Sardinia, Italy. This is an isolated, old, and genetically homogeneous population with high prevalence of MS. Our study sample includes both nuclear families and unrelated cases and controls. A multi-stage study design was adopted. In the first stage, microsatellites were typed in the 17q11.2 region, previously independently found to be in linkage with MS. One significant association was found at microsatellite D17S798. Next, a bioinformatic screening of the region surrounding this marker highlighted an interesting candidate MS susceptibility gene: the Amiloride-sensitive Cation Channel Neuronal 1 ( ACCN1 ) gene. In the second stage of the study, we resequenced the exons and the 3′ untranslated (UTR) region of ACCN1 , and investigated the MS association of Single Nucleotide Polymorphisms (SNPs) identified in that region. For this purpose, we developed a method of analysis where complete, phase-solved, posterior-weighted haplotype assignments are imputed for each study individual from incomplete, multi-locus, genotyping data. The imputed assignments provide an input to a number of proposed procedures for testing association at a microsatellite level or of a sequence of SNPs. These include a Mantel-Haenszel type test based on expected frequencies of pseudocase/pseudocontrol haplotypes, as well as permutation based tests, including a combination of permutation and weighted logistic regression analysis. Application of these methods allowed us to find a significant association between MS and the SNP rs28936 located in the 3′ UTR segment of ACCN1 with p = 0.0004 ( p = 0.002, after adjusting for multiple testing). This result is in tune with several recent experimental findings which suggest that ACCN1 may play an important role in the pathogenesis of MS.
The association between common variants in the FTO gene with weight, adiposity and body mass index (BMI) has now been widely replicated. Although the causal variant has yet to be identified, it most likely maps within a 47 kb region of intron 1 of FTO. We performed a genome-wide association study in the Sorbian population and evaluated the relationships between FTO variants and BMI and fat mass in this isolate of Slavonic origin resident in Germany. In a sample of 948 Sorbs, we could replicate the earlier reported associations of intron 1 SNPs with BMI (eg, P-value-0.003, beta=0.02 for rs8050136). However, using genome-wide association data, we also detected a second independent signal mapping to a region in intron 2/3 about 40-60 kb away from the originally reported SNPs (eg, for rs17818902 association with BMI P-value=0.0006, beta=-0.03 and with fat mass P-value-0.0018, beta(-) -0.079). Both signals remain independently associated in the conditioned analyses. In conclusion, we extend the evidence that FTO variants are associated with BMI by putatively identifying a second susceptibility allele independent of that described earlier. Although further statistical analysis of these findings is hampered by the finite size of the Sorbian isolate, these findings should encourage other groups to seek alternative susceptibility variants within FTO (and other established susceptibility loci) using the opportunities afforded by analyses in populations with divergent mutational and/or demographic histories. European Journal of Human Genetics (2010) 18, 104-110; doi:10.1038/ejhg.2009.107; published online 8 July 2009
Type 2 diabetes (T2D) is a global health burden that will benefit from personalised risk prediction. We aimed to identify longitudinal predictors of glycaemic traits relevant for T2D by applying machine learning (ML) to multi-omics data from the Northern Finland Birth Cohort 1966 at 31 (T1) and 46 (T2) years old. We predicted fasting glucose/insulin (FG/FI), glycated haemoglobin (HbA1c) and 2-hour glucose/insulin from oral glucose tolerance test (2hGlu/2hIns) at T2 in 595 individuals from 1,010 variables at T1 and T2: body-mass-index (BMI), waist-hip-ratio, sex; nine blood plasma measurements; 454 NMR-based metabolites (228 at T1 and 226 at T2); 542 methylation probes established for BMI/FG/FI/HbA1c/T2D/2hGlu/2hIns (277 at T1 and 264 at T2). Metabolic and methylation data were used in their raw form (Mb-R, Mh-R) or in scores (Mb-S, Mh-S). We used six ML approaches: random forest (RF), boosted trees (BT) and support vector regression (SVR) with the kernels of linear/linear with L2 regularization/polynomial/radial-basis function. RF and BT showed consistent performance while most SVRs struggled with high-dimensional data. The predictions worked best for FG and FI (average R2 values of six ML models: 0.47 and 0.30 for Mb-S). With Mb-S/Mb-R data, sex, branched-chain and aromatic amino acids, HDL-cholesterol, VLDL, glycoprotein acetyls, glycerol, ketone bodies at T2 and measurements of obesity already at T1 were amongst the top predictors. Addition of methylation data, did not improve the predictions (P>0.3, model comparison); however, 15/17 markers were amongst the top 25 predictors of FI/FG when using Mb-S+Mh-R data. With ML we could narrow down hundreds of variables into a clinically relevant set of predictors and demonstrate the importance of longitudinal changes in prediction.
Whether loci that influence fasting glucose (FG) and fasting insulin (FI) levels, as identified by genome-wide association studies, modify associations of diet with FG or FI is unknown. We utilized data from 15 US and European cohort studies comprising 51,289 persons without diabetes to test whether genotype and diet interact to influence FG or FI concentration. We constructed a diet score using study-specific quartile rankings for intakes of whole grains, fish, fruits, vegetables, and nuts/seeds (favorable) and red/processed meats, sweets, sugared beverages, and fried potatoes (unfavorable). We used linear regression within studies, followed by inverse-variance-weighted meta-analysis, to quantify 1) associations of diet score with FG and FI levels and 2) interactions of diet score with 16 FG-associated loci and 2 FI-associated loci. Diet score (per unit increase) was inversely associated with FG ( 0.004 mmol/L, 95 confidence interval: 0.005, 0.003) and FI ( 0.008 ln-pmol/L, 95 confidence interval: 0.009, 0.007) levels after adjustment for demographic factors, lifestyle, and body mass index. Genotype variation at the studied loci did not modify these associations. Healthier diets were associated with lower FG and FI concentrations regardless of genotype at previously replicated FG- and FI-associated loci. Studies focusing on genomic regions that do not yield highly statistically significant associations from main-effect genome-wide association studies may be more fruitful in identifying diet-gene interactions.
By combining genome-wide association data from 8,130 individuals with type 2 diabetes (T2D) and 38,987 controls of European descent and following up previously unidentified meta-analysis signals in a further 34,412 cases and 59,925 controls, we identified 12 new T2D association signals with combined P < 5 x 10(-8). These include a second independent signal at the KCNQ1 locus; the first report, to our knowledge, of an X-chromosomal association (near DUSP9); and a further instance of overlap between loci implicated in monogenic and multifactorial forms of diabetes (at HNF1A). The identified loci affect both beta-cell function and insulin action, and, overall, T2D association signals show evidence of enrichment for genes involved in cell cycle regulation. We also show that a high proportion of T2D susceptibility loci harbor independent association signals influencing apparently unrelated complex traits.
Background: Recent genome-wide association studies (GWAS) have identified novel loci associated with sudden cardiac death (SCD). Despite this progress, identified DNA variants account for a relatively small portion of overall SCD risk, suggesting that additional loci contributing to SCD susceptibility await discovery. The objective of this study was to identify novel DNA variation associated with SCD in the context of coronary artery disease (CAD). Methods and Findings: Using the MetaboChip custom array we conducted a case-control association analysis of 119,117 SNPs in 948 SCD cases (with underlying CAD) from the Oregon Sudden Unexpected Death Study (Oregon-SUDS) and 3,050 controls with CAD from the Wellcome Trust Case-Control Consortium (WTCCC). Two newly identified loci were significantly associated with increased risk of SCD after correction for multiple comparisons at: rs6730157 in the RAB3GAP1 gene on chromosome 2 (P = 4.93 x 10(-12), OR = 1.60) and rs2077316 in the ZNF365 gene on chromosome 10 (P = 3.64 x 10(-8), OR = 2.41). Conclusions: Our findings suggest that RAB3GAP1 and ZNF365 are relevant candidate genes for SCD and will contribute to the mechanistic understanding of SCD susceptibility.
To further investigate susceptibility loci identified by genome-wide association studies, we genotyped 5,500 SNPs across 14 associated regions in 8,000 samples from a control group and 3 diseases: type 2 diabetes (T2D), coronary artery disease (CAD) and Graves' disease. We defined, using Bayes theorem, credible sets of SNPs that were 95% likely, based on posterior probability, to contain the causal disease-associated SNPs. In 3 of the 14 regions, TCF7L2 (T2D), CTLA4 (Graves' disease) and CDKN2A-CDKN2B (T2D), much of the posterior probability rested on a single SNP, and, in 4 other regions (CDKN2A-CDKN2B (CAD) and CDKAL1, FTO and HHEX (T2D)), the 95% sets were small, thereby excluding most SNPs as potentially causal. Very few SNPs in our credible sets had annotated functions, illustrating the limitations in understanding the mechanisms underlying susceptibility to common diseases. Our results also show the value of more detailed mapping to target sequences for functional studies.
We performed a genome-wide association study (GWAS) and a multistage meta-analysis of type 2 diabetes (T2D) in Punjabi Sikhs from India. Our discovery GWAS in 1,616 individuals (842 case subjects) was followed by in silico replication of the top 513 independent single nucleotide polymorphisms (SNPs) ( P< 10 −3 ) in Punjabi Sikhs ( n = 2,819; 801 case subjects). We further replicated 66 SNPs ( P< 10 −4 ) through genotyping in a Punjabi Sikh sample ( n = 2,894; 1,711 case subjects). On combined meta-analysis in Sikh populations ( n = 7,329; 3,354 case subjects), we identified a novel locus in association with T2D at 13q12 represented by a directly genotyped intronic SNP (rs9552911, P = 1.82 × 10 −8 ) in the SGCG gene. Next, we undertook in silico replication (stage 2b) of the top 513 signals ( P< 10 −3 ) in 29,157 non-Sikh South Asians (10,971 case subjects) and de novo genotyping of up to 31 top signals ( P< 10 −4 ) in 10,817 South Asians (5,157 case subjects) (stage 3b). In combined South Asian meta-analysis, we observed six suggestive associations ( P< 10 −5 to < 10 −7 ), including SNPs at HMG1L1 / CTCFL , PLXNA4 , SCAP , and chr5p11. Further evaluation of 31 top SNPs in 33,707 East Asians (16,746 case subjects) (stage 3c) and 47,117 Europeans (8,130 case subjects) (stage 3d), and joint meta-analysis of 128,127 individuals (44,358 case subjects) from 27 multiethnic studies, did not reveal any additional loci nor was there any evidence of replication for the new variant. Our findings provide new evidence on the presence of a population-specific signal in relation to T2D, which may provide additional insights into T2D pathogenesis.
Background: Obesity is associated with vitamin D deficiency, and both are areas of active public health concern. We explored the causality and direction of the relationship between body mass index (BMI) and 25-hydroxyvitamin D [25(OH) D] using genetic markers as instrumental variables (IVs) in bi-directional Mendelian randomization (MR) analysis. Methods and Findings: We used information from 21 adult cohorts (up to 42,024 participants) with 12 BMI-related SNPs (combined in an allelic score) to produce an instrument for BMI and four SNPs associated with 25(OH) D (combined in two allelic scores, separately for genes encoding its synthesis or metabolism) as an instrument for vitamin D. Regression estimates for the IVs (allele scores) were generated within-study and pooled by meta-analysis to generate summary effects. Associations between vitamin D scores and BMI were confirmed in the Genetic Investigation of Anthropometric Traits (GIANT) consortium (n = 123,864). Each 1 kg/m(2) higher BMI was associated with 1.15% lower 25(OH) D (p = 6.52x10(-27)). The BMI allele score was associated both with BMI (p = 6.30x10(-62)) and 25(OH) D (20.06% [95% CI -0.10 to -0.02], p = 0.004) in the cohorts that underwent meta-analysis. The two vitamin D allele scores were strongly associated with 25(OH) D (p = 0.57 for both vitamin D scores). Conclusions: On the basis of a bi-directional genetic approach that limits confounding, our study suggests that a higher BMI leads to lower 25(OH) D, while any effects of lower 25(OH) D increasing BMI are likely to be small. Population level interventions to reduce BMI are expected to decrease the prevalence of vitamin D deficiency.
Indians undergoing socioeconomic and lifestyle transitions will be maximally affected by epidemic of type 2 diabetes (T2D). We conducted a two-stage genome-wide association study of T2D in 12,535 Indians, a less explored but high-risk group. We identified a new type 2 diabetes–associated locus at 2q21, with the lead signal being rs6723108 (odds ratio 1.31; P = 3.32 × 10 −9 ). Imputation analysis refined the signal to rs998451 (odds ratio 1.56; P = 6.3 × 10 −12 ) within TMEM163 that encodes a probable vesicular transporter in nerve terminals. TMEM163 variants also showed association with decreased fasting plasma insulin and homeostatic model assessment of insulin resistance, indicating a plausible effect through impaired insulin secretion. The 2q21 region also harbors RAB3GAP1 and ACMSD ; those are involved in neurologic disorders. Forty-nine of 56 previously reported signals showed consistency in direction with similar effect sizes in Indians and previous studies, and 25 of them were also associated ( P< 0.05). Known loci and the newly identified 2q21 locus altogether explained 7.65% variance in the risk of T2D in Indians. Our study suggests that common susceptibility variants for T2D are largely the same across populations, but also reveals a population-specific locus and provides further insights into genetic architecture and etiology of T2D.
Genome-wide association (GWA) studies have identified multiple loci at which common variants modestly but reproducibly influence risk of type 2 diabetes (T2D). Established associations to common and rare variants explain only a small proportion of the heritability of T2D. As previously published analyses had limited power to identify variants with modest effects, we carried out meta-analysis of three T2D GWA scans comprising 10,128 individuals of European descent and approximately 2.2 million SNPs (directly genotyped and imputed), followed by replication testing in an independent sample with an effective sample size of up to 53,975. We detected at least six previously unknown loci with robust evidence for association, including the JAZF1 (P = 5.0 x 10(-14)), CDC123-CAMK1D (P = 1.2 x 10(-10)), TSPAN8-LGR5 (P = 1.1 x 10(-9)), THADA (P = 1.1 x 10(-9)), ADAMTS9 (P = 1.2 x 10(-8)) and NOTCH2 (P = 4.1 x 10(-8)) gene regions. Our results illustrate the value of large discovery and follow-up samples for gaining further insights into the inherited basis of T2D.
Coffee is the most commonly used stimulant and caffeine is its main psychoactive ingredient. The heritability of coffee consumption has been estimated at around 50%. We performed a meta-analysis of four genome-wide association studies of coffee consumption among coffee drinkers from Iceland (n = 2680), the Netherlands (n = 2791), the Sorbs Slavonic population isolate in Germany (n = 771) and the USA (n = 369) using both directly genotyped and imputed single nucleotide polymorphisms (SNPs) (2.5 million SNPs). SNPs at the two most significant loci were also genotyped in a sample set from Iceland (n = 2430) and a Danish sample set consisting of pregnant women (n = 1620). Combining all data, two sequence variants significantly associated with increased coffee consumption: rs2472297-T located between CYP1A1 and CYP1A2 at 15q24 (P = 5.4.10(-14)) and rs6968865-T near aryl hydrocarbon receptor (AHR) at 7p21 (P = 2.3.10(-11)). An effect of similar to 0.2 cups a day per allele was observed for both SNPs. CYP1A2 is the main caffeine metabolizing enzyme and is also involved in drug metabolism. AHR detects xenobiotics, such as polycyclic aryl hydrocarbons found in roasted coffee, and induces transcription of CYP1A1 and CYP1A2. The association of these SNPs with coffee consumption was present in both smokers and nonsmokers.
The adipocyte-derived protein adiponectin is highly heritable and inversely associated with risk of type 2 diabetes mellitus (T2D) and coronary heart disease (CHD). We meta-analyzed 3 genome-wide association studies for circulating adiponectin levels (n = 8,531) and sought validation of the lead single nucleotide polymorphisms ( SNPs) in 5 additional cohorts (n = 6,202). Five SNPs were genome-wide significant in their relationship with adiponectin (P
Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recoVered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available.
Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40 x 10(-4), permutation p = 1.0 x 10(-3)). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13 x 10(-7)). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases.
Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2 , rs1050700 at TSC1 , rs249942 at PALB2 , and rs9827843 at ROBO2 ) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research.
Glucose levels 2 h after an oral glucose challenge are a clinical measure of glucose tolerance used in the diagnosis of type 2 diabetes. We report a meta-analysis of nine genome-wide association studies (n = 15,234 nondiabetic individuals) and a follow-up of 29 independent loci (n = 6,958-30,620). We identify variants at the GIPR locus associated with 2- h glucose level (rs10423928, beta (s.e.m.) = 0.09 (0.01) mmol/l per A allele, P = 2.0 x 10(-15)). The GIPR A-allele carriers also showed decreased insulin secretion (n = 22,492; insulinogenic index, P = 1.0 x 10(-17); ratio of insulin to glucose area under the curve, P = 1.3 x 10(-16)) and diminished incretin effect (n = 804; P = 4.3 x 10(-4)). We also identified variants at ADCY5 (rs2877716, P = 4.2 x 10(-16)), VPS13C (rs17271305, P = 4.1 x 10(-8)), GCKR (rs1260326, P = 7.1 x 10(-11)) and TCF7L2 (rs7903146, P = 4.2 x 10(-10)) associated with 2-h glucose. Of the three newly implicated loci (GIPR, ADCY5 and VPS13C), only ADCY5 was found to be associated with type 2 diabetes in collaborating studies (n = 35,869 cases, 89,798 controls, OR = 1.12, 95% CI 1.09-1.15, P = 4.8 x 10(-18)).
Introduction Polygenic Score (PGS) is a valuable method for assessing the estimated genetic liability to a given outcome or genetic variability contributing to a quantitative trait. While PRSs are widely used for complex traits, their application in uncovering shared genetic predisposition between phenotypes, i.e. when genetic variants influence more than one phenotype, remains limited. Methods We developed an R package, comorbidPGS, which facilitates a systematic evaluation of shared genetic effects among (cor)related phenotypes using PGSs. The comorbidPGS package takes as input a set of Single Nucleotide Polymorphisms (SNPs) along with their established effects on the original phenotype (Po), referred to as Po-PGS. It generates a comprehensive summary of effect(s) of Po-PGS on target phenotype(s) (Pt) with customisable graphical features. Results We applied comorbidPGS to investigate the shared genetic predisposition between phenotypes defining elevated blood pressure (Systolic Blood Pressure, SBP; Diastolic Blood Pressure, DBP; Pulse Pressure, PP) and several cancers (Breast Cancer, BrC; Pancreatic Cancer, PanC; Kidney Cancer, KidC; Prostate Cancer, PrC; Colorectal Cancer, CrC) using the European ancestry UK Biobank individuals and GWAS meta-analyses summary statistics from independent set of European ancestry individuals. We report a significant association between elevated DBP and the genetic risk of PrC (β (SE)=0.066 (0.017), P-value=9.64×10^(-5)), as well as between CrC PGS and both, lower SBP (β (SE)=-0.10 [0.029], P-value=3.83×10^(-4))) and lower DBP (β (SE)=-0.055 [0.017], P-value=1.05×10^(-3)). Our analysis highlights two nominally significant relationships for individuals with genetic predisposition to elevated SBP leading to higher risk of KidC (OR [95%CI]=1.04 [1.0039-1.087], P-value=2.82×10^(-2)) and PrC (OR [95%CI]=1.02 [1.003-1.041], P-value=2.22×10^(-2)). Conclusion Using comorbidPGS, we underscore mechanistic relationships between blood pressure regulation and susceptibility to three comorbid malignancies. This package offers valuable means to evaluate shared genetic susceptibility between (cor)related phenotypes through polygenic scores.
Polygenic risk score analyses on embryos (PGT-P) are being marketed by some private testing companies to parents using in vitro fertilisation as being useful in selecting the embryos that carry the least risk of disease in later life. It appears that at least one child has been born after such a procedure. But the utility of a PRS in this respect is severely limited, and to date, no clinical research has been performed to assess its diagnostic effectiveness in embryos. Patients need to be properly informed on the limitations of this use of PRSs, and a societal debate, focused on what would be considered acceptable with regard to the selection of individual traits, should take place before any further implementation of the technique in this population.
Polycystic ovary syndrome (PCOS) is a very common endocrine condition in women in India. Gut microbiome alterations were shown to be involved in PCOS, yet it is remarkably understudied in Indian women who have a higher incidence of PCOS as compared to other ethnic populations. During the regional PCOS screening program among young women, we recruited 19 drug naive women with PCOS and 20 control women at the Sher-i-Kashmir Institute of Medical Sciences, Kashmir, North India. We profiled the gut microbiome in faecal samples by 16S rRNA sequencing and included 40/58 operational taxonomic units (OTUs) detected in at least 1/3 of the subjects with relative abundance (RA) ≥ 0.1%. We compared the RAs at a family/genus level in PCOS/non-PCOS groups and their correlation with 33 metabolic and hormonal factors, and corrected for multiple testing, while taking the variation in day of menstrual cycle at sample collection, age and BMI into account. Five genera were significantly enriched in PCOS cases: , , and previously reported for PCOS , and confirmed by different statistical models. At the family level, the relative abundance of was enriched, whereas was decreased among cases. We observed increased relative abundance of and with higher fasting blood glucose levels, and and with larger hip, waist circumference, weight, and with lower prolactin levels. We also detected a novel association between and follicle-stimulating hormone levels and between and alkaline phosphatase, independently of the BMI of the participants. Our report supports that there is a relationship between gut microbiome composition and PCOS with links to specific reproductive health metabolic and hormonal predictors in Indian women.
Pancreatic cancer is a rare but fatal form of cancer, the fourth highest in absolute mortality. Known risk factors include obesity, diet, and type 2 diabetes; however, the low incidence rate and interconnection of these factors confound the isolation of individual effects. Here, we use epidemiological analysis of prospective human cohorts and parallel tracking of pancreatic cancer in mice to dissect the effects of obesity, diet, and diabetes on pancreatic cancer. Through longitudinal monitoring and multi-omics analysis in mice, we found distinct effects of protein, sugar, and fat dietary components, with dietary sugars increasing Mad2l1 expression and tumor proliferation. Using epidemiological approaches in humans, we find that dietary sugars give a MAD2L1 genotype-dependent increased susceptibility to pancreatic cancer. The translation of these results to a clinical setting could aid in the identification of the at-risk population for screening and potentially harness dietary modification as a therapeutic measure. [Display omitted] •Distinct roles for dietary fat, protein, and sugar on murine pancreatic cancer•Dietary glucose triggers Mad2l1 upregulation and tumor cell proliferation in mice•Gene-diet interaction identifies sugar-MAD2L1 link in human pancreatic cancer•Dietary plant fats were protective in human pancreatic cancer susceptibility Dooley et al. used parallel analysis of a murine pancreatic cancer model and a human prospective cohort to study the interaction of diet and pancreatic cancer. Both systems identify complex effects with different dietary components, converging on a link between dietary sugar and the cell-cycle checkpoint gene MAD2L1.
Body composition is often altered in psychiatric disorders. Using genome-wide common genetic variation data, we calculate sex-specific genetic correlations amongst body fat %, fat mass, fat-free mass, physical activity, glycemic traits and 17 psychiatric traits (up to N = 217,568). Two patterns emerge: (1) anorexia nervosa, schizophrenia, obsessive-compulsive disorder, and education years are negatively genetically correlated with body fat % and fat-free mass, whereas (2) attention-deficit/hyperactivity disorder (ADHD), alcohol dependence, insomnia, and heavy smoking are positively correlated. Anorexia nervosa shows a stronger genetic correlation with body fat % in females, whereas education years is more strongly correlated with fat mass in males. Education years and ADHD show genetic overlap with childhood obesity. Mendelian randomization identifies schizophrenia, anorexia nervosa, and higher education as causal for decreased fat mass, with higher body fat % possibly being a causal risk factor for ADHD and heavy smoking. These results suggest new possibilities for targeted preventive strategies.
Background: There are various maternal prenatal biopsychosocial (BPS) predictors of birth weight, making it difficult to quantify their cumulative relationship. Methods: We studied two birth cohorts: Northern Finland Birth Cohort 1986 (NFBC1986) born in 1985–1986 and the Generation R Study (from the Netherlands) born in 2002–2006. In NFBC1986, we selected variables depicting BPS exposure in association with birth weight and performed factor analysis to derive latent constructs representing the relationship between these variables. In Generation R, the same factors were generated weighted by loadings of NFBC1986. Factor scores from each factor were then allocated into tertiles and added together to calculate a cumulative BPS score. In all cases, we used regression analyses to explore the relationship with birth weight corrected for sex and gestational age and additionally adjusted for other factors. Results: Factor analysis supported a four-factor structure, labelled closely to represent their characteristics as ‘Factor1-BMI’ (body mass index), ‘Factor2-DBP’ (diastolic blood pressure), ‘Factor3-Socioeconomic-Obstetric-Profile’ and ‘Factor4-Parental-Lifestyle’. In both cohorts, ‘Factor1-BMI’ was positively associated with birth weight, whereas other factors showed negative association. ‘Factor3-Socioeconomic-Obstetric-Profile’ and ‘Factor4-Parental-Lifestyle’ had the greatest effect size, explaining 30% of the variation in birth weight. Associations of the factors with birth weight were largely driven by ‘Factor1-BMI’. Graded decrease in birth weight was observed with increasing cumulative BPS score, jointly evaluating four factors in both cohorts. Conclusion: Our study is a proof of concept for maternal prenatal BPS hypothesis, highlighting the components snowball effect on birth weight in two different European birth cohorts.
Background Genome-wide association studies have captured a large proportion of genetic variation related to type 1 diabetes mellitus (T1D). However, most of these studies are performed in populations of European ancestry and therefore the disease risk estimations can be inaccurate when extrapolated to other world populations. Methods We conducted a case-control study in 1866 individuals from the three major populations of the Republic of Bashkortostan (Russians, Tatars, and Bashkirs) in Russian Federation, using single-locus and multilocus approach to identify genetic predictors of T1D. Results We found that LTA rs909253 and TNF rs1800629 polymorphisms were associated with T1D in the group of Tatars. Meta-analysis of the association study results in the three ethnic groups has confirmed the association between the T1D risk and LTA rs909253 genetic variant. LTA rs909253 and TNF rs1800629 loci were also featured in combinations most significantly associated with T1D. Conclusion Our findings suggest that LTA rs909253 and TNF rs1800629 polymorphisms are associated with the risk of T1D both independently and in combination with polymorphic markers in other inflammatory genes, and the analysis of multi-allelic combinations provides valuable insight in the study of polygenic traits.
Expression quantitative trait loci (eQTL) can provide a link between disease susceptibility variants discovered by genetic association studies and biology. To date, eQTL mapping studies have been primarily conducted in healthy individuals from population-based cohorts. Genetic effects have been known to be context-specific and vary with changing environmental stimuli. We conducted a transcriptome- and genome-wide eQTL mapping study in a cohort of patients with idiopathic or heritable pulmonary arterial hypertension (PAH) using RNA sequencing (RNAseq) data from whole blood. We sought confirmation from three published population-based eQTL studies, including the GTEx Project, and followed up potentially novel eQTL not observed in the general population. In total, we identified 2314 eQTL of which 90% were cis-acting and 75% were confirmed by at least one of the published studies. While we observed a higher GWAS trait colocalization rate among confirmed eQTL, colocalisation rate of novel eQTL reported for lung-related phenotypes was twice as high as that of confirmed eQTL. Functional enrichment analysis of genes with novel eQTL in PAH highlighted immune-related processes, a suspected contributor to PAH. These potentially novel eQTL specific to or active in PAH could be useful in understanding genetic risk factors for other diseases that share common mechanisms with PAH.
Pulmonary arterial hypertension (PAH) is characterised by pulmonary vascular remodelling causing premature death from right heart failure. Established DNA variants influence PAH risk, but susceptibility from epigenetic changes is unknown. We addressed this through epigenome-wide association study (EWAS), testing 865,848 CpG sites for association with PAH in 429 individuals with PAH and 1226 controls. Three loci, at Cathepsin Z (CTSZ, cg04917472), Conserved oligomeric Golgi complex 6 (COG6, cg27396197), and Zinc Finger Protein 678 (ZNF678, cg03144189), reached epigenome-wide significance (p
Obesity and type 2 diabetes (T2D) are associated with increased risk of pancreatic cancer. Here we assessed the relationship between pancreatic cancer and two distinct measures of obesity, namely total adiposity, using BMI, versus abdominal adiposity, using BMI adjusted waist-to-hip ratio (WHRadjBMI) by utilising polygenic scores (PGS) and Mendelian randomisation (MR) analyses. We constructed z-score weighted PGS for BMI and WHRadjBMI using publicly available data and tested for their association with pancreatic cancer defined in UK biobank (UKBB). Using publicly available summary statistics, we then performed bi-directional MR analyses between the two obesity traits and pancreatic cancer. PGS(BMI) was significantly (multiple testing-corrected) associated with pancreatic cancer (OR[95%CI] = 1.0804[1.025-1.14], P = 0.0037). The significance of association declined after T2D adjustment (OR[95%CI] = 1.073[1.018-1.13], P = 0.00904). PGS(WHRadjBMI) association with pancreatic cancer was at the margin of statistical significance (OR[95%CI] = 1.047[0.99-1.104], P = 0.086). T2D adjustment effectively lost any suggestive association of PGS(WHRadjBMI) with pancreatic cancer (OR[95%CI] = 1.039[0.99-1.097], P = 0.14). MR analyses showed a nominally significant causal effect of WHRadjBMI on pancreatic cancer (OR[95%CI] = 1.00095[1.00011-1.0018], P = 0.027) but not for BMI on pancreatic cancer. Overall, we show that abdominal adiposity measured using WHRadjBMI, may be a more important causal risk factor for pancreatic cancer compared to total adiposity, with T2D being a potential driver of this relationship.
Background The prevalence of depression is higher among those with diabetes than in the general population. The Patient Health Questionnaire (PHQ-9) is commonly used to assess depression in people with diabetes, but measurement invariance of the PHQ-9 across groups of people with and without diabetes has not yet been investigated. Methods Data from three independent cohorts from the USA (n=1,886 with diabetes, n=4,153 without diabetes), Quebec, Canada (n= 800 with diabetes, n= 2,411 without diabetes), and the UK (n=4,981 with diabetes, n=145,570 without diabetes), were used to examine measurement invariance between adults with and without diabetes. A series of multiple group confirmatory factor analyses were performed, with increasingly stringent model constraints applied to assess configural, equal thresholds, and equal thresholds and loadings invariance, respectively. One-factor and two-factor (somatic and cognitive-affective items) models were examined. Results Results demonstrated that the most stringent models, testing equal loadings and thresholds, had satisfactory model fit in the three cohorts for one-factor models (RMSEA = .063 or below and CFI = .978 or above) and two-factor models (RMSEA = .042 or below and CFI = .989 or above). Limitations Data were from Western countries only and we could not distinguish between type of diabetes. Conclusions Results provide support for measurement invariance between groups of people with and without diabetes, using either a one-factor or a two-factor model. While the two-factor solution has a slightly better fit, the one-factor solution is more parsimonious. Depending on research or clinical needs, both factor structures can be used.
Conventional measurements of fasting and postprandial blood glucose levels investigated in genome-wide association studies (GWAS) cannot capture the effects of DNA variability on 'around the clock' glucoregulatory processes. Here we show that GWAS meta-analysis of glucose measurements under nonstandardized conditions (random glucose (RG)) in 476,326 individuals of diverse ancestries and without diabetes enables locus discovery and innovative pathophysiological observations. We discovered 120 RG loci represented by 150 distinct signals, including 13 with sex-dimorphic effects, two cross-ancestry and seven rare frequency signals. Of these, 44 loci are new for glycemic traits. Regulatory, glycosylation and metagenomic annotations highlight ileum and colon tissues, indicating an underappreciated role of the gastrointestinal tract in controlling blood glucose. Functional follow-up and molecular dynamics simulations of lower frequency coding variants in glucagon-like peptide-1 receptor (GLP1R), a type 2 diabetes treatment target, reveal that optimal selection of GLP-1R agonist therapy will benefit from tailored genetic stratification. We also provide evidence from Mendelian randomization that lung function is modulated by blood glucose and that pulmonary dysfunction is a diabetes complication. Our investigation yields new insights into the biology of glucose regulation, diabetes complications and pathways for treatment stratification. Genome-wide association analyses of blood glucose measurements under nonstandardized conditions provide insights into the biology of glucose regulation, diabetes complications and pathways for treatment stratification.
Introduction The role of TOMM40-APOE 19q13.3 region variants is well documented in Alzheimer's disease (AD) but remains contentious in dementia with Lewy bodies (DLB) and Parkinson's disease dementia (PDD). Methods We dissected genetic profiles within the TOMM40-APOE region in 451 individuals from four European brain banks, including DLB and PDD cases with/without neuropathological evidence of AD-related pathology and healthy controls. Results TOMM40-L/APOE-ε4 alleles were associated with DLB (ORTOMM40-L = 3.61; P value = 3.23 × 10−9; ORAPOE-ε4 = 3.75; P value = 4.90 × 10−10) and earlier age at onset of DLB (HRTOMM40-L = 1.33, P value = .031; HRAPOE-ε4 = 1.46, P value = .004), but not with PDD. The TOMM40-L/APOE-ε4 effect was most pronounced in DLB individuals with concomitant AD pathology (ORTOMM40-L = 4.40, P value = 1.15 × 10−6; ORAPOE-ε4 = 5.65, P value = 2.97 × 10−8) but was not significant in DLB without AD. Meta-analyses combining all APOE-ε4 data in DLB confirmed our findings (ORDLB = 2.93, P value = 3.78 × 10−99; ORDLB+AD = 5.36, P value = 1.56 × 10−47). Discussion APOE-ε4/TOMM40-L alleles increase susceptibility and risk of earlier DLB onset, an effect explained by concomitant AD-related pathology. These findings have important implications in future drug discovery and development efforts in DLB.
We assessed the predictive ability of a combined genetic variant panel for the risk of recurrent pregnancy loss (RPL) through a case-control study. Our study sample was from Ukraine and included 114 cases with idiopathic RPL and 106 controls without any pregnancy losses/complications and with at least one healthy child. We genotyped variants within 12 genetic loci reflecting the main biological pathways involved in pregnancy maintenance: blood coagulation (F2, F5, F7, GP1A), hormonal regulation (ESR1, ADRB2), endometrium and placental function (ENOS, ACE), folate metabolism (MTHFR) and inflammatory response (IL6, IL8, IL10). We showed that a genetic risk score (GRS) calculated from the 12 variants was associated with an increased risk of RPL (odds ratio 1.56, 95% CI: 1.21, 2.04, p = 8.7 × 10−4). The receiver operator characteristic (ROC) analysis resulted in an area under the curve (AUC) of 0.64 (95% CI: 0.57, 0.72), indicating an improved ability of the GRS to classify women with and without RPL. Ιmplementation of the GRS approach can help define women at higher risk of complex multifactorial conditions such as RPL. Future well-powered genome-wide association studies will help in dissecting biological pathways previously unknown for RPL and further improve the identification of women with RPL susceptibility.
Pubertal growth patterns correlate with future health outcomes. However, the genetic mechanisms mediating growth trajectories remain largely unknown. Here, we modeled longitudinal height growth with Super-Imposition by Translation And Rotation (SITAR) growth curve analysis on ~ 56,000 trans-ancestry samples with repeated height measurements from age 5 years to adulthood. We performed genetic analysis on six phenotypes representing the magnitude, timing, and intensity of the pubertal growth spurt. To investigate the lifelong impact of genetic variants associated with pubertal growth trajectories, we performed genetic correlation analyses and phenome-wide association studies in the Penn Medicine BioBank and the UK Biobank. Large-scale growth modeling enables an unprecedented view of adolescent growth across contemporary and 20th-century pediatric cohorts. We identify 26 genome-wide significant loci and leverage trans-ancestry data to perform fine-mapping. Our data reveals genetic relationships between pediatric height growth and health across the life course, with different growth trajectories correlated with different outcomes. For instance, a faster tempo of pubertal growth correlates with higher bone mineral density, HOMA-IR, fasting insulin, type 2 diabetes, and lung cancer, whereas being taller at early puberty, taller across puberty, and having quicker pubertal growth were associated with higher risk for atrial fibrillation. We report novel genetic associations with the tempo of pubertal growth and find that genetic determinants of growth are correlated with reproductive, glycemic, respiratory, and cardiac traits in adulthood. These results aid in identifying specific growth trajectories impacting lifelong health and show that there may not be a single "optimal" pubertal growth pattern.
The risk of depression could be evaluated through its multifactorial nature using the polygenic score (PGS) approach. Assuming a "clinical continuum" hypothesis of mental diseases, a preliminary assessment of individuals with elevated risk for developing depression in a non-clinical group is of high relevance. In turn, epidemiological studies suggest including social/lifestyle factors together with PGS to address the "missing heritability" problem. We designed regression models, which included PGS using 27 SNPs and social/lifestyle factors to explain individual differences in depression levels in high-education students from the Volga-Ural region (VUR) of Eurasia. Since issues related to population stratification in PGS scores may lead to imprecise variant effect estimates, we aimed to examine a sensitivity of PGS calculated on summary statistics of depression and neuroticism GWAS from Western Europeans to assess individual proneness to depression levels in the examined sample of Eastern Europeans. A depression score was assessed using the revised version of the Beck Depression Inventory (BDI) in 1065 young adults (age 18-25 years, 79% women, Eastern European ancestry). The models based on weighted PGS demonstrated higher sensitivity to evaluate depression level in the full dataset, explaining up to 2.4% of the variance (p = 3.42 x 10(-7)); the addition of social parameters enhanced the strength of the model (adjusted r(2) = 15%, p < 2.2 x 10(-16)). A higher effect was observed in models based on weighted PGS in the women group, explaining up to 3.9% (p = 6.03 x 10(-9)) of variance in depression level assuming a combined SNPs effect and 17% (p < 2.2 x 10(-16))-with the addition of social factors in the model. We failed to estimate BDI-measured depression based on summary statistics from Western Europeans GWAS of clinical depression. Although regression models based on PGS from neuroticism (depression-related trait) GWAS in Europeans were associated with a depression level in our sample (adjusted r(2) = 0.43%, p = 0.019-for unweighted model), the effect was mainly attributed to the inclusion of social/lifestyle factors as predictors in these models (adjusted r(2) = 15%, p < 2.2 x 10(-16)-for unweighted model). In conclusion, constructed PGS models contribute to a proportion of interindividual variability in BDI-measured depression in high-education students, especially women, from the VUR of Eurasia. External factors, including the specificity of rearing in childhood, used as predictors, improve the predictive ability of these models. Implementation of ethnicity-specific effect estimates in such modeling is important for individual risk assessment.
The polygenic scores (PGSs) are developed to help clinicians in distinguishing individuals at high risk of developing disease outcomes from the general population. Clear cell renal cell carcinoma (ccRCC) is a complex disorder that involves numerous biological pathways, one of the most important of which is responsible for the microRNA biogenesis machinery. Here, we defined the biological-pathway-specific PGS in a case-control study of ccRCC in the Volga-Ural region of the Eurasia continent. We evaluated 28 DNA SNP variants, located in microRNA biogenesis genes, in 464 individuals with clinically diagnosed ccRCC and 1042 individuals without the disease. Individual genetic risks were defined using the SNP-variant effects derived from the ccRCC association analysis. The final weighted and unweighted PGS models were based on 21 SNPs, and 7 SNPs were excluded due to high LD. In our dataset, microRNA-machinery-weighted PGS revealed 1.69-fold higher odds (95% CI [1.51-1.91]) for ccRCC risk in individuals with ccRCC compared with controls with a p-value of 2.0 x 10(-16). The microRNA biogenesis pathway weighted PGS predicted the risk of ccRCC with an area under the curve (AUC) = 0.642 (95%nCI [0.61-0.67]). Our findings indicate that DNA variants of microRNA machinery genes modulate the risk of ccRCC in Volga-Ural populations. Moreover, larger powerful genome-wide association studies are needed to reveal a wider range of genetic variants affecting microRNA processing. Biological-pathway-based PGSs will advance the development of innovative screening systems for future stratified medicine approaches in ccRCC.
Depression is a common comorbidity of type 2 diabetes. We assessed the causal relationships and shared genetics between them. We applied two-sample, bidirectional Mendelian randomization (MR) to assess causality between type 2 diabetes and depression. We investigated potential mediation using two-step MR. To identify shared genetics, we performed 1) genome-wide association studies (GWAS) separately and 2) multiphenotype GWAS (MP-GWAS) of type 2 diabetes (19,344 case subjects, 463,641 control subjects) and depression using major depressive disorder (MDD) (5,262 case subjects, 86,275 control subjects) and self-reported depressive symptoms (n = 153,079) in the UK Biobank. We analyzed expression quantitative trait loci (eQTL) data from public databases to identify target genes in relevant tissues. MR demonstrated a significant causal effect of depression on type 2 diabetes (odds ratio 1.26 [95% CI 1.11-1.44], P = 5.46 × 10-4) but not in the reverse direction. Mediation analysis indicated that 36.5% (12.4-57.6%, P = 0.0499) of the effect from depression on type 2 diabetes was mediated by BMI. GWAS of type 2 diabetes and depressive symptoms did not identify shared loci. MP-GWAS identified seven shared loci mapped to TCF7L2, CDKAL1, IGF2BP2, SPRY2, CCND2-AS1, IRS1, CDKN2B-AS1. MDD has not brought any significant association in either GWAS or MP-GWAS. Most MP-GWAS loci had an eQTL, including single nucleotide polymorphisms implicating the cell cycle gene CCND2 in pancreatic islets and brain and the insulin signaling gene IRS1 in adipose tissue, suggesting a multitissue and pleiotropic underlying mechanism. Our results highlight the importance to prevent type 2 diabetes at the onset of depressive symptoms and the need to maintain a healthy weight in the context of its effect on depression and type 2 diabetes comorbidity.
Over 1000 mutations are described in the androgen receptor (AR) gene. Of those, about 600 were found in androgen insensitivity syndrome (AIS) patients, among which 400 mutations affect the ligand-binding domain (LBD) of the AR protein. Recently, we reported a novel missense mutation c.2507T>G I836S (ClinVarID: 974911) in a patient with complete AIS (CAIS) phenotype. In the present study, we applied a set of computational approaches for the structural analysis of the ligand-binding domains in a wild-type and mutant AR to evaluate the functional impact of the novel I836S mutation. We revealed that the novel I836S substitution leads to a shorter existence time of the ligand’s gating tunnel and internal cavity, occurring only in the presence of S836 phosphorylation. Additionally, the analysis of phosphorylation of the 836 mutant residues explained the negative impact on AR homodimerization, since monomer surface changes indirectly impacted the binding site. Our analyses provide evidence that I836S causes disruptions of AR protein functionality and development of CAIS clinical features in patients.
We tested associations between 13 established genetic variants and type 2 diabetes (T2D) in 1371 study participants from the Volga-Ural region of the Eurasian continent, and evaluated the predictive ability of the model containing polygenic scores for the variants associated with T2D in our dataset, alone and in combination with other risk factors such as age and sex. Using logistic regression analysis, we found associations with T2D for the CCL20 rs6749704 (OR = 1.68, P-FDR = 3.40 x 10(-5)), CCR5 rs333 (OR = 1.99, P-FDR = 0.033), ADIPOQ rs17366743 (OR = 3.17, P-FDR = 2.64 x 10(-4)), TCF7L2 rs114758349 (OR = 1.77, P-FDR = 9.37 x 10(-5)), and CCL2 rs1024611 (OR = 1.38, P-FDR = 0.033) polymorphisms. We showed that the most informative prognostic model included weighted polygenic scores for these five loci, and non-genetic factors such as age and sex (AUC 85.8%, 95%CI 83.7-87.8%). Compared to the model containing only non-genetic parameters, adding the polygenic score for the five T2D-associated loci showed improved net reclassification (NRI = 37.62%, 1.39 x 10(-6)). Inclusion of all 13 tested SNPs to the model with age and sex did not improve the predictive ability compared to the model containing five T2D-associated variants (NRI = -17.86, p = 0.093). The five variants associated with T2D in people from the Volga-Ural region are linked to inflammation (CCR5, CCL2, CCL20) and glucose metabolism regulation (TCF7L, ADIPOQ2). Further studies in independent groups of T2D patients should validate the prognostic value of the model and elucidate the molecular mechanisms of the disease development.
The current epidemics of cardiovascular and metabolic noncommunicable diseases have emerged alongside dramatic modifications in lifestyle and living environments. These correspond to changes in our “modern” postwar societies globally characterized by rural-to-urban migration, modernization of agricultural practices, and transportation, climate change, and aging. Evidence suggests that these changes are related to each other, although the social and biological mechanisms as well as their interactions have yet to be uncovered. LongITools, as one of the 9 projects included in the European Human Exposome Network, will tackle this environmental health equation linking multidimensional environmental exposures to the occurrence of cardiovascular and metabolic noncommunicable diseases.
Despite the identification of several dozens of common genetic variants associated with Alzheimer’s disease (AD) and Parkinson’s disease (PD), most of the genetic risk remains uncharacterised. Therefore, it is important to understand the role of regulatory elements, such as miRNAs. Dysregulated miRNAs are implicated in AD and PD, with potential value in dissecting the shared pathophysiology between the two disorders. miRNAs relevant to both neurodegenerative diseases are related to axonal guidance, apoptosis, and inflammation, therefore, AD and PD likely arise from similar underlying biological pathway defects. Furthermore, pathways regulated by APP, L1CAM, and genes of the caspase family may represent promising therapeutic miRNA targets in AD and PD since they are targeted by dysregulated miRNAs in both disorders. Systematic reviews and meta-analyses clearly identify sets of miRNAs that are dysregulated in AD and postmortem brain samples from patients with PD.Given the central role of miRNAs in neuronal function and the close link between select miRNAs and key pathological processes in AD and PD, it was proposed that this information could be used to better understand the shared pathobiology of the two disorders.It was suggested that miRNA changes are cell type specific and the shifting balance between different cell populations as neurodegeneration advances may be important when miRNAs are considered as diagnostic or therapeutic targets.Similar evidence in other disease areas, such as cancer, has successfully been applied to develop more effective strategies for early detection and disease-modifying interventions.
Epidemic obesity is the most important risk factor for prediabetes and type 2 diabetes (T2D) in youth as it is in adults. Obesity shares pathophysiological mechanisms with T2D and is likely to share part of the genetic background. We aimed to test if weighted genetic risk scores (GRSs) for T2D, fasting glucose (FG) and fasting insulin (FI) predict glycaemic traits and if there is a causal relationship between obesity and impaired glucose metabolism in children and adolescents. Genotyping of 42 SNPs established by genome-wide association studies for T2D, FG and FI was performed in 1660 Italian youths aged between 2 and 19 years. We defined GRS for T2D, FG and FI and tested their effects on glycaemic traits, including FG, FI, indices of insulin resistance/beta cell function and body mass index (BMI). We evaluated causal relationships between obesity and FG/FI using one-sample Mendelian randomization analyses in both directions. GRS-FG was associated with FG (beta = 0.075 mmol/l, SE = 0.011, P = 1.58 × 10 −11) and beta cell function (beta = −0.041, SE = 0.0090 P = 5.13 × 10 −6). GRS-T2D also demonstrated an association with beta cell function (beta = −0.020, SE = 0.021 P = 0.030). We detected a causal effect of increased BMI on levels of FI in Italian youths (beta = 0.31 ln (pmol/l), 95%CI [0.078, 0.54], P = 0.0085), while there was no effect of FG/FI levels on BMI. Our results demonstrate that the glycaemic and T2D risk genetic variants contribute to higher FG and FI levels and decreased beta cell function in children and adolescents. The causal effects of adiposity on increased insulin resistance are detectable from childhood age.
Early childhood growth patterns are associated with adult health, yet the genetic factors and the developmental stages involved are not fully understood. Here, we combine genome-wide association studies with modeling of longitudinal growth traits to study the genetics of infant and child growth, followed by functional, pathway, genetic correlation, risk score, and colocalization analyses to determine how developmental timings, molecular pathways, and genetic determinants of these traits overlap with those of adult health. We found a robust overlap between the genetics of child and adult body mass index (BMI), with variants associated with adult BMI acting as early as 4 to 6 years old. However, we demonstrated a completely distinct genetic makeup for peak BMI during infancy, influenced by variation at the LEPR/LEPROT locus. These findings suggest that different genetic factors control infant and child BMI. In light of the obesity epidemic, these findings are important to inform the timing and targets of prevention strategies.
Differences between sexes contribute to variation in the levels of fasting glucose and insulin. Epidemiological studies established a higher prevalence of impaired fasting glucose in men and impaired glucose tolerance in women, however, the genetic component underlying this phenomenon is not established. We assess sex-dimorphic (73,089/50,404 women and 67,506/47,806 men) and sex-combined (151,188/105,056 individuals) fasting glucose/fasting insulin genetic effects via genome-wide association study meta-analyses in individuals of European descent without diabetes. Here we report sex dimorphism in allelic effects on fasting insulin at IRS1 and ZNF12 loci, the latter showing higher RNA expression in whole blood in women compared to men. We also observe sex-homogeneous effects on fasting glucose at seven novel loci. Fasting insulin in women shows stronger genetic correlations than in men with waist-to-hip ratio and anorexia nervosa. Furthermore, waist-to-hip ratio is causally related to insulin resistance in women, but not in men. These results position dissection of metabolic and glycemic health sex dimorphism as a steppingstone for understanding differences in genetic effects between women and men in related phenotypes.
DNA methylation variations are prevalent in human obesity but evidence of a causative role in disease pathogenesis is limited. Here, we combine epigenome-wide association and integrative genomics to investigate the impact of adipocyte DNA methylation variations in human obesity. We discover extensive DNA methylation changes that are robustly associated with obesity (N = 190 samples, 691 loci in subcutaneous and 173 loci in visceral adipocytes, P < 1 x 10(-7)). We connect obesity-associated methylation variations to transcriptomic changes at >500 target genes, and identify putative methylation-transcription factor interactions. Through Mendelian Randomisation, we infer causal effects of methylation on obesity and obesity-induced metabolic disturbances at 59 independent loci. Targeted methylation sequencing, CRISPR-activation and gene silencing in adipocytes, further identifies regional methylation variations, underlying regulatory elements and novel cellular metabolic effects. Our results indicate DNA methylation is an important determinant of human obesity and its metabolic complications, and reveal mechanisms through which altered methylation may impact adipocyte functions. DNA methylation variation is associated with human obesity but a whether it plays a causal role in disease pathogenesis is unclear. Here, the authors perfom an integrative genomic study in human adipocytes to show that DNA methylation variations contribute to obesity and type 2 diabetes susceptibility, revealing underlying genomic and molecular mechanisms.
Rationale: Idiopathic and heritable pulmonary arterial hypertension (PAH) are rare but comprise a genetically heterogeneous patient group. RNA sequencing linked to the underlying genetic architecture can be used to better understand the underlying pathology by identifying key signaling pathways and stratify patients more robustly according to clinical risk.Objectives: To use a three-stage design of RNA discovery, RNA validation and model construction, and model validation to define a set of PAH-associated RNAs and a single summarizing RNA model score. To define genes most likely to be involved in disease development, we performed Mendelian randomization (MR) analysis.Methods: RNA sequencing was performed on whole-blood samples from 359 patients with idiopathic, heritable, and drug-induced PAH and 72 age- and sex-matched healthy volunteers. The score was evaluated against disease severity markers including survival analysis using all-cause mortality from diagnosis. MR used known expression quantitative trait loci and summary statistics from a PAH genome-wide association study.Measurements and Main Results: We identified 507 genes with differential RNA expression in patients with PAH compared with control subjects. A model of 25 RNAs distinguished PAH with 87% accuracy (area under the curve 95% confidence interval: 0.791–0.945) in model validation. The RNA model score was associated with disease severity and long-term survival (P = 4.66 × 10−6) in PAH. MR detected an association between SMAD5 levels and PAH disease susceptibility (odds ratio, 0.317; 95% confidence interval, 0.129–0.776; P = 0.012).Conclusions: A whole-blood RNA signature of PAH, which includes RNAs relevant to disease pathogenesis, associates with disease severity and identifies patients with poor clinical outcomes. Genetic variants associated with lower SMAD5 expression may increase susceptibility to PAH.
Genetic studies promise to provide insight into the molecular mechanisms underlying type 2 diabetes (T2D). Variants associated with T2D are often located in tissue-specific enhancer clusters or super-enhancers. So far, such domains have been defined through clustering of enhancers in linear genome maps rather than in three-dimensional (3D) space. Furthermore, their target genes are often unknown. We have created promoter capture Hi-C maps in human pancreatic islets. This linked diabetes-associated enhancers to their target genes, often located hundreds of kilobases away. It also revealed >1,300 groups of islet enhancers, super-enhancers and active promoters that form 3D hubs, some of which show coordinated glucose-dependent activity. We demonstrate that genetic variation in hubs impacts insulin secretion heritability, and show that hub annotations can be used for polygenic scores that predict T2D risk driven by islet regulatory variants. Human islet 3D chromatin architecture, therefore, provides a framework for interpretation of T2D genome-wide association study (GWAS) signals.
Primary immunodeficiency (PID) is characterized by recurrent and often life-threatening infections, autoimmunity and cancer, and it poses major diagnostic and therapeutic challenges. Although the most severe forms of PID are identified in early childhood, most patients present in adulthood, typically with no apparent family history and a variable clinical phenotype of widespread immune dysregulation: about 25% of patients have autoimmune disease, allergy is prevalent and up to 10% develop lymphoid malignancies . Consequently, in sporadic (or non-familial) PID genetic diagnosis is difficult and the role of genetics is not well defined. Here we address these challenges by performing whole-genome sequencing in a large PID cohort of 1,318 participants. An analysis of the coding regions of the genome in 886 index cases of PID found that disease-causing mutations in known genes that are implicated in monogenic PID occurred in 10.3% of these patients, and a Bayesian approach (BeviMed ) identified multiple new candidate PID-associated genes, including IVNS1ABP. We also examined the noncoding genome, and found deletions in regulatory regions that contribute to disease causation. In addition, we used a genome-wide association study to identify loci that are associated with PID, and found evidence for the colocalization of-and interplay between-novel high-penetrance monogenic variants and common variants (at the PTPN2 and SOCS1 loci). This begins to explain the contribution of common variants to the variable penetrance and phenotypic complexity that are observed in PID. Thus, using a cohort-based whole-genome-sequencing approach in the diagnosis of PID can increase diagnostic yield and further our understanding of the key pathways that influence immune responsiveness in humans.
The impact of many unfavorable childhood traits or diseases, such as low birth weight and mental disorders, is not limited to childhood and adolescence, as they are also associated with poor outcomes in adulthood, such as cardiovascular disease. Insight into the genetic etiology of childhood and adolescent traits and disorders may therefore provide new perspectives, not only on how to improve wellbeing during childhood, but also how to prevent later adverse outcomes. To achieve the sample sizes required for genetic research, the Early Growth Genetics (EGG) and EArly Genetics and Lifecourse Epidemiology (EAGLE) consortia were established. The majority of the participating cohorts are longitudinal population-based samples, but other cohorts with data on early childhood phenotypes are also involved. Cohorts often have a broad focus and collect(ed) data on various somatic and psychiatric traits as well as environmental factors. Genetic variants have been successfully identified for multiple traits, for example, birth weight, atopic dermatitis, childhood BMI, allergic sensitization, and pubertal growth. Furthermore, the results have shown that genetic factors also partly underlie the association with adult traits. As sample sizes are still increasing, it is expected that future analyses will identify additional variants. This, in combination with the development of innovative statistical methods, will provide detailed insight on the mechanisms underlying the transition from childhood to adult disorders. Both consortia welcome new collaborations. Policies and contact details are available from the corresponding authors of this manuscript and/or the consortium websites.
Crohn Disease (CD) is a complex genetic disorder for which more than 140 genes have been identified using genome wide association studies (GWAS). However, the genetic architecture of the trait remains largely unknown. The recent development of machine learning (ML) approaches incited us to apply them to classify healthy and diseased people according to their genomic information. The Immunochip dataset containing 18,227 CD patients and 34,050 healthy controls enrolled and genotyped by the international Inflammatory Bowel Disease genetic consortium (IIBDGC) has been re-analyzed using a set of ML methods: penalized logistic regression (LR), gradient boosted trees (GBT) and artificial neural networks (NN). The main score used to compare the methods was the Area Under the ROC Curve (AUC) statistics. The impact of quality control (QC), imputing and coding methods on LR results showed that QC methods and imputation of missing genotypes may artificially increase the scores. At the opposite, neither the patient/control ratio nor marker preselection or coding strategies significantly affected the results. LR methods, including Lasso, Ridge and ElasticNet provided similar results with a maximum AUC of 0.80. GBT methods like XGBoost, LightGBM and CatBoost, together with dense NN with one or more hidden layers, provided similar AUC values, suggesting limited epistatic effects in the genetic architecture of the trait. ML methods detected near all the genetic variants previously identified by GWAS among the best predictors plus additional predictors with lower effects. The robustness and complementarity of the different methods are also studied. Compared to LR, non-linear models such as GBT or NN may provide robust complementary approaches to identify and classify genetic markers. status: published
Birth weight variation is influenced by fetal and maternal genetic and non-genetic factors, and has been reproducibly associated with future cardio-metabolic health outcomes. In expanded genome-wide association analyses of own birth weight (n = 321,223) and offspring birth weight (n = 230,069 mothers), we identified 190 independent association signals (129 of which are novel). We used structural equation modeling to decompose the contributions of direct fetal and indirect maternal genetic effects, then applied Mendelian randomization to illuminate causal pathways. For example, both indirect maternal and direct fetal genetic effects drive the observational relationship between lower birth weight and higher later blood pressure: maternal blood pressure-raising alleles reduce offspring birth weight, but only direct fetal effects of these alleles, once inherited, increase later offspring blood pressure. Using maternal birth weight-lowering genotypes to proxy for an adverse intrauterine environment provided no evidence that it causally raises offspring blood pressure, indicating that the inverse birth weight-blood pressure association is attributable to genetic effects, and not to intrauterine programming.
Anorexia nervosa (AN) occurs nine times more often in females than in males. Although environmental factors likely play a role, the reasons for this imbalanced sex ratio remain unresolved. AN displays high genetic correlations with anthropometric and metabolic traits. Given sex differences in body composition, we investigated the possible metabolic underpinnings of female propensity for AN. We conducted sex-specific GWAS in a healthy and medication-free subsample of the UK Biobank (n = 155,961), identifying 77 genome-wide significant loci associated with body fat percentage (BF%) and 174 with fat-free mass (FFM). Partitioned heritability analysis showed an enrichment for central nervous tissue-associated genes for BF%, which was more prominent in females than males. Genetic correlations of BF% and FFM with the largest GWAS of AN by the Psychiatric Genomics Consortium were estimated to explore shared genomics. The genetic correlations of BF%male and BF%female with AN differed significantly from each other (p
The prevention of the risk of type 2 diabetes (T2D) is complicated by multidimensional interplays between biological and psychosocial factors acting at the individual level. To address the challenge we took a systematic approach, to explore the bio-psychosocial predictors of blood glucose in mid-age. Based on the 31-year and 46-year follow-ups (5,078 participants, 43% male) of Northern Finland Birth Cohort 1966, we used a systematic strategy to select bio-psychosocial variables at 31 years to enable a data-driven approach. As selection criteria, the variable must be (i) a component of the metabolic syndrome or an indicator of psychosocial health using WHO guidelines, (ii) easily obtainable in general health check-ups and (iii) associated with fasting blood glucose at 46 years (P
The article “The use of polygenic risk scores in pre-implantation genetic testing: an unproven, unethical practice”, written by Francesca Forzano et al., was originally published electronically on the publisher’s internet portal on 17 December 2021 without open access. With the authors’ decision to opt for Open Choice, the copyright of the article changed on 11 July 2022 to
Recently, rare heterozygous mutations in were identified in patients with pulmonary arterial hypertension (PAH). encodes the circulating BMP (bone morphogenetic protein) type 9, which is a ligand for the BMP2 receptor. Here we determined the functional impact of mutations and characterized plasma BMP9 and BMP10 levels in patients with idiopathic PAH. Missense BMP9 mutant proteins were expressed and the impact on BMP9 protein processing and secretion, endothelial signaling, and functional activity was assessed. Plasma BMP9 and BMP10 levels and activity were assayed in patients with PAH with variants and in control subjects. Levels were also measured in a larger cohort of control subjects ( = 120) and patients with idiopathic PAH ( = 260). We identified a novel rare variation at the and loci, including copy number variation. , BMP9 missense proteins demonstrated impaired cellular processing and secretion. Patients with PAH who carried these mutations exhibited reduced plasma levels of BMP9 and reduced BMP activity. Unexpectedly, plasma BMP10 levels were also markedly reduced in these individuals. Although overall BMP9 and BMP10 levels did not differ between patients with PAH and control subjects, BMP10 levels were lower in PAH females. A subset of patients with PAH had markedly reduced plasma levels of BMP9 and BMP10 in the absence of mutations. Our findings demonstrate that mutations result in BMP9 loss of function and are likely causal. These mutations lead to reduced circulating levels of both BMP9 and BMP10. These findings support therapeutic strategies to enhance BMP9 or BMP10 signaling in PAH.
Pulmonary arterial hypertension (PAH) is a rare disease that leads to premature death from right heart failure. It is strongly associated with elevated red cell distribution width (RDW), a correlate of several iron status biomarkers. High RDW values can signal early-stage iron deficiency or iron deficiency anaemia. This study investigated whether elevated RDW is causally associated with PAH. A two-sample Mendelian randomisation (MR) approach was applied to investigate whether genetic predisposition to higher levels of RDW increases the odds of developing PAH. Primary and secondary MR analyses were performed using all available genome-wide significant RDW variants (n=179) and five genome-wide significant RDW variants that act via systemic iron status, respectively. We confirmed the observed association between RDW and PAH (OR 1.90, 95% CI 1.80-2.01) in a multicentre case-control study (cases n=642, disease controls n=15889). The primary MR analysis was adequately powered to detect a causal effect (odds ratio) between 1.25 and 1.52 or greater based on estimates reported in the RDW genome-wide association study or from our own data. There was no evidence for a causal association between RDW and PAH in either the primary (ORcausal 1.07, 95% CI 0.92-1.24) or the secondary (ORcausal 1.09, 95% CI 0.77-1.54) MR analysis. The results suggest that at least some of the observed association of RDW with PAH is secondary to disease progression. Results of iron therapeutic trials in PAH should be interpreted with caution, as any improvements observed may not be mechanistically linked to the development of PAH.
The European Society of Human Genetics (ESHG) was founded in 1967 as a professional organisation for members working in genetics in clinical practice, research and education. The Society seeks the integration of scientific research and its implementation into clinical practice and the education of specialists and the public in all areas of medical and human genetics. The Society works to do this through many approaches, including educational sessions at the annual conference; training courses in general and specialist areas of genetics; an online resource of educational materials (EuroGEMS); and a mentorship scheme. The ESHG Education Committee is implementing new approaches to expand the reach of its educational activities and portfolio. With changes in technology, appreciation of the utility of genomics in healthcare and the public’s and patients’ increased awareness of the role of genomics, this review will summarise how the ESHG is adapting to deliver innovative educational activity.
Type 2 diabetes (T2D) affects the health of millions of people worldwide. The identification of genetic determinants associated with changes in glycemia over time might illuminate biological features that precede the development of T2D. Here we conducted a genome-wide association study of longitudinal fasting glucose changes in up to 13,807 non-diabetic individuals of European descent from nine cohorts. Fasting glucose change over time was defined as the slope of the line defined by multiple fasting glucose measurements obtained over up to 14 years of observation. We tested for associations of genetic variants with inverse-normal transformed fasting glucose change over time adjusting for age at baseline, sex, and principal components of genetic variation. We found no genome-wide significant association (P
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Background: Maternal pre-pregnancy body mass index (BMI) is positively associated with offspring birth weight (BW) and BMI in childhood and adulthood. Each of these associations could be due to causal intrauterine effects, or confounding (genetic or environmental), or some combination of these. Here we estimate the extent to which the association between maternal BMI and offspring body size is explained by offspring genotype, as a first step towards establishing the importance of genetic confounding. Methods: We examined the associations of maternal pre-pregnancy BMI with offspring BW and BMI at 1, 5, 10 and 15 years, in three European birth cohorts (n
Handedness has been extensively studied because of its relationship with language and the over-representation of left-handers in some neurodevelopmental disorders. Using data from the UK Biobank, 23andMe and the International Handedness Consortium, we conducted a genome-wide association meta-analysis of handedness (N = 1,766,671). We found 41 loci associated (P < 5 x 10(-8)) with left-handedness and 7 associated with ambidexterity. Tissue-enrichment analysis implicated the CNS in the aetiology of handedness. Pathways including regulation of microtubules and brain morphology were also highlighted. We found suggestive positive genetic correlations between left-handedness and neuropsychiatric traits, including schizophrenia and bipolar disorder. Furthermore, the genetic correlation between left-handedness and ambidexterity is low (r(G) = 0.26), which implies that these traits are largely influenced by different genetic mechanisms. Our findings suggest that handedness is highly polygenic and that the genetic variants that predispose to left-handedness may underlie part of the association with some psychiatric disorders. A genome-wide association study of 1.7 million individuals identified 41 genetic variants associated with left-handedness and 7 associated with ambidexterity. The genetic correlation between the traits was low, thereby implying different aetiologies.