Rare Coding Variation and Risk of Intracerebral Hemorrhage
Background and Purpose—Intracerebral hemorrhage has a substantial genetic component. We performed a preliminary search for rare coding variants associated with intracerebral hemorrhage.
Methods—A total of 757 cases and 795 controls were genotyped using the Illumina HumanExome Beadchip (Illumina, Inc, San Diego, CA). Meta-analyses of single-variant and gene-based association were computed.
Results—No rare coding variants were associated with intracerebral hemorrhage. Three common variants on chromosome 19q13 at an established susceptibility locus, encompassing TOMM40, APOE, and APOC1, met genome-wide significance (P<5e−08). After adjusting for the APOE epsilon alleles, this locus was no longer convincingly associated with intracerebral hemorrhage. No gene reached genome-wide significance level in gene-based association testing.
Conclusions—Although no coding variants of large effect were detected, this study further underscores a major challenge for the study of genetic susceptibility loci; large sample sizes are required for sufficient power except for loci with large effects.
Genetic variation plays a substantial role in the risk of intracerebral hemorrhage (ICH).1 Genome-wide association studies have identified common variants associated with risk of ICH, both lobar and nonlobar subtypes.2 The degree to which rare genetic variants, those with minor allele frequencies far smaller than those of variants typically discovered through genome-wide association studies, contribute to this risk is unknown. Preliminary targeted sequencing studies have supported a possible role for rare variants in sporadic ICH.3 Recently, the exome array has emerged as an efficient, cost-effective tool to bridge array-based common variant association studies and whole-exome or whole-genome sequencing to identify coding variation underlying common conditions. The goal of this study was to explore the role of exonic variants in risk of ICH, using exome array.
Study subjects, genotyping, and quality control procedures are described in the Methods in the online-only Data Supplement.
Scores and minor allele frequency (MAF) for individual variants and a covariance matrix for each gene were computed, including age, sex, and the first 2 principal components as covariates in the model. Inverse variance–weighted meta-analysis of score tests was computed for both common and rare variants.
As MAF decreases, single-variant analysis loses the power to reach genome-wide significance, even in the presence of a true association. Therefore, variants within each gene or region of interest are aggregated to increase the power to detect variants with small effects. We applied sequence kernel association test (SKAT), SKAT-O, and T1 count tests for gene-based analysis.4 In analysis using SKAT, each single nucleotide polymorphism was weighted by the inverse of its SE and its MAF, where variants with lower MAF are relatively upweighted. In the T1 count test, each variant was weighted equally, irrespective of their MAF. The association models were adjusted for age, sex, and the first 2 principal components.
We performed association analysis in all subjects, as well as separately for lobar ICH. Analysis of nonlobar ICH was not performed because of small sample size. Quality control was performed using PLINK v1.07. All other analyses were performed using seqMeta package in R version 220.127.116.11
After excluding subjects for quality (n=31) and genetic outliers (n=56), there were 1553 subjects for analysis (Table I in the online-only Data Supplement).
In single-variant analysis, we identified a susceptibility locus at chr19q13 (P<5e−08), including 3 common variants with MAF ranging from 13% to 19% (Table; Figure I in the online-only Data Supplement). The top variant at this locus was rs769449, which is an intronic single nucleotide polymorphism on APOE (P=1.94e−11; odds ratio, 1.97 [95% confidence interval, 1.62–2.40]). There was no evidence of heterogeneity across 2 studies (Figure 1). These variants are in moderate linkage disequilibrium, with r2 estimates ranging 0.4 to 0.6 (Figure 2).
The 19q13 locus encompasses TOMM40, APOE, and APOC1. Common variants in this locus have been associated with several traits, including lipid levels, Alzheimer disease, cerebral amyloid angiopathy, and ICH.6,7 Given the association of APOE ε2 and ε4 alleles with ICH, we adjusted for these alleles, which had been previously genotyped in the majority of study subjects (Table II in the online-only Data Supplement).8 This adjustment resulted in loss of the observed signal, suggesting that these associations arose from the effect of ε2 and ε4 alleles (Table).
No low frequency variant or gene emerged as associated with ICH or the lobar subtype using SKAT, SKAT-O, or burden tests before and after adjustment for the ε2 and ε4 alleles. The strongest association for the gene-based analyses was observed for GADL1 in the T1 count test after adjustment for the epsilon alleles (P=6.37e−05; cumulative MAF=3.3%).
Common genetic variation seems to play a substantial role in ICH risk and key clinical features, including clinical outcome.1 Ongoing genome-wide association studies are designed to detect common variants, but they may miss rare variation. The contribution from rare variation is less substantiated.
The present analysis did not identify any rare coding variants for ICH. Our effort to identify coding variation with modest effect sizes was limited by inadequate statistical power. The gene-based tests can partially compensate for this limitation, but still lack of sufficient number of observations of rare variants in such small sample sizes prohibits taking full advantage of this approach. We estimated that our power was ≈7% for detection of a significant association at P<1e−06 (corrected for multiple tests in the gene-based analysis) at maximum odds ratio=5 when MAF=0.0001.9 Our data therefore suggest that most genetic risk for ICH resides within common and rare variants with modest effect size. Accurate estimation of the extent to which rare variants contribute to risk of ICH will require larger scale sequencing studies with coverage of both common and rare variants.
The development of international consortia has facilitated recruitment of hundreds of thousands of subjects with common diseases such as ischemic stroke and accelerated the rate of genetic discoveries for complex traits. With decreasing costs of sequencing studies and further expansion of consortia, genetic characterization of less common conditions such as ICH will become more feasible.
Sources of Funding
This study was supported by grants R01NS073344, R01NS059727, and 5K23NS059774 from the National Institutes of Health–National Institute of Neurological Disorders and Stroke (NIH-NINDS) and 0755984T from the American Heart Association. Computation support was provided, in part, by the Wake Forest School of Medicine Center for Public Health Genomics. Dr Anderson received research grants from the NIH-NINDS, American Brain Foundation, and Massachusetts General Hospital Institute for Heart, Vascular and Stroke Care. Dr Worrall received a research grant from the NIH. Dr Rosand received research grants from the NIH.
Dr Worrall is the Associate Editor for the journal Neurology. Dr Rosand is a consultant to Boehringer Ingelheim. The other authors report no conflicts.
Guest Editor for this article was Martin Dichgans, MD.
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.115.009838/-/DC1.
- Received April 23, 2015.
- Revision received April 23, 2015.
- Accepted May 15, 2015.
- © 2015 American Heart Association, Inc.
- Devan WJ,
- Falcone GJ,
- Anderson CD,
- Jagiella JM,
- Schmidt H,
- Hansen BM,
- et al
- Woo D,
- Falcone GJ,
- Devan WJ,
- Brown WM,
- Biffi A,
- Howard TD,
- et al
- Lee S,
- Emond MJ,
- Bamshad MJ,
- Barnes KC,
- Rieder MJ,
- Nickerson DA,
- et al
- 5.↵R: A Language and Environment for Statistical Computing. Version 3.0.2. Vienna, Austria: R Foundation for Statistical Computing; 2014.