Polygenic Overlap Between Kidney Function and Large Artery Atherosclerotic Stroke
Background and Purpose—Epidemiological studies show strong associations between kidney dysfunction and risk of ischemic stroke (IS), the mechanisms of which are incompletely understood. We investigated whether these associations may reflect shared heritability because of a common polygenic basis and whether this differed for IS subtypes.
Methods—Polygenic models were derived using genome-wide association studies meta-analysis results for 3 kidney traits: estimated glomerular filtration rate using serum creatinine (eGFRcrea: n=73 998), eGFR using cystatin C (eGFRcys: n=22 937), and urinary albumin to creatinine ratio (n=31 580). For each, single nucleotide polymorphisms passing 10 P value thresholds were used to form profile scores in 4561 IS cases and 7094 controls from the United Kingdom, Germany, and Australia. Scores were tested for association with IS and its 3 aetiological subtypes: large artery atherosclerosis, cardioembolism, and small vessel disease.
Results—Polygenic scores correlating with higher eGFRcrea were associated with reduced risk of large artery atherosclerosis, with 5 scores reaching P<0.05 (peak P=0.004) and all showing the epidemiologically expected direction of effect. A similar pattern was observed for polygenic scores reflecting higher urinary albumin to creatinine ratio, of which 3 associated with large artery atherosclerosis (peak P=0.01) and all showed the expected directional association. One urinary albumin to creatinine ratio–based score also associated with small vessel disease (P=0.03). The global pattern of results was unlikely to have occurred by chance (P=0.02).
Conclusions—This study suggests possible polygenic correlation between renal dysfunction and IS. The shared genetic components may be specific to stroke subtypes, particularly large artery atherosclerotic stroke. Further study of the genetic relationships between these disorders seems merited.
Epidemiological evidence supports an association between kidney dysfunction and risk of cardiovascular diseases, including stroke. In fact, the majority of individuals with chronic kidney disease (CKD) die of a cardiovascular cause before developing end-stage renal disease.1 In relation to stroke, kidney dysfunction is associated with multiple outcomes, including incident stroke,2 recurrent stroke,3 and mortality after stroke.4 These relationships seem related more to ischemic stroke (IS) rather than hemorrhagic stroke.5,6
The mechanisms for these phenotypic correlations are not completely understood. Sharing of established cardiovascular risk factors, such as age, sex, blood pressure, cholesterol, smoking, and diabetes mellitus, can explain some but not all excess stroke risk in patients with CKD.3 Sharing of pathophysiological correlates of vascular disease, including carotid atherosclerosis,7 arterial stiffness,8 and cerebral white matter hyperintensity lesions,9 also seem to explain an additional component, but not all, of the excess risk.
Kidney traits and IS both have substantial heritable components, with ≈30% to 50% of observed variation in glomerular filtration rate (GFR),10 15% to 45% of variation in albuminuria,11 and 30% to 40% of variation in IS risk12 attributable to genetic effects. Thus, associations between kidney dysfunction and IS may partly reflect a shared genetic component.
In recent years, genome-wide association studies (GWAS) have identified several single nucleotide polymorphisms (SNPs) associated with kidney traits and aetiological subtypes of IS. However, these variants explain only a minority (1%–2%) of population variation in their respective traits. This missing heritability partly reflects a genetic architecture comprising numerous risk variants with effects too small to show significant association in available sample collections. However, numerous SNPs aggregated into polygenic scores may show stronger association and explain more trait variation.13
The polygenic basis of complex traits also hampers attempts to demonstrate pleiotropy for individual SNPs, because effect sizes are typically small for both traits. Using the largest available data sets, a recent study assessed individual SNPs for joint contribution to CKD and cardiovascular disease.14 Significant cross-trait association was observed for SNPs at only 1 locus (SH2B3), suggesting that if genetic overlap between CKD and cardiovascular disease exists, the effect sizes of pleiotropic variants are likely too small to permit their individual detection.
Building on this earlier work, we hypothesised that phenotypic correlations between kidney dysfunction and IS may result from polygenic overlap, that is, sharing of a genetic component consisting of numerous, small-effect SNPs. This hypothesis was tested by deriving polygenic scores using GWAS results for kidney traits and testing their performance in stroke GWAS data sets. These analyses used GWAS meta-analysis results from the CKDGen Consortium and individual-level genotype data from 3 IS case–control collections.
Data Sources and Study Samples
For the derivation stage, we used genome-wide association meta-analysis results for 3 established kidney function traits: (1) estimated glomerular filtration rate (eGFR) based on serum creatinine (eGFRcrea: n=73 998)15; (2) eGFR based on Cystatin C (eGFRcys: n=22 937)15; and (3) urinary albumin to creatinine ratio (UACR: n=31 580).16 Higher GFR describes better kidney function, whereas elevated UACR suggests kidney disease. Details of individual studies are provided in Tables I and II in the online-only Data Supplement.
Testing of polygenic scores was performed using individual-level genotype data for IS cases and controls from the Wellcome Trust Case Control Consortium 2 Ischemic Stroke Study17 and the Australian Stroke Genetics Collaborative.18 Three major aetiological subtypes of IS were determined: (1) large artery atherosclerosis (LAA); (2) cardioembolism (CE); and (3) small vessel disease (SVD). Stroke subtyping was performed using the the Trial of Org 10 172 in Acute Stroke Treatment (TOAST) system.19 All studies were approved by appropriate ethics committees, and participants provided written informed consent.
Genotyping, Imputation, and Quality Control
For IS studies, genotyping was performed using Illumina arrays followed by quality control and imputation to the HapMap Phase II CEU reference. Principal components analysis was performed using Eigenstrat,20 based on 95 016 approximately independent, directly genotyped SNPs. Principal component covariates of ancestry were calculated after 3 iterations of principal components analysis with outlier removal.
SNP Enrichment Assessment, Polygenic Scoring, and Association Analyses
Using GWAS meta-analysis results for the 3 kidney traits, sets of SNPs passing 10 graded P values (Pthreshold=0.0001, 0.001, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, and 1) were extracted. SNP sets were pruned by removing correlated SNPs (r2>0.2) within 1 Mb, preferentially retaining the most significantly associated SNP.21 Pruned SNP sets were used to form polygenic scores for IS cases and controls as the sum of reference alleles for each SNP weighted by the summary regression coefficient for the relevant kidney trait. Polygenic scores were tested for association with IS and its subtypes using mixed effects logistic regression adjusted for 3 ancestry principal components, incorporating study site as a random effect. The proportion of stroke case–control variation explained by polygenic scores (R2) was estimated as previously described for mixed effects logistic models.22 Polygenic score tests were 1-sided because a priori, we sought to identify effects consistent with established epidemiological evidence. For eGFRcrea and eGFRcys, the prespecified effect direction was negative because higher eGFR correlates with reduced stroke risk.2 For UACR, a positive effect was prespecified.23 The study-wise significance threshold was derived using the method proposed by Galwey24 (online-only Data Supplement). A flow-chart describing the polygenic analysis is shown in Figure I in the online-only Data Supplement.
At a significance threshold of 0.05, we had 98% power to detect polygenic scores explaining ≥0.2% of variance in case/control status for any validation subtype, 81% to 82% power to detect scores explaining ≥0.1% of variance (varying by subtype), and 51% to 53% power for scores explaining ≥0.05% of variance.13 At a significance threshold of 0.001, the corresponding power estimates were 76% to 78%, 32% to 34%, and 10% to 11%, respectively.
Lookup for Individual SNPs Associated With Kidney or Stroke Traits
As a secondary analysis, we conducted targeted, cross-trait analyses for individual SNPs previously associated with kidney traits or IS subtypes. These analyses are described in the online-only Data Supplement.
SNP set enrichment across P value thresholds for the 3 kidney traits is shown in Figure. Enrichment was strongest for eGFR based on serum creatinine (eGFRcrea). Modest SNP enrichment was also observed for eGFR based on Cystatin C (eGFRcys). Results for UACR showed less marked evidence for enrichment.
Polygenic Scoring Results
A total of 4561 IS cases and 7094 controls were used for polygenic score association testing (Table 1). Based on observed correlation among traits (online-only Data Supplement), the adjusted study-wise significance threshold was α=0.001.
Two scores derived from eGFRcrea showed nominal association with broad IS, reaching P=0.02 and P=0.05 (for Pthreshold=0.001 and 0.01, respectively, see Table 2). Five eGFRcrea-based scores showed nominal association with LAA, with peak association (P=0.004) observed for Pthreshold=0.05. This score explained an estimated 0.26% of LAA case–control variation (Table III in the online-only Data Supplement). For both IS and LAA, scores for all 10 P value thresholds demonstrated the expected negative direction of effect, with genetic scores indicative of higher eGFRcrea correlating with reduced (negative) stroke risk. No eGFRcrea-based score showed association with either CE or SVD, and effect directions were also inconsistent within these subtypes.
Polygenic scores derived from eGFRcys showed no association with broad IS or any of its subtypes (all P>0.05; see Table IV in the online-only Data Supplement). Further, effect directions were inconsistent across tests within each stroke type.
For UACR, one score (Pthreshold=0.01) showed nominal association with broad IS (P=0.04; Table 3). Three scores were also nominally associated with LAA (peak P=0.01 at Pthreshold=0.001, R2=0.11%) and 1 score showed nominal association with SVD (P=0.03 at Pthreshold=0.0001, R2=0.14%; see Table III in the online-only Data Supplement). For the 3 traits (IS, LAA, and SVD), all effects were in the expected direction, with scores indicative of higher UACR (microalbumuria) correlating with positive stroke risk.
None of the polygenic associations passed the study-wide significance threshold of 0.001. However, within each of the 5 trait combinations showing nominal association (eGFRcrea-IS, eGFRcrea-LAA, UACR-IS, UACR-LAA, and UACR-SVD), the direction of effect across all 10 tests was in accordance with prior evidence. We conducted simulations to empirically estimate the probability of this pattern occurring by chance for any given kidney-stroke trait combination (online-only Data Supplement). For eGFRcrea, the probability of a set of 10 tests showing a negative effect was P=0.080, based on 10 000 simulations. For eGFRcys and UACR, the corresponding 1-sided probabilities were 0.067 and 0.069, respectively. Thus, the observed pattern was unlikely to have arisen by chance in any given set of 10 tests.
If we consider all kidney–stroke combinations where such consistency was observed, the results seem even less likely to have occurred by chance. Excluding results for broad IS, we conducted 3 sets of approximately independent stroke subtype tests for each of the 3 kidney traits. Among these 9 sets of tests, we observed 3 in which consistent effects in the expected direction were uniformly observed (eGFRcrea-LAA, UACR-LAA, and UACR-SVD). The probability of this occurring by chance is ≈0.023 (online-only Data Supplement), or ≈1 in every 43 studies such as ours. This suggests our nominal associations likely to represent true polygenic correlations of small effect.
Lookup for Individually Associated SNPs
Of 40 SNPs previously associated with kidney traits, 1 (rs653178) was significantly associated (1-sided P=2×10-4) with LAA25 after multiple testing adjustment. Of 7 stroke-associated SNPs, none were significantly associated with kidney traits (Tables V and VI in the online-only Data Supplement).
This study suggests that reported epidemiological associations between renal disease and stroke may be partly explained by shared genetic factors and that this association may differ for IS subtypes, in this study being most marked for large artery stroke. However, the causal genetic variants are likely of small individual effect, detectable in the current study only when aggregated into highly polygenic scores and, even then, achieving only nominal significance. Independent replication will be necessary to confirm the validity of these results.
An important factor affecting the significance of our results was power for individual stroke subtypes. For nominally significant associations, the kidney trait profile scores explained from 0.1 to 0.26% of case–control variation for different stroke subtypes. Power analyses indicated larger samples would be necessary to identify the observed effects at more stringent significance levels.
Although the proportion of stroke subtype variance explained by kidney-based scores was low, this does not mean the true genetic overlap is small. Profile scoring combines errors in effect estimates across all SNPs in the score, which usually produces estimates of explained variance markedly lower than true values.26
We observed the strongest polygenic correlations between eGFR defined based on serum creatinine and LAA. Using eGFRcrea as the derivation trait, nominal significance of polygenic scores with LAA was sustained across nearly the full range of P value derivation thresholds. If these results reflect a true genetic correlation, the pattern of results is consistent with a complex genetic model involving numerous small-effect variants influencing diverse biological processes.13
The primary pathophysiological mechanism for LAA is atherosclerosis of the large cerebral arteries,19 a surrogate marker of which is carotid intima media thickness. Various epidemiological studies have reported inverse associations between eGFR and carotid intima media thickness. The majority show that this association can be explained by traditional cardiovascular risk factors, including age, smoking, hypertension, obesity, diabetes mellitus, and dyslipidaemia.27,28 Thus, polygenic correlations between eGFR and LAA – if confirmed – may reflect genetic variants influencing atherosclerosis or its heritable risk factors.
We observed no cotrait association for eGFR scores derived from cystatin C at P<0.05. The lack of similar results between eGFRcrea and eGFRcys may reflect the smaller sample size for the latter; the larger discovery set for eGFRcrea will increase polygenic score precision.13 Greater discriminatory power of eGFRcrea-based scores was also supported by stronger SNP enrichment across P value thresholds (Figure). Previous GWAS meta-analyses have also identified considerably more SNPs associated with eGFRcrea than eGFRcys. Given that both are measures of GFR and have similar heritability, the different results may largely reflect differences in sample size and power to identify variants of modest effect.
Polygenic scores derived from microalbuminuria (UACR) showed nominal associations with LAA across various P value thresholds. This is consistent with epidemiological associations between UACR, carotid intima media thickness, and cardiovascular disease, although the pathophysiological basis of these relationships is less clear. In contrast to GFR, microalbuminuria seems not to reflect generalized atherosclerosis,23 but may represent another common pathophysiologic process, such as endothelial dysfunction or low-grade inflammation.29
We observed no evidence for polygenic overlap between renal function and cardioembolic stroke (CE), in spite of epidemiological associations between CKD and atrial fibrillation, the major CE risk factor. However, increased atrial fibrillation prevalence has mainly been shown in patients with advanced kidney disease.30,31 Furthermore, factors associated with atrial fibrillation in the general population seem not to be associated with atrial fibrillation in CKD,32 suggesting pathophysiological differences.
Our results suggested possible polygenic overlap between microalbuminuria and SVD. These results are consistent with epidemiological associations between renal function and cerebral SVD,9 which have been interpreted as suggesting a systemic generalized microvascular disease that underlies both pathologies.33 Given the epidemiological evidence, more significant genetic associations might have been expected. Our modest results may reflect 2 factors. First, accuracy of diagnosis of small vessel stroke is greatly improved by the routine use of MRI, which was only used for ≈50% of stroke patients in the current study. Second, pathological and imaging data suggest that SVD is phenotypically heterogeneous, incorporating 2 distinct subtypes. One is characterized by single larger lacunar infarcts and thought to primarily relate to atherosclerosis; the other is characterized by multiple small lacunar infarcts and leukoaraiosis and is related to a diffuse small vessel arteriopathy.34,35 The latter subtype has been particularly associated with microalbuminuria. If renal disease is genetically correlated with only 1 SVD subtype, the heterogeneity of broadly defined SVD will have reduced our ability to detect any genetic overlap. The study of SVD samples with finer-scale phenotyping will likely provide better insights into genetic pleiotropy for SVD.
We observed largely negative evidence for cross-trait association for individual SNPs strongly associated with either kidney function or stroke. Only 1 SNP previously associated with eGFR was associated with both large artery and small vessel stroke; this SNP has been recently associated with broad IS36 and was also the only variant showing cross-trait association in the previous CKDGen analysis.14
An important limitation of this study was modest sample sizes for stroke subtypes. Sample size is a challenge for genetic studies of stroke, reflecting the technical nature of case ascertainment and the presence of multiple aetiological types. These are among the largest current IS samples with GWAS, but our results should be validated in larger, well-phenotyped stroke GWAS data sets as they become available.
This study suggests a potential polygenic basis for epidemiological associations between renal dysfunction and IS. The effects of the putative shared genetic components seem small and potentially specific to distinct stroke types.
Sources of Funding
E.G. Holliday is supported by the Australian Heart Foundation and National Stroke Foundation (100071). M. Traylor is funded by a UK Stroke Association project grant. H.S. Markus is supported by an National Institute for Health Research (UK) Senior Investigator award. H.S. Markus and S. Bevan’s work in this area is supported by the Cambridge University Hospital Trust NIHR Biomedical Research centre. P.M. Rothwell holds NIHR and Wellcome Trust Senior Investigator Awards and is funded by the NIHR Biomedical Research Centre, Oxford. C. Sudlow, H.S. Markus and S. Bevan have received research funding from the Wellcome Trust. C. Sudlow has received funding from the UK Binks Trust. J. Coresh, M. de Andrade, C.S. Fox, W.H.L. Kao, B.D. Mitchell, and S.T. Turner have received research funding from the US National Institutes of Health. P. Hamet and J. Tremblay have received research funding from Genome Quebec, Canada. J. Tremblay has received funding from the Canadian Institutes for Health Research. C. Levi and R.J. Scott have received research funding from the Australian National Health and Medical Research Council.
Guest Editor for this article was Ralph L. Sacco, MD.
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.114.006609/-/DC1.
- Received June 30, 2014.
- Revision received September 15, 2014.
- Accepted September 18, 2014.
- © 2014 American Heart Association, Inc.
- Lee M,
- Saver JL,
- Chang KH,
- Liao HW,
- Chang SC,
- Ovbiagele B
- Ovbiagele B,
- Bath PM,
- Cotton D,
- Sha N,
- Diener HC
- Kastarinen H,
- Ukkola O,
- Kesäniemi YA
- Elias MF,
- Davey A,
- Dore GA,
- Gillespie A,
- Abhayaratna WP,
- Robbins MA
- Ikram MA,
- Vernooij MW,
- Hofman A,
- Niessen WJ,
- van der Lugt A,
- Breteler MM
- Bevan S,
- Traylor M,
- Adib-Samii P,
- Malik R,
- Paul NL,
- Jackson C,
- et al
- Boger CA,
- Chen MH,
- Tin A,
- Olden M,
- Kottgen A,
- de Boer IH,
- et al
- Adams HP Jr,
- Bendixen BH,
- Kappelle LJ,
- Biller J,
- Love BB,
- Gordon DL,
- et al
- Snijders TAB,
- Bosker RJ
- Matsushita K,
- van der Velde M,
- Astor BC,
- Woodward M,
- Levey AS,
- de Jong PE,
- et al
- Stehouwer CD,
- Smulders YM
- Bansal N,
- Fan D,
- Hsu CY,
- Ordonez JD,
- Marcus GM,
- Go AS
- Knopman DS
- Boiten J,
- Lodder J,
- Kessels F