Assistant Professor, Department of Human Genetics, Emory University
Learning objectives
Discuss the assumptions of Mendelian randomization (MR)
Discuss a few notable MR methods
Close with discussion on the MR literature broadly
Inferring causal effects of \(\color{darkgreen}{X}\) on \(\color{darkorange}{Y}\)
Consider two phenotypes, measured in humans:
\(\color{darkgreen}{X}\), some “exposure”, e.g., low-density lipoprotein, C-reactive protein, etc.
\(\color{darkorange}{Y}\), some “outcome”, e.g., ischemic heart disease, systolic blood pressure, etc.
Inferring the causal effect of \(\color{darkgreen}{X}\) on \(\color{darkorange}{Y}\)
We are interested in inferring the causal effect of \(\color{darkgreen}{X}\) on \(\color{darkorange}{Y}\). Perhaps ideally, we could randomly assign values to \(\color{darkgreen}{X}\), i.e., \(do(\color{darkgreen}{X} = x)\), and then measure how \(\color{darkorange}{Y}\) changes as we change \(do(\color{darkgreen}{X} = x)\).
This works (e.g, RCTs). But obviously, for numerous pairs of \((\color{darkgreen}{X}, \color{darkorange}{Y})\), RCT isn’t tractable.
Transmission of alleles as a pseudo-RCT
Perhaps we can instead use natural variation, with the right kind of randomness, as a proxy for an RCT, a kind of “natural experiment.” Consider genetic inheritance:
In the autosomes, offspring randomly inherit one of mother’s two alleles, and one of the father’s two alleles.
These alleles are transmitted indepenently on one another (putting aside linkage disequilibrium for now)
Perhaps with these principles in mind, we can view each trait relevant variant as a kind of RCT.
Genetic variation as a parallel to an RCT
MR study design, Sanderson et al., 2022, Nature reviews methods primers
Mendelian Randomization
To do this, we can use Mendelian Randomization (MR), which leverages genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure (\(\color{darkgreen}{X}\)) on an outcome (\(\color{darkorange}{Y}\)).
In the case of no confounding between \(\color{darkgreen}{X}\) and \(\color{darkorange}{Y}\):
Here, \(U_i\) represents the confounder, \(\gamma\) is the effect of the confounder on \(\color{darkgreen}{X}\), and \(\delta\) is the effect of the confounder on \(\color{darkorange}{Y}\).
Implications of this model with a confounder
The presence of the confounder \(U\) introduces bias in the estimation of the causal effect \(\color{darkblue}{\alpha}\).
Note: The term \(\delta * Cov(U, \color{darkgreen}{X} \mid G)\) introduces bias due to the confounder \(U\).
Addressing confounding in MR
To address confounding, we can use genetic variants as instrumental variables (IVs) that are not associated with the confounder \(U\). This allows us to estimate the causal effect \(\color{darkblue}{\alpha}\) without bias from \(U\).
Example
Suppose we have identified a set of genetic variants that are associated with body mass index (BMI) and we want to estimate the causal effect of BMI on blood pressure. We can use these genetic variants as instruments in an MR analysis to infer the causal relationship between BMI and blood pressure.
Example
Sanderson et al., 2022, Nature reviews methods primers
Example confounders
Age
Smoking
Diet
Alcohol consumption
Comorbid diseases
Many others…
Assumptions of MR
Relevance: (\(G\) is associated with \(\color{darkgreen}{X}\))
Independence: (\(G\) is not associated with \(U\))
Exclusion restriction: The total effect of \(G\) on \(\color{darkorange}{Y}\) is mediated through \(\color{darkgreen}{X}\)
How to estimate \(\alpha\)
Consider the least squares estimate of the effect of \(X\) on \(Y\)
This is an estimator of \(\color{darkblue}{\alpha}\) if one has access to \(G\), \(\color{darkgreen}{X}\), and \(\color{darkorange}{Y}\) all at the same time. Often, however, this is not the case.
Two-sample MR
It’s quite natural in MR to ask whether some molecular phenotype mediates the assocation between genotype and phenotype
Does gene expression mediate the association between \(G\) and \(\color{darkorange}{Y}\)? (e.g., TWAS)
Does protein abundance mediate the association? (e.g. PWAS)
Do metabolites mediate the association?
Often, interesting molecular phenotypes are measured in distinct cohorts (e.g., GTEx, Geuvadis) from the GWAS.
Inverse-variance weighted estimator (IVW)
Assume that summary statistics for the following have been generated:
Association between \(G\) and \(\color{darkgreen}{X}\) : \(\hat{\beta}\), \(\hat{\sigma^2_x}\),
Association between \(G\) and \(\color{darkorange}{Y}\) : \(\hat{\psi}\), \(\hat{\sigma^2_y}\)