Mixed Effects Model to Handle Population Structure

R package build

2021-02-11

Categories: Lecture

GWAS: $\boldsymbol{Y} = \boldsymbol{X}\beta + \boldsymbol{\epsilon}$

Example with n=4 \[\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}\cdot \beta + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix}\]

$ N(, ^2)$

\[\begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix} \sim N \left(\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} + \sigma_\epsilon^2\cdot \begin{bmatrix} 1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix}\right) \]

We estimate $\beta$ using (typically) linear regression. In fact, the estimated $\hat{beta}$ is an MLE (maximum likelihood estimate) from linear regression.

Using a random effect to account for population structure. $\boldsymbol{Y} = \boldsymbol{X}\cdot\beta + u + \boldsymbol{\epsilon}$

In contrast to $\beta$ $\epsilon$ is a fixed effect (not random), $u$ is a random effect. We demonstrate random effects by their distribution, i.e. the parameters of the distribution of the r.v. It’s common to use unusual distribution for that : $u_i \sim N(\sigma, \sigma^2_g)$

The full vector of random effects (one per individual) in n=4 example is:

\[\begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \sim N \left(\sigma, \sigma^2_g\cdot \begin{bmatrix} k_{11}&k_{12}&k_{13}&k_{14}\\ k_{21}&k_{22}&k_{23}&k_{24}\\ k_{31}&k_{32}&k_{33}&k_{34} \\ k_{41}&k_{42}&k_{43}&k_{44} \\ \end{bmatrix} \right) \]

The authors of EMMAX proposed that if we use the genetic relatedness matrix, then this model will be perfect for accounting for population structure and relatedness. Let’s look at a very simple example where the population structure is given by a random effect that depends only on population membership. \[u = \begin{bmatrix} u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \end{bmatrix} \] $u_\text{AFR} \sim N(\sigma, \sigma^2_g)$ $u_\text{EUR} \sim N(\sigma, \sigma^2_g)$ $u_\text{AFR} \bot u_\text{EUR}$

We assume that the first two individuals have AFR ancestry and that the last two people have EUR ancestry; we use $u_\text{AFR}$ to represent AFR specific random effect and $u_\text{EUR}$ for the EUR specific random effect. Let’s assume that both have the same variance, $\sigma^2_g$ and that they are independent of each other. Therefore:

$E u_\text{AFR} = E u_\text{EUR} = 0$ $E u_\text{AFR}^2 = E u_\text{EUR}^2 = \sigma^2_g$ $E u_\text{AFR} \cdot u_\text{EUR} = 0$

$\boldsymbol{Y} = \boldsymbol{X}\cdot\beta + u + \boldsymbol{\epsilon}$

\[\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}\cdot \beta + \begin{bmatrix} u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix}\] Where $\beta$ is a mixed effect

\[ u \sim N \left (\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \sigma^2_g\cdot K \right) \] \[ \epsilon \sim N \left (\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \sigma^2_e\cdot\begin{bmatrix} 1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix} \right) \] Let’s calculate K now. K is a similarity matrix and is sometimes called a kernel.

Calculate $K = 1/ \sigma^2_g$ $var(\bar{u})$ \[ Var \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} = Euu' = E \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \cdot \begin{bmatrix} u_1 & u_2 & u_3 & u_4 \end{bmatrix} \] Using that $Var(\hat{u}) = E(u-Eu)(u-Eu)' = Euu'$ Since \[Eu = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}\]

\[ \sigma^2_g \cdot \mathbf{K}= E\cdot \begin{bmatrix} u_{11}&u_{12}&u_{13}&u_{14} \\ u_{21}&u_{22}&u_{23}&u_{24} \\ u_{31}&u_{32}&u_{33}&u_{34}\\ u_{41}&u_{42}&u_{43}&u_{44} \end{bmatrix} = \begin{bmatrix} E\cdot u_{11}&E \cdot u_{12}&E \cdot u_{13}&E \cdot u_{14} \\ E\cdot u_{21}&E \cdot u_{22}&E \cdot u_{23}&E \cdot u_{24} \\ E\cdot u_{31}&E \cdot u_{32}&E \cdot u_{33}&E \cdot u_{34} \\ E\cdot u_{41}&E \cdot u_{42}&E \cdot u_{43}&E \cdot u_{44} \end{bmatrix} \] \[ \sigma^2_{g}\cdot \mathbf{K} = \sigma^2_g\cdot\begin{bmatrix} 1&1&0&0 \\ 1&1&0&0 \\ 0&0&1&1 \\ 0&0&1&1 \ \end{bmatrix} \]

Mixed Effects Model to Handle Population Structure

R package build

2021-02-11

Reuse

Suggest changes

Citation