GWAS: \(\boldsymbol{Y} = \boldsymbol{X}\beta + \boldsymbol{\epsilon}\)
Example with n=4 \[\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}\cdot \beta + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix}\]
$ N(, ^2)$
\[\begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix} \sim N \left(\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} + \sigma_\epsilon^2\cdot \begin{bmatrix} 1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix}\right) \]
We estimate \(\beta\) using (typically) linear regression. In fact, the estimated \(\hat{beta}\) is an MLE (maximum likelihood estimate) from linear regression.
Using a random effect to account for population structure. \(\boldsymbol{Y} = \boldsymbol{X}\cdot\beta + u + \boldsymbol{\epsilon}\)
In contrast to \(\beta\) \(\epsilon\) is a fixed effect (not random), \(u\) is a random effect. We demonstrate random effects by their distribution, i.e. the parameters of the distribution of the r.v. It’s common to use unusual distribution for that : \(u_i \sim N(\sigma, \sigma^2_g)\)
The full vector of random effects (one per individual) in n=4 example is:
\[\begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \sim N \left(\sigma, \sigma^2_g\cdot \begin{bmatrix} k_{11}&k_{12}&k_{13}&k_{14}\\ k_{21}&k_{22}&k_{23}&k_{24}\\ k_{31}&k_{32}&k_{33}&k_{34} \\ k_{41}&k_{42}&k_{43}&k_{44} \\ \end{bmatrix} \right) \]
The authors of EMMAX proposed that if we use the genetic relatedness matrix, then this model will be perfect for accounting for population structure and relatedness. Let’s look at a very simple example where the population structure is given by a random effect that depends only on population membership. \[u = \begin{bmatrix} u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \end{bmatrix} \] \(u_\text{AFR} \sim N(\sigma, \sigma^2_g)\) \(u_\text{EUR} \sim N(\sigma, \sigma^2_g)\) \(u_\text{AFR} \bot u_\text{EUR}\)
We assume that the first two individuals have AFR ancestry and that the last two people have EUR ancestry; we use \(u_\text{AFR}\) to represent AFR specific random effect and \(u_\text{EUR}\) for the EUR specific random effect. Let’s assume that both have the same variance, \(\sigma^2_g\) and that they are independent of each other. Therefore:
\(E u_\text{AFR} = E u_\text{EUR} = 0\) \(E u_\text{AFR}^2 = E u_\text{EUR}^2 = \sigma^2_g\) \(E u_\text{AFR} \cdot u_\text{EUR} = 0\)
\(\boldsymbol{Y} = \boldsymbol{X}\cdot\beta + u + \boldsymbol{\epsilon}\)
\[\begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}\cdot \beta + \begin{bmatrix} u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \\ u_\text{AFR} \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \\ \epsilon_4 \end{bmatrix}\] Where \(\beta\) is a mixed effect
\[ u \sim N \left (\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \sigma^2_g\cdot K \right) \] \[ \epsilon \sim N \left (\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}, \sigma^2_e\cdot\begin{bmatrix} 1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1 \end{bmatrix} \right) \] Let’s calculate K now. K is a similarity matrix and is sometimes called a kernel.
Calculate \(K = 1/ \sigma^2_g\) \(var(\bar{u})\) \[ Var \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} = Euu' = E \begin{bmatrix} u_1 \\ u_2 \\ u_3 \\ u_4 \end{bmatrix} \cdot \begin{bmatrix} u_1 & u_2 & u_3 & u_4 \end{bmatrix} \] Using that \(Var(\hat{u}) = E(u-Eu)(u-Eu)' = Euu'\) Since \[Eu = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}\]
\[ \sigma^2_g \cdot \mathbf{K}= E\cdot \begin{bmatrix} u_{11}&u_{12}&u_{13}&u_{14} \\ u_{21}&u_{22}&u_{23}&u_{24} \\ u_{31}&u_{32}&u_{33}&u_{34}\\ u_{41}&u_{42}&u_{43}&u_{44} \end{bmatrix} = \begin{bmatrix} E\cdot u_{11}&E \cdot u_{12}&E \cdot u_{13}&E \cdot u_{14} \\ E\cdot u_{21}&E \cdot u_{22}&E \cdot u_{23}&E \cdot u_{24} \\ E\cdot u_{31}&E \cdot u_{32}&E \cdot u_{33}&E \cdot u_{34} \\ E\cdot u_{41}&E \cdot u_{42}&E \cdot u_{43}&E \cdot u_{44} \end{bmatrix} \] \[ \sigma^2_{g}\cdot \mathbf{K} = \sigma^2_g\cdot\begin{bmatrix} 1&1&0&0 \\ 1&1&0&0 \\ 0&0&1&1 \\ 0&0&1&1 \ \end{bmatrix} \]