vignette

GWAS Power Calculation

2021-01-24 Haky Im
𝕎e want to calculate the power to detect an association a genetic marker and a quantitative phenotype. […] As a concrete example, we pre-specify the model for continuous trait \(Y\) and genotype \(X\) as \[Y = \beta X + \epsilon, ~~~~ \text{with } \epsilon \sim N(0, \sigma_\epsilon^2)\] where … Read more →

Calculate Z-score, P-value, Chi2 stat from GWAS

2022-04-13 Haky Im
\[Y = \beta \cdot X + \epsilon\] GWAS summary statistics will contain an estimate of the regression coefficient \(\hat\beta\) and its standard error \(\text{se}(\hat\beta)\) for each SNP in the GWAS. We distinguish the true \(\beta\) from the estimated value \(\hat\beta\) using a hat. […] \[ … Read more →

Multiple Testing 2022

2022-04-06 Haky Im
𝕩kcd image for significance […] We start defining some parameters for the simulations. The need for these will become obvious later. ## set seed to make simulations reproducible ## set.seed(20210108) ## let's start with some parameter definitions nsamp = 100 beta = 2 h2 = 0.1 sig2X = h2 … Read more →

Multiple Testing

2021-04-07 Haky Im
𝕃et’s sample one value Z from the standard normal distribution set.seed(8050151) z1 = rnorm(1, mean=0, sd=1) z1 ## [1] -0.6089123 Let’s calculate now the p-value of this observation. Recall that the p-value is the probability that we will draw a sample as extreme as the observed value under the null … Read more →

Hardy Weinberg Equilibrium

2021-04-05 Haky Im
𝕀n a population with random mating and no migration, most common variants will be at Hardy Weinberg Equilibrium (HWE). The genotype of a variant in HWE with minor allele (\(a\)) with frequency maf will be distributed as follows prob(AA): \(p = (1 - \text{maf})^2\) prob(Aa): \(2 \cdot \text{maf} … Read more →

Logistic Regression

2021-04-01 Haky Im
𝕃ogistic regression is the most common mode to model a binary outcome Y […] The log of the odds happens to be a very convenient quantity to be modeled as linear function of covariates: The log of the odds is modeled as a linear function of covariates \[Y \sim \text{Bernoulli}(\pi)\] \[ … Read more →

Maximum Likelihood Estimation

2021-04-01 Haky Im
𝕄aximum likelihood estimation is one of the most important tools for statistical inference and is used to find estimates for model parameters. […] Download the slides here […] Check out the FiveMinuteStats series here Read more →

Winner's curse

2021-04-01 Haky Im

Genotype Imputation

2021-03-30 Haky Im

Hypothesis Testing

2021-01-21 Haky Im
𝕀n draft folder Gentle introduction to hypothesis testing from Kahn Academy A drug effect example of hypothesis testing and p-values from Kahn Academy Read more →