Learning Objectives
- run example GWAS using plink
- recognize plink data formats
- perform QC with plink
- run association with plink
- interpret output of plink
Material
We will follow the tutorial published here by Marees et al A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis
Lab
log into midway and load plink
If you want to work on the cluster you will need to load plink
module avail ##Step 1
module avail plink ##Step 2
module load plink/1.90b6.9 ##Step 3
If you prefer to run on your computer, download plink from here
We also need to load R
module load R
Download the tutorial files
An easy way to get the tutorial files is git cloning the repository
git clone https://github.com/MareesAT/GWA_tutorial.git
You should see something like this
Data format
Basic Plink command
For convenience, I’ve copied the tutorial’s scripts below
1. GWAS QC
- 1_Main_script_QC_GWAS.txt
- check_heterozygosity_rate.R
- Relatedness.R
- hist_miss.R
- pops_HapMap_3_r3
- hwe.R
- MAF_check.R
- gender_check.R
- heterozygosity_outliers_list.R
- inversion.txt
To look at the figures on your local machine, you can use scp
or rsync
rsync -avz haky@midway2.rcc.uchicago.edu:/project2/bios25328/haky/GWA_tutorial/1_QC_GWAS ~/Downloads/lab2/
This will copy everything you have under the directory. You can selectively copy to your local machine as follows
scp haky@midway2:/project2/bios25328/haky/GWA_tutorial/1_QC_GWAS/histimiss.pdf .
2. Population stratitfication
3. Association
References
Marees, AT, de Kluiver, H, Stringer, S, et al. A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis. Int J Methods Psychiatr Res. 2018; 27:e1608. https://doi.org/10.1002/mpr.1608