From Array Suite Wiki
The function to perform variant association testing using for either all variants in land (genome-wide association) or for a pre-define set of variants. The result is saved as a result set.
- 1) Define the SampleSet to be analyzed
- a. When defining the SampleSet, consider the availability of phenotype data and covariates so that the SampleSet will be representative of the final sample list to be analyzed
- b. Decide what data type you will be analyzing (array genotypes, sequence, imputed data, or combined)
- There are several ways to create SampleSets in Land. For example, use the filter pane to remove samples missing phenotype data and samples that have :not been imputed. Then create use Create SampleSet
- 2) Run PCA if you would like to use PCs as covariates in your association model.
- 3) Run the Variant Association as outlined using the appropriate options
Perform Variant Association Analysis
Land – Select the instance of GeneticsLand that you wish to use in your analysis
Data type – Select the data type that you wish to use in your analysis. The exported variants will depend on how the data was published. For example, selecting “Genotyped Data” will export variants that were originally published as genotype data (PLINK or VCF (Genotyped)) and will exclude variants from sequencing and imputation (VCF (Sequenced), VCF (Imputed), and Impute2).
Sample set – Sample sets are a collection of samples within GeneticsLand. To create a SampleSet, Select View | Samples | Create SampleSet
Select variant list file – Select a subset of variants provided in the List file.
- Select Continuous trait (linear model) for a quantitative outcome (phenotype) (e.g. BMI)
- Select Binary trait (logistic model) for a binary outcome (phenotype) (e.g. Case/Control)
- Select Survival trait (cox model) for survival outcomes
- Check the EMMAX option to run the Efficient Mixed-Model Association eXpedited (EMMAX) method of adjustment of population stratification. Only available for continuous outcomes. Alternatively, PCA can be used for all outcomes.
Specify Model (Phenotype and Covariates)
- Select the phenotype to be tested using the dynamic search box, which shows clinical and sample metadata
- Include covariates to in the association model by scrolling through the metadata variables and selecting "Add"
- Set Control – For binary traits only, select the referent group for the model
NOTE: Crossed (interaction) or nested models are currently in beta.
- HWE p-value cutoff – Filter by Hardy-Weinberg Equilibrium (HWE) pvalue for genetic markers
- R2 cutoff (derived) – Represents the imputation quality threshold (set to 0 to export everything) calculated using the selected samples
- Allele count cutoff – Cutoff of the minor allele count in the selected samplesets. It will exclude markers with very few instances of minor allele (less than the cutoff).
Result set name – Name to be use for output logs and the returned result set in Land.
Output folder – Designate the output location
Results are returned to Land Select Analytics | Open Result Set
The results will be labeled with the name given by the Result set name