From Array Suite Wiki
Genetic Principal Component Analysis
This method runs PLINK2's implementation of GCTA's Principal Component Analysis (PCA) on genotypes from GeneticsLand using the phase 3 genotypes of the 1000 Genomes continental reference populations to infer the ancestry of each sample.
To run this module, from a GeneticsLand click Analytics | Principal Component Analysis
Input Data Requirements
This method requires the user to choose a Sample Set specifying which samples to analyze.
Land - Be sure to select the GeneticsLand in which you are working (the one containing the genotyped samples you wish to analyze)
Sample set - Choose the Sample Set listing the samples you wish to analyze
Job number - Specify how many export jobs should be submitted simultaneously. If your Array Server has a cluster enabled, these jobs will be submitted to the cluster. Currently, the exporting is broken up into 700 jobs so, setting as 700 will submit all jobs to the cluster queue immediately.
1000 Genomes anchored plots
After the job has completed, view the Result Set by clicking Analytics | Open Result Set
Then select the Sample set name under the Principal Component Analysis tag and click OK
The Result Set will contain 3 views. The first is the table of eigenvectors calculated from the PCA with the 1000 genomes reference samples. The samples from your Sample Set will be at the bottom with a Data Source value of Study. This table also includes the Inferred Population based on the first 5 eigenvectors.
The second view is the scatter plot of eigenvector 1 on the X axis and eigenvector 2 on the Y axis with samples colored by the Inferred Population and shaped by the Data Source as indicated in the Legend on the right.
The third view is the same as the second except using eigenvectors 3 and 4.
PCs for use as covariates in association analyses
The first 10 PCs will be calculated separately in the sample genotypes alone (without the 1000 Genomes) to give results appropriate for use as covariates in genetic association analyses to adjust for population structure. These will be reported in a new Sample Set which you can access by clicking Manage | Samples | Manage Sample Sets
Then select the Sample set_PCA under the Principal Component Analysis tag to see this table of PCs with the InferredPopulation column joined from the 1000 Genomes-based analysis displayed in the Result set above.