A novel and fast approach for population structure inference using kernel-PCA and optimization (PSIKO)

Popescu, Andrei-Alin, Harper, Andrea, Trick, Martin, Bancroft, Ian and Huber, Katharina T. (2014) A novel and fast approach for population structure inference using kernel-PCA and optimization (PSIKO). Genetics, 198 (4). pp. 1421-31. ISSN 1943-2631

[thumbnail of pca-2014-10-20]
Preview
PDF (pca-2014-10-20) - Draft Version
Download (340kB) | Preview

Abstract

Population structure is a confounding factor in Genome Wide Association Studies, increasing the rate of false positive associations. In order to correct for it, several model-based algorithms such as ADMIXTURE and STRUCTURE have been proposed. These tend to suffer from the fact that they have a considerable computational burden, limiting their applicability when used with large datasets, such as those produced by Next Generation Sequencing (NGS) techniques. To address this, non-model based approaches such as SNMF and EIGENSTRAT have been proposed, which scale better with larger data. Here we present a novel non-model based approach, PSIKO, which is based on a unique combination of linear kernel-PCA and least-squares optimization and allows for the inference of admixture coefficients, principal components, and number of founder populations of a dataset. PSIKO has been compared against existing leading methods on a variety of simulation scenarios, as well as on real biological data. We found that in addition to producing results of the same quality as other tested methods, PSIKO scales extremely well with dataset size, being considerably (up to 30 times) faster for longer sequences than even state of the art methods such as SNMF. PSIKO and accompanying manual are freely available at https://www.uea.ac.uk/computing/psiko.

Item Type: Article
Additional Information: Open access
Uncontrolled Keywords: q-matrix,genome-wide association studies,admixture inference,kernel-pca,population structure
Faculty \ School: Faculty of Science
Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Computational Biology > Phylogenetics (former - to 2018)
Faculty of Science > Research Groups > Computational Biology
Depositing User: Pure Connector
Date Deposited: 04 Nov 2014 12:22
Last Modified: 13 Jun 2023 08:26
URI: https://ueaeprints.uea.ac.uk/id/eprint/50646
DOI: 10.1534/genetics.114.171314

Downloads

Downloads per month over past year

Actions (login required)

View Item View Item