University of California, Los Angeles
- • Research Internship under Prof. Sriram Sankararaman and Eran Halperin
- • Worked on a Scalable and flexible Probabilistic model for PCA for large-scale genetic variation data.
- • Derived an iterative EM algorithm for computing the principal components to compute K principal components on N individuals and M SNPs per iteration.
- • Extended the EM algorithm to accurately compute PCs in the presence of missing genotypes while retainingits computational efficiency.
- • Accelerated the EM algorithm using Globally Convergent Squared Iterative Methods (SQUAREM) to reduce the number of iterations to reach convergence.
- • Benchmarked the accuracy and run times on simulated datasets and achieved a speedup of about 4 timesfaster compared to FastPCA, a scalable randomised algorithm for PCA (Galinsky et al. 2016)