Analytics Portfolio

PhD Related

Thesis Project: Drug Unbinding Transition-State Plasticity

My main thesis project is to investigate the plasticity of ligand unbinding transition states from receptor binding sites.

This is of interest to drug designers which are looking to tailor the kinetic properties of drugs. Such kinetics oriented design can have impacts on drug efficacy and practical improvements of drugs by decreasing the frequency of dosing and avoiding side effects by improving windows of efficacy.

This project pipeline is multipart.

First, we perform molecular dynamics (MD) simulations of drug molecules unbinding from their protein targets using supercomputer clusters of graphics processing units (GPUs).

Because the unbinding process is a slow one (relative to the timescale of the MD simulations) novel enhanced sampling algorithms must be used in order to speed up the simulations without disturbing the physical correctness of the structures we observe.

Primarily this is the WExplore algorithm which was implemented in my weighted ensemble simulation library wepy.

These methods are similar to Markov State Monte Carlo (MCMC), particle methods, and other importance sampling techniques and are applicable outside of biophysics for estimating probabilities of rare events.

Second, a number of machine learning and custom data mining methods are applied to extract meaning from the "raw" simulation data in a distributed fashion.

This includes geometric constraint queries using the mastic and geomm libaries, as well as custom analyses with numpy and scipy.

I also perform machine learning analyses using sckit-learn for clustering and PCA as well as a number of libraries specific to the field for building Markov State Models and calculating stochastic network properties via Transition Path Theory.

I have leveraged the distributed computing framework dask to scale the analysis of terabytes of simulation data.

The results of this research so far have been published or are still in progress.