wepy is a pure-python framework and library I wrote to support my
PhD thesis work in doing weighted ensemble simulations of drug
binding via the OpenMM extension.
The main feature of
wepy was to add support for fast prototyping new
resampling algorithms that are substantially more flexible & complex
than other libraries allowed for.
It also has support for a general purpose random-access single-file database format in HDF5 that drastically simplifies the organization of simulation data as well as making it cross-platform; avoiding bugs such as those arising from differences in lexical sorting of file names between different OSes.
wepy simulations are assembled and configured in python and avoids
the complexities of dealing with various static configuration files
(which are really only necessary for allowing untrusted users to
wepy is highly customizable while still isolating each component
making it very simple to extend only the things you need.
This is a library for profiling arbitrary inter-molecular interactions in molecular systems.
It provides automatic detection of functional groups through
but also allows you to define your own definitions for functional
It provides a library of common functional groups for profiling, but this is extensible as well.
Results come as pure-python objects as well as pandas tables which can then be exported to any format or database.
geomm is a python library that provides pure-function
implementations for common computations in biophysics.
It is mainly a response to most libraries in the field of biophysics all having mutually incompatible in-memory object representations and the need to convert between all of them when composing them.
General Purpose Utilities
- Simple command line interface to start a dask scheduling server.
- Command line interface to update a local bibtex file from bibsonomy.org for paper and book citations.
- CLI to turn a multi-layer Inkscape SVG file into a multi-page PDF. I use this for making slide decks.
- Python library with nice data structures for representing,
parsing, and writing
fstabfiles. Idea is to configure
fstabinformation in a nicer more forgiving format (like
- Python libray for generating
rsynccommand strings from python dataclasses. Used in
- Generate submission scripts for the SLURM job scheduler
from shell scripts that is configurable via
- Simple utility to quickly benchmark the speed of various python serialization tools on files of your choice.
- Generate org-mode TODO hierarchies from python modules. I used this to generate a TODO list for writing/auditing docstrings for my large projects.
Configurations for users in Unix-like environments (like linux) is a major pain point for beginners and advanced users alike.
I have been slowly developing a set of tools to regain some sanity and add some essential features.
They are (in order of maturity):
- Which is a layer of indirection over shell
.bashrcfiles) that is semantically meaningful and allows for componentization and several distinct user profiles.
$HOMEis a warzone. A toolset and directory schema for your "dotfiles" and bootstrapping user configurations into new computers and environments.
- Tool and flows for managing credentials and secrets across different computers and environments.
- Tools, schemas, & flows for managing files and projects across different domains (e.g. work, personal, hobby groups).
refugue is a tool for managing data synchronizations between a
personal network of computers and drives.
It allows you to perform synchronizations from any computer (actually the more fine-grained concept of a replica) by using meaningful pet names instead of network addresses.
Synchronizations are specified using a small vocabulary of
well-documented behaviors that are then "compiled" to the underlying
tool being used to perform transfers (i.e.
It also simplifies and unifies the process of defining working sets that are to be present on different machines. For instance having different sets of files on your laptop vs. your servers.
Here is an example:
refugue --sync='' computerA/tree computerB/backup
computerB/bacup are file subtree on a
specific host or disk drive.
Working sets for each are defined in a local versionable configuration file and need not be executed on either of the two computers in the command (as long as they are reachable via ssh).
jubeo: Meta-Project Protocol
The name is stolen from object based systems like Smalltalk and Common Lisp's Meta-Object Protocol which is a way to update "living" code objects.
This is a tool for updating and maintaining tooling for different types of projects (software dev, analytics, website design, etc.).
The overarching goal is to regain some of the original unix-philosophy of writing small tools that do one thing, and work well together. I.e. developing polyrepos (as opposed to monorepos).
The problem is that in modern dev environments there are so many things to set up and manage:
- building documentation
- running regression tests
- code formatting
- type checking
- managing virtual environments
Which can get tedious very quickly if you have more than a few projects to do this all for.
Historically, this was done through makefiles which is a practice almost long-forgotten by python devs. And as a result a dizzying plethora of repository management tools have come up that try to do all of this in one package.
jubeo allows you to configure simple tools in one place (a
repository and component modules) and then distribute (through simple
file copying) to many different projects, while allowing you to name
tasks semantically rather than based on specific tools (i.e.
python setup.py sdist wheel).
Furthermore, once tools are copied they belong to the code base and
are versioned along with it. You aren't adding a dependency on jubeo
to give you this stuff. All
jubeo does is make it simple to update
or fix tooling (such as
publish) that are all
the same across many different projects.
This makes it much lower friction to just make a new tool (i.e. a different package to `pip install`) rather than adding a feature to an existing CLI you are familiar with since you won't have to manually perform all the boring stuff maintainers do.
In an existing project you would run something similar to get started (on a new python package):
jubeo init --upstream=git+https://github.com/salotz/jubeo.git#repos/python . pip install -r .jubeo/requirements.txt
Then you should be able to see all the tasks that are available to you:
inv -l inv py.build
Then just commit them like you would any other helper script.
When you want to update your tools just run:
jubeo update . git commit -m "updated jubeo tools"
If you don't like the new changes, just roll back that commit! No more figuring out dependency hell for your tooling. Just fix the problem and get back to work.
It also allows you to add custom tasks and targets for your project
which will always be necessary. Just write new
invoke files in the
tasks/plugins folder and add them to the list in
doit (WIP) to give a uniform command-line
interface across all tools.
Since a specific directory layout is usually expected (hey we can't fix everything!) it is supported by a collection of cookiecutters for bootstrapping projects as well:
- For generating new cookiecutter projects.
- For generating new jubeo repos.
- For generating python projects in the way I think is best.
- For generating data science projects with pipelines, data management, modules, packaging, and deployment.