Presentation at Silicon Therapeutics

I was lucky enough to be able to present my work this week at Silicon Therapeutics, thanks for having me!

I've made the slides available in this repository on gitlab at this location as a PDF.

Presentation roughly covers these topics:

  • Kinetics oriented drug design
  • Description of the Weighted Ensemble (WE) class of enhanced sampling algorithms
  • Motivation and design of the wepy framework for performing and analyzing WE simulations as well as prototyping new methods.
  • Results for my PhD research into ligand unbinding transition state structure for clinically relevant target soluble epoxide hydrolase.

Comments

Comments are implemented as a mailing list hosted on sourcehut: https://lists.sr.ht/~salotz/salotz.info_comments-section

Comments are for the whole website so start your subject line with: [sitx_wepy_talk] for this post.

Cron alternative with runit and snooze

I have tried many times to use cron effectively, but ultimately this ends in failure each time. Lets look at why I find cron onerous and a solution I came up with1.

Skip to this section if you just want to see the solution.

Non-Issues

Conceptually specifying when to run things with a special syntax is not a bad idea and not why I find cron hard to use.

Just as a reminder this is what a cron job I found in my system for the dma mail agent:

*/5 *	* * *	root	[ -x /usr/sbin/dma ] && /usr/sbin/dma -q1

The meaning and examples for the timing specs are well known and not really an issue here. There are lots of alternatives that try to improve on this, this is not one of them (although it is different in this regard).

Sys-Admin Orientation

My first issue with cron is that it is inseparable (or at least practically so) from the base *nix system. That means you can't run it as a user as it is meant to be the scheduled thing runner for a multi-user system. So while you can have the system run things on behalf of you as a user you can't really control it.

In this day and age, multi-user systems are becoming pretty rare outside HPCC systems in academia and gov't labs. For most people they want to run simple recurring tasks on their desktop, laptop, VPS server, or maybe even a homelab server. In all these cases user's are usually simply mechanisms for implementing access control and not for distinct human entities.

This kind of sysadmin oriented baggage is pretty pervasive in *nix systems and cron is not alone here.

Because cron is so deeply embedded into the lizard brain of *nix it is necessary to have a properly running and configured cron for the stability of your system.

As a reasonably competent user of linux desktops, the thought of inadvertently messing up /etc/crontab and its ilk is enough to keep me away from it.

For instance where exactly am I supposed to put a cron job? On Ubuntu 20.04 (KDE Neon actually) I see this in my /etc dir:

--> ls /etc/ | grep cron
anacrontab
cron.d
cron.daily
cron.hourly
cron.monthly
crontab
cron.weekly

Additionally this is highly variable distro to distro.

For instance here is part of the man page for cron on my system:

DEBIAN SPECIFIC
       Debian introduces some changes to cron that were not originally available  upstream.   The  most
       significant changes introduced are:

       —      Support for /etc/cron.{hourly,daily,weekly,monthly} via /etc/crontab,

       —      Support for /etc/cron.d (drop-in dir for package crontabs),

       —      PAM support,

       —      SELinux support,

       —      auditlog support,

       —      DST and other time-related changes/fixes,

       —      SGID crontab(1) instead of SUID root,

       —      Debian-specific file locations and commands,

       —      Debian-specific configuration (/etc/default/cron),

       —      numerous other smaller features and fixes.

       Support  for  /etc/cron.hourly,  /etc/cron.daily, /etc/cron.weekly and /etc/cron.monthly is pro‐
       vided in Debian through the default setting of the /etc/crontab file (see the system-wide  exam‐
       ple  in crontab(5)).  The default system-wide crontab contains four tasks: run every hour, every
       day, every week and every month.  Each of these tasks will execute run-parts providing each  one
       of the directories as an argument.  These tasks are disabled if anacron is installed (except for
       the hourly task) to prevent conflicts between both daemons.

This goes on for several more paragraphs of neckbeard-speak that I don't pretend to understand. Am I running anacron? Why? What is run-parts? Its obvious a lot of this is for security purposes that are important in enterprise environments that a professional sysadmin is paid a hefty salary to comprehend. So there is probably a simple answer to this mess, except I don't have direct access to the people who made this mess to ask them.

This is a scenario where reading the manual leaves me more confused than I started. I'll probably get chastised for overcomplicating things. In pre-emptive response I will leave this quote from James Clear's mailing list I got today:

To simplify before you understand the details is ignorance.

To simplify after you understand the details is genius.

I'm in the former category.

I would like to schedule backups to happen everynight, but I would rather not upset the fragile balance distro maintainers have made to keep your desktop running "smoothly".

Cron is simply not meant for a simple user like me.

Control & Supervision

The second major problem I had with cron was the real lack of control you have over it. You write text into the file and cron goes off and does its thing, just like Ritchie and Thompson intended.

Everyone with a modicum of understanding of why databases exist sees the obvious issues with inconsistency that this has. So no longer can you just edit /etc/crontab but rather you are supposed to go through helper tools (like crontab) which does all manner of locking and gatekeeping to desparately pretend its a database.

Using tools that automatically open up editors is another pet-peeve of mine since you now are bringing in a lot of other assumptions about the configuration of your system. I hope you know how to safely close files in nano and vi! Further, I like to keep all my configurations in a central configuration directory that I can use normal version control etc. How am I supposed to "load" these into the cron jobs without manually copy-pasting, at least if your going by the "manual". Aren't these systems supposed to be scriptable? 2

As we'll get to in the actual solution I'm presenting here, the loading, unloading, restarting, pausing etc. of jobs is eerily similar to the feature sets used in PID 0 programs like systemd.

Indeed systemd has a similar timer system, which has much better control, but again suffers from being sysadmin oriented. Also writing unit files blows. I do it, but I try not to.

Back to cron though, the only solution to this I have found is to install a program called cronitor, which is like a freemium tool that really isn't that scriptable either due to the (very nice) terminal UI. This is useful, but I don't really see this as something I can expect to have going into the future on all my systems.

Observability & Logging Utilities

I was able to get through all the issues above through sheer force of will, but ultimately I was done in by the necessary interaction of cron to two even more obtuse systems in *nix: email and logging.

Email is sort of baked into cron and from what I can tell is the main way of getting notifications that running things had issues.

For instance if your trying to run a script and you got the path wrong you'll need a working email server for cron to send a message. And if you give a mouse a cookie…

Then you'll need to make sure your mail boxes for the users on your system are working, you know where they are and you have a client for reading them.

This seems like a huge dependency to have to just get a message about a job having an error in it?

Shouldn't there just be a log file I can pop open in a text editor (like Ritchie and Thompson intended goddamit)?

For me and I'm sure many others, I've never had a need to interact with intra-system email. Even on the HPCC systems I worked in for my PhD this is basically an unused feature.

When you do get jobs running without error (and thus no reason to email you), you'll want to see their logs too.

In cron, its completely up to you to create and manage the lifecycle of logs in your system. While in theory this is a good thing since it allows you to not be locked into something you hate, in practice for non-neckbeards it involves you having detailed knowledge of yet another complex subsystem.

I simply don't know where to even start here. The cron docs say to use rsyslogd, which I don't know how to use. Furthermore, my system is using systemd which has nice commands that show you the latest in the logs. Is this subsystem disjoint? It wouldn't suprise me that they would jump through different calcified hoops to keep 40 year old things running.

Again there is probably a simple answer to all this, but its one that reading the man page can't get you. I don't log a lot of stuff, but mistakes happen and logs can fill up a / partition scarily easy.

Runit tutorial

I've contemplated and even tried a few alternatives to cron. This one was discovered while throwing stones at other birds. I was pleasantly surprised.

Basically it boils down to using a process supervisor called runit and a fancy big brother to the unix sleep command called snooze.

Runit was originally meant to be a replacement for older init systems like sysvinit that is cross-platform as well as very simple and not requiring many libraries. This was to suite it as part of the core system to bootstrap everything else (think cron and sshd etc.).

While these are nice characteristics they aren't killer for us here. However, nowadays its pretty popular with the anti-systemd crowd. Its an option for init systems in Gentoo and others and is the default in Void linux. Besides its simplicity it is pretty easy to get up and configured and you just run a few shell scripts to get everything going.

I have a few complaints

  1. the documentation is kind of nonlinear and doesn't give you a walkthrough of how to use the whole system 3.
  2. commands are disjointed and spread between a number of executables and use of standard unix commands like ln and rm.

The second point is actually a feature where each of the little components can be used standalone. However, this makes it a little more confusing to wrap your head around and I found myself constantly reviewing my notes to know which command to use.

I solved this with a few shell functions, but I would like to see a wrapper CLI to make it a little centralized conceptually (and to add a few convenience features) for those that would want it.

Reading these articles also helped in understanding it:

Even after reading these I had to muck around and figure a bunch of little details out so I thought I would throw my own little tutorial on the pile to hopefully save some people's time and make runit a little more approachable.

Luckily on Ubuntu 20.04 its really easy to get runit installed and running as a systemd service. Just install using apt:

sudo apt install runit

This even starts the service:

sudo systemctl status runit
sudo journalctl -u runit

Normally there are 3 stages (i.e. states) runit has:

  1. Single run tasks on startup
  2. Process supervision: starting, stopping, restarting services
  3. Shutting services down as the system goes down

Because we aren't using runit as a PID 0 init system, we only care about stage 2 & 3. The apt installation takes care of this for us thankfully.

So you should see the following directories appear:

/etc/runit
stages and runlevel stuff, ignore this (for now).
/etc/sv
The services directory, this is where you author things.
/etc/service
This is where "enabled" services are put.

I'll call these different directories by the environment variable I refer to them as. I put this in my ~/.profile:

## runit known env variables

# the active service directory to query for state, this is what it is
# default, but I like to set so its easier for me to disable services
export SVDIR="/etc/service"

# SNIPPET: the normal wait time
# export SVWAIT=7

## my vars, not recognized by any runit tools

# this is the standard directory of where services are put for the
# system. The SerVice LIBrary
export SVLIB="/etc/sv"

# log dir
export SVLOG_SYS_DIR="/var/local/log/runit"

To define a service you make a directory in SVLIB with some specially named shell scripts. Mine has these directories in it:

--> ls $SVLIB
backup  hello  printer_live  recollindex  test_env

Each one is a specific service. Lets first look at hello to get a simple picture of what these are:

--> ls $SVLIB/hello
finish  log  run  supervise

The most important one is run which is a shell script:

#!/bin/sh

# run the service
while :
do
    echo "Hello"
    sleep 2
done

This just prints "Hello" to stdout and then waits 2 seconds.

This service isn't being run yet. For that you need to put it into the SVDIR:

sudo ln -s -f "$SVLIB/hello" "$SVDIR/"

You can check the status of the service with the sv command:

--> sudo sv status $SVDIR/hello
run: /etc/service/hello: (pid 664634) 193s

You can check the status of all services similarly:

sudo sv status $SVDIR/*

If you see this:

--> sudo sv status $SVDIR/hello
down: /etc/service/hello: 1s, normally up, want up

There is something wrong with your run script.

Looking at sudo systemctl status runit and sudo journalctl -u runit could usually help me figure the issue out (no email!!!).

Once its working you should see the "Hello"s on the log for runit if you aren't logging this service:

--> sudo journalctl -u runit | tail
Oct 23 16:49:30 ostrich 2[664634]: Hello
Oct 23 16:49:32 ostrich 2[664634]: Hello
Oct 23 16:49:34 ostrich 2[664634]: Hello
Oct 23 16:49:36 ostrich 2[664634]: Hello
Oct 23 16:49:38 ostrich 2[664634]: Hello
Oct 23 16:49:40 ostrich 2[664634]: Hello
Oct 23 16:49:42 ostrich 2[664634]: Hello
Oct 23 16:49:44 ostrich 2[664634]: Hello
Oct 23 16:49:46 ostrich 2[664634]: Hello
Oct 23 16:49:48 ostrich 2[664634]: Hello

Next you'll have the finish script which is just what should be run at the end of the script:

#!/bin/sh

echo "Shutting Down"

We don't have anything to do really so we just write a message. But you could do cleanup stuff here if you want.

Last the logging spec. This is a subdirectory called log:

--> tree $SVLIB/hello/log
/etc/sv/hello/log
├── run
└── supervise [error opening dir]

1 directory, 1 file

Where again the run is a shell script:

#!/bin/sh

exec svlogd -tt /var/local/log/runit/hello

To keep things simple this is what you want. In general you could swap out different logging daemons other than svlogd (which comes with runit), but I don't see a reason to and this Just Works™. Basically runit will create this as a separate service, but just knows how to pipe around outputs now.

If you add these and then reload the services:

sudo sv reload $SVDIR/hello

You'll stop seeing "Hello" in the runit system log, and start seeing it in the log file we configured:

sudo less +F "$SVLOG_SYS_DIR/hello/current"
# or
sudo tail -f "$SVLOG_SYS_DIR/hello/current"

Before we go over configuring the logging daemon (that sysvlogd thing we ran in log/run) I should mention all those supervise dirs that were laying around.

These basically are the locks and other control data that runit uses to manage the services. Don't mess with them. They are owned by root anyways. One thing you can do if you think you messed things up is to disable the service and remove them all to start fresh:

sudo rm -rf $SVDIR/hello
rm -rf $SVLIB/hello/supervise
rm -rf $SVLIB/hello/log/supervise

Now one last thing is to configure the log. This file doesn't go in the service directory (SVLIB) but the directory where the logs are. So make this file SVLOG_SYS_DIR/hello/config and it should have something like this:

# max size in bytes
s100000

# keep max of 10 files
n10

# minimum of 5 files
N5

# rotate every number of seconds
t86400

# prepend each log message with the characters
pHELLO::

This lets you rotate logs and control file sizes. Its a really not nice file format but I will forgive them considering they aren't using any libraries for TOML or YAML parsing or such things. Again something I would improve on for non PID 0 usage.

With this all in place you'll see something like this in SVLOG_SYS_DIR/hello:

--> sudo tree $SVLOG_SYS_DIR/hello
/var/local/log/runit/hello
├── @400000005f929f2c12d2041c.s
├── @400000005f92b3043a84c42c.s
├── @400000005f92c6dd22d71b04.s
├── @400000005f92dab60d0ce77c.s
├── @400000005f92ee8f37dec1dc.s
├── @400000005f93026921e78ae4.s
├── @400000005f931643127f5bf4.s
├── @400000005f932a1d27cdd3b4.s
├── @400000005f933df70f77ffc4.s
├── @400000005f933e0a239542b4.s
├── config
├── current
└── lock

Where those ID named files are the rotated logs.

Now that we're done with the runit tutorial lets show you how to make a timer service that acts like a cron job.

Timer Services With Runit and Snooze

snooze was also available in the Ubuntu package index so we just install with apt:

sudo apt install snooze

Otherwise its very basic C so should be trivial to compile and install.

First lets just right a timer script that runs a command every 40 seconds seconds with sleep and then we can just replace sleep with snooze. Our run script is then:

#!/bin/sh

echo "Sleeping until time to run"
sleep 40

/home/user/.local/bin/run_thing

echo "Finished with task"

Because we are running this script as a service and the job of a service manager is to restart failing services, this script will just run over and over. There really wasn't a need for the while loop in our hello example.

Now to make this "cron-like" we replace sleep with snooze:

#!/bin/sh

echo "Sleeping until time to run"
snooze

/home/user/.local/bin/run_thing

echo "Finished with task"

Where the default of snooze is just to block until midnight every night (that is hour zero). So basically its just calculating how long it is until midnight and sleeping until then. Pretty simple right? Once snoozing is over the command will run the task will terminate, get restarted and then will snooze again until next time.

You can see all the options for configuring when snooze will sleep until in its docs and man page (this one is actually readable). But for instance you can set it sleep until 3 AM on Mondays and Thursdays:

snooze -w1,4 -H3 

Something else I like to do is have it run a number of jobs in serial as a cycle. Where you would have to have multiple cron jobs to achieve this you can do it in one script with this method.

I have a two print jobs every week to keep the heads from drying out. Black and white on Monday and color on Thursday at noon.

#!/bin/sh

echo "Snoozing until time to print"
snooze -w1 -H12

echo "Printing black and white test page"

# black and white
lp /home/salotz/testpage_bw.pdf


echo "Snoozing until next print job"
snooze -w4 -H12

echo "Printing RBG color test page"

# color: RBG
lp /home/salotz/testpage_color_rbg.pdf

echo "End of run script job cycle"

With cron you'd have something like 4:

0 12 * * 1 salotz lp /home/salotz/testpage_bw.pdf
0 12 * * 4 salotz lp /home/salotz/testpage_color_rbg.pdf

So the syntax for snooze is actually pretty similar and perhaps a little cleaner and semantic.

So just to review: By avoiding cron we've avoided some of the most difficult and finnicky sysadmin tasks in linux, namely email and logging, and traded them for runit. I think its a fair trade and as we've seen setting up runit is trivial in Ubuntu (and likely other distros).

I can definitely see now why runit has fierce supporters. I'll definitely be using it for situations I would normally use systemd as well. While I'm not going to be running Void linux on my desktop any time soon its a good candidate for inside of containers.

Final Notes

As I mentioned I am not a grey-neckbeard and so if I've made any gross oversights in my claims, please let me know in the comments. I would love to like cron more.

Comments

Comments are implemented as a mailing list hosted on sourcehut: https://lists.sr.ht/~salotz/salotz.info_comments-section

Comments are for the whole website so start your subject line with: [runit_snooze] for this post.

Footnotes:

1

Well, the idea was hinted at in the snooze README but I put it together.

2

Every shell script is a scar:

master_tab="$(cat << EOF
$(cat $CONFIGS_LIB_DIR/cron/header.crontab)
$(cat $CONFIGS_LIB_DIR/cron/misc.crontab)
$(cat $CONFIGS_LIB_DIR/cron/my.crontab)
EOF
)"

# then write to the crontab file after cleaning it out
(crontab -r && crontab -l 2>/dev/null; echo "$master_tab") | crontab -
3

It only has How-To Guides and Reference according to this classification: https://documentation.divio.com/

4

No I didn't do that myself, of course I used this https://crontab.guru/.

Blog Launch

Here is my obligatory blog launch post where I tell you about how I created my blog and so that there is at least one post here ;)

This Blog

I ended up choosing Nikola mainly because:

  • It's written and extended in Python (my default language)
  • It has pretty good support for Emacs Org-Mode markup via a plugin
  • Simple while still supporting basic blog & general purpose web pages

Nikola is basically a wrapper around doit with special functions for web pages.

doit is essentially a python programmable replacement for make, and just about every experience report with static site generators is that they are just fancy makefiles. QED

Future Posts

In future posts I hope to discuss some practical issues around:

  • Tools and tips for data science projects as applications and pipelines.
  • Managing dependencies, virtualenvs, repository structure, and release management for python projects.
  • Setting up and managing personal configuration and infrastructure in a linux environment.
  • Software architecture and design.

The goal of my posts will not be tutorials around how to use specific tools but rather how they all can fit together and integrate.

Potential Collaborations

Check out my list of projects. I am looking for others to test out and critique their design.

I also maintain a personal listing of: Request For Comments (RFCs) and Proposals For Software (PFS) in this repository (web page to come).

Thats it!