tjenis

Bashing Bash

I recently realized that my bash scripts tend to rot with every single line I add. The snippet below explains it well:

#!/usr/bin/env bash
# this script to be ran every 30 minutes

for f in $(find -name "*.json"); do
    lockfile="$(basename $f).lock"
    if [ -f "$lockfile" ]; then
        continue
    fi

    touch "$lockfile"
    # exposes $USERNAME and $PASSWORD
    ./json_to_envvars.py "$f"
    # backgrounding so we can do many actions in paralell
    ./do_action.sh "$USERNAME:$PASSWORD" &
    rm -f "$lockfile"
done

There are multiple pain points here.

The single biggest mistake made here, however, is using the file system as the state. I see this tendancy all the time in my scripts. Imagine trying to debug this mess when the whole file system is your state machine. I understand why containers are so hot; they restrict persistent state to volumes.

I thought I was clever when I got my college degree. Yet here I am, writing scripts like these for a living. $lockfile might not even be writable by the current process. Shit.

Also, if the parameter to ./do_action.sh is to be kept confidential, you're screwed by lurking ps aux cowboys. Such a silly thing, imagine using command line parameters for passing secrets. When people say C has footguns, they forgot to mention that those footguns are still present in other shapes in bash

What's the remedy? Just move to a real language. That way, you avoid

The same logic, implemented in python:

#!/usr/bin/env python3

with open('locks.json') as f:
    locks = json.load(f)

def lock(file):
    locks[file] = True

def unlock(file):
    locks[file] = False

for file in glob.glob("*.json", recursive=True):
    with open(file) as f:
        envvars = json.load(f)

    if status[file]["lock"]:
        continue

    lock(file)
    arg = f'{envvars["username"]}:{envvars["password"]}'
    do_action(arg, background=True)
    unlock(file)

with open('locks.json') as f:
    f.write(json.dumps(locks))

Now that I look at it, it might actually be the case that I am not a great system designer quite yet. Because the python code above instead has race conditions if multiple instances of the main script is invoked. BUT! My talking points still stand as the rock in brock:

That leads me to thinking, how do you design a system to be resilient to dead-locks, whilst allowing parallel processing? The file system is inherently paralellizable (just write to another .lock-file) and has a single namespace.

Unless we daemonize the python script?

#!/usr/bin/env python3

with open('locks.json') as f:
    locks = json.load(f)

while True:
    next_run = datetime.now() + timedelta(minutes=30)

    try:
        do_the_for_loop(locks)
    except Exception as e:
        save_locks(locks)
        raise e
    
    save_locks(locks)
    pause.until(next_run)

That's hot. I really like this architecture more. The state is written to disk on each run, successful or not. The script can be service-ified by systemd; so that it is restarted when it dies.

Is this how a design session usually goes? Cause I need more of those on $MY_COMPANY then. Also, remember that I've only worked with corporate software since september last year; I'm still learning becoming a code monkey