TIL: git push --force-with-lease

Don't ever type git push --force. Yes, there are times we have to hold our nose and do a force push. Maybe the project requires contributions to be rebased or squashed. Maybe we pushed the nuclear launch codes. But there are failure modes:

  • You might be accidentally pushing to or from the wrong branch, and hence are about to blow away valuable work at the remote. Yes, is unlikely, and can be fixed after the fact, but who knows how much embarrassing disruption and confusion you'll cause the team before you realize what you did.
  • Do you always remember to check the state of the remote, to make sure there isn't unexpected extra commits on the remote that you'll unknowingly blow away when you push? Do you enjoy always having to type those extra commands to pull and check the remote commits?
  • No matter how conscientious we are about checking the above, there is a race condition. We might check the remote, then someone else pushes valuable changes, then we force push and blow them away.

Although there are conventions that can help with all the above (e.g. only ever force pushing to your own fork, to which nobody else ever pushes), they aren't generally watertight. (e.g. you might have pushed something yourself, before vacation, and forgotten about it.)

So the generally agreed method to avoid the above failure modes is "be more careful", which sounds to me like the common prelude to failure. What we need are push's newer command-line options:

--force-with-lease
Like --force, but refuses to push if the remote ref doesn't point at the same commit that our local remote-tracking branch 'origin/mybranch' thinks it should. So if someone else pushes something to the remote's 'mybranch' just before we try to force push, our push will fail until we pull (and, in theory, inspect) those commits that we were about to blow away.

It turns out that this is inadequate. One might have fetched an up-to-date remote branch, but somehow or other ended up with our local HEAD on a divergent branch anyway:

C origin/mybranch
|
B¹   B² HEAD mybranch
 \ /
  A
  |

In this situation, --force-with-lease will allow us to push, not only blowing away the original commit B¹, as we intended, but also C, which was maybe pushed by someone else before we fetched.

To guard against this, we can use the even newer option:

--force-if-includes
This makes --force-with-lease even more strict about rejecting pushes, using clever heuristics on your local reflog, to check that the remote ref being updated doesn't include commits which have never been part of your local branch.

Upshot is, I plan to default to always replacing uses of --force with:

git push --force-with-lease --force-if-includes ...

That's a lot to type, the options don't have short versions, and it's easy to forget to do. Hence, shadow git to enforce it, and make it easy. In .bashrc or similar:

# Shadow git to warn againt the use of 'git push -f'
git() {
    is_push=false
    is_force=false
    for arg in "$@"; do
        [ "$arg" = "push" ] && is_push=true
        [ "$arg" = "-f" -o "$arg" = "--force" ] && is_force=true
    done
    if [ "$is_push" = true ] && [ "$is_force" = true ]; then
        # Suggest alternative commands.
        echo "git push -f: Consider 'git push --force-with-lease --force-if-includes' instead, which is aliased to 'gpf'"
        return 1
    fi
    # Run the given command, using the git executable instead of this function.
    $(which git) "$@"
}

# git push force: using the new, safer alternatives to --force
gpf() {
    # Older versions of git don't have --force-if-includes. Fallback to omitting it.
    if ! git push --quiet --force-with-lease --force-if-includes "$@" 2>/dev/null ; then
      git push --quiet --force-with-lease "$@"
    fi
}

Then trying to do it wrong tells you how to easily do it right:

$ git push -f
git push -f: Consider 'git push --force-with-lease --force-if-includes' instead, which is aliased to 'gpf'
[1]
$ gpf
$

(The [1] is my prompt telling me that the last command had an error exit value.)

Structured Pattern Matching in Python

I read through descriptions of structured pattern matching when it was added in Python 3.10 a couple of years ago, and have studiously avoided it ever since. It seemed like a language feature that's amazingly useful in one or two places, like writing a parser, say, and is a horrifically over-complicated mis-step just about everywhere else.

Update: A day after writing this I see that Guido van Rossum wrote exactly that, a parser, to showcase the feature. I'm guessing he writes a lot of parsers. I definitely don't write enough of them to think this language feature is worth the extra complexity it brings.

Regardless, I really ought to remember how it works, so this is my attempt to make the details stick, by writing about it.

If you're not me, you really ought to be reading about it from the source instead:

Basic structure

match EXPRESSION:
    case PATTERN1:
        ...
    case PATTERN2:
        ...
    case _:
        ...

This evaluates the match EXPRESSION, then tries to match it against each case PATTERN, executing the body of the first case that matches, falling back to the optional final _ default case. (match and case are not keywords, except in the context of a match...case block, so you can continue using them as variable names elsewhere.)

But what are PATTERNs, and how are they tested for a match?

Patterns

Patterns can be any of the following. As becomes increasingly obvious down the list, the real power of this feature comes from composing each of these patterns with the others. For complicated patterns, parentheses can be used to indicate order of operations.

Literals

Like other languages' traditional switch statement:

match mycommand:
    case 'start':
        ...
    case 'stop':
        ...
    case _:
        raise CommandNotFoundError(mycommand)

Such literal case patterns may be strings (including raw and byte-strings, but not f-strings), numbers, booleans or None.

Such cases are compared with equality:

match 123:
    case 123.0:
        # matches!

except for booleans and None, which are compared using is:

class Any:
    def __eq__(self, _):
        return True

myfalse = Any()

match myfalse:
    case False:
        # Doesn't match, even though myfalse == False
        assert False

Variable names

We can replace a literal with a variable name, to capture the value of the match expression.

match command:
    case 'start':
        ...
    case 'stop':
        ...
    case unknown:
        # New variable 'unknown' is assigned the value of command

The 'default' case pattern _ is just a special case variable name which binds no name.

Beware the common error of using "constants" as the case pattern:

NOT_FOUND = 404

match error:
    case NOT_FOUND: # bad
        handle_404()

The above case is intended to test for error == NOT_FOUND, but instead assigns the variable NOT_FOUND = error. The best defense is to always include a default catch-all case at the end, which causes the above NOT_FOUND case to produce a SyntaxError:

NOT_FOUND = 404

match error:
    case NOT_FOUND:
        handle_404()
    case _:
        pass
SyntaxError: name capture 'NOT_FOUND' makes remaining patterns unreachable

To use a 'constant' in a case pattern like this, qualify it with a dotted name, such as by using an enum.Enum:

match error
    case errors.NOT_FOUND:
        # correctly matches

Sequences

Using a list-like or tuple-like syntax, matches must have the right number of items. Like Python's existing iterable unpacking feature. Use * to match the rest of a sequence. Included variable names are set if a case matches by all other criteria.

match command:
    case ('start', name):
        # New variable name=command[1]
    case ('stop', name):
        # New variable name=command[1]
    case ('stop', name, delay):
        # New variables name=command[1], delay=command[2]
    case ('stop', name, delay, *extra):
        # New variables name=command[1], delay=command[2] & extra=command[3:]
    case _:
        raise BadCommand(command)

Mappings

Using a dict-like syntax. The match expression must must contain a corresponding mapping, and can contain other keys, too. Use ** to match the rest of a mapping.

match config:
    case {'host': hostname}:
        # 'config' must contain key 'host'. New variable hostname=config['host']
    case {'port': portnumber}:
        # 'config' must contain key 'port'. New variable portnumber=config['port']
        # Remember we only use the first matching case.
        # If 'config' contains 'host', then this 'port' case will not match.
    case {'scheme': scheme, **extras}:
        # new variables 'scheme' and 'extras' are assigned.

Case patterns may contain more than one key-value pair. The match expression must contain all of them to match.

    case {
        'host': hostname,
        'port': portnumber,
    }:
        ...

Objects and their attributes

Using class syntax, the value must match an isinstance check with the given class:

match event:
    case Click():
        # handle click
        ...
    case KeyPress():
        # handle key press
        ...

Beware the common error of omitting the parentheses:

match myval:
    case Click: # bad
        # handle clicks

The above case is intended to test for isinstance(myval, Click), but instead creates a new var, Click = myval. The best defence against this error is to always include a default catch-all at the end, which makes the Click catch-all produce an error by making subsequent patterns unreachable.

Attribute values for the class can be given, which must also match.

match event:
    case KeyPress(key_name='q', release=False):
        game.quit()
    case KeyPress():
        handle_keypress(event)

Values can also be passed as positional args to the class-like case syntax:

    case KeyPress('q', True)
        ...

If the class is a namedtuple or dataclass, then positional args to a class-like case pattern can automatically be handled using the unambiguous ordering of its attributes:

@dataclass
class Dog:
    name: str
    color: str

d = Dog('dash', 'golden')

match d:
    case Dog('dash', 'golden'):
        # matches

But for regular classes, the ordering of the class attributes is ambiguous. To fix this, add a __match_args__ attribute on the class, a tuple which specifies which class attributes, in which order, can be specified in a case pattern:

class KeyPress:
    __match_args__ = ('key_name', 'release')

event = KeyPress(key_name='q', release=False)

match event:
    case KeyPress('q', False):
        # matches!

As you might expect, the literal positional args can be replaced with variable names to capture attribute values instead:

match event:
    case KeyPress(k, r): # names unimportant, order matters
        handle_keypress(k, r)

Positional sub-patterns behave slightly differently for builtins bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple. A positional value is matched by equality against the match expression itself, rather than an attribute on it:

match 123:
    case int(123):
        # matches
    case int(123.0):
        # would also match if it wasn't shadowed

Similarly, a positional variable is assigned the value of the match expression itself, not an attribute on that value:

match 123:
   case int(value):
        ...

assert value == 123

The values passed as keyword or positional args to class-like case patterns can be more than just literals or variable names. In fact they can use any of the listed pattern types. For example, they could be a nested instance of this class-like syntax:

class Location:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class Car:
    def __init__(self, location):
        self.location = location

mycar = Car(Location(11, 22))

match mycar:
    case Car(location=(Location(x=x, y=y))):
        # matches, and captures 'x' and 'y'

assert x == 11
assert y == 22

Combine patterns using |

To match either one pattern or another:

    case 1 | True | 'true' | 'on' | 'yes':
        # matches any of those values

Capture sub-patterns using as

We've seen how we can either match against a value, or capture the value using a variable name. We can do both using as:

    case 'a' | 'b' as ab:
        # matches either value, captures what the value actually was

This might not be much use when capturing the whole match expression like that. If the match expression is just a variable, then we could instead simply refer to that variable. But using as can be useful when the match expression is lengthy or has side-effects:

match events.get_next():
    case KeyDown() as key_event:
        ...

or to capture just a component of the whole expression. Contrived example:

    case ('a' | 'b' as ab, 'c'):
        # matchs ['a', 'c'] or ['b', 'c'], and captures the first letter in 'ab'

An if guard clause

Add arbitrary conditions to the match:

    case int(i) if i < 100:
        # matches integers less than 100

Or, alternatively:

    case int() as i if i < 100:
        # matches integers less than 100

Complications

This feature seems rife with complexity. The flexible syntax of case patterns forms a new mini-language, embedded within Python. It has many similarities to Python, but also many initially unintuitive differences.

For example, a class-like case pattern such as case Click():. Anywhere else in the language, the expression like Click(...) would create an instance of the Click class. In a case statement, it instead is doing things like isinstance and hasattr checks.

Similarly, including variable names doesn't return the variable value as in ordinary Python. Instead it binds a value as that name. This is the source of the annoying gotcha described above, that bare "constants" like NOT_FOUND behave very unexpectedly when used as case expressions.

There are a few places in real-world code where structured pattern matching will produce nicer code than the equivalent using nested elifs. But equally, there are a lot of places where the elifs are a more natural match. Developers now get to choose which they're going to use, and then later disagree with each other about it, or simply change their mind, and end up converting code from one to the other.

If this was a simple feature, with low overheads, then I'd forgive its inclusion in the language, accepting the costs in return for the marginal and unevenly distributed benefits.

But it's really not simple. In addition to Python programmers all having to do an exercise like this post just to add it to their mental toolbox, it needs maintenance effort, not just in CPython but in other implementations too, and needs handling by tools such as syntax highlighters, type checkers. It really doesn't seem like a net win to me, unless you're writing way more parsers than the average programmer, which no doubt the champions of this feature are.

Ur-Fascism

Ur-Fascism cover

by Umberto Eco, 1995.

Eco's prose has left me in the dust on occasion in the past. I missed so many references, or failed to keep up with the relentlessly nested layers of meaning, that I was simply holding on for the ride. This essay matches Eco's characteristically dense, intellectual prose, studded with foreign phrases, and references to contemporary thinkers, historical movements and causes and revolutions and dictatorships, and their antecedents. But it is short, and perhaps in contrast to the flourishes of brilliance that comprise his fiction, this is a factual piece, and is written to be understood rather than to dazzle. It begins by establishing Eco's credentials to speak on this topic, with the first of several entrancing first-hand indications of what it was like to grow up as an intellectual young child in Italy under Mussolini, and the revelations that followed at the opening up of his world at the end of WWII.

He differentiates between the truly totalitarian fascism of, say, Nazism, which subordinated all of life to the state, with the looser, less coherent Italian fascism, noting that this did not derive from any increment of tolerance, merely the absence of a sufficiently encompassing underlying philosophy. Despite this, it is the Italian mode, which was the first right wing dictatorship in modern history, from which subsequent dictators seem to have drawn most stylistic inspiration, and from which our generic term of "fascism" derives.

The last half of the essay describes how fascism means different things in different contexts, but the various incarnations through history have exhibited sufficiently overlapping sets of symptoms as to glean a family resemblance. Eco enumerates 13 identifying traits, noting that the presence of even one of them can be sufficient to allow fascism to coagulate around it. Mostly for my own benefit, (with my own parenthesised observations) they are, briefly:

  1. Fascism incorporates a cult of tradition. This can be deployed as an automatic refutation of any undesirable new ideas, enshrining in their place the immutable wisdom of a mythical past. In addition, traditionalism undermines the perceived value of learning in itself - pre-emptively thwarting troublesome intellectuals. This anti-intellectual received wisdom, in order to provide whichever justifications are required of it, requires the syncretistic combination of various ancient beliefs. As a result it must tolerate contradictions. Indeed, the more stark they are, the better they serve the purpose of selecting followers who will obediently think whatever they are told.

  2. The rejection of modernism. This is a powerful recruiting tool, enabling the fascist to leverage any dissatisfaction of the populous, laying claim to the emotionally appealing universal solution of a regression to simpler, happier times, while simultaneously rewinding societal progress in equality or liberty. While Nazis and fascists love their technology, this is a superficial tool, used in support of a deeply regressive project, namely the restoration of power to those with the strength and the will to take it. This irrationality goes hand in hand with anti-intellectualism.

  3. Value vigor and action above reflection. Thinking is emasculation. Culture is suspect insofar as it aligns with any sort of critical theory or values. Regard centers of analysis or learning such as libraries or universities with suspicion for harboring - or even indoctrinating - people of opposing political viewpoints. Again, this is deeply intertwined with (1) & (2), and its prevalence pre-emptively defuses any kind of mainstream understanding or critique.

  4. All of the above make it inevitable that any given fascist regime will be rife with internal contradictions. While modernity achieves its intellectual prowess through the nurturing of diverse thought, fascism cannot possibly withstand any critical analysis. Hence, disagreement is treason.

  5. The fascist appeal to popularity exploits and exacerbates the natural fear of the other, and hence is always inherently racist. Expect demonization of immigrants, foreigners, other nation states, as well as other marginalized groups, taking advantage of whatever local contemporary biases and fears might be present.

  6. The above exploitation takes the form of an appeal to the frustrations of the middle class - or whichever class can be most useful and readily mobilized by persuasion that their problems are caused by some identifiable other.

  7. Modernity genuinely does disintegrate traditional social bonds, along with sources of identity and meaning. Fascism's solution to this is to unify the disaffected under the only remaining banner common to them all, that of patriotism and nationalism. This unity is strengthened by emphasis on the country's enemies, and especially by conspiracy theories of secret international plots against the nation. Followers must feel besieged (as Trump advised the January 6th crowd that "America is under siege"). Eco makes special mention, in the U.S, of Pat Robertson's The New World Order, but potential sources of conspiracy are innumerable.

  8. Followers can be riled into frenzy of humiliation at their enemies' wealth or power. Jews control the world and its money. Instead of coastal liberals, refer to coastal elites. But at the same time, the instinct to action requires that enemies can easily be defeated. Hence enemies are simultaneously too weak and too strong. Herein lies one of fascism's greatest weaknesses, responsible for several lost wars, in that it is constitutionally incapable of objectively assessing an enemy's strength.

  9. Goad followers into violent action with rhetoric not just of a struggle for survival, but by declaration that life is struggle, and hence pacifism is conspiring with the enemy.

  10. While fascism appeals for the participation of the population by promising empowerment for the majority, its naked power lust is a fundamentally aristocratic endeavor. The leader takes power from those too weak to oppose him, disdaining both conquered rivals and the subjugated population. Power struggles within the Party are vicious, and the party rides roughshod over the citizens, who likewise are leagues above the disenfranchised. The hierarchy is strict, steep, and ruthless. Elitism abounds, as does fear of losing one's status.

  11. The redress of modernity's threadbare social fabric, by emphasizing nationalism and strength, further erodes interpersonal solidarity. Each individual must becomes their own hero. Strong, independent, and utterly without recourse in times of need. The cult of the hero is intimately entwined with a cult of death. Having only the narrow causes of the nation and the Party to live for, the hero yearns for a heroic death - or, better, to demonstrate their power and status by sending others to their death.

  12. The preeminent will to power, so often frustrated in an aristocratic, dog-eat-dog social order, manifests alternately in things like machismo, disdain for women, and phallic fetishization of weapons. Repressed insecurity breeds an outward contempt for unconventional sexuality, including chastity.

  13. Under fascism, the people have no innate rights, and hence no material preferences or expression. Instead, the leader pretends to interpret the Will of the People. This charade requires the party apparatus to select and amplify some emotional response of the people, and present it as representative, so that the party can be empowered to act on behalf of that supposedly majority. Consider the amplifications of manufactured outrage about culture war issues, so that elected representatives are empowered to act decisively on their own preferences, against the majority of the population's wishes. This leads directly to confrontation with institutions such as a parliament. Fascism will therefore cast aspersions on any properly functioning parliament's legitimacy.

  14. All fascisms make use of their own varieties of NewSpeak, using an impoverished vocabulary and syntax, in order to limit the instruments for critical reasoning. This may appear in apparently innocent forms, from schoolbooks to popular talk shows.

TIL: Makefiles that are self-documenting, and process all extant files.

Self-documenting Makefiles

A trick from years ago, but I copy it around between projects enough that it deserves calling out. Add a target:

help: ## Show this help.
    @# Optionally add 'sort' before 'awk'
    @grep -E '^[a-zA-Z_\.%-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-10s\033[0m %s\n", $$1, $$2}'
.PHONY: help

Decorate other targets with a descriptive '##' comment, like "Show this help" above. Now calling the 'help' target will summarize all the things the Makefile can do. eg:

$ make help
help       Show this help.
setup      Install required system packages using 'apt install'.
%.pdf      Generate given PDF from corresponding .tex file.
all        Convert all .tex files to PDF.

You might choose to make 'help' the first target in the Makefile, so that it gets triggered when the user runs make without arguments.

Process all extant files

Make's canonical paradigm is that you tell it the name of the file to generate, and it uses the tree of dependencies specified in the Makefile to figure out how to build it. Typically you'll use automatic variables like "$<" to represent the wildcarded source filename:

%.pdf: %.tex ## Generate given PDF from corresponding .tex file.
    pdflatex $<

The pitfall is that when invoking this, you have to name all the PDF files you want to generate. If the names are a fixed part of your build, they can be embedded in the Makefile itself:

all: one.pdf two.pdf three.pdf

But if their names are dynamic, you have to specify them on the command line, which is a pain:

$ make one.pdf two.pdf three.pdf

This is easy enough when re-generating all the PDFs that already exist:

$ make *.pdf

but is no help when you just have a bunch of .tex files and you just want Make to build all of them. This is going the opposite way to canonical make usage. We want to specify the existing source files (*.tex, in this case), and have Make build the resulting products.

To do it, we need our Makefile to enumerate the existing source files:

TEX_FILES = $(wildcard *.tex)

Using the 'wildcard' function here behaves better than a bare wildcard expansion, e.g. it produces no output when there are no matches, rather than outputting the unmatched wildcard expression.

Then use a substitution to generate the list of .pdf filenames:

all = $(TEX_FILES:%.tex=%.pdf)

Now make all will generate one .pdf file for each extant .tex file, regardless of whether the corresponding .pdf files already exist or not.