Jenkins "fatal: reference is not a tree" on git checkout

(Not really sure where to report issues with Django’s Jenkins builds.)

Some Jenkins PR builds seem to be randomly failing with a “reference is not a tree” error in git checkout. Example:

> git checkout -f cc5f6d59e04c23681458424e3768d31cb0e2828f # timeout=10
hudson.plugins.git.GitException: Command "git checkout -f cc5f6d59e04c23681458424e3768d31cb0e2828f" returned status code 128:
stdout: 
stderr: fatal: reference is not a tree: cc5f6d59e04c23681458424e3768d31cb0e2828f

It seems somewhat random. (That is the correct SHA for the PR commit.)

This may be a transient GitHub caching issue or something like that. But I noticed that GitHub’s actions/checkout solved a similar issue a few years back: their action “fetches a specific SHA and retries with a few delays between before failing.” If the problem continues, perhaps the Jenkins builds should adopt a similar strategy?

[FWIW, the ReadTheDocs preview hook failed with that same git error on an earlier run, which seems to confirm this is a GitHub glitch of some sort.]

I normally force push again after a failure like this

There are a few issues on the Jenkins git plugin that mention this error: https://plugins.jenkins.io/git/issues/
I can’t find an obvious fix from a google search :thinking:

Yeah, that’s what I did. Force pushing Whack-a-Moled the “reference is not a tree” failure over to a different job in the next run.

Best guess is that there’s a cached version of the repo in some GitHub endpoint. After a force push, the workflow kicks off immediately with the SHA of the new head, but the repo cache may not be updated when the Jenkins runner tries to fetch that SHA.

GitHub’s own workaround for this (in their actions/checkout) is to retry up to three times, randomly delaying 10-20 seconds after each failure, for git fetch and git ls-remote.

Again, probably no action necessary, unless these failures become frequent. (Just wanted to report the issue somewhere, in case others are seeing it.)