-
-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I feel like there are ways to better optimize the process of updating to a new Gleam version.
Right now, the patches done to the Gleam codebase are very unclear and necessarily must be manually documented by me, which is quite error-prone. This is because I'm rebasing Gleam on top of the codebase, and not the opposite.
I wanted to follow a rebase-only model from the start, but I didn't want to restart main
from a clean slate each time. Of course, the full history wouldn't be lost by keeping separate branches or tags. However, it'd be very inconvenient, especially considering Glistix's target audience: Nix users; whereas Nix flake is very "git-dependent", and I didn't want to incur problems overall.
However, it's possible we could get the best of both worlds - rebase on each update and keep patches separate, while also keeping the full history available on main
- by following a variation of the advice provided in a 2016 post by Die Antwort1: we could keep a separate branch which is constantly rebased (let's say, "unstable" or "dev"), and its state is then fully applied to main
as a single "super merge" commit (that is, entirely replacing main
's contents) on each Gleam version bump.
That way, we can more easily keep track of changes that are still relevant and know what to do with them as Gleam updates come by (those changes would be my own commits in the rebased / unstable branch), while also keeping track of the full history of changes to Glistix (main
would basically collect merges over time).
Another relevant case of study is Zen browser2, a Firefox fork, which keeps its source code as pure .patch
files. However, I like to be able to easily access the current state of the repository on GitHub, and that would make it difficult to do so, so I am (at least in principle) opposed to that.
Related: #41
Update: I've taken the initiative to try a slight variation of this approach. I've created a patchset
branch (https://github.com/Glistix/glistix/commits/patchset/) whose goal is to hold all changes to upstream, as well as one single squashed commit with all non-upstream changes (changes to compiler-core/src/nix/
, nix/
, flake.nix
, flake.lock
, and anything exclusive to Glistix).
This is because updating Gleam won't ever change those files unless I explicitly change them myself, since those files do not exist in upstream. In addition, keeping all of my own commits separate would lead to "death by a thousand cuts" and only make me waste more time, since any conflicts to past commits would have a cascading effect where that'd be more easily fixed at the tip of the branch.
Of course, this cascading effect is intended for patches to upstream code, since the upstream base is continually change. It is, however, not intended for changes to non-upstream code, at least in principle, since I have full control and history available for this code.
As a result, PRs will continue to target main (although PRs patching upstream should ideally be fully separate from PRs targeting non-upstream code, adding stub non-upstream code if needed), and updating Gleam will consist of
- Applying any missing commits from
main
topatchset
; - Rebasing the patchset branch on top of the new Gleam version;
- Fixing conflicts within individual patch commits;
- If necessary, updating the non-upstream code;
- Merging the diff back to
main
(ideally, neatly separated into different commits, one for each changed patch).
Some remaining questions:
- Should we follow through with this change?
Temporary answer: Let's try it for the next Gleam version and see how it goes.Answer: Some initial local tests indicate this approach is pretty robust, and I was able to quickly update Glistix to the latest Gleam main. Let's test again once Gleam 1.8.0 is out. - How should the process of bumping the Gleam version look like in the end? I was thinking of pushing PRs directly to the rebase branch, and rebasing it only when there is a new Gleam version. We'd push it back to main on each Glistix version (or regularly, if needed). Answer: See above.
- In particular, when merging back from the rebase branch to the main branch, should we squash the commits or do a regular merge? I mostly prefer to squash since GitHub (and Git, for that matter) shows all commits from both merged branches, which is particularly inconvenient and annoying since our own history gets mixed up with upstream's history. Answer: Any changes to the patchset branch while updating Gleam are recorded (can be visualized with
git diff main..patchset
orjj diff --from main --to patchset
) and become commits in an update PR. - How to deal with test snapshot names? Right now, they're the most frequent source of "false-positives" of conflicts when bumping Gleam, since Gleam commits update tests prefixed with
gleam_core__
(for example) while we rename our tests to beglistix_core__
due to the crate name. One potentially simple alternative is to have the first commit post-rebase always be "rename all the crates and tests". However, it'd also be nice if we could forcecargo insta
to use thegleam_core__
prefix since it really doesn't matter. (I'd like to keep the crate name, if possible, though.) Answer: There is a single auto-generated commit at the very beginning of thepatchset
branch, right after the Gleam commits. This commit can be regenerated by runningbin/glistix-fix-test-names.sh
again.
Footnotes
-
"Git Tricks for Maintaing a Long-Lived Fork" by Die Antwork (2016): https://die-antwort.eu/techblog/2016-08-git-tricks-for-maintaining-a-long-lived-fork/ ↩
-
Zen browser's repository: https://github.com/zen-browser/desktop ↩