Darcs Commit Coalescing



Overview

One useful workflow involving a DCVS is to have multiple levels of repo (repository), and make many small commits to a local repo, and then push those commits up to one or more levels of published repos.

For example:

local development repo -> local testing repo -> shared team repo -> public global repo.

It is often desirable to coalesce (combine) the many small commits in the development repo(s) into one or a small number of larger commits which correspond more directly to the implemented features.

This coalescing would typically occur as the commits are migrated to one of the more shared repos.

Most DVCSs have features that either directly address this need, or that can be used to support this, albeit with a somewhat involved sequence of commands. For example, Git provides various features that support this ability including: “add -u”, “rebase --interactive”, and “merge --squash”; and Mercurial supports a number of possible command sequences for “concatenating” commits, usually involving one or more Mercurial extensions.

Darcs currently has one feature that can be used for this, but it is quite limited. Using “amend-record” a new record (commit) action can be made part of an existing commit (patch). In addition, Darcs supports a command sequence that would coalesce commits.

Current Solutions

General Solutions

The crux of the feature is: how are the bounds defined that specify which commits to coalesce?

There are two general solutions:

  1. associate them automatically using a pre-created containing structure such as a branch;

  2. associate them manually (usually interactively) after they have been created.

Git supports both approaches: “merge --squash” relies on the bounds of a discrete branch to identify the commits to coalesce, and in addition can retain both the original commits and their coalesced counterpart. Alternatively, “rebase --interactive” allows the user to interactively identify those existing commits to coalesce, but in the process loses the original small commits and can allow unintentional errors including losing commits.

Mercurial has a number of approaches, but most rely on the user interactively identifying those existing commits to coalesce, and most rely on the user correctly managing a sequence of interrelated commands.

Suggested New Solution in Darcs

Darcs already has most or all of the base features to build a good and consistent implementation of this. The main new feature needed is the logic to actually coalesce patches as they are pushed to another repo.

Eg: “push --coalesce”.

The “--coalesce” option to push would cause it to create a single coalesced patch in the target repo that explicitly references the original patches (commits), as well as the single, new patch containing the result of combining all the original patches.

I had considered that this could be engineered using the existing patch-dependency features (the single coalesced patch depending on the multiple original patches), but I suspect that it would be cleaner to create a new patch-type: “coalesced patch”.

I would recommend that the original patches are not copied to the target repo, and the references to them in the coalesced patch are primarily used to prevent subsequent push or pull operations from duplicating the patches in any repo – either by pulling or pushing the original patches into a repository containing the coalesced patch; or by pulling or pushing the coalesced patch into a repository containing any of the original patches.

Using this new feature, there would now be at least three simple and robust workflows in Darcs for coalescing patches:

Conclusion

By building on the existing features and processing “style” of Darcs, a solution could be engineered that is simultaneously better, simpler and more robust than those of the other popular DCVS systems considered (Git and Mercurial).