One useful workflow involving a DCVS is to have multiple levels of repo (repository), and make many small commits to a local repo, and then push those commits up to one or more levels of published repos.
local development repo -> local testing repo -> shared team repo -> public global repo.
It is often desirable to coalesce (combine) the many small commits in the development repo(s) into one or a small number of larger commits which correspond more directly to the implemented features.
This coalescing would typically occur as the commits are migrated to one of the more shared repos.
Most DVCSs have features that either directly address this need, or that can be used to support this, albeit with a somewhat involved sequence of commands. For example, Git provides various features that support this ability including: “add -u”, “rebase --interactive”, and “merge --squash”; and Mercurial supports a number of possible command sequences for “concatenating” commits, usually involving one or more Mercurial extensions.
Darcs currently has one feature that can be used for this, but it is quite limited. Using “amend-record” a new record (commit) action can be made part of an existing commit (patch). In addition, Darcs supports a command sequence that would coalesce commits.
This coalesces (squashes) all the commits on a branch as part of merging that branch with another branch.
Given that Git branches are relatively lightweight, and the way they can be used to isolate distinct development activity, this is a good solution.
does not necessarily lose historical information
whatever history duplication occurs is controlled
single simple command
relies on all commits being on (the same) distinct branch
branch must have been set up in advance
branch must not have been “polluted” with other commits
This can be used to coalesce (squash) selected commits from anywhere to anywhere else.
Given that rebasing is not an unusual activity in Git, this is not a bad solution.
single simple command
interactive to allow flexibility
it rewrites history, so the relationship between the original multiple commits and the resulting coalesced commits may (will?) be lost;
user error can result in commits being
Use regular Mercurial commands to create a temporary branch that is is missing all the commits to be coalesced, and has all those changes in the working set. A commit at this point will create a coalesced commit.
uses standard commands
does not rewrite history (caveat: see Cons below)
involves a sequence of commands – can result in errors;
if applied to the working repo, then history is destroyed;
if applied to a temporary repo, then history is increased;
if the original commits are pushed from the local repo by mistake, then there are multiple commits in the receiving repo for the same changes;
Use of Mercurial Queues
The queues can be used to perform a myriad of reordering (and concatenation) tasks. The normal workflow using queues could replicate Git’s “rebase --interactive”.
Use regular Darcs commands (mainly unrecord) to create a temporary branch that is missing all the patches (commits) to be coalesced, and has all the corresponding changes in the working files. A record at this point will create a coalesced patch (commit) that can be pushed to a more global repo.
Pros and Cons are the same as those for the same operation under Mercurial (or Git).
A subsequent record in the local repo is added (coalesced immediately) to an existing patch (commit).
does not rewrite history
does not duplicate history
no history of the smaller individual
The crux of the feature is: how are the bounds defined that specify which commits to coalesce?
There are two general solutions:
associate them automatically using a pre-created containing structure such as a branch;
associate them manually (usually interactively) after they have been created.
Git supports both approaches: “merge --squash” relies on the bounds of a discrete branch to identify the commits to coalesce, and in addition can retain both the original commits and their coalesced counterpart. Alternatively, “rebase --interactive” allows the user to interactively identify those existing commits to coalesce, but in the process loses the original small commits and can allow unintentional errors including losing commits.
Mercurial has a number of approaches, but most rely on the user interactively identifying those existing commits to coalesce, and most rely on the user correctly managing a sequence of interrelated commands.
Darcs already has most or all of the base features to build a good and consistent implementation of this. The main new feature needed is the logic to actually coalesce patches as they are pushed to another repo.
Eg: “push --coalesce”.
The “--coalesce” option to push would cause it to create a single coalesced patch in the target repo that explicitly references the original patches (commits), as well as the single, new patch containing the result of combining all the original patches.
I had considered that this could be engineered using the existing patch-dependency features (the single coalesced patch depending on the multiple original patches), but I suspect that it would be cleaner to create a new patch-type: “coalesced patch”.
I would recommend that the original patches are not copied to the target repo, and the references to them in the coalesced patch are primarily used to prevent subsequent push or pull operations from duplicating the patches in any repo – either by pulling or pushing the original patches into a repository containing the coalesced patch; or by pulling or pushing the coalesced patch into a repository containing any of the original patches.
Using this new feature, there would now be at least three simple and robust workflows in Darcs for coalescing patches:
The user creates a new branch for each separate task. The patches from a branch can then be coalesced as they are pushed to a higher-level repo.
history is not rewritten
history is retained: the original repo has the entire (fine-grained) history, while the target of the coalesced push contains the simplified (course-grained) history.
the discrete branch must have been created in advance
it is difficult to push the discrete
patches from this repo, and then coalesce them on a subsequent
push to yet another repo. This is because once the patches have
been pushed from the branch into a different repo, they are
difficult to distinguish from other patches in that repo.
The user creates a “spontaneous” branch for a task, and then selects the patches in that spontaneous branch to be coalesced during a push into a higher level repo.
history is not rewritten
history is retained (as per discrete branches)
the spontaneous branch can be pushed without being coalesced into a higher level repo, from whence it can be coalesced to further repos.
a spontaneous branch must normally be created at the time the first (or at latest the second) patch of the spontaneous branch is created.
Retro-actively creating a spontaneous
branch requires amending the member patches, which can create
conflicts if those patches have been pushed/pulled to any other
using the “--ask-depends” option of “record”, the user can create a new “group-patch” that depends on any arbitrary set of existing patches. Pushing this group-patch with the --coalesce option would result in a single coalesced patch being pushed that referred back to this group-patch, which in turn depends on (refers to) the specified patches.
history is not rewritten
history is not lost
the group-patch represents the knowledge that these patches are related
the group-patch can be pushed without coalescing into a higher-level repo, from whence it can be further pushed with coalescing.
There is no way that creating a group-patch can result in commits (patches) being lost (cf: Git).
the user must correctly identify the patches to include in the coalesce operation. This is no different to the other interactive options of Git and Mercurial.
By building on the existing features and processing “style” of Darcs, a solution could be engineered that is simultaneously better, simpler and more robust than those of the other popular DCVS systems considered (Git and Mercurial).