Version 24 (modified by 17 years ago) (diff) | ,
---|
Table of Contents
Using Monotone for Pidgin
Monotone is a distributed version control system, and as such has some user-visible differences compared to, say, CVS or SVN. In addition, each of the existing DVCS solutions seem to have idiosyncrasies to themselves, and monotone is no exception. Due to these things, we'll try to grow some monotone howtos and best practices on this page.
External Documentation
- The official manual
- The monotone wiki has a lot of good information, as well
Best Practices
We are currently drafting a best practices page at MonotoneBestPractices, which will at some point, be merged into this documentation.
Getting Started with Pidgin monotone
There is a monotone server running on pidgin.im. Its key fingerprint is 42a055d77e641de411e118d121cfc598ec0e7725
.
The initial revision history retrieval is quite taxing on the server. In order to make the initial pull more efficient, we provide a snapshot of the mtn database to bootstrap the process. This database requires monotone 0.33 or newer. (Note that if you use monotone 0.34 or higher, you will need to migrate the database after extracting it.)
To fetch the revision history from this server, and check out a working copy, do:
$ DATABASE=/home/user/monotone_databases/pidgin.mtn $ WORKINGDIR=/home/user/code/pidgin-mtn Download the bootstrap database from http://developer.pidgin.im/static/pidgin.mtn.bz2 Extract the bootstrap database and move it to $DATABASE $ mtn -d $DATABASE pull pidgin.im im.pidgin.* $ mtn -d $DATABASE co -b im.pidgin.pidgin $WORKINGDIR
(The variables here are just to help you understand which parts of the process are up to your personal choice.)
This will create a database for storing your development stuff, fetch the entire revision history available from pidgin.im to that database, and then check out a working copy of the newest revision of Pidgin in that database. To update the database from the server in the future, you can either a) simply go to $WORKINGDIR
and execute mtn pull
, or execute mtn -d $DATABASE pull
from anywhere. Note that this will pull the new revision history from the server, but will not update your working directory to reflect the newest available revision. For this, you need to run mtn up
in $WORKINGDIR
.
Consult the monotone documents, and particularly their CVS phrasebook to see the things you can now do with your database and working copy. You should find that most of the actions at this point feel pretty familiar.
Keys and Key Management
Monotone uses asymmetric keypairs for various trust and identification tasks. Every certificate created by a developer is signed by a keypair unique to that developer. In practice, this means that every commit to a branch is signed by the developer which made the commit (because the piece of data tying a particular revision to a branch is a certificate).
Therefore, in order to commit revisions, push a revision to most netsync servers, create a certificate, or perform a number of other activities, you will have to have a monotone keypair. To generate a keypair, use mtn genkey $KEYID
. Key IDs are normally email addresses, and at this point there is no way to use two keys with the same key ID on the same project (keys are addressed by their ID, not fingerprint etc.). For playing, you might want to generate a throwaway key ID just in case; I recommend that developers' normal pidgin keys be of the form username@pidgin.im
. There is nothing which says this must be the case, however, and there is certainly something to be said for using a different key for each physical workstation or administrative domain that one uses.
Once you have created a key, you can generate a public key which can be shared with other developers (for the purpose of establishing trust, giving netsync permission, etc.) with the command mtn pubkey $KEYID
. The output of this command can be imported into a monotone database with mtn read < $FILE
, and then synced to a remote server (even if it has not been used to sign any certificates) with mtn push --key-to-push $KEYID
. Note that if a key has been used to sign certificates which are communicated in a netsync transaction, it will be automatically synced along with the revisions; this means that if third-party developers use monotone (which we should encourage!) and we retrieve changes from them via mtn pull, their keys will be automatically installed in the pidgin.im repository at the next developer push of those revisions.
Branching im.pidgin.pidgin
There are two kinds of branches in monotone, which I will call macro- and micro-branches. We will deal with each in turn.
macro-branches
A macro-branch is a set of monotone revisions which have a particular certificate associated with them, identifying them as belonging to the same branch. In our case, the "main" branch of development is im.pidgin.pidgin
. All revisions in the monotone database which carry a cert of type branch
with the value im.pidgin.pidgin
are on this branch. Note that, technically, revisions on such a branch don't have to have any relation to one another -- however, it probably makes sense that they are all descended from some ultimate ancestor revision, and that they are logically related in some fashion. In the case of im.pidgin.pidgin
, they form a (presently) linear history taken from the Gaim svn repository.
Branch certificates are a little bit "magic", in that monotone knows about them and changes its behavior based on them. For example, a commit
ted revision will inherit the branch certificate of its parent. An update
on a workspace will update to the "newest" (DAG-wise; more on this later) revision bearing the same branch tag. The set of revisions to synchronize via netsync is chosen by a branch specification pattern.
Creating a new branch
Creating a new branch is as easy as committing a revision with a new branch name, or adding a new branch certificate to an existing revision.
To create a new branch from a set of changes in your workspace (that is, commit a revision with a new branch name, supply the -b
or --branch
argument to monotone commit
. In other words:
mtn ci -b <new-branch-name>
Creating a new branch from an existing revision is accomplished with one of the commands:
mtn approve -b <new-branch-name> <revision> mtn cert <revision> branch <new-branch-name>
Branch naming
Branch names are not structured (that is to say, their structure is not enforced), but good practices for branch naming suggest that related branches have similar names. BCP seems to be Java-style inverted-domain naming. We are using this practice, with all Pidgin.im-related branches living under the im.pidgin namespace, with further hierarchy below this. For example, the 2007 Summer of Code projects are all beneath im.pidgin.soc.2007.<projectname>. Naming like this tells us something about the branches immediately upon executing mtn ls branches
.
Merging branches
Merging two branches is accomplished with the command mtn propagate <from-branch> <to-branch>
. There are other ways to merge (e.g., approve or cert a revision onto the destination branch, and then handle as a micro-branch below), but this is the most straightforward and will normally serve your purposes. If required, mtn explicit_merge provides more control.
micro-branches
Every history in monotone is represented as a directed acyclic graph (DAG). This means that every revision checked into the database has an explicit list of parent revisions, which are fixed at the moment that the revision is committed and are thereforth immutable. The DAG structure is general; this means that a revision can have more than one parent (currently, I believe it is only possible to have zero, one, or two, due to the implementation of monotone), and more than one revision can have the same parent (that is to say, a revision can have more than one child). Because ancestors are immutable, and an ancestor must exist at creation time, a revision can never be its own ancestor -- thus the acyclic part.
Due to the distributed nature of monotone, a little bit of thought will lead to the conclusion that it is possible to have a DAG which has more than one "head" revision. Consider the case where two developers pull from the pidgin.im
repository at the same time, and thus receive the same head; let us call it 0123abcd
. Each developer goes on to make a change, and commits that change to their local database. Say, a1b2c3d4
for devA
and 9876fedc
for devB
. The two developers then push their local changes to pidgin.im
, and lo and behold, we have the graph:
,--a1b2c3d4 / ... 0123abcd \ '--9876fedc
We call a1b2c3d4
and 9876fedc
the heads of the branch im.pidgin.pidgin
, and the heads of a branch can be viewed with the command mtn heads
. I call this divergence (within the same logical branch im.pidgin.pidgin
) a micro-branch.
Such a micro-branch obviously cannot be resolved with mtn propagate
, as both revisions are on the same logical branch. To resolve such a branch, the command mtn merge
is used. Either devA
or devB
can merge these two revisions, say yielding a fourth revision deadbeef
. The resulting graph then looks like:
,--a1b2c3d4, / \ ... 0123abcd deadbeef \ / '--9876fedc'
Given that such structure exists and is possible, it can be exploited intentionally as well as created inadvertently. Monotone:DaggyFixes discusses just this, and its usage in identifying and fixing bugs. Additionally, this means that it is not necessary (as it often is with svn and CVS) to update your working directory before committing pending changes; simply commit them at the point where you created them, and then merge this commit with the branch as it currently stands.
Note that mtn update
will not update a branch which has multiple heads; you will either have to explicitly mtn update -r <revision>
to select a particular head (or other revision), or merge the divergent heads before updating.
Branch Complexity
While all of this seems somewhat complex and difficult compared to the linear-history model of CVS or svn, it is really quite unavoidable in the context of a distributed VCS (as the above example shows). Different systems handle it differently (darcs in particular using quite a different model), but the problem will exist in any such system. Once you get your head wrapped around it, it's actually quite intuitive and powerful. For more information, see the Branching and Merging chapter of the monotone documentation.
Merging and Conflicts
As we mentioned above, merges are generally handled with mtn propagate
and mtn merge
, depending on whether the merge in question is of a macro- or micro-branch. An important, related question is, "What happens when there are conflicts?". The answer happens to be a bit suboptimal at the moment, but the monotone folks are working on that.
Content Conflicts
A content conflict is when two revisions have edited the same file in close enough proximity that their diffs interfere with each other. These cause <<<<<
=====
>>>>>
blocks in CVS and Subversion. In monotone, merges via the working directory are not yet supported; all conflicts must be resolved at merge time, and the resulting resolution will be directly committed to the revision database with no chance to edit anything else (for example, to resolve non-content-conflict logical consistency problems). These content conflicts will be presented to you, one at a time, in a 3-way merge application. This means that you'll have to have such an application installed; I have found that xxdiff and meld are more intuitive than some of the other options which monotone understands (such as emacs and vim merge modes). These applications will show you the two sides of the merge conflict and the least common ancestor (the "closest" revision to the conflicting revisions, by revision hop count, computed via some clever algorithm) in separate panes, and you simply choose which stanzas you wish to include and which you wish to leave out, and make any necessary edits to make that happen. This is a little bit tricky, but once you've done it a couple of times it becomes easier.
Non-Content Conflicts
A non-content conflict is a conflict not between the contents of files in two revisions, but between the files themselves. This happens, for example, if you create a file named foo.c
in two different revisions, and then try to merge those revisions. It can also happen if you have files that monotone doesn't know about (like .o
files) scattered around your build tree, and you try to merge a revision which rearranges things. Monotone will resist removing these unknown files, which can cause conflicts if, say, the directory they are in needs to be removed or replaced. The former type of non-content conflict can normally be resolved simply by removing or renaming the like-named files or directories in one of the revisions you wish to merge before merging. The latter can be a bit trickier to solve (particularly because the error messages are currently rather poor and unclear), but you might find mtn revert --missing
, mtn rm --missing
, and mtn ls unknown | xargs rm
helpful. Note that all three of these remove data, and the mtn rm
command actually changes what's in the repository. You probably want to look at the output of mtn ls missing
and mtn ls unknown
before executing any of these commands.
Important Practices
I recommend reading Monotone:DaggyFixes in particular. The other material at Monotone:BestPractices may be useful as well.
It will make your life easier (as it will with svn, or CVS, or any other VCS, for that matter) that you pull from the "official" repository and update your workspace before starting in on large changes, if possible. This will prevent gratuitious merge conflicts stemming from simply having an old version of the repository on your local machine. What is less important with monotone than other VCSs is updating after making such a change and before committing. You can use micro-branches to your advantage, in this case; simply commit your workspace, then pull and merge if necessary, before pushing. Likewise, it is important that you pull before merging, to help prevent having to merge multiple times.
Comments
Please leave any comments you have about monotone usage that you think should be documented for our use, or should become part of our own best practices, here.
lschiere: I like the idea of fixing a bug at the original introduction point and then merging. That is likely more work than is worth-while for some bugs, but other bugs we have identified the bad revision as part of figuring out what is causing the bug. Along the same lines, the pluck command would be useful for backporting fixes when we are unable to fix the bug in the "daggy" manner.