Tags with Mercurial and Forests
June 6, 2008 2 Comments
When you have a project in a Mercurial repository, probably using the Forest extension (like I’m doing), then you most likely want to tag the forest occasionally, be it before a release, or (like in my case) when an autobuilder was successful and wants to mark a specific version as OK. Obviously, when working with a forest, there’s a problem, because there is not a single version to tag, but in fact a couple of version. Each tree needs to be tagged. My specific requirements for tagging are those:
- A tag must identify a specific state of the source tree (forest).
- It must be reliable. I don’t want to recover from arbitrary errors.
- Setting and removing a tag must be atomic. No other operation should interfer with it.
- It must be possible for multiple processes to set and remove tags on a repository.
- The tags should be available through the VCS (I could simply store the stuff in a database myself, but that would partly defeat the purpose).
Here’s my solutions, in the order that I tried:
Using hg tag
This is the first thing that comes to mind, simply because there is such a command. Frankly, in hindsight, this seems the worst solution to the problem, at least with forests. I tried to write scripts that walk the forest and set tags in all the repositories. I even started an HG extension for that. Not only is this very complicated, it turns out that it is almost impossible to push back the tags to a master repository reliably, especially when you want to do this automatically. Also, this stores the tag in each of the repositories, introducing changesets in all of them, thus cluttering all the logs. Ugly. Keep away from that.
The next solution is based on hg fsnap. This is an interesting command. Using that, you can create a snapshot of a forest, and later you can use fseed or fclone to re-create the forest at exactly this state. Seems like a reasonable candidate to implement tagging with. So I did. My script did create snapshots of a specific state, put them in their own repository inside the forest, and pushed them to the master. Since the tags are in their own repository, it is much easier to push them back. However, it is still not 100% fault-proof, especially when several autobuild-processes plus developers are involved. Much better than the first solution, but still, not recommended.
Funnily, this idea comes from the Subversion development model. Instead of hg tag or hg snapshot, it is very straightforward to simply create a clone as a tag. Just like you do with branching. (Yes, I consider in-tree branches somewhat broken. or at least confusing.) Cloning inside the same filesystem has almost zero overhead, because HG uses hardlinks internally. This makes it easy to deploy a similar structure as is common in Subversion:
trunk/ branches/ tags/
Trunk is where all the main development goes. Branches holds all the clones that are considered branches, be it separate release branches or some kind of development branch. You can do just the same with tags, simply clone the forest from trunk (or some branch) to the tags directory and don’t touch it anymore afterwards. This is very straightforward to implement and is the only solution (AFAICS) that fulfills all the requirements above. I’d even go so far to recommend this for non-forest repositories too instead of hg tag. I wish I had this idea earlier. This is why having all kind of commands in the core of HG and having them named like in other RCSs is probably not a good idea, it misleads to thinking that (for example) hg tag is just the same as cvs tag. It stole me a lot of time.