分散式版本控制(DVCS,Distributed Version Control Systems)


Analysis of Git and Mercurial

Analysis of Git and Mercurial

Note: this analysis was done in summer 2008, when we first began scoping work for DVCS support in Google Code.


This document summarizes the initial research for adding distributed version control as an option for Google Code. Based on popularity, two distributed version control systems were considered: Git and Mercurial. This document describes the features of the two systems, and provides an overview of the work required to integrate them with Google Code.

Distributed Version Control

In traditional version control systems, there is a central repository that maintains all history. Clients must interact with this repository to examine file history, look at other branches, or commit changes. Typically, clients have a local copy of the versions of files they are working on, but no local storage of previous versions or alternate branches.

Distributed Version Control Systems (DVCS) use a different structure. With DVCS, every user has their own local repository, complete with project history, branches, etc. Switching to an alternate branch, examining file history, and even committing changes are all local operations. Individual repositories can then exchange information via push and pull operations. A push transfers some local information to a remote repository, and a pull copies remote information to the local repository. Note that neither repository is necessarily “authoritative” with respect to the other. Both repositories may have some local history that the other does not have yet. One key feature of any DVCS system is to make it easy for repositories to unambiguously describe the history they have (and the history they are requesting). Both Git and Mercurial do this by using SHA1 hashes to identify data (files, trees, changesets, etc).

DVCS’s provide a lot of flexibility in developer workflows. They can be used in a manner similar to traditional VCS’s, with a central “authoritative” repository with which each developer synchronizes. For larger projects, it is also possible to have a hierarchy of server repositories, with maintainers for each repository accepting changes from downstream developers and then forwarding them upstream. DVCS’s also allow developers to share work with each other directly. For example, two developers working on a new feature could work on a common branch and share work with each other independent of an “authoritative” server. Once their work was stable, it could then be pushed to a public repository for a larger audience.

Because there is no central repository, the terms client and server don’t necessarily apply. When talking about two repositories, they are typically referred to as local and remote rather than client and server. However, in the context of implementing a DVCS for Google Code, the repository hosted at Google will be considered the server, and a user’s repository will be called the client.

Feature Comparison

There is actually a great deal of similarity between Git and Mercurial. Instead of providing a long list of features that are equivalently supported in both system, this section attempts to highlight areas of significant difference between the systems.

Git Advantages

  • Client Storage Management. Both Mercurial and Git allow users to selectively pull branches from other repositories. This provides an upfront mechanism for narrowing the amount of history stored locally. In addition, Git allows previously pulled branches to be discarded. Git also allows old revision data to be pruned from the local repository (while still keeping recent revision data on those branches). With Mercurial, if a branch is in the local repository, then all of its revisions (back to the very initial commit) must also be present, and there is no way to prune branches other than by creating a new repository and selectively pulling branches into it. There has been some work addressing this in Mercurial, but nothing official yet.
  • Number of Parents. Git supports an unlimited number of parent revisions during a merge. Mercurial only allows two parents. In order to achieve an N-way merge in Mercurial, the user will have to perform N-1 two-way merges. Although in many cases this is also the preferred way to merge N parents regardless of DVCS, with Git the user can perform an N-way merge in one step if so desired.
  • Rebasing. Git has a rebase command which allows you to take a local branch and change its branch point to a more recent revision. For example, if one is working on a new feature for a product, a local branch may have been started from the 1.0 release. If while still working on the feature the main product line has been updated to 1.1, it may be desirable to pull in the 1.0->1.1 changes onto the local feature branch and treat it as a branch from 1.1 instead of 1.0. In most other systems, this is done by merging the 1.1 changes onto the branch. Merging is the right thing to do from an SCM perspective, where the focus is on ‘reproducibility of past states’. However, when the focus is ‘authoring a clean software revision history’, rebasing is sometimes a superior method. git-rebase allows you to make a previously non-linear history linear, keeping the history cleaner. To be fair, the same deltas are produced by simply merging every commit from the 1.0 version to the 1.1 version individually, and committing them individually. Rebasing is about safely doing this and then deleting the old 1.0 based versions so they don’t clutter the tree.

Note: Mercurial has added rebase support since this analysis was conducted.

Mercurial Advantages

  • Learning Curve. Git has a steeper learning curve than Mercurial due to a number of factors. Git has more commands and options, the volume of which can be intimidating to new users. Mercurial’s documentation tends to be more complete and easier for novices to read. Mercurial’s terminology and commands are also a closer to Subversion and CVS, making it familiar to people migrating from those systems.
  • Windows Support. Git has a strong Linux heritage, and the official way to run it under Windows is to use cygwin, which is far from ideal from the perspective of a Windows user. A MinGw based port of Git is gaining popularity, but Windows still remains a “second class citizen” in the world of Git. Based on limited testing, the MinGW port appeared to be completely functional, but a little sluggish. Operations that normally felt instantaneous on Linux or Mac OS X took several tenths of a second on Windows. Mercurial is Python based, and the official distribution runs cleanly under Windows (as well as Linux, Mac OS X, etc).
  • Maintenance. Git requires periodic maintenance of repositories (i.e. git-gc), Mercurial does not require such maintenance. Note, however, that Mercurial is also a lot less sophisticated with respect to managing the clients disk space (see Client Storage Management above).
  • History is Sacred. Git is extremely powerful, and will do almost anything you ask it to. Unfortunately, this also means that Git is perfectly happy to lose history. For example, git-push –force can result in revisions becoming lost in the remote repository. Mercurial’s repository is structured more as an ever-growing collection of immutable objects. Although in some cases (such as rebase), rewriting history can be an advantage, it can also be dangerous and lead to unexpected results. It should be noted, however, that a custom Git server could be written to disallow the loss of data, so this advantage is minimal.

Other Differences

  • Rename/Copy Tracking. Git does not explicitly track file renaming or copying. Instead, commands such as git-log look for cases where an identical file appears earlier in the repository history and infers the rename or copy. Mercurial takes the more familiar approach of providing explicit rename and copy commands and storing this as part of the history of a file. Each approach has both advantages and disadvantages, and it isn’t clear that either approach is universally “better” than the other.
  • Architecture. Git was originally a large number of shell scripts and unix commands implemented in C. Over time, a common library that shared between commands has been developed, and many of the commands have been built into the main git executable. Mercurial is implemented mostly in Python (with a small amount of C), with an extension API that allows third parties to enhance Mercurial via custom Python modules.
  • Private History. In Git, the default mode of operation is for developers to have their own local (and private) tags/branches/revisions, and exercise a lot of control over what becomes public. With Mercurial the emphasis is the other way around – default push/pull behavior shares all information and extra steps need to be taken to share a subset. This is not listed as an advantage for either system, because both systems are generally capable of supporting either kind of operation.
  • Branch Namespace. In Git, each repository has its own branch namespace, and users set up a mapping between local branchnames and remote ones. With Mercurial, there is a single branch namespace shared by all repositories.


Data Storage

Both Git and Mercurial internally work with very similar data: revisions of files along with a small amount of meta information (parents, author, etc). They both have objects that represent a project-wide commit, and these are also versioned. They both have objects that associate a commit with a set of file versions. In Git, this is a tree object (a tree structure with tree objects for directories and references to file revisions as the leaves). In Mercurial, there is a manifest (a flat list mapping pathnames to file revision objects). Aside from the manifest/tree difference, both are very similar in terms of how objects are searched and walked.

Git uses a combination of storing objects directly in the file system (indexed by SHA1 hash) and packing multiple objects into larger compressed files, while Mercurial uses a revlog structure (basically a concatenation of revision diffs with periodic snapshots of a complete revision). In both cases, the native storage will not be used and the objects will be stored in Bigtable instead. Due to the similarity of the basic Git and Mercurial data objects, the effort to solve such problems should be the same regardless of which DVCS is being used.

The only major difference for the data storage layer is the implementation language. If a significant amount of Git/Mercurial code is to be reused, then a Git implementation would be in C, and a Mercurial one would be in Python (or perhaps C++ with SWIG bindings).

Mercurial Integration

Mercurial has very good support for HTTP based stateless pushing and pulling of remote repositories. A reasonable amount of effort has been made to reduce the number of round trips between client and server in determining what data needs to be exchanged, and once this determination has been made all of the relevant information is bundled into a single large transfer. This is a good match for Google’s infrastructure, so no modifications will be required on the client side.

Git Integration

Git includes support for HTTP pulls (and WebDAV pushes), but the implementation assumes that the server knows nothing about Git. It is designed such that you can have a Apache simply serve the Git repository as static content. This method requires numerous synchronous round trip requests, and is unsuitable for use in Google Code (1).

Git also has a custom stateful protocol that supports much faster exchanges of information, but this is not a good match for Google infrastructure. Specifically, it is very desirable to use a stateless HTTP protocol since there is already significant infrastructure in place to make such transactions reliable and performant.

Note: There has been some discussion about improving HTTP support in the Git community since this analysis was done.


In terms of implementation effort, Mercurial has a clear advantage due to its efficient HTTP transport protocol.

In terms of features, Git is more powerful, but this tends to be offset by it being more complicated to use.

1 As a benchmark, Git and Mercurial repositories were seeded with approximately 1500 files totaling 35 M of data. The servers were running in Chicago and the clients in Mountain View (51 ms ping time). The operation of cloning the remote repository (similar to a initial checkout in traditional version control systems) averaged 8.1 seconds for Mercurial and 178 seconds for Git (22 times slower). A single file in the repository was then changed 50 times and the clients pulled the updates. In this case, Mercurial took 1.5 seconds and Git required 18 seconds (12 times slower). When the Git protocol was used instead of HTTP, Git’s performance was similar to Mercurial (8.7 seconds for cloning, 2.8 seconds for the pull).


作者 胡凱


從現象上看,版本管理伺服器不在本地是遭遇速度瓶頸的主要原因,本質卻是由於版本管理工具不能很好的根據團隊的規模和結構伸縮。對我們而言,比較理想的版本管理解決方案是在中美兩地架設伺服器,加快各個操作的執行速度,伺服器之間自動同步來平衡兩地對於速度和代碼集成的要求。然而採用Subversion 作為版本管理工具,決定了伺服器僅能架設在一地。SVK可以解決部分問題,但它的缺陷太多,操作起來非常不便。我們所面臨的備份問題則是由於在Subversion的設計中,所有的元資料僅僅保存在伺服器上,一旦伺服器出現意外,元資料所包含的寶貴資訊便無從恢復。之前的教訓讓我們認識到如果採用Subversion作為版本管理工具,就不能僅僅樂觀的假設伺服器不會出錯,必須有詳盡可行的備份計畫,通過不斷備份來規避風險。



Mercurial帶給團隊的第一個體驗就是快,原因很簡單,由於DVCS的工作目錄與中央倉庫(Central Repository)別無二致,同樣保存了全部的元資料,那麼Subversion需要通過網路完成的操作(諸如提交、追溯歷史、更新等), Mercurial可以在離線條件下通過操作本地倉庫完成(圖-1)。


通過減少與中央倉庫的通信, Mercurial加快了操作速度,減小了網路環境對團隊的影響,非常符合我們的需求。這種速度和可靠性的提高,對於時刻與版本管理工具打交道的開發者是一種非常愉悅的工作體驗。此外,包含了全部元資料的工作目錄可以在中央倉庫出現問題時(圖2-b)成為備用倉庫(圖2-c),而整個過程只需運行一條命令即可。




在日常的工作中,我們常常利用Mercurial靈活的分支合併來共用修改,協同工作。幾個月前在印度發佈產品時,我需要在新的工作站上安裝開發環境,由於代碼庫龐大而且網速緩慢,克隆中央倉庫的操作需要花費數小時才能完成(圖3-a),Mercurial的靈活性使我可以將工作站指向已經存有代碼的筆記本電腦來執行克隆操作(圖 3-b),在數分鐘後工作站就完成了全部的克隆操作,之後再將它指向中央倉庫(圖3-c),即可正常提交/更新代碼,大大節省了時間,提高了效率。


在另一個場景中,由於我所在團隊使用Linux作為開發環境,在急於驗證某些功能是否在Windows平臺可以正常工作時,我們會將代碼在Linux工作站本地提交,再將Windows工作站的工作目錄指向Linux工作站,獲取更新(圖4-b)。之後,在Windows 平臺驗證功能,如果功能存在問題,可以修復後再將修訂從Windows工作站提交到Linux工作站(圖4-c),最終由Linux工作站運行測試並將全部更新同步到中央倉庫(圖4-d)。Mercurial的分散式特性讓開發團隊敏捷的分享修訂,更有效率的開發。



本地倉庫的存在,使Mercurial對小步前進更加友好 。小步前進意味著開發者在不破壞任何現有功能的前提下,每次修改少量代碼並提交。這兩個需求讓使用集中式版本管理工具的開發者常常處於兩難的境地,”不破壞現有功能“與“每次修改少量代碼並提交”意味著存在便於分析的細粒度需求以及開發人員必須掌握增量式的物件建模、重構,資料庫設計、遷移等技術。難於小步前進體現的是團隊成員經驗和技術的欠缺,然而解決這些問題不是一朝一夕之功,本地倉庫的存在給了開發者更大的自由,允許開發者頻繁提交而無需顧忌是否每一次提交都不會破壞現有功能,在代碼經過若干次提交到達穩定狀態時再與中央資料庫同步。通過使用Mercurial,使得小步前進這個實踐得以在團隊開展,在大家體會到實踐帶來的好處後,再追求高品質的小步前進。




毫無疑問Subversion是非常優秀的版本管理工具,但是它有自己的適用範圍,並不是銀彈。拋棄 Subversion,也不因為我們是新技術的狂熱分子,而是它無法伸縮來適應團隊的結構變化。對於希望嘗試DVCS的團隊,我的幾個建議是:決策者首先要識別團隊的痛點,對問題域有清醒的認識,而不能僅僅追趕技術潮流,其次是使用它、慢慢的接受它,如果團隊僅僅止於理論方面的學習,各種方案的論證,是無法掌握DVCS並利用它來提高團隊效率的,最後整個團隊需要持續學習DVCS背後的設計思想,對於問題域的抽象以及豐富的插件的使用方法。這些知識將直接或間接的幫助團隊提高進行代碼版本管理的能力,更有效律的管理代碼。


by 雲風

我最早接觸的 SCM 工具是 vss ,但是沒用幾天(換工作到網易後)就遷移到了 cvs 。我自己大約用了一年後,公司集體從 cvs 遷移到了 svn 。領導這次大遷徙的大大說, svn 是一個更好的 cvs (確實是這樣嗎?據說有爭議,但至少我感覺在多檔版本控制上 svn 比 cvs 方便,因為 cvs 無法保證多個檔同時提交的原子性)。

前幾年,有人跟我爭論過到底 vss 的加鎖模式好,還是 cvs 的合併模式好。我覺得答案是不言而喻的,懶得爭論。雖然在某些特殊環境上,我們偶爾需要加鎖模式協同工作,但對於程式師的協作來說,無疑我們需要並行的工作。

我們在若干年前曾經淘汰過一次加鎖的協作編碼方式,而到了今天,是時候再做一些改變了。或許,分散式的版本控制工具才是未來的發展方向。我想,總有一天,cvs/svn 這類集中式版本控制工具會被淘汰掉的。


  1. 我們禁止提交不能編譯通過的代碼,儘量不提交不能測試通過的代碼。結果,對於很複雜的模組,有人幾乎一個月都沒提交過一次。他總是覺得程式還不太成熟,但幾經修改的代碼其實從來沒有作版本控制。
  2. 有些模組由兩個人合作編寫,關係非常緊湊。有時候需要在兩人之間交換一些代碼,為了方便,大家通過代碼倉庫中轉,結果在倉庫中留下許多未完成的版本。
  3. 代碼被用筆記本帶回家,結果在家完成的部分無處可以提交。(為了安全,我們的代碼倉庫不能從外網訪問)
  4. 某人寫了一個模組,總是有 bug 沒有修改完,而不敢提交。這個時候,另一個人希望協助他找問題,卻沒有合適的途徑 share 那段完成了一半的模組。跑過去 XP 一下麼?天哪,為什麼我們這裏每個人用的編輯器都不一樣,還都愛用些特別個性的配色方案呢?

我們嘗試過一些 work around 的解決方法,比如在本地自己創建倉庫。托TortoiseSVN 的福,這件事在 Windows 下做起來還是很簡單的。可終歸是多個倉庫的管理,用的人始終感覺麻煩,而沒有貫徹下去。


集中式版本控制工具,總要求你有一個中心伺服器,提供一個專案倉庫。每個人都必須嚴格保持跟倉庫的內容一致。當項目大於等於 2 人時,往往都會指定一些規則,比如不要提交寫了一半的代碼到倉庫去等等。結果,這些規則導致了上面我提到的問題。

即使是一個人自己用,有時候也會碰到問題。有一次我回到家,看到老爸(一個老程式師)在家做一個自己的小東西。因為我們家有兩台電腦,我看見兩台機器上有若干份不同的版本,我便推薦他用 svn 。因為兩台機器都不是 24H 開機,我便選擇了在 U 盤上創建倉庫。可以設想的到,兩台機器的 U 盤插入後盤符是不同的,這可真是一場災難啊。

其實大多數情況下,我們要的僅僅是 版本管理 ,並不要求通過這類工具協同很多人修改同一份代碼。在我們公司,修改別人的代碼是要通知檔創建人的。大家都儘量在自己的工作目錄下寫東西。我並不要求分散式的版本控制工具幫我解決開發人員分佈在不同地方的問題,我需要的僅僅是可以更方便的創建私人(或小團體)的分支,可以局部的提交的問題。這些,只需要一個倉庫合併的特性就做到了。

我比較孤陋寡聞,知道有分散式版本控制工具是從 git 發佈的消息開始的。有 Linus 的鼎鼎大名在那,應該是個好東西。我想還會有一些跟我一樣,一進入項目開發就兩耳不聞窗外事的朋友,不知道 git 是何物的話,不妨看看 Git 中文教程

可惜的是,git 對 Windows 支持的並不好。我們至少還有一半的項目跑在 Windows 下,開發人員則超過一半在用 windows 平臺。聽說其原因是 git 非常依賴檔系統的一些特性,這些在 Linux 下表現的很好,而 Windows 下特別糟糕。不管具體原因是這個還是別的,我對在公司推廣 git 沒有多少信心。

另一個選擇是 Monotone ,但聽說跟 git 有同樣的問題(對 windows 的支持問題)。畢竟 git 本身就受了 monotone 的很大影響吧(我猜的)。有趣的是,和 Git 一樣 Monotone 也是用 C 寫的。當然這句話其實應該倒過來說,因為 Monotone 是從 2003 年開始的,比 Git 早多了。

關於 Git 和 monostone 對 windows 支持不太好的說法,可以參考這一篇: Mozilla: Version Control System Shootout Redux Redux ,Mozilla 的大大這樣評價:Git is inappropriate for cross-platform projects due to its UNIX-centric nature; same goes for Monotone.

嗯,既然 Mercurial 是 (Mozilla 的) current favorite (but not the winner) ,我們也可以關注一下。說起來,Mercurial 的命令名很有趣,是 hg 。我花了幾秒鐘才反應過來,Hg 就是汞嘛 😀 。

下面再讓我們看看幾個候選人,Bazaar 的網站上有它和其他幾種工具的比較。雖然有人說它性能不行,但我想那是針對 Mozllia 這種超級項目說的,我想對我們這樣的小東西不會有什麼影響。別的方面看起來很不錯喲。尤其是它宣稱的智能 rename ,真是太有愛了。

svn 下給目錄 rename 絕對是場災難。如果你不小心沒有直接去倉庫中 rename ,那麼就意味著目錄下所有檔的 del / add 。而即使你在倉庫上直接操作,所有 client 都會大量的做 del / add 操作。每當這個時候,我都超心痛我的硬碟。

darcs 看起來也不錯,我對 Haskell 本身就有莫名的好感,用 Haskell 寫出來的軟體對我就意味著穩定。雖然我自己不怎麼玩 Haskell 也不太用 Python ,但是若讓我花時間選一門語言玩的話,我會優先試試 Haskell 的。

作為 svn 的老用戶,或許應該多關注一下 svk ,它在 svn 的基礎上增加了一些分散式管理的東西。但是我不太喜歡這種補丁式的解決方案,因為設計總會隨著需求而改變。若是背上太多歷史包袱會讓我有些不詳的預感。

最後可以看看 GNU Arch 。我流覽了 arch 的 wiki 中 WhyArch 這一頁,吸引我的是最後兩條:

  1. Arch is lightweight
  2. Arch has a clean and transparent design

不過從 google 搜索結果來看,我沒覺得 GNU Arch 是個有前途的項目(相比前面幾個而言)。

對於我這樣依然有部分時間在 Windows 環境下苟延殘喘的程式師來說,有個好消息。那就是托開源的福,可愛的小烏龜無處不在。

  1. Mercurial 的烏龜版:TortoiseHg
  2. Bazaar 的烏龜版: TortoiseBZR
  3. Darcs 的烏龜版: TortoiseDarcs

不過就我的歷史經驗,只有 TortoiseSVN 是正宗烏龜,最好用。不用對其他版本烏龜的操作手感抱太大希望。

1 月 23 日 補充:

下面很多朋友談到,合理的使用 branch 的功能就能解決我碰到的大多數問題。

沒錯,的確是這樣。但是我們現在使用的 svn ,由於各種原因開 branch 都是件很麻煩的事。並不是指操作麻煩,而是管理麻煩。我們沒有專門的代碼倉庫管理人員,大家比較鬆散。另外,在經過一次安全事故後,公司要求嚴格控制代碼樹上每個分支的讀寫許可權。最終導致開 branch 成本過高,而很少有人日常使用。

前面提到分散式版本控制工具提供了方便的倉庫合併功能,這個倉庫合併其實就是分支合併。並非 svn 沒有,而是做的不方便。這一點正如 cvs 的一個老問題:如何方便的確定一組檔的版本,我們可以用 tag 來解決,但終究不如 svn 那樣每次多文件提交都是單一原子操作來的方便。



雖然我使用傳統SCM(軟體配置管理,或者叫RCS——版本控制系統)軟體已經有很多年,從那個垃圾一般的VSS到功能強大的CVS和SVN都用過 一段時間,短則半年(VSS),長則數年(SVN)。但是對於分散式版本控制系統(DRCS),我也是剛接觸不久,發現它們的確很不錯,相比傳統SCM來 說,是一種質的變化。

所謂DRCS是相對于傳統集中式的SCM而言的。對於傳統SCM來說,Repository是集中在唯一的一個地方, 所有的用戶進行commit或update以及其他的相關操作基本上都需要能夠直接連接到這個Repository才能進行。這就會存在一些比較麻煩的問 題,就我個人的體會來說,常見的麻煩事在於(以下僅以CVS和SVN為例,至於VSS——不能為rubbish浪費時間):

















Bazaar的優點在於功能強大並且安裝使用很方便。因為是用Python寫的,只要在安裝了Python之後,再安裝一下Bazaar就可以使用了。需要注意的是,它的SFTP功能使用了兩個包: pycrypto和paramiko。這是需要額外安裝一下的。剛好我安裝Bazaar的時候兩個包的官方網站都不太正常(難道是RPWT?),最後只好通過GOOGLE找到別的鏡像才下載到。

Bazaar 的缺點是速度太慢。它的慢倒不是像雲風所說的那樣,對大項目太慢,而是因為它是純python的程式,在windows下每次運行都要啟動python環 境,所以每次輸入命令都要等上一會兒才執行,對於像我這種經常用status命令檢查的人或是那些用小步迭代的XP方式開發的人來說,比較難以忍受。

據 說也有一個類似TortoiseSVN的ToroiseBZR,但是鑒於我對TSVN的使用經驗來說,Windows的Explorer已經很不穩定了, 再加這個東東就不穩定得受不了了。所以我還是不用Tortoise了,再說用慣了命令行操作在很多時候比用Tortoise還方便。



bzr push s


bzr branch s

不 過試用下來我碰到一個問題,通過SFTP作push是沒有問題,但是如上面的命令作checkout或branch操作時,卻只會下載一個本地 Repository的副本(.bzr目錄),不會生成工作目錄和檔。即使再用update也不行,不知道是不是我RPWT,還是下來的這個.bzr本 身就有問題。但換成HTTP就可以了:

bzr branch

但 這樣就麻煩一些,一個是安全性的問題,當然可以用HTTP-Auth加上身份驗證,但畢竟沒有SFTP安全;另一個是需要在服務端作Web Server配置,允許通過HTTP訪問遠端Repository;第三個問題是HTTP是唯讀的,這樣上傳和下載需要使用不同的URL。

Web Server端的配置倒不複雜,標準的WEB靜態頁面訪問配置即可。例如(包含HTTP-Auth身份驗證):

  Alias /bzr /home/bzr
  <Directory /home/bzr>
    Options FollowSymLinks
    AllowOverride FileInfo Indexes Limit
    Order allow,deny
    Allow from all

    AuthType Basic
    AuthName "Bazaar Repository Files"
    AuthUserFile /home/svn/svn-auth-dev
    Require valid-user


bzr ignored






Mercurial 的優點在於功能強大並且速度超快(相對Bazaar而言,與SVN相比似乎也快一些),看源碼Mercurial也是用Python寫的,只不過發佈版像 是用PY2EXE等工具編譯成的EXE,不知道為什麼速度會這麼快。缺點是操作遠端Repository的功能需要一些服務端的安裝配置工作,這點不如 Bazaar方便。






ssh = D:\tools\plink.exe -ssh -pw password
username = username<>


ssh 項就是與SSH有關的配置專案了。按照參考文檔中所說的,Mercurial通過ssh操作遠端Repository時可以交互提示輸入登錄密碼,但是我 在Windows下試驗失敗,可能是plink的問題。所以在第一次從遠端Repository上clone到本地repository時需要在這裏用- pw選項輸入登錄密碼(不過用完就可以去掉這一項了——因為不同的項目可能使用不同的遠端Repository,所以就可能使用不同的用戶和密碼,那個時 候可以配置到專案的.hgrc裏,詳見後面的說明)。







hg init




syntax: glob




default = ssh://username@remotehost//home/username/projname
default-push = ssh://username@remotehost//home/username/projname
ssh = D:\tools\plink.exe -ssh -pw password




hg status








hg add



hg commit -m “關於本次提交的說明文本”



sudo apt-get install mercurial



cd /home/username/projname
hg init


hg push ssh://username@remotehostname//home/username/projname


hg push


hg push –debug



hg clone ssh://username@remotehostname//home/username/projname







hg remove path/filename


hg remove -I path/wildcard .


hg pull ssh://username@remotehostname//home/username/projname


hg pull


hg update






hg head
hg heads
hg log


Mercurial 還有一套很強的變更打包解包功能,即一個開發者可以把自己本地Repository中的變更記錄打包發給別人,別的開發者得到這個變更包以後,可以解包到 自己的Repository裏,這樣就可以連公共的遠端Repository也不需要。這樣的“分散式”真是太徹底了。




我 曾經樂觀地認為DRCS會取代傳統SCM,但這只是我個人的體會,我可以很輕鬆地把SVN換成Mercurial,但是並不表示這對所有人都是合適的。令 狐就指出,在他們公司,因為在VSS的基礎上有一整套自己的管理工具和規範,即使明知有更好的選擇,也不太可能就把它換掉的。




Bazaar 的優點是智慧重命名,這個在大專案中進行目錄重命名時會有優勢,但是這個功能畢竟不常用。Mercurial的重命名與傳統SCM是一樣的,都是刪除後重 新添加。在操作性能上Mercurial完勝Bazaar,在安裝方便性上也是Mercurial勝出——Bazaar在使用SSH方式進還需要自己安裝 額外的依賴套裝軟體。


Bazaar的HTTP方式很簡單,只要在Web Server裏配置一個Directory專案,允許通過HTTP訪問Repository中的.bzr目錄即可。不過Bazaar的HTTP方式只提供讀操作功能,這是它的不足之處。

需 要進行遠端Repository的讀寫操作,還是要用SFTP——FTP over SSH——方式。當然這種訪問方式的實現也很簡單,只要伺服器支持SFTP即可使用。甚至不需要在服務端安裝Bazaar,遠端Repository的操 作(包括初始創建)也全都是在用戶端進行。


首 先是HTTP方式,這需要在服務端運行serve命令,在特定埠上提供HTTP服務,然後由實際的Web Server通過mod_proxy等方式代理一下使用。這樣的代價就是需要在服務端消耗額外的資源,但換來的好處是可以提供更強大的功能,而不是像 Bazaar那樣只能讀訪問。





在Google Code上用 Mercurial 取代 Subversion 管理你的项目

By Leeiio

之前,我一直都是用的SVN作為我日常的版本控制工具,諸如代碼啊文檔啊之類的東西。至於CVS這麼復古的版本控制工具更是沒有機會去嘗試。說到SVN控制版本的話,作為託管服務商比較好的就有google code,本人也一直在使用。最近,由於一些專案的原因,瞭解到了另外一個版本控制工具Hg,當然Hg不是它的原名,原名叫Mercurial,都是水銀的意思,所以通常稱呼為Hg。

風雲《分 布式的版本控制工具》,猛禽《分 布式版本控制(一)》 《分 布式版本控制(二) 》,Sparkle《我 與Mercurial 系列等幾篇文章》等。只是想瞭解Mercurial(Hg)的話,Mercurial官方wiki已經有很詳盡的資料和幫助文檔了。

接下來回到本文的正題。本文的起因是Google Code在早前除了支援SVN託管代碼外,更支援了分散式版本控制Mercurial(Hg)來管理你託管在Google Code上的專案。至於Google為什麼在這麼多種的分散式版本控制工具中選擇了Mercurial而不是Git,這裏有一篇文章,推薦閱讀一下《Git 與 Mercurial 的分析》,原文《Analysis of Git and Mercurial》

下面,就教大家怎麼讓Google CodeMercurial替代Subversion來管理你的專案。原文。

Google Code裏設置由Mercurial來管理專案

  1. 訪問你已經存在的google code專案頁面,選擇 “Administer” 選項頁,然後選擇下級分類選項頁 “Source”。
  2. 改變第一項Repository type為Mercurial。
  3. 參照下文介紹的“如何轉換Google Code裏Subversion的歷史記錄到Mercurial中”,導入你的代碼到 Hg 代碼庫中
  4. 以同樣的導入代碼的方式,導入你的wiki到Hg wiki庫中。確認你使用的subversion代碼庫的wiki路徑(例如 以及 Hg 代碼庫的wiki路徑(


如何轉換Google CodeSubversion的歷史記錄到Mercurial

如果你不在乎你原來專案的歷史記錄,那麼你可以簡單地從Subversion裏的主幹代碼或者wiki中提取最新的代碼然後放到你的Mercurial中。假設你googlecode裏的 Mercurial代碼庫是空的,那麼可以這樣操作:

$ hg clone hg-client

$ cd hg-client

$ svn export –force .

$ hg add .

$ hg commit -m “Initial import of source.”

$ hg push


$ hg clone hg-client-wiki

$ cd hg-client-wiki

$ svn export –force .

$ hg add .

$ hg commit -m “Initial import of wiki.”

$ hg push


  • ‘hg convert’擴展模組。最新版本的Mercurial已經包含這個模組,請確保你的hg版本為1.1 或者 1.2 或者更高版本。 (可用”hg –version”命令查看) 然後在你的.hgrc裏面添加如下代碼啟用該擴展模組:
  • [extensions]


  • Subversion的swig-python綁定。請確保您有最近的Subversion安裝(1.5或1.6)。絕大部分的Subversion衍生版本都有與python的綁定,或者提供他們額外的二進位包。你可以運行如下代碼檢查你的svn – python的綁定是否正常:
  • $ python -c “import svn.core; print svn.core.SVN_VER_MINOR”



$ svn checkout svn

$ cd svn

$ ./ && ./configure

$ make

$ sudo make install

$ make swig-py  # make sure you have swig 1.3 installed already

$ make check-swig-py

$ sudo make install-swig-py

這可能還需要安裝backports裏的libsvn1, subversion, mercurial-common, 和mercurial 包。

現在我們開始轉換– branches(分支), tags(標籤)以及其他全部:

$ mkdir hg-client

$ hg convert hg-client

一旦完成轉換,你就可以push你最新的歷史記錄到你的Google Code專案中(前提是你有了一個空的Mercurial版本庫):

$ cd hg-client

$ hg push