Understanding Darcs/Print Version
|This is the print version of Understanding Darcs
You won't see this message or any elements not part of the book's content when you print or preview this page.
- 1 Getting Started
- 2 Making Changes
- 3 Working With Others
- 4 Reviewing Your Work
- 5 Undoing Mistakes
- 6 Dealing With Conflicts
- 7 Introduction to Patch Theory
- 7.1 Math and computer science nerds only
- 7.2 What is the theory of patches?
- 7.3 Why all this math?
- 7.4 Context, patches and changes
- 7.5 Sequential patches
- 7.6 Inverses
- 7.7 Commutation
- 7.8 Merging
- 7.9 Definitions and properties
- 8 Intermediary Patch Theory
- 9 Patch Theory and Conflicts
- 9.1 Conflicts
- 9.2 Forced commutation
- 9.3 Forced commutation in merging
- 9.4 Darcs 2
- 9.5 Current research
Anatomy of a darcs repository
Understanding how to use the darcs commands can be a lot easier if you have a rough idea how things work. We're not asking you to learn about patch commutation algebra (yet), but one thing you should at least be comfortable with is the anatomy of a darcs repository.
The idea of the diagram is as follows: the stuff on the left in yellow is you. That's what darcs calls the working directory. The stuff in grey, on the other hand, is part of that mysterious
_darcs directory if you've played around with darcs before. This can be broken down some more. The pristine tree (middle) is exactly like your working directory, except only representing the last saved state. It's not essential to darcs's operations, but it makes things run more efficiently, and is perhaps useful for understanding how things work. Finally, the right-most portion is the set of patches. Patches are what makes darcs... well... darcs. Darcs thinks in patches. Almost every darcs operation somehow involves (darcs) juggling some patches around behind the scenes. Enthusiastic darcs users find that this makes life easier in a number of ways. For example, accepting patches to your code becomes an extremely natural thing to do - you just let darcs apply them and in they go. But before we get to that, let's tackle a small set of essential commands.
Say you have a directory with some files in it. When you run
darcs init to initialise the repository, you get an empty new darcs repository. Your working directory might contain files, but darcs does not know about them yet.
darcs add tells darcs to keep track of a file or directory that was only in your working directory. It creates or adds to a special temporary patch which we call the pending patch (will be represented in blue). Note that it does not affect your pristine tree whatsover! The idea is that we haven't saved your work (which is what the pristine tree is for). We've only told darcs that we might conceivably want to save it later on.
Note that the pending patch is different from all the other patches. It is really a representation of (some) things you have not yet converted into a real darcs patch.
darcs whatsnew compares the working directory against the pristine tree (theoretically, against the set of patches) and displays what has changed between them. If there is anything in the pending patch, it also displays the contents of that.
darcs record command is how you save your work. It copies your chosen changes from the working directory to the pristine tree, and more importantly, creates a new patch representing those changes. Changes can also come from the pending patch, and these changes will also be propagated into the pristine tree.
record makes new stuff old
The record command takes the changes which only exist in your working directory (or the pending patch) and updates the pristine tree. The result of a record operation is a new patch.
replace for renaming tokens
The replace command is useful for explictly telling darcs to replace one word with another (for example, a variable name, if you are a programmer).
Note that because of the underlying patch theory, replace only works if the new word doesn't already exist in the file. Darcs will helpfully let you know if you try to replace something you cannot. Also, there is a switch for forcing the replacement, but the resulting patch is not a clean darcs-replace patch, but a combination of that, and what you would have gotten if you had edited the file in a text editor. In short, forcing darcs to replace when it really doesn't want to can lead to counter-intuitive results.
Playing with files
We saw add in the last chapter. What about the other file-related commands?
add tells darcs to pay attention
add, unsuprisingly, adds a file or directory to the list of files that darcs is paying attention to.
mv for moving or renaming
mv lets you rename a file or put it in a different directory. This command updates the pending patch with a move command.
you don't need remove
You might think that remove gets rid of a file, but actually all it does is to remove it from the list of files that darcs is paying attention to. You probably just want to delete the file instead (eg: with rm). Darcs will notice and record the change next time you darcs record that file. So what's the remove command good for? It might be handy if you want to just remove the file from darcs, without actually getting rid of your physical copy. This is most useful when you've accidentally used darcs add on a file you don't want darcs to pay attention to.
Working With Others
darcs get makes a copy of an entire darcs repository. Note that we only get the recorded patches (and pristine tree), not any of the pending stuff.
darcs pull copies from some other repository patches that you do not have. The patches are applied to your pristine tree and working directory. This may cause changes to be merged. Note that
darcs push does the same thing, but in the other direction.
Darcs send is sort of like push, only it doesn't actually apply the patches anywhere. Instead it generates a handy email to the person or people who own the repository. If it can't figure out who owns the repository, it will let you send to any email address you want. Note that you can also pass the -o command to "send" into a file, rather than a mail.
Apply is what you do to a patch that somebody darcs sends to you. You can also use it for the files you generated via send -o. Note: push is actually just apply in disguise, but with all the boring work of copying files over being done for you.
Put enables to copy a local repository to remote location (for instance, via ssh). Think of put as the opposite of get.
Dealing with conflicts
So you pulled a patch and you got a conflict. What do you do? See the chapter Dealing with conflicts
Reviewing Your Work
The whatsnew command allows you to get an overview of what unrecorded changes you have made on your working copy.
This way you can get an idea of what needs to be saved.
Changes gives a changelog-style summary of the repo history
Using this command you can get an output like that produced by the diff command. Allowing you to save changes in a plain old (yet very common) format.
Note that darcs does not depend on the diff binary.
||This page or section is an undeveloped draft or outline.
You can help to develop the work, or you can ask for assistance in the project room.
There are many ways to get rid of things: remove, rollback, revert, obliterate, unpull, unrecord... One would almost think too many. Only three of these are very important. In order of gravity, they are revert, unrecord, obliterate. You can also see them as being part of this table of symmetries:
(i.e: add, remove, mv, editing a file)
|less recent||pull||unpull (aka obliterate)|
Don't worry too much about this table. Surely it will make sense later on.
Anyway, here is a brief comparison of all the different ways you can get rid of stuff. Each one has its place, even the weird ones, like rollback.
The big three
revert only removes what's new
The simplest of these commands is revert. All reverting does is to get rid of stuff you have not recorded yet. You can think of revert as being the "opposite" of whatsnew. Revert gets rid of stuff that is new.
unrecord makes things new again
Unrecord does something quite different from revert. Whereas revert gets rid of stuff that is new, unrecord removes a patch, but here's the important part, makes the stuff in it new again so that you can choose to re-record or revert it at your leisure.
Unrecord should only be used if you recorded something, realised you made a mistake, and want to record it differently (note also amend-record). Note: only use unrecord if you are sure that your repository is the only one that has that patch in it!
Note that the picture above gives a somewhat more accurate depiction of what unrecord does - it removes a patch and the corresponding modifications from the pristine tree. The fact that something is new again is just a natural consequence of this fact.
obliterate is unrecord + revert
Obliterate was deliberately named to be scary. Obliterate can be seen as unrecording a patch (thus making its stuff new again) and then reverting it. In other words, obliterate totally wipes a patch out! Typically: you would use obliterate to go really far back in time. Say, "hmm, all that stuff I've been working on for the past three months was pretty stupid". Obliterate is the answer there.
||It is probably a good idea to darcs get a backup copy of your repository before obliterating things. This way, if you accidentally remove something by mistake, you can pull it back in|
Other ways to get rid of things
unpull is obliterate
Unpull and obliterate are exactly the same command and is only named this way to reflect its usefulness for undoing a pull.
Rollback is not a command that you can expect to use very often. The situation is this. You've got a patch you want to get rid of, but people have been telling you that you shouldn't obliterate or unrecord patches that are already in other people's repositories. So what do you do? One solution is to fire up your text editor and make exact opposite changes as the ones in the patch (and then record, etc). Another solution is to generate a rollback patch, which does the same thing. It creates a patch that does exactly the opposite of another patch.
Some users just find it easier to go the text-editor route.
remove doesn't belong here
Despite its getting-rid-of-things style name, remove is not really an undo kind of command. Its job is the opposite of add's: it tells darcs not to pay attention to a file any longer. But as we mentioned in the previous chapter, most of the time you don't even need darcs remove. Simply telling your computer to get rid of the file in your working directory is good enough for darcs to notice it is gone.
Questions and objections
- But... but... I just want to go back to the state of my repository from two weeks ago!
- obliterate is probably what you want. See above.
Dealing With Conflicts
Handle your conflicts at home
For the moment, the best place to go for dealing with conflicts is the conflicts FAQ on darcs wiki.
Introduction to Patch Theory
Math and computer science nerds only
(The occasional physicist will be tolerated)
Casual users be warned, the stuff you're about to read is not for the faint of heart! If you're a day-to-day darcs user, you probably do not need to read anything from this page on. However, if you are interested in learning how darcs really works, we invite you to roll up your sleeves, and follow us in this guided tour of the growing Theory of Patches.
What is the theory of patches?
The darcs patch formalism is the underlying "math" which helps us understand how darcs should behave when exchanging patches between repositories. It is implemented in the darcs engine as data structures for representing sequences of patches and Haskell functions equivalent to the operations in the formalism. This section is addressed at two audiences: curious onlookers and people wanting to participate in the development of the darcs core. My aim is to help you understand the intuitions behind all this math, so that you can get up to speed with current conflictors research as fast as possible and start making contributions. You should note that I myself am only starting to learn about patch theory and conflictors, so there may be mistakes ahead.
Why all this math?
One difference between centralized and distributed version control systems is that "merging" is something that we do practically all the time, so it is doubly important that we get merging right. Turning the problem of version control into a math problem has two effects: it lets us abstract all of the irrelevant implementation details away, and it forces us to make sure that whatever techniques we come up with are fundamentally sound, that they do not fall apart when things get increasingly complicated. Unfortunately, math can be difficult for people who do not make use of it on a regular basis, so what we attempt to do in this manual is to ease you into the math through the use of concrete, illustrated examples.
A word of caution though, "getting merging right" does not necessarily consist of having clever behaviour with respect to conflicts. We will begin by focusing on successful, non-conflicting merges and move on to the darcs approach to handling conflicts.
|Note that the notation we use follows that from FOSDEM 2006, and not the darcs patch theory appendix. Namely, patch composition is written left to right. means that B is applied after A|
Context, patches and changes
Let us begin with a little shopping. Arjan is working to build a shopping list for the upcoming darcs hackathon. As we speak, his repository contains a single file s_list with the contents
1 apples 2 bananas 3 cookies 4 rice
Note:the numbers you see are just line numbers; they are not part of the file contents
As we will see in this and other examples in this book, we will often need to assign a name to the state of the repository. We call this name a context. For example, we can say that Arjan's repository is a context , defined by there being a file s_list with the contents mentioned above.
Arjan makes a modification which consists of adding a line in s_list. His new file looks like this:
1 apples 2 bananas 3 beer 4 cookies 5 rice
When Arjan records this change (adding beer), we produce a patch which tells us not only what contents Arjan added ("beer") but where he added them, namely to line 3 of s_list. We can say that in his repository, we have moved from context to context via a patch A. We can write this using a compact notation like or using the graphical representation below:
Starting from this context, Arjan might decide to make further changes. His new changes would be patches that apply to the context of the previous patches. So if Arjan makes a new patch on top of this, it would take us from context to some new context . The next patch would take us from this context to yet another new context , and so on and so forth. Patches which apply on top of each other like this are called sequential patches. We write them in left to right order as in the table below, either representing the contexts explicitly or leaving them out for brevity:
|with context||sans context (shorthand)|
All darcs repositories are simply sequences of patches as above; however, when performing a complex operation such as an undo or exchanging patches with another user, it becomes absolutely essential that we have some mechanism for rearranging patches and putting them in different orders. Darcs patch theory is essentially about giving a precise definition to the ways in which patches and patch-trees can be manipulated and transformed while maintaining the coherency of the repository.
Let's return to the example from the beginning of this module. Arjan has just added beer to our hackathon shopping list, but in a sudden fit of indecisiveness, he reconsiders that thought and wants to undo his change. In our example, this might consist of firing up his text editor and remove the offending line from the shopping list. But what if his changes were complex and hard to keep track of? The better thing to do would be to let darcs figure it out by itself. Darcs does this by computing an inverse patch, that is, a patch which makes the exact opposite change of some other patch:
||The inverse of patch is , which is the patch for which the composition makes no changes to the context and for which the inverse of the inverse is the original patch.|
So above, we said that Arjan has created a patch which adds beer to the shopping list, passing from context to , or more compactly, . Now we are going to create the inverse patch , which removes beer from the shopping list and brings us back to context . In the compact context-patch notation, we would write this as . Graphically, we would represent the situation like this:
Patch inverses may seem trivial, but as we will see later on in this module, they are a fundamental operation and absolutely crucial to make some of the fancier stuff -- like merging -- work correctly. One of the rules we impose in darcs is that every patch must have an inverse. These rules are what we call patch properties. A patch property tells us things which must be true about a patch in order for darcs to work. People often like to dream up new kinds of patches to extend darcs's functionality, and defining these patch properties is how we know that their new patch types will behave properly under darcs. The first of these properties is dead simple:
||Patch property: Every patch must have an inverse|
Arjan was lucky to realise that he wanted to undo his change as quickly as he did. But what happens if he was a little slower to realise his mistake? What if he makes some other changes before realising that he wants to undo the first change? Is it possible to undo his first change without undoing all the subsequent changes? It sometimes is, but to do so, we need to define an operation called commutation.
Consider a variant of the example above. As usual, Arjan adds beer to the shopping list. Next, he decides to add some pasta on line 5 of the file:
The question is how darcs should behave if Arjan now decides that he does not want beer on the shopping list after all. Arjan simply wants to remove the patch that adds the beer, without touching the one which adds pasta. The problem is that darcs repositories are simple, stupid sequences of patches. We can't just remove the beer patch, because then there would no longer be a context for the pasta patch! Arjan's first patch takes us to context like so: , and his second patch takes us to context , notably starting from the initial context : . Removing patch would be pulling the rug out from under patch . The trick behind this is to somehow change the order of patches and . This is precisely what commutation is for:
||The commutation of patches and is represented by . and are intended to perform the same change as and|
Why not keep our old patches?
To understand commutation, you should understand why we cannot keep our original patches, but are forced to rely on evil step sisters instead. It helps to work with a concrete example such as the beer and pasta one above. While we could write the sequence to represent adding beer and then pasta, simply writing for pasta and then beer would be a very foolish thing to do.
Put it this way: what would happen if we applied before ? We add pasta to line 5 of the file:
1 apples 2 bananas 3 cookies 4 rice 5 pasta
Does something seem amiss to you? We continue by adding beer to line 3. If you pay attention to the contents of the end result, you might notice that the order of our list is subtly wrong. Compare the two lists to see why:
1 apples 2 bananas 3 beer 4 cookies 5 rice 6 pasta
1 apples 2 bananas 3 beer 4 cookies 5 pasta 6 rice
It might not matter here because it is only a shopping list, but imagine that it was your PhD thesis, or your computer program to end world hunger. The error is all the more alarming because it is subtle and hard to pick out with the human eye.
The problem is one of context, specifically speaking, the context between and . In order for instructions like "add pasta to line 5 of s_list" to make any sense, they have to be in the correct context. Fortunately, commutation is easy to do, it produces two new patches and which perform the same change as and but with a different context in between.
|Patch is identical to . It adds "beer" to line 3 of the shopping list. But what should patch do?|
One more important detail to note though! We said earlier that getting the context right is the motivation behind commutation -- we can't simply apply patches in a different order, because that would get the context all wrong. But context does not have any effect on whether A and B can commute (or how they should commute). This is strictly a local affair. Conversely, the commutation of A and B does not have any effect either on the global context: the sequences and (where the latter is the commutation of the former) start from the same context and end in the same context.
The complex undo revisited
Now that we know what the commutation operation does, let's see how we can use it to undo a patch that is buried under some other patch. The first thing we do is commute Arjan's beer and pasta patches. This gives us an alternate route to the same context. But notice the small difference between and !
The purpose of commuting the patches is essentially to push patch on to end of the list, so that we could simply apply its inverse. Only here, it is not the inverse of that we want, but the inverse of its evil step sister . This is what applying that inverse does: it walks us back to the context , as if we had only applied the pasta patch, but not the beer one.
And now the undo is complete. To sum up, when the patch we want to undo is buried under some other patch, we use commutation to squeeze it to the end of the patch sequence, and then compute the inverse of the commuted patch. For the more sequentially minded, this is what the general scheme looks like:
Imagine the opposite scenario: Arjan had started by adding pasta to the list, and then followed up with the beer.
Commutation and patches
Every time we define a type of patch, we have to define how it commutes with other patches. Most of time, it is very straightforward. When commuting two hunk patches, for instance, we simply adjust their line offset. For instance, we want to put something on line 3 of the file, but if we use patch to insert a single line before that, what used to be line 3 now becomes line 4! So patch inserts the line "x" into line 4, much like inserts it into line 3.
Some patches cannot be commuted. For example, you can't commute the addition of a file with adding contents to it. But for now, we focus on patches which can commute.
- Note: this might be a good place to take a break. We are moving on to a new topic and new (but similar) examples
We have presented two fundamental darcs operations: patch inverse and patch commutation. It turns out these two operations are almost all that we need to perform a darcs merge.
Arjan and Ganesh are working together to build a shopping list for the upcoming darcs hackathon. Arjan initialises the repository and adds a file s_list with the contents
1 apples 2 bananas 3 cookies 4 rice
He then records his changes, and Ganesh performs a
darcs get to obtain an identical copy of his repository. Notice that Arjan and Ganesh are starting from the same context
Arjan makes a modification which consists of adding a line in s_list. His new file looks like this:
1 apples 2 bananas 3 beer 4 cookies 5 rice
Arjan's patch brings him to a new context :
Now, in his repository, Ganesh also makes a modification; he decides that s_list is a little hard to decipher and renames the file to shopping. Remember, at this point, Ganesh has not seen Arjan's modifications. He's still starting from the original context , and has moved a new context , via his patch :
At this point in time, Ganesh decides that it would be useful if he got a copy of Arjan's changes. Roughly speaking we would like to pull Arjan's patch A into Ganesh's repository B. But, there is a major problem! Namely, Arjan's patch takes us from context to context . Pulling it into Ganesh's repository would involve trying to apply it to context , which we simply do not know how to do. Put another way: Arjan's patch tells us to add a line to file s_list; however, in Ganesh's repository, s_list no longer exists, as it has been moved to shopping. How are we supposed to know that Arjan's change (adding the line "beer") is supposed to apply to the new file shopping instead?
Arjan and Ganesh's patches start from the same context o and diverge to different contexts a and b. We say that their patches are parallel to each other, and write it as . In trying to pull patches from Arjan's repository, we are trying to merge these two patches. The basic approach is to convert the parallel patches into the sequential patches , such that does essentially the same change as does, but within the context of b. We want to produce the situation
Performing the merge
Converting Arjan and Ganesh's parallel patches into sequential ones requires little more than the inverse and commutation operations that we described earlier in this module:
- So we're starting out with just Ganesh's patch. In context notation, we are at
- We calculate the inverse patch . The sequence consists of moving s_list to shopping and then back again. We've walked our way back to the original context:
- Now we can apply Arjan's patch without worries: , but the result does not look very interesting, because we've basically got the same thing Arjan has now, not a merge.
- All we need to do is commute the last two patches, , to get a new pair of patches . Still, the end result doesn't seem to look very interesting since it results in exactly the same state as the last step:
- However, one crucial difference is that the second to last patch produces just the state we're looking for! All we now have to do to get at it is to ditch the patch, which is only serving to undo Ganesh's precious work anyway. That is to say, by simply determining how to produce an which will commute with , we have determined the version of which will update Ganesh's repository.
The end result of all this is that we have the patch we're looking for, and a successful merge.
Merging is symmetric
Concretely, we've talked about Ganesh pulling Arjan's patch into his repository, so what about the other way around? Arjan pulling Ganesh's patch into his repository would work the same exact way, only that he is looking for a commuted version of Ganesh's patch that would apply to his repository. If Ganesh can pull Arjan's patch in, then Arjan can pull Ganesh's one too, and the result would be exactly the same:
||The result of a merge of two patches and is one of two patches and , which satisfy the relationship|
The merge definition describes what should happen when you combine two parallel patches into a patch sequence. The built-in symmetry is essential for darcs because a darcs repository is defined entirely by its patches. Put another way,
- To be written
The commutation with inverse property
The definition of a merge tells us what we want merging to look like. How did we know how to actually perform that merge? The answer comes out of the following property of commutation and inverse: if you can commute the inverse of a patch with some other patch , then you can also commute the patch itself against .
||if and only if , provided both commutations succeed.|
Note how the left hand side of this property exactly matches the relationship demanded by the definition of a merge. To see why this all works,
- To be written
Definitions and properties
|definition of inverse||has no effect|
|inverse of an inverse|
|inverse composition property|
|definition of commutation|
|definition of a merge|
|commutation with inverse property||if and only if|
Intermediary Patch Theory
It's all patches
Before moving on, we would like to make one very minor point clear: in the day to day operation of darcs, we talk about pulling and pushing patches, and of recording and reverting changes. This is just a user interface convention. In patch theory terms, all of these are just patches. The patches which you pull and push are named patches, patches which contain a name and a list of unnamed patches. So in fact, when you pull a single named patch from somebody else's repository, you are pulling a sequence of potentially many primitive patches. What does this mean for merging?
Merging a sequence of patches
In the last chapter, we saw that dealing with simple, non-conflicting merges consists mainly of making an inverse patch and commuting that inverse with the other side's patches. Let us now explore a slightly more complicated scenario, where we have to merge against a non-conflicting sequence of patches. We do this with a variant of the darcs hackathon shopping list. As usual, Arjan and Ganesh are working together to write the shopping list. They both start from a common file shplst containing
apples bananas cookies
As before, Arjan inserts "beer" in line 3 of shplst and records the change. He then decides to add another item on the end of the list, this time, "pasta" and records his second change. In darcs notation, Arjan has brought us from an initial context , to a new context with beer in it, and then to yet another context with pasta as well..
FIXME: will be fleshed out: i want to show what happens when Ganesh pulls two patches in
Sequences of patches
Patch Theory and Conflicts
||This guide uses notation and terminology developed during FOSDEM 2006, so it will be out of synch with the older notation/terminology from the darcs-conflicts mailing list archive|
Up to now, we have only dealt with merging patches that do not conflict with each other. The next question of interest is how darcs should behave when they do.
Consider the previous darcs hackathon example, where as usual, Arjan decides that the shopping list needs some beer. In this scenario, Ganesh decides that you can't live on apples and cookies alone and records a patch adding "pasta" to the s_list file. Now he wants to know what Arjan is up to, and so pulls the beer patch into his repository, but oh no! Arjan and Ganesh's patches conflict! How should darcs behave here?
The darcs answer is that both patches cancel each other out so that neither of them has any effect. The resulting shopping list has neither beer nor pasta. This might sound alarming, but it's not as bad as you might think. Darcs does not silently delete your code. After canceling the two patches out, it adds a third patch into your working directory which indicates both sides of the conflict so that you can select the one that you want. So any resolution you apply is a third patch which depends on the two conflicting ones. If you did
darcs whatsnew on Ganesh's repository at this point, what you would get is something like this:
v v v v v v beer ----------- pasta ^ ^ ^ ^ ^ ^
How do we know we have a conflict?
It is intuitively obvious that Arjan's patch conflicts with Ganesh's, but intuition is useless if it does not translate into actual Haskell code. The first issue is thus that of knowing that we have a conflict in the first place.
All of this boils down to commutation. We have a conflict if commutation is not defined for the two patches. Let us briefly revisit the merge process described in the previous chapter. When Ganesh tries to pull Arjan's patch in, he tries to adapt the patch to his context by performing the following sequence: invert his own patch, apply Arjan's patch , commute the inverted patch with Arjan's patch, and discard the evil step sister of his inverted patch. As we know, inverting patches is easy. Ganesh's patch is inverted into something which remove 'pasta' from line 3 of the s_list file. On the other hand, when we try to commute that against Arjan's patch, we have a failure.
Why? Simply because it is how we define commutation between the two types of patches. For instance, both Ganesh's and Arjan's patches are hunk patches. The commutation of two hunk patches of the same file is defined in darcs using Haskell code very similar to the following (simplified from PatchCommute.lhs):
commuteHunk :: FileName -> (FilePatchType, FilePatchType) -> Maybe (Patch, Patch) commuteHunk f (p1@(Hunk line2 old2 new2), p2@(Hunk line1 old1 new1)) | line1 + lengthnew1 < line2 = Just ... | line1 + lengthnew1 == line2 && nonZero = Just ... | line2 + lengthold2 < line1 = Just ... | line2 + lengthold2 == line1 && nonZero = Just ... | otherwise = Nothing where nonZero = lengthold2 /= 0 && lengthold1 /= 0 && lengthnew2 /= 0 && lengthnew1 /= 0 lengthnew1 = length new1 lengthnew2 = length new2 lengthold1 = length old1 lengthold2 = length old2
Only four cases are defined. The first two cases cover the situation where the
p1 occurs in an earlier part file than
p2 (even bumping up against it as in the second case). The latter two cases cover the reverse situation (
p2 is in earlier part of the file than
p1). However, the case where
p2 overlap simply does not fall into one of these possibilities. Thus we have a conflict on our hands.
Now that we know we have a conflict, we now need to deal with this conflict in a sane manner. We not only want to deal with the conflict at hand, but deal with it in a way which allows the conflict to propagate cleanly across an entire sequence of patches. Well, darcs is based on commutation, so in order to keep things running smoothly, we need to make sure that things continue to commute. So, we're going to define a secondary forced commutation operation that we only use when there is a conflict.
Recall the definition of commutation from the previous chapter:
||The commutation of patches and is represented by . and are intended to perform the same change as and|
The forced commutation is going to do something similar, but with a very odd twist. Instead of patches and performing the same change as their respective ancestors and ; forced commutation is going to give us patches, each of which makes the change that the other patch does. That is, normal commutation wants to do roughly the same thing as , but forced commutation makes it do the same thing as .
|operation||effect of||effect of|
As a side note, we're going to need a little terminology to keep ourselves from tripping over our tongues. It's not very convenient to always talk about one patch making the same change as another patch, which is something we will be referring to a lot. So let us compress things a little bit. Instead of saying that patch makes the same change as , let us simply say that the effect of is . It is the same idea, but with slightly smoother terminology.
Forced commutation in merging
Let us see what the implications of this are for Ganesh and Arjan. We want to commute the inverse of Ganesh's patch () against Arjan's patch. Since the two patches conflict, we have to resort to forced commutation, which produces two patches and with the following bizarre properties:
- the effect of is ; it removes Ganesh's "pasta" from the shopping list.
- likewise, the effect of is ; it adds Arjan's "beer" to the shopping list.
This is all very convenient, because if I may remind you, what we're really after is cancelling out the patches. If we do the standard merging technique of simply removing (so we don't add the beer after all), we will have succesfully undone Ganesh's pasta patch. The merge is complete!
Marking the conflict
But wait! We can't just leave things undone. How is the poor developer supposed to know if there is a conflict, if darcs handles them by undoing things? The answer is that we're not going to stop here. Undoing the conflict is a very important first step, as we will see in further detail below. Look at it this way. We know there was a conflict, because of the way commutation was defined, and we know which patches were involved in the conflict. So whenever this happens, we first undo everything, and then inspect the contents of the conflicting patches, and use that to create a new conflict-marking patch.
FIXME:insert image here showing the conflict-marking patch
:TODO: introduce this section
The exponential merge problem
Unfortunately, the darcs 1 merge algorithm has the property that certain merges -- merges that people have experienced in real life -- are exponential in time with respect to the size of conflict (in number of conflicting patches). This leads to the problem that some users have experienced where users would do a darcs pull and inexplicably, darcs would just sit there and hang...
So how does the new darcs 2 fix this problem? What's going on under the hood?
The notion of conflictors is essentially that we would special patches that contain a list of patches they conflict with
||This section is a stub.
You can help Wikibooks by expanding it.
Use of Generalised Algebraic Datatypes to improve code safety
||This section is a stub.
You can help Wikibooks by expanding it.