Understanding Darcs/Patch theory and conflicts
|This guide uses notation and terminology developed during FOSDEM 2006, so it will be out of synch with the older notation/terminology from the darcs-conflicts mailing list archive|
Up to now, we have only dealt with merging patches that do not conflict with each other. The next question of interest is how darcs should behave when they do.
Consider the previous darcs hackathon example, where as usual, Arjan decides that the shopping list needs some beer. In this scenario, Ganesh decides that you can't live on apples and cookies alone and records a patch adding "pasta" to the s_list file. Now he wants to know what Arjan is up to, and so pulls the beer patch into his repository, but oh no! Arjan and Ganesh's patches conflict! How should darcs behave here?
The darcs answer is that both patches cancel each other out so that neither of them has any effect. The resulting shopping list has neither beer nor pasta. This might sound alarming, but it's not as bad as you might think. Darcs does not silently delete your code. After canceling the two patches out, it adds a third patch into your working directory which indicates both sides of the conflict so that you can select the one that you want. So any resolution you apply is a third patch which depends on the two conflicting ones. If you did
darcs whatsnew on Ganesh's repository at this point, what you would get is something like this:
v v v v v v beer ----------- pasta ^ ^ ^ ^ ^ ^
How do we know we have a conflict?
It is intuitively obvious that Arjan's patch conflicts with Ganesh's, but intuition is useless if it does not translate into actual Haskell code. The first issue is thus that of knowing that we have a conflict in the first place.
All of this boils down to commutation. We have a conflict if commutation is not defined for the two patches. Let us briefly revisit the merge process described in the previous chapter. When Ganesh tries to pull Arjan's patch in, he tries to adapt the patch to his context by performing the following sequence: invert his own patch, apply Arjan's patch , commute the inverted patch with Arjan's patch, and discard the evil step sister of his inverted patch. As we know, inverting patches is easy. Ganesh's patch is inverted into something which remove 'pasta' from line 3 of the s_list file. On the other hand, when we try to commute that against Arjan's patch, we have a failure.
Why? Simply because it is how we define commutation between the two types of patches. For instance, both Ganesh's and Arjan's patches are hunk patches. The commutation of two hunk patches of the same file is defined in darcs using Haskell code very similar to the following (simplified from PatchCommute.lhs):
commuteHunk :: FileName -> (FilePatchType, FilePatchType) -> Maybe (Patch, Patch) commuteHunk f (p1@(Hunk line2 old2 new2), p2@(Hunk line1 old1 new1)) | line1 + lengthnew1 < line2 = Just ... | line1 + lengthnew1 == line2 && nonZero = Just ... | line2 + lengthold2 < line1 = Just ... | line2 + lengthold2 == line1 && nonZero = Just ... | otherwise = Nothing where nonZero = lengthold2 /= 0 && lengthold1 /= 0 && lengthnew2 /= 0 && lengthnew1 /= 0 lengthnew1 = length new1 lengthnew2 = length new2 lengthold1 = length old1 lengthold2 = length old2
Only four cases are defined. The first two cases cover the situation where the
p1 occurs in an earlier part file than
p2 (even bumping up against it as in the second case). The latter two cases cover the reverse situation (
p2 is in earlier part of the file than
p1). However, the case where
p2 overlap simply does not fall into one of these possibilities. Thus we have a conflict on our hands.
Now that we know we have a conflict, we now need to deal with this conflict in a sane manner. We not only want to deal with the conflict at hand, but deal with it in a way which allows the conflict to propagate cleanly across an entire sequence of patches. Well, darcs is based on commutation, so in order to keep things running smoothly, we need to make sure that things continue to commute. So, we're going to define a secondary forced commutation operation that we only use when there is a conflict.
Recall the definition of commutation from the previous chapter:
|The commutation of patches and is represented by . and are intended to perform the same change as and|
The forced commutation is going to do something similar, but with a very odd twist. Instead of patches and performing the same change as their respective ancestors and ; forced commutation is going to give us patches, each of which makes the change that the other patch does. That is, normal commutation wants to do roughly the same thing as , but forced commutation makes it do the same thing as .
|operation||effect of||effect of|
As a side note, we're going to need a little terminology to keep ourselves from tripping over our tongues. It's not very convenient to always talk about one patch making the same change as another patch, which is something we will be referring to a lot. So let us compress things a little bit. Instead of saying that patch makes the same change as , let us simply say that the effect of is . It is the same idea, but with slightly smoother terminology.
Forced commutation in merging
Let us see what the implications of this are for Ganesh and Arjan. We want to commute the inverse of Ganesh's patch () against Arjan's patch. Since the two patches conflict, we have to resort to forced commutation, which produces two patches and with the following bizarre properties:
- the effect of is ; it removes Ganesh's "pasta" from the shopping list.
- likewise, the effect of is ; it adds Arjan's "beer" to the shopping list.
This is all very convenient, because if I may remind you, what we're really after is cancelling out the patches. If we do the standard merging technique of simply removing (so we don't add the beer after all), we will have succesfully undone Ganesh's pasta patch. The merge is complete!
Marking the conflict
But wait! We can't just leave things undone. How is the poor developer supposed to know if there is a conflict, if darcs handles them by undoing things? The answer is that we're not going to stop here. Undoing the conflict is a very important first step, as we will see in further detail below. Look at it this way. We know there was a conflict, because of the way commutation was defined, and we know which patches were involved in the conflict. So whenever this happens, we first undo everything, and then inspect the contents of the conflicting patches, and use that to create a new conflict-marking patch.
FIXME:insert image here showing the conflict-marking patch
:TODO: introduce this section
The exponential merge problem
Unfortunately, the darcs 1 merge algorithm has the property that certain merges -- merges that people have experienced in real life -- are exponential in time with respect to the size of conflict (in number of conflicting patches). This leads to the problem that some users have experienced where users would do a darcs pull and inexplicably, darcs would just sit there and hang...
So how does the new darcs 2 fix this problem? What's going on under the hood?
The notion of conflictors is essentially that we would special patches that contain a list of patches they conflict with
| This section is a stub.
You can help Wikibooks by expanding it.
Use of Generalised Algebraic Datatypes to improve code safety
| This section is a stub.
You can help Wikibooks by expanding it.