One of the under-appreciated aspects to software development is the ability to produce a good patch. Not just the coding–making sure the actual patch itself is of good quality. The backporting theme I’ve been discussing recently here all starts with the assumption that you can move changes from a newer version of software into an older one. That invariably involves some conflicts, but one of the best ways to avoid them is to make sure each of the individual commits–and accordingly the patches they represent–are as clean as possible. Figuring out just how to extract a good backport from a change to a newer feature is an art form, and it’s easier if the patches you start with are well constructed.
Periodically newcomers to PostgreSQL submission get roasted for not following the project’s coding guidelines. Most of these situations could have been resolved by the submitter, without that embarrassment/harassment, had they looked at the patch at all with the intent of cleaning it up before submission. The last time this popped up in our community, I got lumped in as one of the “formatting curmudgeons”, a title I was proud to wear. It took me a while to really appreciate how important patch diff elegance is to building large pieces of software reliably, and to develop my own good techniques for doing so.
I’m not a fan of making things more difficult than they have to be for submitters to projects I work on though, so the one part I realized I could do better is helping others learn this material. While there are plenty of tutorials on how to use tools like git, I don’t know that I’ve ever seen one that talks about the minutia of using diff to make good patches. We curmudgeons are quick to say things like “blend in with the surrounding code” or “remove spurious whitespace”. But no one even talks about how you do that, exactly, in a way I’d expect newcomers to understand. That’s why I just wrote a new tutorial called Creating Clean Patches, intended as supplemental reading to Submitting a Patch. Once you see what a bad patch looks like, compared to a clean one that would be reviewed only on its code quality merits, some of these fuzzy guidelines are much easier to understand. I think that article might even be useful outside of the PostgreSQL community. Plenty of other open-source projects struggle with the same issue.
Very good wiki page. I would also show an example that I’ve seen a lot: diff doesn’t choose the most natural context to show a patch. For example, suppose I change this snippet:
if ( foo ) {
bar;
}
to this:
if ( foo ) {
bar;
}
else {
baz;
}
The patch sometimes comes out like this:
+ }
+ else {
+ baz;
which is not helpful, especially in more complex cases. Such a patch might have to be edited manually to show only the new block, from its beginning to its end.
@xaprb
The problem you describe is addressed by patience diff, which is available through git diff – you need only give the –patience flag. Sending a patience diff to -hackers is acceptable, and I would argue produces a better diff (few would disagree, but there is a lack of awareness of the better alternative).
There’s a pretty good explanation of it here:
http://bramcohen.livejournal.com/73318.html