If you’ve ever “watched” a busy GitHub repository, your email inbox has discovered what it feels like to step in front of a firehose. If the project in question has active code reviewers, the problem is often worse by an order of magnitude. Every comment yields another email to all watchers. The CockroachDB repository’s weekly average is at 81 pull requests and 440 notification-generating comments.```
Most of us who once paid close attention to incoming changes have since lost the ability to do so; these days, monitoring the stream requires a superhuman effort. The mere mortals among us can only pay attention to the pull requests we’ve authored or are tasked with reviewing. What’s surprising is that the watching functionality provided by GitHub is so coarse-grained. The dial apparently only has settings for “0” and “11”.
A search for GitHub digests yields some choices. Diffmatic is neat, as is this open source Ruby digest from the folks at Heroku.
Taking a cue from these, I decided to spend a Flex Friday repurposing my previous efforts to analyze our GitHub stargazers to build repo-digest, a GitHub pull request digester which provides a daily appraisal of PRs in a concise and parsable format.
repo-digest provides daily digesting of one or more GitHub repos. By default, the digest includes all pull requests which were opened or closed within the past 24 hours, but this is something you can tinker with to suit your preferences via the –since command line flag. The digest sorts all pull requests – both open and closed – in descending order, by total changes, to highlight consequential PRs.
repo-digest --since=2016-02-24T19:00:00-05:00 --repos=cockroachdb/cockroach,cockroachdb/docs --token=f87456b1112dadb2d831a5792bf2ca9a6afca7bc
How does it work? It’s a straightforward usage of the GitHub API. Pull requests are queried in reverse date order from the comma-separated list of repositories specified via the –repos flag. Each pull request is queried in turn to get more details and its list of changed files. Changed files are again queried in turn to get addition and deletion counts for each.
The digest provides additional insight into the focus of a pull request by listing the most important subdirectories. File changes are tallied for each subdirectory, and those which comprise the top 80th percentile are deemed representative of the pull request and listed immediately underneath the additions and deletions totals:
The images shown here have been styled to match Cockroach Labs branding, but the template is flexible and very easy to customize. I used golang’s nifty templating language.
The default template is generically styled. You can create and specify your own template using –template=. repo-digest automatically inlines CSS styles to make the output suitable for sending via email.
The goal of CockroachDB is to “make data easy,” and while it seems like a stretch now, we …Read more
As we’ve built CockroachDB, correctness has been our primary concern. But as we’ve drawn closer …Read more
Note: As of May 4, 2021, this post is under active revision to bring it up to date with the current practices at …Read more