Seb Jacobs, a Developer at FutureLearn, discusses how we maintain a useful Git history and outlines five principles we live by when making code commits.
Development started on the FutureLearn platform on 2 April 2013 and since that time, our codebase has gone through rapid changes, covering over 10,000 commits.
With a rapidly changing codebase and a growing development team, being able to communicate how and why your code evolves over time is crucial.
In order to manage the complex process of changing coding, we not only believe in having well factored code and tests, but also a simple Git history, which allows us to make informed decisions about future code changes.
Telling stories with your Git history
Our Git history is a living, ever-changing, searchable record that tells the story of how and why our code is the way it is. The ability to document code effectively using Git (or another version control system) is just as important as being able to ship a feature, write clean code or readable tests.
Although your code should be self-documenting, it doesn’t tell the story of why the code is the way it is or how it came to be.
You may be used to more conventional methods of code documentation such as a Readme or a Wiki, however these forms of documentation can often become out of date.
The other method of code documentation that comes to mind is code comments. These can be noisy and are not always relevant. I believe you should apply the same rules you apply to code design, in terms of separating your concerns, using Git for the documentation and the code for the code. These comments also have a tendency of becoming out of date.
Here are five principles we live by at FutureLearn when it comes to Git:
1. Atomic commits
Large commits can be difficult to read, especially when they contain changes which are unrelated. I’ll admit that it is often difficult to think about how you might split up a large commit, but I find it helps to think about the purpose of each commit.
Think of atomic commits as the smallest amount of code changed which delivers value – whether it’s tidying up existing code or introducing a new (small) feature.
commit: [REDACTED] Date: [REDACTED] Allow educators to invite users onto courses. 61 files changed, 937 insertions(+), 81 deletions(-)
With this example there are a lot of changes in this one commit. On the surface it might look like all these changes are related. However, if you were to break down the changes, you would have a better sense of what’s going on, for example:
a6455f8 Record when enrolment is created via an invitation. b529f6d Allow invited users to enrol on courses. b5bb6e4 Allow invited users to see the course description page. c829cbc Send enrolment invitation emails in batches of 1000. 5feaccf Allow educators to invite users onto courses.
Now that this example has been split up, it starts to become more useful; not only is each commit smaller, it also gives you a better story of what has happened.
2. Useful commit messages
Writing commit messages can be difficult, but it helps to have a purpose in mind. We’re already breaking the changes down into atomic commits, meaning we should have a good idea of the value of each commit.
If there’s one thing to remember, it is to explain why you’ve made the change in the first place. This is the perfect opportunity to reflect on what you’re doing and to provide context – whether it’s to satisfy a user requirement, to fix a bug or to make another change easier to make in the future.
I find it helps to look at the commit from the perspective of another developer. What questions might they be asking when looking at your code changes? What might not be immediately obvious?
In terms of good practice, the following template is a good start. However, bear in mind that every commit is different.
Short one line title. An explanation of the problem, providing context (this may be as simple as a reference to the user story). Longer description of what the change does. An explanation of why the change is being made. Perhaps a discussion of alternatives that were considered.
The first line should be used to explain the value of the changes, rather than focussing on the implementation details. By keeping this concise, we allow the reader to easily scan over the commit and find the code changes they are interested in.
The rest of the Git commit message depends on the change, but it’s always useful to explain what you’ve changed and why you’ve changed it.
I find it also helps in some cases, to provide further context such as explaining alternative solutions you’ve ruled out or providing external references.
Correct the colour of FAQ link in course notice footer PT: https://www.pivotaltracker.com/story/show/84753832 In some email clients the colour of the FAQ link in the course notice footer was being displayed as blue instead of white. The examples given in PT are all different versions of Outlook. Outlook won't implement CSS changes that include `!important` inline . Therefore, since we were using it to define the colour of that link, Outlook wasn't applying that style and thus simply set its default style(blue, like in most browsers). Removing that `!important` should fix the problem.  https://www.campaignmonitor.com/blog/post/3143/outlook-2007-and- the-inline-important-declaration/
This example has a clear headline, it outlines the problem, the developer’s intent and also provides context around the change.
3. Revise history before sharing
When developing your code, you are bound to change direction or even make mistakes (most commonly introducing typos or bugs).
324d079 Fix typo in enrolment flash message 3a85f77 Only display enrol button for users who can enrol 4cc4778 Allow users to enrol on courses
In this example, the developer has introduced a typo in their first commit, which they have fixed in a later commit.
Before you share your commit history, it’s important to think about what is useful information for someone else to read. You shouldn’t think of your Git history as a “truthful” log of what you worked on step-by-step. Just as we refactor code, we should refactor our commits before sharing them with others.
The power of Git makes it simple to re-order, reword and refactor your commits until they tell the clearest story possible.
Git’s interactive rebasing functionality allows us to tell a clearer story:
$ git rebase --interactive 3a85f77 Only display enrol button for users who can enrol 773e345 Allow users to enrol on courses
By reducing the amount of noise in your commit history, you can save you and your team time.
If you’re struggling with this process, I find that using Pull Requests is a great way to collaborate on shaping the commit history of your feature branch.
4. Single purpose branches
Long-living feature branches can often be difficult to keep up to date with master. I find that it is important to think about the purpose and scope of your feature branch.
Although you might be working on a single feature (user story), this doesn’t necessarily mean that you can’t deliver value in stages.
5ce95fb Notify educators when an invitation has been accepted. 5ce95fb Refactor specs around enrolment invitations. ee95245 Extend enrolment invitation to educators. cfb2fb4 Tidy up whitespace in enrolment invitations spec.
Like with this example, you may find that you are having to change/refactor key areas of the codebase in order to implement the feature you are developing. These changes can often provide a clear benefit that isn’t directly related to the feature branch, and can be landed on master separately (and earlier).
$ git cherry-pick cfb2fb4 5ce95fb * 0564508 Merge branch 'educator-enrolment-invitations' |\ | * 5ce95fb Notify educators when an invitation has been accepted. | * ee95245 Extend enrolment invitation to educators. | | |/ * 5ce95fb Refactor specs around enrolment invitations. * cfb2fb4 Tidy up whitespace in enrolment invitations spec.
By splitting up your feature branches, you not only reduce the pain of merging each branch, you also deliver value sooner and make your Git history more readable.
5. Keep your history linear
Often, merging changes into master can result in your history becoming tangled and difficult to read.
* ce91a05 Merge branch 'reprint-statements' |\ | * ae43ad0 Disable reprint link for refunded purchases. | * 0b1abb0 Allow admins to flag purchases for re-printing. * | 35d0357 Put dates formats in the pattern library * | 275206c Merge branch 'fulfilment-attempt' |\ \ | * | 7aae45b Populate `fulfilled?` for existing purchases. | * | 8e461b1 Display purchase fulfilment attempts to admins. * | | 1adc0a9 Reduce padding around the course run date | |/ |/|
This becomes even more of an issue when you have several feature branches being developed in parallel.
* ce91a05 Merge branch 'reprint-statements' |\ | * ae43ad0 Disable reprint link for refunded purchases. | * 0b1abb0 Allow admins to flag purchases for re-printing. |/ * 35d0357 Put dates formats in the pattern library * 275206c Merge branch 'fulfilment-attempt' |\ | * 7aae45b Populate `fulfilled?` for existing purchases. | * 8e461b1 Display purchase fulfilment attempts to admins. | * 44cbfd0 Introduce Fulfilment attempts |/ * 1adc0a9 Reduce padding around the course run date
When it comes to merging, we try to preserve our merges commits and to rebase our feature branches before merging.
$ git checkout reprint-statements $ git rebase master $ git checkout master $ git merge --no-ff reprint-statements
This allows us to group related commits together while keeping our merges clean, making it a lot easier to identify when a particular change was introduced.
We also find it often helps to commit smaller changes directly onto master.
If you spend as much time ensuring your commits are well factored as you do refactoring your code and tests, it will save you and your team time and pain in the future.
How do you use Git? What sort of problems do you think it solves? Let us know in the comments below.
Want to know more about how we use Git? Watch a talk from our CTO, Joel Chippindale.