In this post, James Coglan, a Developer at FutureLearn, explains null objects and the Law of Demeter.
In this post, James Coglan, a Developer at FutureLearn, explains null objects and the Law of Demeter.
Last week we had a bug to fix that prompted a useful refactoring. In order to display a page with partial information, we wanted to introduce a null object. But, the code as it stood required the planned null object to provide a complicated interface, and we used the Law of Demeter to remove the complexity.
But first, some context. When a learner takes one of our courses, they can buy a Statement of Participation (henceforth referred to simply as Statement). The Statement carries various pieces of information about the learner, the course they took, the organisation that runs the course, and so on. One of these pieces of information consists of either one or two signatories, who are usually the lead educators on the course. They are represented on the Statement by their name, title, and an image of their signature.
As part of the course authoring process, the educators must upload this information to us, and we provide a preview of the Statement containing the information they’ve provided. Sometimes, users request a preview when they’ve not filled in all the required information, and this was causing an error in production: the preview would fail if there wasn’t a signatory for the Statement.
The source of the bug
Here’s a snippet of the class that was generating the error; there is much more code in the class than this but this is what’s relevant to the current discussion:
class PrintedStatementPresenter def initialize(statement) @statement_presenter = StatementPresenter.new(statement) end def primary_signatory_signature @statement_presenter.primary_signatory_signature.vector.read.html_safe end end
PrintedStatementPresenter wraps another presenter
StatementPresenter. We generate both printed and digital versions of the Statement and
StatementPresenter provides shared functionality for both these uses.
StatementPresenter wraps an ActiveRecord model called
statement in the above code) that relates all the data about the purchased Statement: who bought it, which course they were taking, who the signatories are, and so on. It is receiving a call to
primary_signatory_signature in the above code, and that method is defined as follows, again omitting much unrelated code:
class StatementPresenter def initialize(statement) @statement = statement @statement_template = statement.purchasable end delegate :name, :title, :organisation_title, :signature, to: :primary_signatory, prefix: :primary_signatory def primary_signatory SignatoryPresenter.new(@statement_template.primary_signatory, @statement_template.course.organisation) end end
delegate call generates several methods, one of which is
primary_signatory_signature, whose implementation is to return the result of
primary_signatory defined via a method as a
SignatoryPresenter wrapping a signatory and organisation from the statement template. If
statement is the complete Statement containing the learner and course details, then you can think of the template as the line item the learner has effectively bought (hence
purchasable) – the Statement, minus their own personal information, but still with all the course and signatory information.
SignatoryPresenter itself is a tiny little class extracted from
StatementPresenter when we started having multiple signatories, and it looks like this:
class SignatoryPresenter def initialize(signatory, organisation) @signatory = signatory @organisation = organisation end delegate :name, :title, :signature, to: :@signatory def organisation_title if @signatory.organisation_title.present? @signatory.organisation_title else @organisation.title end end end
Mostly this class transparently delegates to a
Signatory model, but it does define one method itself:
organisation_title. This method encapsulates the decision about what to display as the signatory’s affiliated organisation: if they have one, we’ll show that, otherwise we’ll show the course’s organisation, on the assumption that the signatory belongs to the same organisation as is running the course. This method will make for an interesting comparison later.
Now we have got to the source of the bug: remember that
PrintedStatementPresenter was calling:
We’ve found out that
primary_signatory_signature is delegated by
SignatoryPresenter, which delegates
signature to its
@signatory field, which was passed in by
StatementPresenter by calling
@statement_template.primary_signatory. If this final value is
nil, that is, if the educator has not yet filled in the signatory information, then
SignatoryPresenter will get a
NoMethodError when calling
@signatory.signature. This error will bubble back through
PrintedStatementPresenter to the HTML template that’s doing the actual rendering.
The trouble with nil
In the Rails world, a common solution to this problem is to use
Object#try and replace the offending delegate in
class SignatoryPresenter def signature @signatory.try(:signature) end end
But this could still leave a
nil bubbling back to
PrintedStatementPresenter so we’d better protect everything after the signature too:
While this might silence the
NoMethodError, it doesn’t lead to a cohesive design or explain the intent of this code, nor does it allow you to do something other than return
nil. It also produces bad error reports: rather than the exception being reported when the
nil is first produced – when
@statement_template.primary_signatory – it is first produced one layer downwards – when
@signatory.signature – and then is allowed to bubble three layers upwards – back through
PrintedStatementPresenter to the template that’s driving the whole call. Stack traces like that are terrible for finding out where the real problem is, and using
Object#try only lets the
nil travel further through the system before something explodes in a galaxy far, far away.
A common solution to this problem is the Null Object pattern. In
StatementPresenter where the
nil is first produced, we can supply a different object instead:
class StatementPresenter def primary_signatory signatory = @statement_template.primary_signatory || NullSignatory.new SignatoryPresenter.new(signatory, @statement_template.course.organisation) end end
SignatoryPresenter will always have a value to work with. But what methods does
NullSignatory need to implement? Well, we can see from
SignatoryPresenter that it must need at least
organisation_title (which is allowed to be
SignatoryPresenter wraps it). These are just strings and can quite easily be implemented to return placeholder text:
class NullSignatory def name '[Signatory name not yet given]' end def title '[Signatory title not yet given]' end def organisation_title nil end end
We can also see from
SignatoryPresenter that it must implement
signature, but that is a more complicated value. In
PrintedStatementPresenter, we end up calling
html_safe part suggests that what comes before it must be a string, but what about the
.vector.read bit? Even if we don’t know what those are yet, we know that we’re going to need to implement something that makes those calls work sufficiently, that is, something like this:
class NullSignatory def signature NullSignature.new end class NullSignature def vector NullVector.new end end class NullVector def read 'some placeholder SVG data?!' end end end
All these classes just to make rendering a signature safe from
nil? This code’s starting to smell a bit.
A clue from Demeter
The complexity of the last snippet of code could indicate one of two things: either we’re trying to implement the wrong type for our null object, or we’ve picked the right type but its interface is too complex. In this case, I think we’ve picked the right type: the signatory is the thing that can be null; either it exists with a
signature, or it does not. Those fields cannot individually be null, as we can see if we look at the class’s validations:
class Signatory < ActiveRecord::Base mount_uploader :signature, SignatureUploader validates :signature, :name, presence: true end
In this case, this is the entire class; I have not elided anything. This is another advantage: small classes have small interfaces, which means fewer methods for us to implement in our null class.
mount_uploader call means that
signature is an uploaded file, managed by CarrierWave, and its uploader looks like this:
class SignatureUploader < CarrierWave::Uploader::Base version :vector do # config for this file version end # more methods (elided) end
So now we know what
primary_signatory_signature.vector.read was all about:
primary_signatory_signature is a
vector gets us a specific version of the uploaded file (we generate vector images from bitmap formats that users upload); and
read, well, reads it.
But this also gives us a clue about the nature of our design problem. Let’s return to the
PrintedStatementPresenter where this all began:
The Law of Demeter is also known as the “two dots” rule: a rule of thumb is that you shouldn’t have two or more “dots” (i.e. method calls) chained together in a single expression. The truth is more subtle than that but it’s a good rule of thumb for spotting problematic code. Let’s examine the problem with this line.
The Law of Demeter is really about types: an object should only talk to (that is, call methods on) its “immediate neighbours”, that is, objects that either:
- are provided to the class via its constructor;
- are passed into the class’s methods as an argument;
- or, are directly created by the class by calling
newor a factory function.
This is intended to stop a class becoming directly coupled to too many other classes in the system; the fewer classes depend directly on one another, the less code you need to change if you change any single one of them. It also drastically limits how far an error can travel before being detected, among other benefits. (A more detailed explanation of this appears in The Law of Demeter: Some archaeological notes.)
Let’s look at our line of code through this lens, considering that our overall chain of delegation looks like this:
PrintedStatementPresenter | V StatementPresenter | V SignatoryPresenter | V Signatory | V SignatureUploader | V FileVersion
The upper three classes are “view-level” classes responsible for rendering things, and the lower three are “model-level”, responsible for storing and retrieving persisted data.
Here’s that line from
We’re allowed to call
@statement_presenter is a
PrintedStatementPresenter constructs itself. That call is delegated to
SignatoryPresenter and then to
Signatory and returns the
SignatureUploader. However, we then call
vector on that to get a
FileVersion, which we call
read on to get a
String. (We also call
html_safe on that
String and get a
SafeBuffer but I’ll discuss that separately.)
PrintedStatementPresenter is coupled to at least two types accidentally:
FileVersion. Yes, in Ruby-land we prefer to think in terms of “duck typing” and we’re only talking about two method calls, but this analysis explains why our
NullSignatory is too complicated.
Fixing the problem
Now we’ve identified the problem, how should we fix it? Well, what’s our original problem? The interface required for
NullSignatory is too complicated. Which “real” object is
NullSignatory standing in for? The
Signatory class, i.e. the value that can be null. What we’d like to do, therefore, is push the complexity in
PrintedStatementPresenter far enough down the stack that this interface becomes simpler.
html_safe part for a second, we can make the top-level call Demeter-compliant by replacing it with:
primary_signatory_signature.vector.read has been replaced with a single method call, returning a
String. This removes the coupling to
Now the question is, how should
primary_signatory_signature_svg be implemented? To answer that we need to look at what we’re trying to hide: coupling to classes related to CarrierWave, that is, classes lower down our chart than
Signatory. This is no coincidence: the entire problem with the
Signatory interface is that it leaks complicated values! We can fix the problem by hiding these details and having
Signatory return the thing we’re actually interested in: a string of SVG.
class Signatory < ActiveRecord::Base mount_uploader :signature, SignatureUploader def signature_svg signature.vector.read end end
This hides all interaction with CarrierWave-derived objects inside the
Signatory model, where the file upload functionality is defined. Our design is now more cohesive, because this concern is not spread out over the system. Why should a view-level class know it needs to read a file to get its data? What if S3 is down and the read fails? We probably don’t want to handle that error so high up the stack.
It also simplifies the interface required by callers of this class, so
NullSignatory is now much simpler. We just need it to implement
signature_svg to return a string.
class NullSignatory def signature_svg # return placeholder string or default SVG end end
PrintedStatementPresenter can just delegate to this method rather than bubbling the whole signature object up the stack.
What about html_safe?
When we simplified the method chain in
PrintedStatementPresenter, we removed
.vector.read but left in a call to
String#html_safe. Isn’t that a Demeter violation?
Well, it might be, but remember that part of the point of Demeter, and certainly the part we’re using here, is to limit the extent of changes to your system. The built-in data types and standard library are less prone to change than your application code, so I’m less concerned about isolating myself from them. This is still a matter of judgement; you probably don’t want your rendering code talking directly to, say,
String? I’m fine with that. Strictly speaking,
String#html_safe is provided by Rails, but it’s existed for so long that I’m also not worried about it changing.
There’s also the issue of what
html_safe is for. It marks a string as safe for concatenating with HTML without needing to be escaped. It’s my opinion that that assertion should happen as close as possible to where that concatenation is happening, i.e. close to the rendering layer rather than deep in the models. The model might know the SVG is safe, but putting the assertion down there effectively spreads an encoding function (converting text to HTML) out over the system and makes it harder to reason about. Validation and encoding should happen at the edge of the system, where parsing and serializing take place.
A little reflection
Along the way, we saw a good example of the pattern we ended up implementing to solve our problem.
SignatoryPresenter#organisation_title exists to encapsulate a conditional that would otherwise end up in the top-level template. Indeed, without this method,
SignatoryPresenter would just be a transparent proxy around
Signatory and would be largely pointless. Its existence means the template has a simple, flat interface to talk to, removing complexity from the rendering layer.
That method exists in
SignatoryPresenter whereas we’ve added our
signature_svg method directly to
Signatory. Shouldn’t we have put it in the presenter? To me, this is a question of looking at which types we’re bringing into contact when we add methods.
organisation_title requires knowledge of
Organisation, which is several hops round the ActiveRecord association graph from
Signatory. So adding the method there would serve to introduce two classes that are currently distant from one another. In contrast,
signature_svg requires knowledge of CarrierWave files, which is already present in
Signatory and not anywhere else, since
Signatory owns that uploaded file. Adding the method to
Signatory therefore doesn’t create new couplings.
Finally, if you look through the code in this article you’ll see plenty more Demeter violations. For example, take another look at
class StatementPresenter def initialize(statement) @statement = statement @statement_template = statement.purchasable end delegate :name, :title, :organisation_title, :signature, to: :primary_signatory, prefix: :primary_signatory def primary_signatory signatory = @statement_template.primary_signatory || NullSignatory.new SignatoryPresenter.new(signatory, @statement_template.course.organisation) end end
The class is given its
@statement field on construction, but then acquires
@statement_template from that object, and reaches deeply into that, for example
@statement_template.course.organisation. Should we worry about that?
All the objects in these calls are
ActiveRecord models, so in a sense you can regard them all as one concept: they’re an object graph representing the database structure. And developers tend to think of the database in those terms, and think about getting things out of it by walking around that structure. We could implement methods on each model to hide this deep structure but it would mostly end up mirroring the structure rather than genuinely encapsulating it. It would produce a lot of tiny delegating methods that developers would find hard to follow.
Rather than every model trying to obey the Law, I find presenters like this – which provide a flat interface onto a slice of this graph, and encapsulate its structure – a good balance between hiding the complexity of the storage model from the view, and being able to see how something actually works all in one place. Such a presenter might end up calling into a lot of classes, but those classes tend to change together, when you change your schema. We think of schemas as sets of related tables, not single tables in isolation, so a presenter like this that groups together a section of related things actually limits the number of classes you need to change when migrations are done.
signature.vector.read is a violation, but again I consider this easier to understand than strictly splitting this call up, especially since it’s all related to one concept: reading an uploaded file.
The point being that, rather than dogmatically following every law and principle in the OOP canon, I prefer to use them to question code, to detect certain problematic patterns – the things we frequently call smells – when I’m finding it hard to change a part of my program. Demeter is one such tool that’s really good at detecting excessive coupling, and it hints at solutions to that problem. But it will never spell the solution out; it’s up to you to decice where to apply the tool for the greatest payoff.
Want to know more about the way we do things? Read more Making FutureLearn posts.