Null objects and the Law of Demeter
In this post, James Coglan, a Developer at FutureLearn, explains null objects and the Law of Demeter.
In this post, James Coglan, a Developer at FutureLearn, explains null objects and the Law of Demeter.
Last week we had a bug to fix that prompted a useful refactoring. In order to display a page with partial information, we wanted to introduce a null object. But, the code as it stood required the planned null object to provide a complicated interface, and we used the Law of Demeter to remove the complexity.
But first, some context. When a learner takes one of our courses, they can buy a Statement of Participation (henceforth referred to simply as Statement). The Statement carries various pieces of information about the learner, the course they took, the organisation that runs the course, and so on. One of these pieces of information consists of either one or two signatories, who are usually the lead educators on the course. They are represented on the Statement by their name, title, and an image of their signature.
As part of the course authoring process, the educators must upload this information to us, and we provide a preview of the Statement containing the information they’ve provided. Sometimes, users request a preview when they’ve not filled in all the required information, and this was causing an error in production: the preview would fail if there wasn’t a signatory for the Statement.
The source of the bug
Here’s a snippet of the class that was generating the error; there is much more code in the class than this but this is what’s relevant to the current discussion:
class PrintedStatementPresenter
def initialize(statement)
@statement_presenter = StatementPresenter.new(statement)
end
def primary_signatory_signature
@statement_presenter.primary_signatory_signature.vector.read.html_safe
end
end
PrintedStatementPresenter
wraps another presenter StatementPresenter
. We generate both printed and digital versions of the Statement and StatementPresenter
provides shared functionality for both these uses.
StatementPresenter
wraps an ActiveRecord model called Statement
(statement
in the above code) that relates all the data about the purchased Statement: who bought it, which course they were taking, who the signatories are, and so on. It is receiving a call to primary_signatory_signature
in the above code, and that method is defined as follows, again omitting much unrelated code:
class StatementPresenter
def initialize(statement)
@statement = statement
@statement_template = statement.purchasable
end
delegate :name, :title, :organisation_title, :signature,
to: :primary_signatory, prefix: :primary_signatory
def primary_signatory
SignatoryPresenter.new(@statement_template.primary_signatory,
@statement_template.course.organisation)
end
end
The delegate
call generates several methods, one of which is primary_signatory_signature
, whose implementation is to return the result of primary_signatory.signature
, with primary_signatory
defined via a method as a SignatoryPresenter
wrapping a signatory and organisation from the statement template. If statement
is the complete Statement containing the learner and course details, then you can think of the template as the line item the learner has effectively bought (hence purchasable
) – the Statement, minus their own personal information, but still with all the course and signatory information.
SignatoryPresenter
itself is a tiny little class extracted from StatementPresenter
when we started having multiple signatories, and it looks like this:
class SignatoryPresenter
def initialize(signatory, organisation)
@signatory = signatory
@organisation = organisation
end
delegate :name, :title, :signature, to: :@signatory
def organisation_title
if @signatory.organisation_title.present?
@signatory.organisation_title
else
@organisation.title
end
end
end
Mostly this class transparently delegates to a Signatory
model, but it does define one method itself: organisation_title
. This method encapsulates the decision about what to display as the signatory’s affiliated organisation: if they have one, we’ll show that, otherwise we’ll show the course’s organisation, on the assumption that the signatory belongs to the same organisation as is running the course. This method will make for an interesting comparison later.
Now we have got to the source of the bug: remember that PrintedStatementPresenter
was calling:
@statement_presenter.primary_signatory_signature.vector.read.html_safe
We’ve found out that primary_signatory_signature
is delegated by StatementPresenter
to SignatoryPresenter
, which delegates signature
to its @signatory
field, which was passed in by StatementPresenter
by calling @statement_template.primary_signatory
. If this final value is nil
, that is, if the educator has not yet filled in the signatory information, then SignatoryPresenter
will get a NoMethodError
when calling @signatory.signature
. This error will bubble back through StatementPresenter
and PrintedStatementPresenter
to the HTML template that’s doing the actual rendering.
The trouble with nil
In the Rails world, a common solution to this problem is to use Object#try
and replace the offending delegate in SignatoryPresenter
with:
class SignatoryPresenter
def signature
@signatory.try(:signature)
end
end
But this could still leave a nil
bubbling back to PrintedStatementPresenter
so we’d better protect everything after the signature too:
@statement_presenter.primary_signatory_signature.try(:vector).try(:read).try(:html_safe)
While this might silence the NoMethodError
, it doesn’t lead to a cohesive design or explain the intent of this code, nor does it allow you to do something other than return nil
. It also produces bad error reports: rather than the exception being reported when the nil
is first produced – when StatementPresenter
calls @statement_template.primary_signatory
– it is first produced one layer downwards – when SignatoryPresenter
calls @signatory.signature
– and then is allowed to bubble three layers upwards – back through StatementPresenter
and PrintedStatementPresenter
to the template that’s driving the whole call. Stack traces like that are terrible for finding out where the real problem is, and using Object#try
only lets the nil
travel further through the system before something explodes in a galaxy far, far away.
Null objects
A common solution to this problem is the Null Object pattern. In StatementPresenter
where the nil
is first produced, we can supply a different object instead:
class StatementPresenter
def primary_signatory
signatory = @statement_template.primary_signatory || NullSignatory.new
SignatoryPresenter.new(signatory, @statement_template.course.organisation)
end
end
Now the SignatoryPresenter
will always have a value to work with. But what methods does NullSignatory
need to implement? Well, we can see from SignatoryPresenter
that it must need at least name
, title
and organisation_title
(which is allowed to be nil
since SignatoryPresenter
wraps it). These are just strings and can quite easily be implemented to return placeholder text:
class NullSignatory
def name
'[Signatory name not yet given]'
end
def title
'[Signatory title not yet given]'
end
def organisation_title
nil
end
end
We can also see from SignatoryPresenter
that it must implement signature
, but that is a more complicated value. In PrintedStatementPresenter
, we end up calling primary_signatory_signature.vector.read.html_safe
; the html_safe
part suggests that what comes before it must be a string, but what about the .vector.read
bit? Even if we don’t know what those are yet, we know that we’re going to need to implement something that makes those calls work sufficiently, that is, something like this:
class NullSignatory
def signature
NullSignature.new
end
class NullSignature
def vector
NullVector.new
end
end
class NullVector
def read
'some placeholder SVG data?!'
end
end
end
All these classes just to make rendering a signature safe from nil
? This code’s starting to smell a bit.
A clue from Demeter
The complexity of the last snippet of code could indicate one of two things: either we’re trying to implement the wrong type for our null object, or we’ve picked the right type but its interface is too complex. In this case, I think we’ve picked the right type: the signatory is the thing that can be null; either it exists with a name
, title
and signature
, or it does not. Those fields cannot individually be null, as we can see if we look at the class’s validations:
class Signatory < ActiveRecord::Base
mount_uploader :signature, SignatureUploader
validates :signature, :name, presence: true
end
In this case, this is the entire class; I have not elided anything. This is another advantage: small classes have small interfaces, which means fewer methods for us to implement in our null class.
The mount_uploader
call means that signature
is an uploaded file, managed by CarrierWave, and its uploader looks like this:
class SignatureUploader < CarrierWave::Uploader::Base
version :vector do
# config for this file version
end
# more methods (elided)
end
So now we know what primary_signatory_signature.vector.read
was all about: primary_signatory_signature
is a SignatureUploader
; vector
gets us a specific version of the uploaded file (we generate vector images from bitmap formats that users upload); and read
, well, reads it.
But this also gives us a clue about the nature of our design problem. Let’s return to the PrintedStatementPresenter
where this all began:
@statement_presenter.primary_signatory_signature.vector.read.html_safe
The Law of Demeter is also known as the “two dots” rule: a rule of thumb is that you shouldn’t have two or more “dots” (i.e. method calls) chained together in a single expression. The truth is more subtle than that but it’s a good rule of thumb for spotting problematic code. Let’s examine the problem with this line.
The Law of Demeter is really about types: an object should only talk to (that is, call methods on) its “immediate neighbours”, that is, objects that either:
- are provided to the class via its constructor;
- are passed into the class’s methods as an argument;
- or, are directly created by the class by calling
new
or a factory function.
This is intended to stop a class becoming directly coupled to too many other classes in the system; the fewer classes depend directly on one another, the less code you need to change if you change any single one of them. It also drastically limits how far an error can travel before being detected, among other benefits. (A more detailed explanation of this appears in The Law of Demeter: Some archaeological notes.)
Let’s look at our line of code through this lens, considering that our overall chain of delegation looks like this:
PrintedStatementPresenter
|
V
StatementPresenter
|
V
SignatoryPresenter
|
V
Signatory
|
V
SignatureUploader
|
V
FileVersion
The upper three classes are “view-level” classes responsible for rendering things, and the lower three are “model-level”, responsible for storing and retrieving persisted data.
Here’s that line from PrintedStatementPresenter
again:
@statement_presenter.primary_signatory_signature.vector.read.html_safe
We’re allowed to call @statement_presenter.primary_signatory_signature
, because @statement_presenter
is a StatementPresenter
that PrintedStatementPresenter
constructs itself. That call is delegated to SignatoryPresenter
and then to Signatory
and returns the SignatureUploader
. However, we then call vector
on that to get a FileVersion
, which we call read
on to get a String
. (We also call html_safe
on that String
and get a SafeBuffer
but I’ll discuss that separately.)
So PrintedStatementPresenter
is coupled to at least two types accidentally: SignatureUploader
and FileVersion
. Yes, in Ruby-land we prefer to think in terms of “duck typing” and we’re only talking about two method calls, but this analysis explains why our NullSignatory
is too complicated.
Fixing the problem
Now we’ve identified the problem, how should we fix it? Well, what’s our original problem? The interface required for NullSignatory
is too complicated. Which “real” object is NullSignatory
standing in for? The Signatory
class, i.e. the value that can be null. What we’d like to do, therefore, is push the complexity in PrintedStatementPresenter
far enough down the stack that this interface becomes simpler.
Ignoring the html_safe
part for a second, we can make the top-level call Demeter-compliant by replacing it with:
@statement_presenter.primary_signatory_signature_svg.html_safe
That is, primary_signatory_signature.vector.read
has been replaced with a single method call, returning a String
. This removes the coupling to SignatureUploader
and FileVersion
.
Now the question is, how should primary_signatory_signature_svg
be implemented? To answer that we need to look at what we’re trying to hide: coupling to classes related to CarrierWave, that is, classes lower down our chart than Signatory
. This is no coincidence: the entire problem with the Signatory
interface is that it leaks complicated values! We can fix the problem by hiding these details and having Signatory
return the thing we’re actually interested in: a string of SVG.
class Signatory < ActiveRecord::Base
mount_uploader :signature, SignatureUploader
def signature_svg
signature.vector.read
end
end
This hides all interaction with CarrierWave-derived objects inside the Signatory
model, where the file upload functionality is defined. Our design is now more cohesive, because this concern is not spread out over the system. Why should a view-level class know it needs to read a file to get its data? What if S3 is down and the read fails? We probably don’t want to handle that error so high up the stack.
It also simplifies the interface required by callers of this class, so NullSignatory
is now much simpler. We just need it to implement signature_svg
to return a string.
class NullSignatory
def signature_svg
# return placeholder string or default SVG
end
end
SignatoryPresenter
, StatementPresenter
and PrintedStatementPresenter
can just delegate to this method rather than bubbling the whole signature object up the stack.
What about html_safe?
When we simplified the method chain in PrintedStatementPresenter
, we removed .vector.read
but left in a call to String#html_safe
. Isn’t that a Demeter violation?
Well, it might be, but remember that part of the point of Demeter, and certainly the part we’re using here, is to limit the extent of changes to your system. The built-in data types and standard library are less prone to change than your application code, so I’m less concerned about isolating myself from them. This is still a matter of judgement; you probably don’t want your rendering code talking directly to, say, Net::HTTP
, but String
? I’m fine with that. Strictly speaking, String#html_safe
is provided by Rails, but it’s existed for so long that I’m also not worried about it changing.
There’s also the issue of what html_safe
is for. It marks a string as safe for concatenating with HTML without needing to be escaped. It’s my opinion that that assertion should happen as close as possible to where that concatenation is happening, i.e. close to the rendering layer rather than deep in the models. The model might know the SVG is safe, but putting the assertion down there effectively spreads an encoding function (converting text to HTML) out over the system and makes it harder to reason about. Validation and encoding should happen at the edge of the system, where parsing and serializing take place.
A little reflection
Along the way, we saw a good example of the pattern we ended up implementing to solve our problem. SignatoryPresenter#organisation_title
exists to encapsulate a conditional that would otherwise end up in the top-level template. Indeed, without this method, SignatoryPresenter
would just be a transparent proxy around Signatory
and would be largely pointless. Its existence means the template has a simple, flat interface to talk to, removing complexity from the rendering layer.
That method exists in SignatoryPresenter
whereas we’ve added our signature_svg
method directly to Signatory
. Shouldn’t we have put it in the presenter? To me, this is a question of looking at which types we’re bringing into contact when we add methods. organisation_title
requires knowledge of Organisation
, which is several hops round the ActiveRecord association graph from Signatory
. So adding the method there would serve to introduce two classes that are currently distant from one another. In contrast, signature_svg
requires knowledge of CarrierWave files, which is already present in Signatory
and not anywhere else, since Signatory
owns that uploaded file. Adding the method to Signatory
therefore doesn’t create new couplings.
Finally, if you look through the code in this article you’ll see plenty more Demeter violations. For example, take another look at StatementPresenter
:
class StatementPresenter
def initialize(statement)
@statement = statement
@statement_template = statement.purchasable
end
delegate :name, :title, :organisation_title, :signature,
to: :primary_signatory, prefix: :primary_signatory
def primary_signatory
signatory = @statement_template.primary_signatory || NullSignatory.new
SignatoryPresenter.new(signatory, @statement_template.course.organisation)
end
end
The class is given its @statement
field on construction, but then acquires @statement_template
from that object, and reaches deeply into that, for example @statement_template.course.organisation
. Should we worry about that?
All the objects in these calls are ActiveRecord
models, so in a sense you can regard them all as one concept: they’re an object graph representing the database structure. And developers tend to think of the database in those terms, and think about getting things out of it by walking around that structure. We could implement methods on each model to hide this deep structure but it would mostly end up mirroring the structure rather than genuinely encapsulating it. It would produce a lot of tiny delegating methods that developers would find hard to follow.
Rather than every model trying to obey the Law, I find presenters like this – which provide a flat interface onto a slice of this graph, and encapsulate its structure – a good balance between hiding the complexity of the storage model from the view, and being able to see how something actually works all in one place. Such a presenter might end up calling into a lot of classes, but those classes tend to change together, when you change your schema. We think of schemas as sets of related tables, not single tables in isolation, so a presenter like this that groups together a section of related things actually limits the number of classes you need to change when migrations are done.
Strictly speaking, Signatory#signature_svg
calling signature.vector.read
is a violation, but again I consider this easier to understand than strictly splitting this call up, especially since it’s all related to one concept: reading an uploaded file.
The point being that, rather than dogmatically following every law and principle in the OOP canon, I prefer to use them to question code, to detect certain problematic patterns – the things we frequently call smells – when I’m finding it hard to change a part of my program. Demeter is one such tool that’s really good at detecting excessive coupling, and it hints at solutions to that problem. But it will never spell the solution out; it’s up to you to decice where to apply the tool for the greatest payoff.
Want to know more about the way we do things? Read more Making FutureLearn posts.