Too DRY - The Grep Test
I’ll occasionally see code that attempts to use clever mechanisms to reduce code duplication or be terser by using metaprogramming or string concatenation or any number of other mechanisms to DRY up code.
But sometimes it impedes maintainability in ways that are not immediately
obvious. When working in any language, but particularly dynamic languages such
as Python, JavaScript, or Ruby, grep
or one of its variants (ack
, git
grep
, etc.) is an essential tool for figuring out where and how a class,
function, module, or variable is used. Here I propose a simple test, that if
failed, is sufficient to reject a piece of code under review. (You are
performing code reviews, right?)
The Grep Test: If any code declares or makes use of a function, class, module, or variable that cannot be located by grepping for its full identifying token, it fails the Grep Test.
By “full identifying token”, I mean that if I search for inverseTransform
, I
expect to find every declaration or invocation of any method named
inverseTransform
. I would not expect to be able to find every use of the
inverseTransform
method of Node
by grepping for Node.inverseTransform
,
since that isn’t a single token.
EDIT: Added a section on Compromises in response to some of the discussion on HN.
Counterexamples
Dynamic Declaration
Here are a few examples of code that would fail the Grep Test because of dynamically declared functions.
JavaScript: dynamic function declarations
var Ray = function(position, direction) {
this.position = position;
this.direction = direction;
};
["position", "direction"].forEach(function(property) {
var getterName = "get" + property.charAt(0).toUpperCase() + property.substr(1);
Ray.prototype[getterName] = function() {
return this[property];
};
});
var ray = new Ray([0, 0], [1, 1]);
console.log(ray.getPosition());
console.log(ray.getDirection());
Fails because grepping for getPosition
won’t find me the function definition.
Ruby: method_missing
class Ray
attr_accessor :position, :direction
def initialize(position, direction)
@position = position
@direction = direction
end
def method_missing(m, *args, &block)
if m.to_s =~ /^negative_(\w+)$/
case $1
when "position"
[ -@position[0], -@position[1] ]
when "direction"
[ -@direction[0], -@direction[1] ]
end
else
super
end
end
end
ray = Ray.new([1, 2], [3, 4])
puts ray.position.inspect
puts ray.direction.inspect
puts ray.negative_position.inspect
puts ray.negative_direction.inspect
Fails the test because grepping for negative_position
won’t find me the
function definition.
Python: __getattr__
import re
class Ray(object):
def __init__(self, position, direction):
self.position = position
self.direction = direction
def __getattr__(self, name):
pat = re.compile('^(\w+)_is_zero$')
if pat.match(name):
prop = pat.findall(name)[0]
if prop == 'position':
return self.position[0] == 0 and self.position[1] == 0
elif prop == 'direction':
return self.direction[0] == 0 and self.direction[1] == 0
raise AttributeError
ray = Ray([0, 0], [1, 1])
print ray.position
print ray.direction
print ray.position_is_zero
print ray.direction_is_zero
Fails the test because grepping for position_is_zero
won’t find me the
function definition.
Dynamic Invocation
Here’s an example that would fail the Grep Test due to dynamic invocation:
Let’s say we have a form:
<form class="login">
Username: <input name="username" />
Password: <input name="password" />
<input type="submit" />
</form>
And we want to update a model based on the values of these fields when the form is submitted:
var User = function() {};
User.prototype.getUsername = function() {
return this._username;
};
User.prototype.setUsername = function(username) {
this._username = username;
};
User.prototype.getPassword = function() {
return this._password;
}
User.prototype.setPassword = function(password) {
this._password = password;
};
$("form.login").submit(function(ev) {
ev.preventDefault();
var fields = $(this).serializeArray();
var user = new User();
$(fields).each(function(i, field) {
var capName = field.name.charAt(0).toUpperCase() + field.name.substr(1);
var value = field.value;
user["set" + capName](value);
console.log(user["get" + capName]());
});
console.log(user);
});
Fails the test because grepping for getUsername
won’t find the invocation in
the submit callback.
Compromises
So should we throw out dynamic code generation wholesale? No. We just have to be careful to preserve the whole identifiers our fellow developers are likely to be looking for.
My preferred solution to all of the above counterexamples is to just avoid dynamic declaration and invocation completely, but there is certainly a scale at which this becomes a pain and metaprogramming becomes a pragmatic option.
Avoiding String Manipulation
One option is to provide slightly more information while still dynamically generating functions.
[
{attr: "position", fn: "getPosition"},
{attr: "direction", fn: "getDirection"}
].forEach(function(property) {
Ray.prototype[property.fn] = function() {
return this[property.attr];
};
});
Now both position
and getPosition
remain greppable while still avoiding
duplicating any common functionality you may have for these functions.
Using Comments
Another similar option for retaining dynamic methods while maintaining greppability is to simple add the tokens in the comments.
// Generates the following methods for Ray:
// - getPosition
// - getDirection
["position", "direction"].forEach(function(property) {
var getterName = "get" + property.charAt(0).toUpperCase() + property.substr(1);
Ray.prototype[getterName] = function() {
return this[property];
};
});
While I frankly still find this code more difficult to follow than the first compromise or avoiding dynamic declaration altogether, I’d still appreciate these comments when navigating a codebase.
Why is this Important?
When working with dynamic code, it’s an incredible boon to productivity to be
able to quickly locate the definition of functions so you can build a complete
mental context about what’s going on. Two common tools for doing this are grep
and ctags
, both of which require the entire identifier be written out at the
declaration site. (For the vim users reading this, you should really know about
using ctags in vim.) Dynamic declaration ruins this.
When modifying existing code, especially without good test coverage,
especially when refactoring code without good test coverage, it’s
essential to be able to find all the usages of a given class or function. Here,
grep
is my tool of choice. Dynamic invocation ruins this. If you have code
breaking the Grep Test, refactoring becomes a nightmare. I’ve deleted code I
thought was unused far too many times because of this.
As a broad generalization, I would say dynamic declaration is occasionally worth the tradeoff, but dynamic invocation is almost never worthwhile.
A Comment on Frameworks
I’m sure by this point, a few of you have thought “Hey, but Rails fails the Grep
Test!“. This is absolutely true, most notable due to the dynamic find_by_*
finders, and the dynamic *_path
URL path generators.
I personally remember tripping over trying to find these in the documentation when I was learning Rails, so I’m not a huge fan of these. That said, there is an argument for the beautiful sugar this provides. If you’re going to do stuff like this, use it sparingly and make sure it’s a well known convention used everywhere.