(Originally posted 2003.Apr.18 Fri; links may have expired.)
The Law of Demeter (LoD) is a heuristic for good object encapsulation. Ignore the Law of Demeter (and other advice on encapsulation), and you'll find yourself in a debugging and refactoring hell. Applied too strictly, LoD forbids container objects and external iterators (there may be a loop-hole for that, which I'll get back to later).
Yesterday, I wanted to eliminate a "reset" function in one object, and instead have its owning object delete and re-create that object. That should be straightforward, but it didn't work because encapsulation and the Law of Demeter were being violated.
The problem here was that a pointer to member variable of this "reset-able" object was being passed around and retained by other objects, whose lifetimes are longer-lived than that member variable would be if I implemented the delete and re-create strategy.
I've trained myself to never give out a member variable's address, just like a TV network never gives out a TV star's address. It's dangerous. It causes the program to be more brittle. It violates encapsulation. You never know what those obsessive fans or programmers might do. Other objects become too dependent on the internal state of another object -- changing the internals of an object becomes difficult. Polymorphism is overly restricted because any "replacement" class must also give out a member variable's address, and that variable's type must be compatible. As I said, I trained myself, but sometimes I forget, and I'm working with other people who sometimes also forget about this danger.
In C++, an alternative to returning a pointer to a member object is returning a reference to a member object, but that turns out to be just as bad. I'll illustrate with some code.
Pointer* ptr = obj.GetMemberAddress(); delete obj; // ptr is now "dangling" - pointing to deleted memory. Reference& ref = obj.GetMemberReference(); delete obj; // ref is now "dangling" - it also points to deleted memory.
No one would write code that obviously bad, but put in an event-loop, lots of intervening functions and other objects, and maybe some threading, and the same thing can happen without anyone realizing it until the crash occurs.
You could make a rule to never retain such a pointer or reference... that is an improvement (as long as no one breaks the rule), but it is awkward. It also requires that you never pass the pointer or reference into functions. It's too easy forget where the pointer came from, and create a persistent object holding that pointer. And that dangerous "persistence" include multi-threading as well as objects - a thread's lifetime is even less predictable than an object's, and it becomes much harder to diagnose dangling pointer problems in multi-threading programs.
So LoD forbids this:
memberPtr = obj.GetMemberAddress(); memberPtr->DoSomething(); // potentially changing obj's member state.
It also forbids this:
obj.GetMemberAddress()->DoSomething();
What you should do, is either return a copy of the member, and the copy's lifetime is no longer under the control of "obj", or incorporate DoSomething() into the API of "obj".
So we can write:
memberValue = obj.GetMemberValue(); // returns copy memberValue.DoSomething(); // doesn't affect obj's original member state.
We can even write:
obj.GetMemberValue().DoSomething();
because DoSomething is operating on a copy. NOTE: if you write this sequence of calls more than once, XP requires that you remove this duplication, most likely by incorporating DoSomething into the API of "obj".
Returning a copy is particularly useful for 'basic' types like String, Date, and so on. The safe programmer will return a copy of a string or date member variable, so that callers can not change the state of the member variable "behind the owner's back'".
Some of the more rabid fans of the Law of Demeter say that even this operating on a copy is too fragile, and you really should do this:
obj.DoSomething();
The danger of over-applying this idea is that your object interfaces get really fat. You really don't want to re-implement all of the member functions of String for each of the String members in your EmployeeData class just because you think LoD tells you to. Because of this, I think of the "Law" as more of a "Recommendation".
I assert that immutable objects are an exception to the Law of Demeter.
Java's String class is immutable (once the object is created, it can't be changed), so Java programmers don't have to make copies of String member variables in their accessor functions.
Some people have recommended declaring mutable and immutable interfaces, declaring the mutable object to implement both of those interfaces, and declaring this "accessor" function's return type be just the immutable interface, so that you can return a mutable object through that immutable interface. Of course, a programmer could "down-cast" back to the mutable type, but then all sorts of bad things can be done if you work at it. Probably better to create a copy of the object to avoid the down-casting trick. And, in languages like Smalltalk and Python, you don't have variable and function type declarations to make this immutable interface idea work (though you could create and return an Immutable Adapter to enclose your mutable member.
And what about that loophole for containers and external iterators?
LoD says you can't return references or pointer to your object's own member data, but containers are given data to hold, and so they can return that data.
External Iterators are new objects, not member objects, created when you call the a function that returns the iterator.
So what am I going to do about my hard-to-modify program?
Well, changing it to conform to LoD is going to be at least a day's worth of work. And we violated LoD on purpose, though now I regret that decision. We have a MFC Document object that owns a platform-independent "document" object. We pass the MFC Document object to other MFC classes, and pass the platform-independent "document" object around to the rest of our code. But we're not consistent about that.
To make this conform to LoD, the platform-independent object must never be passed around to other objects at all -- everywhere we currently do this, we should be passing around that MFC Document object instead. That means that the complete interface of platform-independent "document" object must be implemented in the MFC Document object, delegating to the member object. However, the "type" we pass into various part of our program doesn't have to always be the MFC Document type, it can be a base-class type -- the platform-independent "document" interface -- we just have to declare the MFC Document type to subclass from that interface.
Then, and only then, could I have the MFC Document object have full control over the lifetime of its member objects (and even then, I have to be careful about threads - I can't delete an object if another thread is still using it.)
And why would I want to eliminate a "reset" member function and instead delete/re-create the object? Because it's too easy for a reset member function to forget to clean up all of its state. I object to "reset" functions generally, for both small objects and large objects like documents. For small objects, I prefer immutable objects that I can easily recreate on demand, because then copies don't have to made in accessor methods, and immutable objects are more easily made multi-thread safe.
For future thinking: how can LoD work with threads? Can we think of the thread as an object? Some platforms do.
Here's a "formal" version of LoD: A method "M" of an object "O" should invoke only the the methods of the following kinds of objects:
Here's the informal version attributed to Peter Van Rooijen:
See http://c2.com/cgi/wiki?LawOfDemeter for more discussion on this topic.
See http://www.ccs.neu.edu/home/lieber/LoD.html for a list of LoD links.
Keith Ray is developing new applications for iOS® and Macintosh®.. Go to Sizeography and Upstart Technology to join our mailing lists and see more about our products.
No comments:
Post a Comment