One Key Abstraction

By Zach Dennis on 01 10 2012

Recently, I've been spending time with Arthur Riel's 1996 book: Object Oriented Design Heuristics. One of the 62 heuristics offered in his work is short and sweet, yet powerful and essential in building quality, understandable, and maintainable software:

This notion of "one and only one" is often under-utilized and sometimes even neglected. It's one of those sayings that should be on a thoughtful Hallmark card for developers. A reminder that something simpler is not merely an ideal to shoot for, but that it's achievable. Because it's not immediately visible to us doesn't mean it's not there. It may only be hidden from view, which may be a sign we haven't simplified our thought enough to recognize it.

If you're familiar with the Single Responsibility Principle from Uncle Bob then this will be very familiar, as both SRP and this heuristic seem to revolve around the same underlying principles. Fun fact: Uncle Bob was a reviewer of this book way back '96!

Having a higher level concept allows us to anchor the operations (aka behaviors) that belong to it.

In theory, both of these concepts (one key abstraction and SRP) are easy to grasp, but they can be challenging to put into practice. One of challenges can be identifying the boundaries of the abstraction you've identified. For example, you may need to add a user, update a user, and delete a user from your system. Couldn't we model each operation as its own responsibility and therefore end up with three different objects (AddUser, UpdateUser, RemoveUser)? Yes, we certainly could. But couldn't find a more simpler abstraction which helps us reduce the number of classes we need to not only keep in our code, but also in our heads? I think we can.

With an object-oriented mindset we'd look not only to find the operations that needed to be performed, but we'd also try to find a unifying concept which ties all of those operations together. In the current add/update/remove user example, it's easy since they all have to do with the notion of a user.

Having a higher level concept allows us to anchor the operations (aka behaviors) that belong to it. If any one particular behavior turns out to be complex we can keep our user intact and with a simple interface and hide the complexity behind that. This helps us avoid what Riel refers to as action-oriented software.

From Object Oriented Design Heuristics: "While action-oriented software development is involved with functional decomposition through a very centralized control mechanism, the object-oriented paradigm focuses more on the decomposition of data with its corresponding functionality in a very decentralized setting."

The action-oriented mindset can lead to a proliferation of top-level classes and modules, centralized complexity/intelligence into God-like objects which in turn implies a lack of distributing responsibilities or concepts which help make the system maintainable, modifiable, and understandable. While I'm sure the term action-oriented can be substituted for any number of similar terms, it's not the term that's so important, it's the effect. This effect (which is alive and well in web applications today) is outside the scope of this post and is ripe to become its own.

Finding a key abstraction helps distribute the complexity in a more organic way since we it naturally decentralizes the responsibilities throughout the system. But finding a key abstraction can be hard.
When identifying one and only one key abstraction for a class or module we can reduce the cognitive effort required.

Writing down the reason to exist

At Path to Agility in Columbus, OH earlier this year Brandon Keepers shared a slide of a convention they were doing at Github. The convention was simple: document each class you write to communicate it's responsibility. If you couldn't write in a sentence or two or you had to use a lot of conjuctive terms like "and"/"or" then there's a good chance your class was doing too much.

Following the conference I made an effort to put this to practice in my code. It was like a cross-word puzzle where you had clues, but you needed to connect those to more wholesome concept or abstraction. You had to find the central reason for why this class/module existed and then you had articulate it. It was incredibly hard at times, but also incredibly fun.

You'd say "aha" this class exists to be in charge of X, only to read through more of the code (if the class already existed) and find subtle nuances which described other dominant responsibilities. When this happened it was an opportunity to question and possibly simplify.

Identifying the dominant responsibility or concept is hard

While documenting the responsibility of a class is a great activity and can be very helpful (I highly recommend it) it can sometimes be challenging to pick a responsibility for a class. One reason for this is because there are often many little behaviors with several different ways of being logically grouped. Which means you have to pick one, and picking one can be hard. It can also cause anxiety around picking the "right" one. Usually none of them are wrong, but based on what the application is for there are typically reasons for why one choice would be considered better than the others. Every now and then a few choices are equivalent (the trade-offs end up evening out) and a flip of the coin will suffice.

Another reason recognizing the abstraction can be difficult is because the unifying concept can sometimes go unnoticed because it hasn't been verbalized or it doesn't exist in the system yet. It's hard to see things that aren't there.

A real world example of identifying multiple responsibilities and a missing abstraction

Here's a concrete example: I was working on an online learning application which had users, courses, lesson plans, steps, a user step record (which was a record that user took a step in a lesson). As a user progressed through the courses all of the responsibility for recording, tracking, and inquirying about a user's progress was given to the step. The step managed the relationship between the user and their step record. On top of that, the lesson consulted step to determine how far along a user had progressed. And likewise, course consulted lesson to see where the user had progressed in the course. What do you think about this?

There is more than one way to model this, can you think of an alternative approach? Take a quick minute and jot down an idea.

Jotted? Okay, just checking, read on!

A colleague on the project suggested that the modeling was awkward. The dual-responsibility that the step had taken on eventually caused other code in the application to get more complicated than it should be and it had a ripple effect where lesson and even course gained more responsibility. He challenged us to find an refactoring that could help simplify this code.

Later that day, while thinking about different ways this code interacted, several refactoring opportunities presented themselves but it was essentially just pushing around code between course, lesson, step, and the user step record. None of them felt like the kind of simplification that we were looking for. Then we started to think less about the code and more about the concepts. People take courses and complete lessons all of the time. This exists in the real world -- why not model it around real world concepts? Now, it felt like we were thinking with an object-oriented mindset.

There were two responsibilities at play: defining the training material and its structure; and recording the progression of a user through that material. Elementary school teachers solved this problem a long time ago. Do you remember the progress sheets where you could achieve a gold star or some other kind of sticker for completing a lesson or exercise in the class? Often times, they'd hang on the wall in the class room or they'd be inside a book that teacher kept at their desk.

Finding ways to simplify the concepts expressed through our code often leads to simpler code.

This notion of a progress sheet was a concept missing from our application. So we decided to move the responsibility of tracking a user's progress through the courses, lessons, and steps to an actual ProgressSheet class.

This was a nice win because it simplified the code and the concepts expressed through the code. Previously, there had been individual methods on course, lesson, and step to record progress and inquire about progress. But these methods had nothing to with the primary reason why course, lesson, and step existed, which was the definition and organization of material to be taught/learned. Now that we had a ProgressSheet we could move all of the functionality related to tracking progression into one spot. It was a unifying concept which gave individual behaviors a home.

Relating this back to Uncle Bob's Single Responsibility Principle: each of these classes now had a single responsibility and a single reason to change. If we changed how the training material was constructed or organized we'd work with the Course, Lesson, or Step appropriately. If we changed how progression was tracked we'd update the ProgressSheet.

In closing

Modeling and designing is hard. Identifying abstractions or recognizing missing concepts is difficult. I routinely fail at it, but I keep trying and every now and then I have these micro breakthroughs where I feel not only is my code well-factored and as simple as I can make it, but the concepts I am expressing through that code are also as simple as I can make them. In my experience, finding ways to simplify the concepts expressed through our code often leads to simpler code.

In Riel's book, he used the statement "a class should capture one and only one key abstraction" and in more recent years Uncle Bob has used the term SRP to help us become better software craftspeople by employing this concept. This post was actually going to cover another of Riel's heuristics, but then, this post would be violating both rules, and that just wouldn't feel right.