Archive for August, 2008
Extract Class & the Single Responsibility Principle
Last week we talked about Extract Method and why it is so important to know and love. This week, we are going to discuss another very important principle in object-oriented design and how Extract Class can help us build and maintain better code.
If you are familiar with Robert Martin’s book Agile Software Development: Principles, Patterns and Practices, you will recall he recast a very powerful concept (the Single-Responsiblity Principle or SRP) know from the “old days”, aka 1979,: a class should have one, and only one, reason to change. Or as I like to add my own spin to it: a class should only do one thing. The reasoning behind the SRP is if your class has two responsibilities, invariably they become coupled (coupled code == bad) and changes in one will impair or inhibit the ability of the class to meet the other responsibilities – and this is only in the case where you have just two responsibilities. Imagine the confusion when we have three or four responsibilities in a single class!!!
If SRP is so important, then why are talking about Extract Class refactoring? This is the refactoring you use when you want to apply the SRP and applying the SRP is key to writing good code, i.e. code that is testable, decoupled, maintainable, etc., etc.
How Do I Find It?: You have one class doing the work that should be done by two.
What Do I Do Once I Find It?: Create a new class and move the relevant fields and methods from the old class into the new class.
Any guidelines you care to share? I am glad that you asked because I do have a few tell tale signs that a new class is waiting to bust out of your old one:
- Find your big classes -> any class with more than a 1000 lines is probably doing more than one thing.
- Classes that have “too many” methods or fields\properties -> if class has more than 10 to 15 methods or fields\properties, you are probably missing an abstraction.
- Methods that don’t belong -> every class has a reason for existence, or a theme, that unifies the method and data. Methods or data that don’t “fit in” with the theme are likely candidates for a new class.
In an application I am working on, we have a real simple Bus class that shepherds messages between simulated hardware and simulated devices (in our case we are talking about drawers, buttons, LED, etc.). When you examine the class you see it is about 250 lines, has three fields, a number of events around connecting and disconnecting devices and some private methods. Looking further, you can find two methods which stick out and don’t match the theme of the class:
45 private IBusMessage BytesToBusMessage(byte[] bytes)
46 {
47 // convert to string
48 string msg = Encoding.ASCII.GetString(bytes, 0, bytes.Length);
49 // split the msg into address and actual command
50 string[] splits = msg.Split(new char[] { '|' });
51 BusMessage message = new BusMessage();
52 message.Address = long.Parse(splits[0]);
53 message.Message = splits[1];
54 if (splits.Length > 2)
55 {
56 message.SequenceNumber = int.Parse(splits[2]);
57 }
58 return message;
59 }
60
61 public static byte[] BusMessageToBytes(IBusMessage message)
62 {
63 // form the message
64 string msg = message.Address + "|" + message.Message + "|" + message.SequenceNumber;
65 return Encoding.ASCII.GetBytes(msg);
66 }
67
So we have a class which is responsible for event registration AND parsing messages. My question is this: why in the world does a class which mostly deals with event registration have anything to do with parsing? Why would this class even care how we parse messages back-and-forth? It sounds like a class doing more than one thing – that is a violation of the SRP – and we will use Extract Class to apply the SRP to Bus.
Next week, we will talk about the Move Method refactoring, which is how you extract your class from the old one.
Extract Method – The Most Important Refactoring EVER
Coming off my last rant, I thought it might be valuable to the community if I were to blog on the various refactorings I felt software developers should be using if they consider themselves professionals. To that end, I am going to borrow the refactoring patterns from Martin Fowler’s book, but provide up-to-date examples in C# code. To start the series off, I will start with the one refactoring ever developer should know and love – Extract Method.
How Do I Find It?: You have a code fragment that can be grouped together.
What Do I Do Once I Found It?: Turn the fragment into a method whose name explains the purpose of the method.
Fear not if you have never heard of this refactoring. It is so fundamental that many developers execute this pattern as a matter-of-course, but did not know the name for it; well, now you do – it is called Extract Method. IMO, mastering this refactoring is the key to understanding the later refactorings, like Extract Class, which we will discuss next week.
So, how do you located code that is a candidate for this refactoring? Here are some guidelines I use:
- Long methods -> methods that exceed 20 lines are a good candidate for this refactoring since they are normally doing more than one thing.
- Code comments -> comments are normally a visual marker that the next lines are conceptual similar and this refactoring is about grouping conceptual similar things in methods.
I randomly searched through our source code repository at work and in a mere 24.4 seconds I found this sample code from a utility class (Util.cs). It is a nice method that does a useful thing (removes extra decimal points from a string – who knew you needed to do that, but why not?), and it needs some help based on my guidelines above:
44 static public string RemoveExtraDecimalPoint(string text)
45 {
46 string returnString = "";
47 int countDecimal = 0;
48
49 foreach (char character in text)
50 {
51 if (character == '.')
52 {
53 countDecimal++;
54 }
55 }
56
57 if (countDecimal > 1)
58 {
59 bool first = true;
60 foreach (char character in text)
61 {
62 returnString += character.ToString();
63 int length = returnString.Length;
64 if (character == '.')
65 {
66 if (first == true)
67 {
68 first = false;
69 }
70 else
71 {
72 returnString = returnString.Remove(returnString.Length - 1, 1);
73 }
74 }
75 }
76 }
77 else
78 {
79 returnString = text;
80 }
81
82 return returnString;
83 }
What makes this a good candidate for Extract Method is this method us 39 lines long – WAY too long for a method if our target is to create method from 5 to 10 lines long (which is the target of this refactoring). Remember, one of the main goals of Extract Method is to create small methods with intention revealing method names that read like comments. Our main focus will be to turn this long method in to a series of method calls which tell the reader what happens inside RemoveExtraDecimalPoint.
The first step after identifying a candidate method for Extract Method is to check the unit tests for this method and make sure they run. Surprise, surprise – this method has no tests!! Just a polite reminder – never, ever refactor code without having either an automated unit test or integration test confirming you did not break anything. That is called hacking. Yes, there will be times you have to “hack” something in, but while learning how to refactor is not it. Writing a unit test after-the-fact is called a characterization test. Here is one of my unit tests (I actually wrote five tests) that will serve as my safety net during the refactoring:
22 [TestMethod]
23 public void RemoveExtraDecimalPointTest()
24 {
25 // setup
26
27 // execute
28 string test = Utils.RemoveExtraDecimalPoint("23..45.0.");
29
30 // verify
31 Assert.AreEqual("23.450", test, "Extra decimal points not removed.");
32
33 // teardown
34 }
My very first Extract Method refactoring will focus on line 47 in the original sample code. The code below shows the new method call I made for this refactoring:
44 static public string RemoveExtraDecimalPoint(string text)
45 {
46 string returnString = "";
47 int countDecimal = Utils.CountDecimalsInString(text);
48
49 if (countDecimal > 1)
50 {
The Extract Method refactoring product a method 10 lines long and has an intention revealing name.
77 private static int CountDecimalsInString(string text)
78 {
79 int countDecimal = 0;
80 foreach (char character in text)
81 {
82 if (character == '.')
83 {
84 countDecimal++;
85 }
86 }
86 return countDecimal;
87 }
We run all our tests, everything passes, we are fine! Time to move on to find another chunk of code to extract. Next we take a look at line 59 from the original code and use Extract Method to make the following new method.
89 private static string ParseExtraDecimalPoints(string text)
90 {
91 string returnString = "";
92 bool first = true;
93
94 foreach (char character in text)
95 {
96 returnString += character.ToString();
97 if (character == '.')
98 {
99 if (first == true)
100 {
101 first = false;
101 }
102 else
103 {
104 returnString = returnString.Remove(returnString.Length - 1, 1);
105 }
106 }
107 }
108 return returnString;
109 }
Again: run all the test, everything passes, we are fine! Notice that our extracted method is a little long (20 lines), but conceptually all the code in ParseExtraDecimalPoints is related, so we are OK. Remember, we are more interested in finding related code than following a rule of “10 lines or less”. Even if all you can pull out is a 200 line method, then that is cool – it is a first start once you pull one chunk out, you can begin to see the smaller chunks embedded in there.
But what does the refactored method look like? I highlighted the addition of the two static methods – CountDecimalsInString and ParseExtraDecimalPoints – to make it easier to see our changes.
44 static public string RemoveExtraDecimalPoint(string text)
45 {
46 string returnString = "";
47 int countDecimal = Utils.CountDecimalsInString(text);
48
49 if (countDecimal > 1)
50 {
51 returnString = Utils.ParseExtraDecimalPoints(text);
52 }
53 else
54 {
55 returnString = text;
56 }
57
58 return returnString;
59 }
Upon examination, we see the refactored version of RemoveExtraDecimalPoint is only 15 lines long and does just two things: counts decimal points and parses code to remove extra decimal points. Oh, that is real easy to understand as opposed to trying to read the original 39 lines of code (take a look if you don’t believe me) with all the loops, conditionals and white space and trying to figure things out.
And that is the point of the Extract Method refactoring.
A Rewrite is NOT Refactoring!
Just the other day I was listening to, i.e. eavesdropping, a conversation between peers where they were trying to decide on what version of Visual Studio.NET their team should migrate to. This team was still using Visual Studio 6 (!!) and wanted to know if they should go to 2005 or 2008. During the conversation the motivation for the upgrade was discussed:
“We want to refactor some of the old [C++] code in C#.”
[OK, now imagine the sound of metal scraping on your fillings. It is that bad.]
Refactoring has been in print since 1999 and has a really clear definition. I also tend to think it is one of the fundamental skills every software developer should be proficient in. Frankly, I feel if you are not refactoring your code, you are not doing your job. So, why do people in this day and age of our profession still consider a rewrite of old code in a new language refactoring? Just another example of sloppy language and, unfortunately, sloppy thinking. So, to be precise, here are the definitions of refactoring.
refactoring (n.): a change to made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior.
refactoring (v.): to restructure software by applying a series of refactorings without changing its observable behavior.
For those of you who have stumbled upon my rant, here is a short list of things I do NOT consider refactoring:
- adding new functionality to the code.
- changing the design in the absence of automated tests – either unit, functional or acceptance.
- a separate “stage” in the project you do either after, or before, the “real” work.
- a replacement, or substitution, for a genuine software design.
- “fixing things up while you are in there.”
Not There Yet
Cory Foy made an interesting comment today on the XP mailing list. I’ll quote the part which I found interesting:
” My theory is that while we supposedly “Crossed the Chasm” from Early Adopters to Early Majority in the Agile world (referencing emails from about a year ago), we are now trying to cross a second chasm – that between Early Majority and Late Majority.”
While I respect Cory immensely, my experience with moderating and hosting XPSD suggests we have a long way to go as a community before we approach the Late Majority. The demographic that comes to XPSD are in their mid-30’s to mid-40’s (of course, we have younger and older present) who generally work in small teams or at small companies. We mostly have technical people come to our meetings each month with the majority of folks being people who write code for a living.
If we were approaching the Late Majority, then I would expect to be seeing the crowd at XPSD getting a little grayer (we all are getting a little grayer these days, but that is beside the point) and coming from the more conservative organizations and companies – defense contractors, the big biotechs and financial companies. I would also expect to be seeing more of the manager types and supplementary roles associated with software development – the testers, SCM and tech writers. I’d also expect to see a spike, or an increasing trend, in the number of people signing up on our mailing list.
While I wish it were true all these things were happening, I am not seeing it in San Diego. I think as a community, we are still just beginning to penetrate the Early Majority.
[Update - Aug 13] Seems like I am not the only one who questions Cory’s (and Scott Ambler’s) assertions that Agile has crossed into the Late Majority - Dave Nicolete is skeptical as well (here and here). I agree with his choice of words that Agile is still seen as an “experimental” or “alternative” way to develop software.
Fake Objects
I was writing some tests today around some legacy code (its about 2 weeks old and not written with TDD) and wanted to make sure that some method was called when the parent method was invoked. Unfortunately, I was having some problems with getting RhinoMocks working. Since I was getting frustrated with some “stupid” tool and not making any progress, I decided to go back to first principles to look for a simpler way to write my test.
This was the code I was working with and I wanted to write a test that proved line 28 was called:
22 public virtual void AcceptMessage(IBusMessage message)
23 {
24 if (!Power)
25 {
26 return;
27 }
28 ProcessMessage(message);
29 }
If you look at the rest of the code, this method comes from an abstract base class and the method in question (line 28) is abstract, protected AND void, so just calling it and looking at the return value is not a strategy I can use. To write the other tests I had written earlier in the day I had used the Fake Object concept just so I can get the object in the testing harness. Keep in mind, I am not trying to fake any collaborator, just test the base class since it does real things I want to make sure happen (or get updated) as the design evolves (you might be thinking this is a dumb place to start with a test, but you have to start somewhere).
Since I have complete control over my fake object and the method I want to make sure is abstract, I can insert a sensing variable in my fake object and set its state when the method (ProcessMethod) is called, like so:
10 public bool MessageProcessed = false;
11
12 protected override void ProcessMessage(IBusMessage message)
13 {
14 MessageProcessed = true;
15 }
Here is my test:
63 [TestMethod]
64 public void AcceptMessageWithPowerTest()
65 {
66 // setup
67 FakeDevice test = new FakeDevice();
68 test.Power = true;
69
70 IBusMessage message = new BusMessage();
71
72 // exercise
73 test.AcceptMessage(message);
74
75 // verify
76 Assert.IsTrue(test.MessageProcessed, "Process message not called.");
77
78 // clean-up
79 }
OK, kinda like using a sledgehammer to hit a carpet nail, but it got me unstuck and back to a green bar. Now, maybe I can spend sometime with RhinoMocks and figure out the easier way to do this.

