Extract Method – The Most Important Refactoring EVER

August 23, 20085 min

Coming off my last rant, I thought it might be valuable to the community if I were to blog on the various refactorings I felt software developers should be using if they consider themselves professionals. To that end, I am going to borrow the refactoring patterns from Martin Fowler’s book, but provide up-to-date examples in C# code. To start the series off, I will start with the one refactoring ever developer should know and love – Extract Method.

How Do I Find It?: You have a code fragment that can be grouped together.

What Do I Do Once I Found It?: Turn the fragment into a method whose name explains the purpose of the method.

Fear not if you have never heard of this refactoring. It is so fundamental that many developers execute this pattern as a matter-of-course, but did not know the name for it; well, now you do – it is called Extract Method. IMO, mastering this refactoring is the key to understanding the later refactorings, like Extract Class, buy youtube shares which we will discuss next week.

So, how do you located code that is a candidate for this refactoring? Here are some guidelines I use:

  1. Long methods -> methods that exceed 20 lines are a good candidate for this refactoring since they are normally doing more than one thing.
  2. Code comments -> comments are normally a visual marker that the next lines are conceptual similar and this refactoring is about grouping conceptual similar things in methods.

I randomly searched through our source code repository at work and in a mere 24.4 seconds I found this sample code from a utility class (Util.cs). It is a nice method that does a useful thing (removes extra decimal points from a string – who knew you needed to do that, but why not?), and it needs some help based on my guidelines above:

44  static public string RemoveExtraDecimalPoint(string text)
45  {
46      string returnString = "";
47      int countDecimal = 0;
48
49      foreach (char character in text)
50      {
51          if (character == '.')
52          {
53              countDecimal++;
54          }
55      }
56
57      if (countDecimal > 1)
58      {
59          bool first = true;
60          foreach (char character in text)
61          {
62              returnString += character.ToString();
63              int length = returnString.Length;
64              if (character == '.')
65              {
66                  if (first == true)
67                  {
68                      first = false;
69                  }
70                  else
71                  {
72                      returnString = returnString.Remove(returnString.Length - 1, 1);
73                  }
74              }
75          }
76      }
77      else
78      {
79          returnString = text;
80      }
81
82      return returnString;
83  }

What makes this a good candidate for Extract Method is this method us 39 lines long – WAY too long for a method if our target is to create method from 5 to 10 lines long (which is the target of this refactoring). Remember, one of the main goals of Extract Method is to create small methods with intention revealing method names that read like comments. Our main focus will be to turn this long method in to a series of method calls which tell the reader what happens inside RemoveExtraDecimalPoint.

The first step after identifying a candidate method for Extract Method is to check the unit tests for this method and make sure they run.  Surprise, surprise – this method has no tests!!  Just a polite reminder – never, ever refactor code without having either an automated unit test or integration test confirming you did not break anything.  That is called hacking.  Yes, there will be times you have to “hack” something in, but while learning how to refactor is not it.  Writing a unit test after-the-fact is called a characterization test.  Here is one of my unit tests (I actually wrote five tests) that will serve as my safety net during the refactoring:

22  [TestMethod]
23  public void RemoveExtraDecimalPointTest()
24  {
25      // setup
26
27      // execute
28      string test = Utils.RemoveExtraDecimalPoint("23..45.0.");
29
30      // verify
31      Assert.AreEqual("23.450", test, "Extra decimal points not removed.");
32
33      // teardown
34  }

My very first Extract Method refactoring will focus on line 47 in the original sample code.  The code below shows the new method call I made for this refactoring:

44  static public string RemoveExtraDecimalPoint(string text)
45  {
46      string returnString = "";
47      int countDecimal = Utils.CountDecimalsInString(text);
48
49      if (countDecimal > 1)
50      {

The Extract Method refactoring product a method 10 lines long and has an intention revealing name.

77  private static int CountDecimalsInString(string text)
78  {
79      int countDecimal = 0;
80      foreach (char character in text)
81      {
82          if (character == '.')
83          {
84              countDecimal++;
85          }
86      }
86      return countDecimal;
87  }

We run all our tests, everything passes, we are fine! Time to move on to find another chunk of code to extract.  Next we take a look at line 59 from the original code and use Extract Method to make the following new method.

89  private static string ParseExtraDecimalPoints(string text)
90  {
91      string returnString = "";
92      bool first = true;
93
94      foreach (char character in text)
95      {
96          returnString += character.ToString();
97          if (character == '.')
98          {
99              if (first == true)
100             {
101                 first = false;
101             }
102             else
103             {
104                 returnString = returnString.Remove(returnString.Length - 1, 1);
105             }
106         }
107     }
108     return returnString;
109 }

Again: run all the test, everything passes, we are fine!  Notice that our extracted method is a little long (20 lines), but conceptually all the code in ParseExtraDecimalPoints is related, so we are OK.  Remember, we are more interested in finding related code than following a rule of “10 lines or less”.  Even if all you can pull out is a 200 line method, then that is cool – it is a first start once you pull one chunk out, you can begin to see the smaller chunks embedded in there.

But what does the refactored method look like?  I highlighted the addition of the two static methods – CountDecimalsInString and ParseExtraDecimalPoints – to make it easier to see our changes.

44  static public string RemoveExtraDecimalPoint(string text)
45  {
46      string returnString = "";
47      int countDecimal = Utils.CountDecimalsInString(text);
48    
49      if (countDecimal > 1)
50      {          
51          returnString = Utils.ParseExtraDecimalPoints(text);
52      }
53      else
54      {
55          returnString = text;
56      }
57
58      return returnString;
59  }

Upon examination, we see the refactored version of RemoveExtraDecimalPoint is only 15 lines long and does just two things: counts decimal points and parses code to remove extra decimal points.  Oh, that is real easy to understand as opposed to trying to read the original 39 lines of code (take a look if you don’t believe me) with all the loops, conditionals and white space and trying to figure things out.

And that is the point of the Extract Method refactoring.