Rules for Implementing TDD in Old Project

The article “Sliding Responsibility of the Repository Pattern” raised several questions, which are very difficult to answer. Do we need a repository if the complete disregard of technical details is impossible? How complex must the repository be so that its addition can be regarded worth-while? The answer to these questions varies depending on the emphasis placed in the development of systems. Probably the most difficult question is the following: do you even need a repository? The problem of “flowing abstraction” and the growing complexity of coding with an increase in the level of abstraction do not allow to find a solution that would satisfy both sides of the fence. For example, in reporting, intention design leads to the creation of a large number of methods for each filter and sorting, and a generic solution creates a large coding overhead.

To have a full picture, I looked at the problem of abstractions in terms of their application in a legacy code. A repository, in this case, is of interest to us only as a tool for obtaining quality and bugless code. Of course, this pattern is not the only thing necessary for the application of the TDD practices. Having eaten a bushel of salt during the development of several large projects and watching what works and what does not, I developed a few rules for myself that help me to follow the TDD practices. I’m open for a constructive criticism and other methods of implementing TDD.

Foreword

Some may notice that it is not possible to apply TDD in an old project. There is an opinion that different types of integration tests (UI-tests, end-to-end) are more suitable for them because it is too difficult to understand the old code. Also, you can hear that writing tests before the actual coding leads only to a loss of time, because we may not know how the code will work. I had to work on several projects, where I was limited only to integration tests, believing that unit tests are not indicative. At the same time, a lot of tests were written, they ran a lot of services, etc. As a result, only one person could understand them, who, in fact, wrote them.

During my practice, I managed to work on several very large projects, where there was a lot of legacy code. Some of them featured tests, and the others did not (there was only an intention to implement them). I participated in two large projects, in which I somehow tried to apply the TDD approach. At the initial stage, TDD was perceived as a Test First development. Eventually, the differences between this simplified understanding and the present perception, shortly called BDD, became clearer. Whichever language is used, the main points, I call them rules, remain similar. Someone can find parallels between the rules and other principles of writing good code.

Rule 1: Using Bottom-Up (Inside-Out)

This rule refers rather to the method of analysis and software design when embedding new pieces of code into a working project.

When you are designing a new project, it is absolutely natural to picture an entire system. At this stage, you control both the set of components and the future flexibility of the architecture. Therefore, you can write modules that can be easily and intuitively integrated with each other. Such a Top-Down approach allows you to perform a good upfront design of the future architecture, describe the necessary guiding lines and have a complete picture of what, in the end, you want. After a while, the project turns into what is called the legacy code. And then the fun begins.

At the stage when it is necessary to embed a new functionality into an existing project with a bunch of modules and dependencies between them, it can be very difficult to put them all in your head to make the proper design. The other side of this problem is the amount of work required to accomplish this task. Therefore, the bottom-up approach will be more effective in this case. In other words, first you create a complete module that solves the necessary task, and then you build it into the existing system, making only the necessary changes. In this case, you can guarantee the quality of this module, as it is a complete unit of the functional.

It should be noted that it’s not all that simple with the approaches. For example, when designing a new functionality in an old system, you will, like it or not, use both approaches. During the initial analysis, you still need to evaluate the system, then lower it to the module level, implement it and then go back to the level of the entire system. In my opinion, the main thing here is not to forget that the new module should be a complete functionality and be independent, as a separate tool. The more strictly you will adhere to this approach, the fewer changes will be made to the old code.

Rule 2: Test only the modified code

When working with an old project, there is absolutely no need to write tests for all possible scenarios of the method/class. Moreover, you may not be aware of some scenarios at all, as there may be plenty of them. The project is already in production, the customer is satisfied, so you can relax. In general, only your changes cause problems in this system. Therefore, only they should be tested.

Example

There is an online-store module, which creates a cart of selected items and stores it in a database. We do not care about the specific implementation. Done as done – this is the legacy code. Now we need to introduce a new behavior here: send a notification to the accounting department in case the cart cost exceeds $1000. Here is the code we see. How to introduce the change?

According to the first rule, the changes must be minimal and atomic. We are not interested in data loading, we do not care about the tax calculation and saving to the database. But we are interested in the calculated cart. If there was a module that does what is required, then it would perform the necessary task. That’s why we do this.

Such a notifier operates on its own, can be tested, and the changes made to the old code are minimal. This is exactly what the second rule says.

Rule 3: We only test requirements

To relieve yourself from the number of scenarios that require testing with unit tests, think of what you actually need from a module. Write first for the minimum set of conditions that you can imagine as requirements for the module. The minimum set is the set, which when supplemented with a new one, the behavior of the module does not change much, and when removed, the module does not work. The BDD approach helps a lot in this case.

Also, imagine how other classes that are clients of your module will interact with it. Do you need to write 10 lines of code to configure your module? The simpler the communication between the parts of the system, the better. Therefore, it is better to select modules responsible for something specific from the old code. SOLID will come to aid in this case.

Example

Now let’s see how everything described above will help us with the code. First, select all the modules that are only indirectly associated with the creation of the cart. This is how the responsibility for the modules is distributed.

This way they can be distinguished. Of course, such changes cannot be made at once in a large system, but they can be made gradually. For example, when changes relate to a tax module, you can simplify how other parts of the system depend on it. This can help get rid of high dependencies and use it in the future as a self-contained tool.

As for the tests, these scenarios will be sufficient. So far, their implementation does not interest us.

Rule 4: Add only tested code

As I wrote earlier, you should minimize changes to the old code. To do this, the old and new/modified code can be split up. The new code can be placed in methods which can be checked using unit tests. This approach will help reduce the associated risks. There are two techniques that have been described in the book “Working Effectively with Legacy Code” (link to the book below).

Sprout method/class – this technique allows you to embed a very secure new code into an old one. The way I added the notifier is an example of this approach.

Wrap method – a bit more complicated, but the essence is the same. It does not always work, but only in cases where a new code is called before/after an old one. When assigning responsibilities, two calls of the ApplyTaxes method were replaced by one call. For this, it was necessary to change the second method so that the logic does not break greatly and it could be checked. That’s how the class looked like before the changes.

And here how it looks after. The logic of working with the elements of the cart changed a little, but in general, everything remained the same. In this case, the old method calls first a new ApplyToItems, and then its previous version. This is the essence of this technique.

Rule 5: “Break” hidden dependencies

This is the rule about the biggest evil in an old code: the use of the new operator inside the method of one object to create other objects, repositories, or other complex objects. Why is that bad? The simplest explanation is that this makes the parts of the system highly connected and helps to reduce their coherence. Even shorter: leads to the violation of the “low coupling, high cohesion” principle. If you look at the other side, then this code is too difficult to extract into a separate, independent tool. Getting rid of such hidden dependencies at once is very laborious. But this can be done gradually.

First, you must transfer the initialization of all dependencies to the constructor. In particular, this applies to the new operators and the creation of classes. If you have ServiceLocator to get instances of classes, you should also remove it to the constructor, where you can pull out all the necessary interfaces from it.

Secondly, variables that store the instance of an external object/repository must have an abstract type, and better an interface. The interface is better because it provides more capabilities to a developer. As a result, this will allow making an atomic tool out of a module.

Thirdly, do not leave large method sheets. This clearly shows that the method does more than it is specified in its name. It is also indicative of a possible violation of SOLID, the Law of Demeter.

Example

Now let’s see how the code that creates the cart has been changed. Only the code block that creates the cart remained unchanged. The rest was placed into external classes and can be substituted by any implementation. Now the EuropeShop class takes the form of an atomic tool that needs certain things that are explicitly represented in the constructor. The code becomes easier to perceive.

Rule 6: The fewer big tests, the better

Big tests are different integration tests that try to test user scripts. Undoubtedly, they are important, but to check the logic of some IF in the depth of the code is very expensive. Writing this test takes the same amount of time, if not more, like writing the functionality itself. Supporting them is like another legacy code, which is difficult to change. But these are just tests!

It is necessary to understand which tests are needed and clearly adhere to this understanding. If you need an integration check, write a minimum set of tests, including positive and negative interaction scenarios. If you need to test the algorithm, write a minimal set of unit tests.

Rule 7: Do not test private methods

A private method can be too complex or contain code that is not called from public methods. I’m sure that any other reason you can think of will prove to be a characteristic of a “bad” code or design. Most likely, a part of the code from the private method should be made a separate method/class. Check if the first principle of SOLID is violated. This is the first reason why it is not worth doing so. The second is that in this way you check not the behavior of the entire module, but how the module implements it. The internal implementation can change regardless of the behavior of the module. Therefore, in this case, you get fragile tests, and it takes more time than necessary to support them.

To avoid the need to test private methods, present your classes as a set of atomic tools and you don’t know how they are implemented. You expect some behavior that you are testing. This attitude also applies to classes in the context of the assembly. Classes that are available to clients (from other assemblies) will be public, and those that perform internal work – private. Although, there is a difference from methods. Internal classes can be complex, so they can be transformed into internal ones and also tested.

Example

For example, to test one condition in the private method of the EuropeTaxes class, I will not write a test for this method. I will expect that taxes will be applied in a certain way, so the test will reflect this very behavior. In the test, I manually counted what should be the result, took it as a standard, and expect the same result from the class.

Rule 8: Do not test the algorithm of methods

Some people check the number of calls of certain methods, verify the call itself, etc., in other words, check the internal work of methods. It’s just as bad as testing of the private ones. The difference is only in the application layer of such a check. This approach again gives a lot of fragile tests, thus some people do not take TDD properly.

Read more…

Rule 9: Do not modify legacy code without tests

This is the most important rule because it reflects a team desire to follow this path. Without the desire to move in this direction, everything that has been said above has no special meaning. Because if a developer does not want to use TDD (does not understand its meaning, does not see the benefits, etc.), then its real benefit will be blurred by constant discussion how difficult and inefficient it is.

If you are going to use TDD, discuss this with your team, add it to Definition of Done, and apply it. At first, it will be hard, like with everything new. Like any art, TDD requires constant practice, and pleasure comes as you learn. Gradually, there will be more written unit tests, you will begin to feel the “health” of your system and start to appreciate the simplicity of writing code, describing the requirements in the first stage. There are TDD studies conducted on real big projects in Microsoft and IBM, showing a reduction in bugs in production systems from 40% to 80% (see the links below).

Further Reading

  1. Book “Working Effectively with Legacy Code” by Michael Feathers
  2. TDD when up to your neck in Legacy Code
  3. Breaking Hidden Dependencies
  4. The Legacy Code Lifecycle
  5. Should you unit test private methods on a class?
  6. Unit testing internals
  7. 5 Common Misconceptions About TDD & Unit Tests
  8. Law of Demeter