In this article, we'll learn what to test in CI/CD pipeline to deliver more reliable and better software.
Overall hierarchy of tests
In 2009, Mike Cohn, an agile evangelist and practitioner, came up with the term “Testing pyramid”. From the bottom to the top, it consisted of three layers: Unit tests, Service Tests, and UI Tests. The naming might seem a bit vague, and the pyramid itself is outdated to a certain extent. But the overarching idea behind it is quite simple. There should be a good amount of small and fast lower-level tests, typically called unit-tests. As you go higher up the pyramid, that is, the coarser-grained your tests are, the amount of them should decrease. The reasoning behind this idea is quite simple either: it is cheap and simple to write fine-grained unit-tests. Besides, they typically run extremely fast.
Every new layer is a leap to a notable abstraction. At least talking about a backend code, the most practical levels are the ones that correspond to abstraction layers. Those are typically the following, from the highest to lowest:
- End-to-end tests, customer journey level. They typically include browser, imitating real user’s behaviour.
- Endpoint tests that include transport. They typically correspond to the controller level of your application. Such tests are typically called Integration tests.
- Unit tests for classes used within a user story. It is your domain level.
This list aligns with ports-and-adapters architecture, popularized by Robert Martin in his book.
There isn’t a single definition of a unit-test. For me, a unit test is the one that relies only on clear and specific abstractions. The scope they operate in is typically small. A litmus test is the following: when your tests are “unit” enough, you don’t need to debug. If you need, the problem is probably not with your tests, but with your design. Create finer-grained classes, cover them with tests. If you are able to spot a bug in a matter of seconds, congratulations: you’ve come up with a decent unit-tests suite. It’s crucial since there should be no tension in running tests like every five seconds or so.
Why not having only unit tests, if they are so great? Because when you have to refactor some user story, you’ll be covered by controller tests. If there are none, you’ll be in big trouble.
Unit test concept has a quite vague definition. Integration test even more so.
In practice, most of the time they mean testing either controllers (or Application Services, in DDD lingo) or endpoints. Typically, this implies that you mock third-party service calls, but don’t mock a database, using the real one instead. Being a higher-level test, you don’t test its implementation details, that is the behaviour of its internal classes. Instead, you test that the HTTP response is correct and desired side effects are in place.
Be very cautious to stay at the same abstraction level. For example, if you test order registration endpoint, don’t test that there is a record in a database. It’s an implementation detail. Today you use Postgres, and tomorrow you end up with Mongo. Instead, call order details API endpoint to make sure that it contains all the required data. What’s the whole point in doing so? Integration tests are your safety net. When implementation details change, sometimes drastically, you want to make sure that everything works as expected.
Writing integration tests often involves discussions on what to mock and what not to mock. For me, the guideline is simple. Your code should be the most fragile and error-prone part in the whole testing process. Say, what is more reliable, your database or your application code? I bet it’s the former. And what about the network? The network is notoriously fragile and slow, and this is the reason why network calls rarely take part in tests. Another example is when your application deals with external devices that are out of your control. For example, devices that print checks are pretty fragile. You don’t want your test fail because the printer is run out of paper or inks or anything else. Otherwise, you end up testing those devices instead of testing your code.
End-to-end tests are the last line of defense. They test a complete customer journey, from UI to the backend, to UI again, and so forth. I always keep in mind the Pareto principle when writing them: 20% of any effort gives 80% of the result. Identify your key business-processes and automate them. Usually, it’s the one that brings your business most of the money. For example, order registration and further processing, resulting in its delivery. Thus, with such tests in your test-suite, there is not a chance for you to introduce a devastating bug, so your back will be covered.
The flipside is the speed and reliability, both of which e2e tests lack. Tests running in a console are about one order of magnitude faster comparing to tests with browser involved. So despite things might look shining in the beginning, a year ago you could end up with a test suite that takes an hour to run.
Besides, end-to-end tests are often flaky. And the more complicated they are, the flakier they get, because of an increasing amount of unforeseen factors. Some rare browser quirk, unexpected layout behavior — all this adds up to overall flakiness. There is a common workaround, though. Don’t spin up a browser; work only through DOM. Thus your tests will be way faster, not 100% end-to-end though.
Having end-to-end tests help you to reveal unexpected coupling, particularly the one that is hard to spot — for example, an accidental coupling between user stories. Remember how last time you modified some user story, and another one got broken unexpectedly? That’s precisely the case for end-to-end tests, covering a set of user stories that form a particular customer journey.
People often conflate e2e tests with UI tests, which is a broader concept.
UI tests probably deserve their own pyramid. Principles are the same, after all. Doing UI development, most of the tests are typically small, fast, and the ones operating in a small scope, which generally falls into the category of “unit”. In other words, this term is often used as a frontend equivalent of a unit-test, which is typically applied to backend development.
Generally, there are two things to test in UI. First one is behaviour. There is a myriad of functionality examples that can be covered with UI unit tests. Calculating an order’s total, a link leading to a correct URL, an alert notification is displayed in red, etc. Besides, you don’t have to test most of the behaviour in an end-to-end fashion, even if you need a response from a backend. Write a stub, and you’re good to go. It can very well be tested in isolation. Jasmine and Mocha are a good place to start out with UI unit tests.
Testing UI layout is typically a bit more intricate endeavour. The basic idea is simple enough, though. With every build, your tests automatically create browser snapshots and compare them with the ones from the previous builds. If they differ, something is wrong. For example, cypress can do such kind of stuff.
Tests within Continuous delivery
If you want to deploy your software frequently and reliably, having a test portfolio is not enough. The key is automating it. You should be able to deploy your code at all times of day and night. Do you want to rely on human beings, who can forget to run tests before pushing code? Do you want to deploy your code only to find out that it doesn’t compile? I bet you don’t. So the best advice you can get is to use a CI/CD pipeline because it excludes a human-factor.
Besides, infrastructure provisioning and deployment process are quite tedious and repetitive tasks, which we humans are particularly bad at. CI/CD pipelines, on the contrary, are extremely good at automating those tasks. Manual tasks don’t scale. If you need a single manual tester now, you’ll need a growing number of them as your system grows bigger. So instead of spending an increasingly significant amount of time carrying them out, you can automate all your processes once and for all.
The absence of tests automation is a perfect recipe on how to abandon your team’s testing activity. A couple of times when some team member forgets to run tests before pushing code could be enough to get disappointed in a whole testing endeavor. How not to? Use a proper CI/CD pipeline that never forgets running your tests before deployment.
The flaky test is the one which fails once in a while. There are several problems with this kind of tests.
First one, when a new build is tested and fails, you can’t say for sure why it happened: because of some flaky test or because a new bug was introduced.
Second, the credibility of test suite decreases and a team ceases to write tests, considering it a useless activity. Having a CI/CD pipeline which alerts in case of a failed test and declines to ship stuff literally makes you fix that flaky test, restoring overall test-suite credibility and team morale.
Ice cream cone
A small amount of cheap and fast tests, a vast amount of slow and expensive (in terms of time required to write them) UI tests. So a slim test pyramid degenerates into a clumsy ice cream cone.
Ice cream cone looks tasty indeed, but don’t allow your test suite to end up looking like that
This cone appears when you start with more high-level tests than you should. Your whole test suite becomes slow, and bugs become hard to get spotted. Instead of testing, you end up in gloomy debug-land. Besides, such kind of tests is not particularly reliable. If you run the same browser-involved e2e test a hundred times, the chances are that one of those tests fails — because of some browser quirks.
It’s not real code — it’s just a test!
Test code is by no means different from the rest of your code. Indeed, it’s not run in production. But you still have to maintain it. And maintainability costs are much higher than in writing ones. So the same best practices that are used in production code are applied to testing code as well.
Desire to test a private method
This is a clear indication that a class where that private method belongs to breaks the Single-Responsibility Principle. It’s a God Object that knows too much and does too much. The cure is simple: extract this private method in its own class and cover it with unit tests.
Testing the wrong thing
Do you test stuff that brings you most of the money? Very often, developers test for the sake of testing itself, without realizing the true value of testing — reducing risks and saving money. So before writing any tests, especially higher-level ones, talk with your product-owner, she certainly could give you some clues on the stuff of the utmost importance.
Tests structure exactly mirrors classes structure
This seems logical from the first sight but look at it this way. Some of your tests should serve you as a safety net which ensures that your system’s behaviour doesn’t change when you modify its internal implementation. More often than not, most of our lower-level classes represent that internal implementation. So any modification in that class would cause a corresponding test failure. Fixing it each time is a daunting task. What Robert Martin recommends is a “test contra-variance,” which means testing only observable behaviour of a class. Which means that if your class
A, providing some domain-specific behaviour, uses a bunch of implementation-specific and stable helpers, you don’t need to test them.
XTest would be just fine. So no need in testing all the internal implementation, that’s very fragile.
This topic is huge indeed, and I’ve barely scratched the surface here. I suggest that you read some more on it; Growing Object-Oriented Software, Guided by Test and The Art of Software Testing are great sources of further inspiration.