Categorie: Testing

Running experiments in production

28-01-2020 door Henry Been·Geen reacties

There is no place like home, and also: there is no place like production. Production is the only location where your code is really put to the test by your users. This insight has helped developers to accept that we have to go to production fast and often. But we also want to do so in a responsible way. Common tactics for safely going to production often are for example blue-green deployments and the use of feature flags.

While helping to limit the impact of mistakes, the downside of these two approaches is that they both run only one version of your code. Either by using a different binary or by using a different code path, the one or the other implementation is executed. But what if we could execute both the old and the new code in parallel and compare the results? In this post I will show you how to run two algorithms in parallel and compare the results, using a library called scientist.net, which is available on NuGet.

The use case

To experiment a bit met experimentation I have created a straight forward site for calculation the n^th position in the Fibonacci sequence as you can see below.

I started out with the most simple implementation that seemed to meet the requirements of the implementation. In other words, the implementation behind this calculation is done using recursion, like shown below.

public class RecursiveFibonacciCalculator : IRecursiveFibonacciCalculator
{
  public Task<int> CalculateAsync(int position)
  {
    return Task.FromResult(InnerCalculate(position));
  }

  private int InnerCalculate(int position)
  {
    if (position < 1)
    {
      return 0;
    }
    if (position == 1)
    {
      return 1;
    }

    return InnerCalculate(position - 1) + InnerCalculate(position - 2);
  }
}

While this is a very clean and to-the-point implementation code wise, the performance is -to say the least- up for improvement. So I took a few more minutes and I came up with an implementation which I believe is also correct, but also much more performant, namely the following:

public class LinearFibonacciCalculator : ILinearFibonacciCalculator
{
  public Task<int> CalculateAsync(int position)
  {
    var results = new int[position + 1];

    results[0] = 0;
    results[1] = 1;

    for (var i = 2; i <= position; i++)
    {
      results[i] = results[i - 1] + results[i - 2];
    }

    return Task.FromResult(results[position]);
  }
}

However, just swapping implementations and releasing didn’t feel good to me, so I thought: how about running an experiment on this?

The experiment

With the experiment that I am going to run, I want to achieve the following goals:

On every user request, run both my recursive and my linear implementation.
To the user I want to return the recursive implementation, which I know to be correct
While doing this, I want to record:
- If the linear implementation yields the same results
- Any performance differences.

To do this, I installed the Scientist NuGet package and added the code shown below as my new implementation.

public async Task OnPost()
{
  HasResult = true;
  Position = FibonacciInput.Position;

  Result = await Scientist.ScienceAsync<int>("fibonacci-implementation", experiment =>
  {
    experiment.Use(async () => await _recursiveFibonacciCalculator.CalculateAsync(Position));
    experiment.Try(async () => await _linearFibonacciCalculator.CalculateAsync(Position));

    experiment.AddContext("Position", Position);
  });
}

This code calls into the Scientist functionality and sets up an experiment with the name fibonacci-implementation that should return an int. The configuration of the experiment is done using the calls to Use(..) and Try(..)

Use(..): The use method is called with a lambda that should execute the known, trusted implementation of the code that you are experimenting with.

Try(..): The try method is called with another lambda, but now the one with the new, not yet verified implementation of the algorithm.

Both the Use(..) and Try(..) method accept a sync-lambda as well, but I do use async/await here on purpose. The advantage of using an async-lambda is that both implementations will be executed in parallel, thus reducing the duration of the web server call. The final thing I do with the call to the AddContext(..) method is adding a named value to the experiment. I can use this context property-bag later on to interpret the results and to pin down scenarios in which the new implementation is lacking.

Processing the runs

While the code above takes care of running two implementations of the Fibonacci sequence in parallel, I am not working with the results yet – so let’s change that. Results can be redirected to an implementation of the IResultPublisher interface that ships with Scientist by assigning an instance to the static ResultPublisher property as I do in my StartUp class.

var resultsPublisher = new ExperimentResultPublisher();
Scientist.ResultPublisher = resultsPublisher;

In the ExperimentResultPublisher class, I have added the code below.

public class ExperimentResultPublisher : IResultPublisher, IExperimentResultsGetter
{
  public LastResults LastResults { get; private set; }
  public OverallResults OverallResults { get; } = new OverallResults();

  public Task Publish<T, TClean>(Result<T, TClean> result)
  {
    if (result.ExperimentName == "fibonacci-implementation")
    {
      LastResults = new LastResults(! result.Mismatched, result.Control.Duration, result.Candidates.Single().Duration);
      OverallResults.Accumulate(LastResults);
    }

    return Task.CompletedTask;
  }
}

For all instances of the fibonacci-implementation experiment, I am saving the results of the last observation. Observation is the Scientist term for a single execution of the experiment. Once I have moved the results over to my own class LastResults, I am adding these last results to another class of my own OverallResults that calculated the minimum, maximum and average for each algorithm.

The LastResults and OverallResults properties are part of the IExperimentResultsGetter interface, which I later on inject in my Razor page.

Results

All of the above, combined with some HTML, will then gave me the following results after a number of experiments.

I hope you can see here how you can take this forward and extract more meaningful information from this type of experimentation. One thing that I would highly recommend is finding all observations where the existing and new implementation do not match and logging a critical error from your application.

Just imagine how you can have your users iterate and verify all your test cases, without them ever knowing.

Is an acceptance environment needed?

30-03-2019 door Henry Been·Geen reacties

This is a subject that I have been wanting to write about for a while now and I am happy to say that I finally found the time while in an airplane*. I think I first discussed this question a few months ago with Wouter de Kort at the start of this year (2019.) Since then he has written an interesting blog post on how to structure your DTAP streets in Azure DevOps. His advice is to structure your environments as a pipeline that allows your code only to flow from source control to production, via other environments. Very sound advice, but next to that I wonder, do you need all those environments?

Traditionally a lot of us have been creating a number of environments, and for reasons. One for developers to deploy to and test their own work. One for testers to execute automated tests and do exploratory testing. And only when a release is of sufficient quality, it is promoted to the next environment: acceptance. Here the release is validated one final time in a production-like-environment, before it is pushed to production. The value that an acceptance environment is supposed to add, is that it is as production-like as possible. Often connected to other live systems to test integrations, whereas a test environment might be connected to stubs or not connected at all.

Drawbacks of multiple environments

However, creating and maintaining all these environments also comes with drawbacks:

It’s easy to see that your costs increase with the number of environments. Not necessarily in setting the environment up and maintaining it (since that is all done through IaC, right?) But still, we have to pay for the resource we use, and that can be serious money
Since our code and the value it delivers can only flow through production after it has gone through all the other environments, the number of environments has impact on how quickly we can deliver our code to production.

All these drawbacks are a plea for limiting the number of environments to as few as possible. Now with that in mind, do you really need an acceptance environment? I am going to argue that you might not. Especially when I hear things like: “Let’s go to acceptance quickly, so we do not have to wait another two days before we can go to production,” I die a little on the inside.

Why you might not need acceptance

So let’s go over some reasons for having an acceptance environment and seeing if we can make these redundant.

No wait, we do need an acceptance environment where our customer can explore the new features and accept their working, before we release them to all users.

While I hope that you involve your customer and other stakeholders before you have a finalized product, there can be value in customers having approve the release of features to users explicitly. However, is it really necessary to do this from an acceptance environment? Have you considered using feature toggles? This way you can release your code to the production environment and allow only your customer access to this new feature. Only after he approves, you open the feature up to more users. In other words, if we can ensure that shipping the new binaries to production, does not automatically entail the release of new features, we do not need an acceptance environment for final feature acceptance by the client. More information on feature flags (also called feature toggles), can be found here.

We need a production-like environment to do final tests

Trust me, there is no place like home. And for your code, production is home. The only way to truly validate your code and the value it should bring, is by running in it in production. An acceptance environment, even if more integrated with other systems and with more realistic data than production, does not compare to production. You cannot fully predict what your users will do with your features, estimate the impact of real world usage or foresee all deviating scenario’s. Here again, if you are using feature flags, that would be a much better approach to progressively open up a new feature to more and more users. And if issues show up, just stop the roll out for a bit or even reverse it, while you are shipping a fix.

Now, do you think you can go without an acceptance environment? And if not, please let me know why not and I might just add a counter argument.

Now, while I do realize that the above does not hold for every organization and every development team, I would definitely recommend to keep challenging yourself on the number of environments you need and if you can reduce that number.

(*) An airplane, where there is absolutly nothing else to do, is an environment I bet we all find inspiring!

Unit testing in PHP

11-05-2018 door Henry Been·Geen reacties

Over the last couple of months I have been coaching a number of PHP teams to help them improve their software engineering practices. The main goals were to improve the quality of the product, ease of delivery and the overall maintainability of the code. If there is one thing that defines maintainable code, in my opinion, it is the existence of unit tests. However, one of the things that proved more difficult than one might expect is to start writing proper unit tests in an existing PHP solution.

In this instance, the teams were using the Laravel framework. However, standard Laravel practices limited the testability of the code created by the teams. I have worked with these teams to make their code more testable to two ends:

Improve overall testability by introducing new class design patterns
Reduce the duration of tests. Prior to this approach, a lot of tests were implemented as end-to-end, interface based tests. And boy, are they slow!

After a number of weeks, we saw the first results coming in, so all of this worked out nicely.

The goal of this post is to share the issues found that were preventing the team from proper unit testing and how we got around them.

Issue 1: instantiating a class in a unit test

The first thing we ran into was the fact that it was impossible to instantiate any class from a unit test. There were two reasons for this. The first was that there was actual work done in the constructor of almost every class: calling a method on another class and/or hitting the database.

Next to this, dependencies for any class were not passed in via the constructor, but were created in the constructor using a standard Laravel pattern. The good news here is that Laravel actually provides you with a dependency container. The bad news is, that it was often used like this:

class TestSubject {
    public function __construct() {
        $this->someDependency = app()->make(SomeDependency::class);
    }
}

This calls a global, static method app() to get the dependency container and then instantiates a class by type. Having this code in the constructor makes it completely impossible to new the class up from a unit test.

In short, we couldn’t instantiate classes in a unit test due to:

Doing work in a constructor
Instantiating dependencies ourselves

Solution: Let the constructor only gather dependencies

First of all, calling methods or the database was quite easy to refactoring out of the constructors. Also, this is a thing that can easily be avoided when creating new classes.

The best way to not instantiate dependencies yourselves, is to leave that to the framework. Instead of hitting the global app() method to obtain the container, we added the needed type as a parameter to the constructor, leaving it up to the Laravel container to provide an instance at runtime (constructor dependency injection.)

public function __construct(SomeDependency $someDependency) {
    $this->someDependency = $someDependency;
}

Now there is still one issue here and that is we are depending upon a concrete class, not an abstraction. This means, we are violation the Dependency Inversion principle. To fix this, we need to depend on an interface. However, now the Laravel dependency container no longer knows which type to provide to our class when instantiating it, since it cannot instantiate a class. Therefore, we have to configure a binding that maps the interface to the class.

app()->bind(SomeDependencyInterface::class, SomeDependency::class);

Having done this, we can now change our constructor to look as follows.

public function __construct(SomeDependency $someDependency) {
    $this->someDependency = $someDependency;
}

At this point we have changed the following:

No work in constructors
Getting dependencies provided instead of instantiating them ourselves.
Depending upon abstractions

Mission accomplished! These things combined now allows to instantiate our test subject in a unit test as follows:

$this->dependencyMock = $this->createMock(SomeDependencyInterface::class);

$this->subject = new TestSubject($this-> dependencyMock);

Issue 2: Global static helper methods

Now we can instantiate a TestSubject in a test and start testing it. The second we got to this state, we ran into another problem that was all over the code base: global, static, helper methods. These methods have different sources. They are built-in PHP methods, Laravel helper methods or convenience methods from 3^rd parties. However, they all present us with the same problems when it comes to testability:

We cannot mock calls to global, static methods. Which means we cannot remove their behavior at runtime and thus cannot isolate our TestSubject and start pulling in real dependencies, dependencies of dependencies, etc…

From here on, I will share (roughly in order of preference) a number of approaches to get around this limitation.

Solution 1: Finding a constructor injection replacement

When starting to investigate these static methods, especially those provided by Laravel, we saw that a lot of them were just short wrapper methods around the Dependency Container. For example, the implementation of a much used view method was this:

public function view($name = null, $data = [], $mergeData = [])
{
    $factory = app(ViewFactory::class);

    if (func_num_args() === 0) {
        return $factory;
    }
    return $factory->make($view, $data, $mergeData);
}

For all these convenience methods, it is straightforward to see that we can easily refactor the calling code from this:

public function __construct() { }

public function index() {
    // … more code
    return view(“blade.name”, $params);
}

To this:

public function __construct(ViewFactory $viewFactory) {
    $this->viewFactory = $viewFactory;
}

public function index() {
    // … more code
    return $this->viewFactory->make(“blade.name”, $params);
}

A quick and easy way to remove a decent portion of calls to global functions.

Solution 2: Software engineering tricks

If there is no interface readily available for constructor injection, we can create one ourselves. A common engineering trick is to move unmockable code to a new class. We then inject this to our subject at runtime. At test time however, we can then mock this wrapper and test our subject as much as possible.

As an example, let’s take the following code:

class Subject {
    function isFileChanged($fileName, $originalHash) {
        $newHash = sha1($fileName);
        return $originalHash == $newHash;
    }
}

Of course we can test this class by letting it operate on a temporary file, but another approach would be to do this:

class Subject {
    private $sha1Hasher;

    public function __construct(ISha1Hasher $sha1Hasher) {
        $this->sha1Hasher = $sha1Hasher;
    }

    function isFileChanged($fileName, $originalHash) {
        $newHash = $this->sha1Hasher->hash ($fileName);
        return $originalHash == $newHash;
    }
}

Maybe not a thing you would do in this specific instance. But if you have code that is more complex and is executing a single call to a global method, this way you can move that call behind an interface and mock it out while testing:

public function testSubject() {
    $hasherMock = $this->createMock(ISha1Hasher::class);
    $hasherMock->method(“hash”)->with(“n/a”)->willReturn(“123”);
    $subject = new Subject($hasherMock);

    $result =$subject->isFileChanged(“n/a”, “123”);

    $this->assertFalse($result);
}

In my opinion, solution 2 is by far a better approach to take than solutions 3 and 4. However, if you are afraid that adding to much types might clutter your codebase or reduce the performance of your application, there are two more approaches available. Both have drawbacks, so I would only use them if you see no other way.

Solution 3: Leveraging PHP namespace precedence

If refactoring global static calls in you code to a new class is not an option and your code is organized into namespaces, there is another way we can mock calls to built-in PHP methods. In the file with our TestClass, we can add a new method with the same name in a namespace that is closer to the caller.

For example, the following call to file_exists() cannot be mocked out:

namespace demo;

class Subject {
    public function hasFile() {
        return file_exists("d:\bier");
    }
}

As you can see, the class containing the hasFile() method is in a namespace called demo. We can create a new method, also called file_exists() in that same namespace, just before our TestClass. When executing, the methods in the namespace that is the closed to the caller will take precedence.

This means, we mock the call to file_exists() to always return true, as follows:
namespace demo;

function file_exists($fileName) {
    return true;
}

class TestClass {
    public function testWhenFileExists_thenReturnTrue() {
        $result = $this->subject->hasFile();

        $this->assertTrue($result);
    }
}

The main drawback of this approach is that it reduces the readability of your code. Also, relying on method hiding for testing purposes might make your code harder to understand for those that do not grasp all the language details.

Solution 4: Leveraging your frameworks and libraries

Finally, your framework might provide its own means for mocking certain calls. In Laravel for example, there is a construct of Facades that you can also use for mocking purposes. Another example is the Carbon datetime convenience library that provides a global static Carbon::setTestNow() method.

I for one would discourage this, as it would mean that you are writing logic that will become dependent on your framework and will not ever be able to switch to another framework without redoing everything. (However… who has done that even once?)

My other argument is one of taste: I simply do not like adding methods to production code, only to make it testable. And I have seen misuse of methods intended for tests only in production code as well…

However, if you do not share these feelings, the approach is quite nicely detailed here: https://laravel.com/docs/5.6/facades or here: http://laraveldaily.com/carbon-trick-set-now-time-to-whatever-you-want/

Conclusion

I hope that this blog gives you a number of approaches to make your PHP code more (unit)testable. Because we all know that only code that is continued tested in a pipeline, can quickly and easily be shipped fast and often to customers.

Enjoy!

With thanks for proofreading: Wouter de Kort, Alex Lisenkov