A retrofit of the GeoServer testing infrastructure mean to make tests more efficient.
Under Discussion, In Progress, Completed, Rejected, Deferred
Over the years GeoServer has accumulated a large number of test cases. The large majority of these test cases are strictly speaking "system" tests (sometimes referred to as integration tests) that require a full GeoServer to be constructed. While this has proven a very effective form of testing it also has proven to be very expensive for a number of reasons:
- A lot of resources such as data and configuration must be copied out to the file system for every system test run
- A full spring context is setup for every test run.
To mitigate this cost test writers have the option of using a "one time setup" that ensures that the above tasks are only performed once per test class. While this helps greatly to reduce the cost it still has some limitations:
- The one time setup is still expensive, and with the many test classes in GeoServer this cost still adds up
- Often the one time setup can't be utilized if a test must change data and/or configuration as part of its operation
At the time of this writing a build with tests of all the GeoServer core modules takes approximately
48 minutes. And this did not include any of the extension modules, some of which (like wps and app-schema) also have a significant number of system tests.
This was measured on a machine with a 2.3GHz CPU and 7200RPM hard drive.
The goal of this proposal is to improve this situation through a number of tasks/techniques.
- Making the system test setup more efficient
- Making better use of mock testing
- Looking at newer test frameworks and tools
In order to provide clarity throughout the proposal a few terms are defined up front. Most of this terminology is based on definitions from Wikipedia. It is recognized that these are by no means official definitions but provide a sufficient base for this proposal.
The term system test refers to the form of testing that is employed by the majority of GeoServer test cases. That is classes that extend from the GeoServerTestSupport class. This base test classes uses what is called a "mock setup", which is not to be confused with what is traditionally known as "mocking" or "mock testing".
The term mock test refers to what is more commonly known as mocking, as described here. Some test cases in GeoServer utilize this form of testing but they are a small minority compared to the system tests.
This section provides a high level overview of the tasks to be performed to remedy the situation. The next section gets into low level implementation details.
Currently GeoServer system tests perform roughly the following tasks when setting up a test case:
- Create a data directory complete with configuration and data on the file system
- Create a spring application context loaded with all application context files on the classpath of the test class
- Provide a number of utility methods for simulating HTTP requests without requiting an actual web server
Currently the first step creates a pre GeoServer 2.0 data directory. Upon startup of the spring context this data directory is converted to the current GeoServer version. This conversion is expensive. This has become much more noticeable due to the recent overhaul of the GeoServer security subsystem which implements a completely different on disk configuration structure.
To remedy this the idea is to update the test setup to create a more recent data directory, forgoing any conversion that must now take place for every setup.
Currently the task of creating the data directory on disk is delegates to the MockData class. Changing this will require major modifications to this class, more or less rewriting it. In addition this class provides hooks to test writers that need to create their own custom resources in the data directory for the specific test. This api is written in a way that exposes many of the implementation details of MockData and the underlying data directory structure so it will have to be updated as well, along with tests that make use of it.
Many system test classes ensure that the test setup occur only once for the execution of all test methods of the class. However as mentioned before this can only be utilized for "read-only" tests. If any of the individual test methods modify data or configuration it leaves the setup in an inconsistent state for other tests and hence invalidates the test class in its entirety as a candidate for a one time setup.
That said, when most tests modify data or configuration it is constrained to a single configuration object. Be it a service, a layer, etc... An alternative to dealing with this by reconstructing the entire test setup would be to simply re-initialize those resources that were affected by a test.
The GeoServer modules that suffer the most from this issue are the wfs and restconfig modules. Not surprisingly they are among the modules that take the longest to build with tests, with times of approximately 5.5 minutes and 8.5 minutes respecitvely.
As a proof of concept of this approach the restconfig module was ported to this setup. The test time for the module was reduced to approximately 1.25 minutes. Nearly a 700% improvement.
Often a test case only needs to interact with a single object. The object may be complex to create so its often simplest to just write the test as a system test and get the depending component as needed. The problem is that the price of the full system test setup is paid just to get at the object. The most common case is when testing a class that requires the GeoServer Catalog to function.
A common and effective remedy to this is to use mocking to mock up whatever component we need and pass that into the object we actually want to test. And indeed a number of test cases in GeoServer do this today, and they naturally run much faster than the system test equivalent. While mocking is great often it is tedious and requires a great deal of work just to set up the mock that we need.
The idea here is to make mock tests easier to write by creating a reusable "mock setup" (not to be confused with the current MockData that provides mocked up instances of common objects like the Catalog and GeoServer instances.
As a proof of concept of this approach a few tests in main were changed from system tests to mock tests. CatalogBuidlerTest that utilizes a one-time setup, and CascadeDeleteVisitorTest that repeats the test setup for every test case.
- CatalogBuilderTest went from 9.061 seconds to 2.032 seconds, a 450% improvement.
- CascadeDeleteVisitorTest went from 9.023 to 0.266 seconds, a 3400% improvement.
Currently GeoServer tests are written in JUnit3 style. Newer test frameworks provide advanced options that test writers can use to control execution of tests. Namely annotations. The obvious choices for a new test framework are JUnit4 and TestNG.
It should be noted that changing the test framework is somewhat orthogonal to this proposal. However since there is some benefit, and it makes sense to do it while we are doing a major overhaul here we list some of the pros of cons of a potential switch to each.
Google "junit4 vs testng" and you will find many resources comparing the two frameworks. For the most part they offer the same functionality. Both are based on annotating test classes and methods.
Both allow for "test wrapping" that allows one to implement certain customizations around how to run tests. Useful for writing code that affects how tests run without having to worry about the details in the test case itself, sort of how we use the OneTimeTestSetup today.
Both have seamless maven integration that don't require much configuration at the maven level.
JUnit4 support is built into eclipse, TestNG requires installing a separate plugin.
Migrating to JUnit4 is no easier than migrating to TestNG. Not surprisingly the TestNG author wanted a good JUnit migration story which is described here.
Both require that each test method be annotated. And both require a static import to handle assertions.
An interesting feature that distinguishes TestNG from JUnit4 are test groups which allow you to easily classify tests into certain groups. This could potentially be a quite useful feature for a number of reasons.
The first being it allows for running tests in only a certain group. For instance it would allow for things like running only non-system tests as a quick sanity check on the build rather than running the full system tests and figuring out mid build that something is broken.
More specifically to GeoServer it could potentially allow for better utilization of state, across test classes, not only test methods in a single test class. For instance, let's say we grouped our system tests into two groups.
- Read only tests that require only a one-time setup
- Other tests that require a repeated setup
Potentially we could run all the read-only tests in sequence and if the vm is not forked per test class then they could potentially all share the same setup state.
TestNG does require to maintain a separate testing.xml file that describes how to run tests, which tests to run, etc... In some cases this file is not required and some sensible defaults are applied.
This is a potential pro and a con. On the pro side the file allows for some advanced test configuration like running certain tests in parallel with multiple threads, etc... On the con side it is a separate file to keep up to date and maintain.
In the end JUnit4 was chosen, for a couple of reasons. First, it was a more conservative upgrade especially given GeoTools already is using JUnit4. Second, as it turns out, JUnit4 is catching up with features of TestNG, namely test groups which was added in version 4.10.
This section proposes an implementation plan for the topics covered in the previous section. The code samples in this section will assume JUnit4 the general design should hold regardless if we choose JUnit4, TestNG, or to remain at JUnit3.
For this proposal a new set of base test classes are proposed rather than modifying the existing ones. This is done for a number of reasons.
- This proposal will require a lot of work porting over test cases from the old framework and not something that can simply slot in. This will require a lot of manpower to do in one shot so a gradual approach will be needed. By using a new set of base classes we ensure that tests that have yet to be ported remain unaffected.
- The new hierarchy will better model the different kinds of tests developers have available, namely system tests and mock tests.
- The new hierarchy will make a one-time setup more or less transparent.
All that said, the new hierarchy will be designed to slot in as seamlessly as possible in order to reduce the work in porting over system tests. For instance, the base classes will make the same test utility methods available to test classes that extend them.
With that, the following class diagram illustrates the proposed class hierarchy.
The base class for all test classes, playing the same role as the current GeoServerAbtractTestSupport. The primary responsibly of this class is to control the basic life cycle (setup and teardown) of a test case. This includes abstracting away the details of a one time or repeated setup via support for a custom annotation. More to come on this below.
The base class for all system tests which is the equivalent of the current GeoServerTestSupport. This class creates the spring context and performs all the same tasks as its predecessor.
The base class for all mock tests. This class utilizes the MockTestData to create mock objects and as well provides utility methods to provide some convenience for creating mock objects.
The same interface we have today, unmodified.
Plays the same role as the current MockData. This test sets up the system test data directory the same way was MockData however does so with a structure that reflects the current data directory on disk format.
Test data that provides a relatively complete "mock setup". This includes mocking up the Catalog and GeoServer interfaces with mock objects that mirror the data and configuration used by the SystemTestData class.
The equivalent of what is now MockData, essentially just remained. Kept around to actually test migration from a pre 2.0 data directory format.
This section shows a number of examples of porting tests to the new proposed testing infrastructure.
This examples illustrates porting a system test from one that requires a repeated test setup to one that can be run with a one time setup.
The test as written today requires a repeated setup because it makes modifications during its life. In particular:
- It modifies the sf:PrimitiveGeoFeature type
- It creates a new datastore in many of the test methods
The approach with this class is to add pre test method setup that erases all state changes:
The removePdsStore method makes use of a utility method for deleting a store if it exists. This method is defined in GeoServerSystemTestSupport. Similar methods exists to remove workspaces, layers, styles, etc...
Similarly The addPrimitiveGeoFeature makes use of a utility method for reverting a layer (and its resource) back to the original state.
This examples illustrates porting a system test that employs a one time test setup to a mock test. One time setup system tests are good candidates for rewriting as a mock tests since they make no modifications to the configuration.
The first setup involves extending from the new base class. Also since mock tests by default setup once there is no need to define the static suite method.
This test requires that raster layers be included in the test setup. The MockTestData provides api similar to the existing MockData class for this.
Note: Ideally since mock objects are cheap to create we could just include the raster data layer by default. However the current implementation creates the mocked coverages by actually copying over the actual raster data, reading it and getting properties from it. It would be nice to improve on this, or perhaps do it lazily so that tests don't explicitly have to include the raster layers, but don't pay the price of setup if they are not used. There are some ideas for doing this but nothing implemented as of yet.
This examples illustrates porting a system test that employs a repeated test setup to a mock test.
Like the last example the first step involves extending the GeoServerMockTestSupport class. And because one time setup is the default with this base class a repeated test setup must be explicitly set via annotation.
The approach with this test class is create a different mock specific to each test case. For example taking the testCascadeStore method as an example.
The setMockCreator method is used to supply the mock setup with a customize object for creating the mock catalog, one specific to the test case. The remaining test classes make a similar call as the first line of the test.
As this a proposal restricted to GeoServer unit tests there are no typical backwards compatibility concerns. As described in the proposal this is not a drop in replacement type proposal and there is a large migration task involved.