Overcoming the Challenges of Creating High Fidelity Software Test Environments
The overall trend in the software industry is to decentralize product quality practices. In this model, the distance between quality assurance, engineering, and DevOps continues to shrink, eliminating the old barriers that often posed an obstacle to quickly shipping high-quality software products. Quality focused resources, especially those in cloud native environments, are increasingly decentralized into smaller self-contained pods alongside product, engineering and DevOps resources. These decentralized organizational units are employing agile methodologies to ship product much faster than in older, more traditional setups. The smaller releases that are characteristic of high velocity, agile organizations help businesses to more quickly realize value from their software investments. But while velocity is increasing, application infrastructures are also decentralizing and getting much more complicated.
Microservices are creating highly distributed architectures with more complex interaction patterns and inter-dependencies. Decentralized teams can evolve each microservice independently of other services so long as its interfaces are maintained. However, maintaining high quality services in a faster, more complicated environment is challenging. As an industry, we must consider how to overcome these challenges to create high fidelity test environments.
Once you are able to dynamically create high fidelity test environments you will substantially raise the quality of your software
Test automation has come a long way and provides the ability to simultaneously achieve velocity and quality, but issues still occur. In my experience, the root cause of many of these issues is the fidelity of the test environments. Each service may pass its functional tests, but problems only come to light once all the services are wired together along with a representative production data set. Thus, it’s increasingly difficult to create high quality test environments that accurately represent the real-life complexity of production environments.
The common answer to these challenges is to focus on DevOps automation, and to use it to build multiple test environments. The availability of test environments is no longer the bottleneck that it once was. However, these automated test environments often fall short of accurately representing production, which results in critical oversights by both automated and manual tests.
Working for a cloud provider, we have hundreds of different core services that power our platform, including a diverse set of end user facing web, desktop, and mobile applications. For an engineer working on one of these services, what does a test environment look like? Is it just the microservice running on their laptop? Is it that service plus other services it depends on running in a larger environment? Or, is it a full test environment that contains all the microservices on our platform? To answer this question, we must draw a line and decide which subset of platform components is best suited to represent the production environment and produce strong test results for any particular scenario.
Containers, while incredibly helpful in moving code and configuration to different environments, do not fully solve the test environment fidelity problem, as containers don’t capture the overall service architecture of an application. When dynamically creating test environments, it is useful to give engineers and quality-focused resources the ability to decide which services are real and which are mocked in a dev or test environment. Mock services expose the service interfaces required to run the application, but may only return responses from a predefined list of possibilities and thus carry a lighter weight than their full counterparts. Providing this level of flexibility allows engineers to create full test components, mock test components, and to combine these together to create a full working environment in an automated fashion. Achieving this requires investment in service discovery mechanisms, adding mock interfaces to all services and adding state reset functions services so data stores can be reset between tests. This kind of environment can be used to automate the creation of much more architecturally representative and higher fidelity test environments which will act as a buffer to catch issues before code is promoted to production.
Another challenge in creating high fidelity test environments revolves around good quality test data. Without strong and representative test data, the likelihood of surprise problems increases when code moves to production. Test data is often not maintained as part of the development process. One common approach to solving this problem is to have a static data set that directly loads into the data stores in the test environment as part of the automated test environment build. Yet, this static data set often becomes stale over time as the application or service evolves with new features and capabilities, thus breaking the fidelity of the test environment.
While it’s tempting to take a slice of the data in your production environment to seed the test environment, this approach raises many security and compliance concerns. It’s not a best practice to have developers walking around with highly sensitive customer information on their laptops. While there are approaches that anonymize the PII in the production data and use this anonymized data to seed test environments, this is a non-starter for most businesses from a security and compliance perspective.
An alternative approach is to use test automation to generate the seed data set in your test environment by using a regression test to create fictitious customer data that can then be used for more advanced scenario testing. This approach avoids potential concerns with directly loading the data stores, but at the expense of being much slower and possibly making automation tests challenging to maintain. The correct approach will depend on the application, data model, and complexity of your scenarios.
As we use automation to achieve higher fidelity test environments that better represent production from an architecture and data perspective, we will continue to see corresponding improvements to overall application quality. This won’t happen on its own though. Investing in the environment and data test automation must be a priority item in your development backlog that has an equal seat at the table to another feature-driven project. And once it is in place, it must be properly maintained as part of the development process. Once you are able to dynamically create high fidelity test environments you will substantially raise the quality of your software.