icon-carat-right menu search cmu-wordmark

A 5-Stage Process for Automated Testing and Delivery of Complex Software Systems

Caden Milne Headshot of Lyndsi Hughes
and

Managing and maintaining deployments of complex software present engineers with a multitude of challenges: security vulnerabilities, outdated dependencies, and unpredictable and asynchronous vendor release cadences, to name a few.

We describe here an approach to automating key activities in the software operations process, with focus on the setup and testing of updates to third-party code. A key benefit is that engineers can more quickly and confidently deploy the latest versions of software. This allows a team to more easily and safely stay up to date on software releases, both to support client needs and to stay current on security patches.

We illustrate this approach with a software engineering process platform managed by our team of researchers in the Applied Systems Group of the SEI’s CERT Division. This platform is designed to be compliant with the requirements of the Cybersecurity Maturity Model Certification (CMMC) and NIST SP 800-171. Each of the challenges above present risks to the stability and security compliance of the platform, and addressing these issues demands time and effort.

When system deployment is done without automation, system administrators must spend time manually downloading, verifying, installing, and configuring each new release of any particular software tool. Additionally, this process must first be done in a test environment to ensure the software and all its dependencies can be integrated successfully and that the upgraded system is fully functional. Then the process is done again in the production environment.

When an engineer’s time is freed up by automation, more effort can be allocated to delivering new capabilities to the warfighter, with more efficiency, higher quality, and less risk of security vulnerabilities. Continuous deployment of capability describes a set of principles and practices that provide faster delivery of secure software capabilities by improving the collaboration and communication that links software development teams with IT operations and security staff, as well as with acquirers, suppliers, and other system stakeholders.

While this approach benefits software development generally, we suggest that it is especially important in high-stakes software for national security missions.

In this post, we describe our approach to using DevSecOps tools for automating the delivery of third-party software to development teams using CI/CD pipelines. This approach is targeted to software systems that are container compatible.

Building an Automated Configuration Testing Pipeline

Not every team in a software-oriented organization is focused specifically on the engineering of the software product. Our team bears responsibility for two sometimes competing tasks:

  • Delivering valuable technology, such as tools for automated testing, to software engineers that enables them to perform product development and
  • Deploying security updates to the technology.

In other words, delivery of value in the continuous deployment of capability may often not be directly focused on the development of any specific product. Other dimensions of value include “the people, processes, and technology necessary to build, deploy, and operate the enterprise’s products. In general, this business concern consists of the software factory and product operational environments; however, it does not consist of the products.”

To improve our ability to complete these tasks, we designed and implemented a custom pipeline that was a variation of the traditional continuous integration/continuous deployment (CI/CD) pipeline found in many traditional DevSecOps workflows as shown below.

figure1_05202025
Figure 1: The DevSecOps Infinity diagram, which represents the continuous integration/continuous deployment (CI/CD) pipeline found in many traditional DevSecOps workflows.

The main difference between our pipeline and a traditional CI/CD pipeline is that we are not developing the application that is being deployed; the software is typically provided by a third-party vendor. Our focus is on delivering it to our environment, deploying it onto our information systems, operating it, and monitoring it for proper functionality.

Automation can yield terrific benefits in productivity, efficiency, and security throughout an organization. This means that engineers can keep their systems more secure and address vulnerabilities more quickly and without human intervention, with the effect that systems are more readily kept compliant, stable, and secure. In other words, automation of the relevant pipeline processes can increase our team’s productivity, enforce security compliance, and improve the user experience for our software engineers.

There are, however, some potential negative outcomes when it is done incorrectly. It is important to recognize that because automation allows for many actions to be performed in quick succession, there is always the possibility that those actions lead to unwanted results. Unwanted results may be unintentionally introduced via buggy process-support code that doesn’t perform the correct checks before taking an action or an unconsidered edge case in a complex system.

It is therefore important to take precautions when you are automating a process. This ensures that guardrails are in place so that automated processes cannot fail and affect production applications, services, or data. This can include, for example, writing tests that validate each stage of the automated process, including validity checks and safe and non-destructive halts when operations fail.

Developing meaningful tests may be challenging, requiring careful and creative consideration of the many ways a process could fail, as well as how to return the system to a working state should failures occur.

Our approach to addressing this challenge revolves around integration, regression, and functional tests that would be run automatically in the pipeline. These tests are required to ensure that the functionality of the third-party application was not affected by changes in configuration of the system, and also that new releases of the application still interacted as expected with older versions' configurations and setups.

Automating Containerized Deployments Using a CI/CD Pipeline

A Case Study: Implementing a Custom Continuous Delivery Pipeline

Teams at the SEI have extensive experience building DevSecOps pipelines. One team in particular defined the concept of creating a minimum viable process to frame a pipeline’s structure before diving into development. This allows all of the groups working on the same pipeline to collaborate more efficiently.

In our pipeline, we started with the first half of the traditional structure of a CI/CD pipeline that was already in place to support third-party software released by the vendor. This gave us an opportunity to dive deeper into the later stages of the pipelines: delivery, testing, deployment, and operation. The end result was a five-stage pipeline which automated testing and delivery for all of the software components in the tool suite in the event of configuration changes or new version releases.

To avoid the many complexities involved with delivering and deploying third-party software natively on hosts in our environment, we opted for a container-based approach. We developed the container build specifications, deployment specifications, and pipeline job specifications in our Git repository. This enabled us to vet any desired changes to the configurations using code reviews before they could be deployed in a production environment.

A Five-Stage Pipeline for Automating Testing and Delivery in the Tool Suite

Stage 1: Automated Version Detection

When the pipeline is run, it searches the vendor site either for the user-specified release or the latest release of the application in a container image. If a new release is found, the pipeline utilizes communication channels set up to notify engineers of the discovery. Then the pipeline automatically attempts to safely download the container image directly from the vendor. If the container image is unable to be retrieved from the vendor, the pipeline fails and alerts engineers to the issue.

Stage 2: Automated Vulnerability Scanning

After downloading the container from the vendor site, it is best practice to run some sort of vulnerability scanner to make sure that no obvious issues that might have been missed by the vendors in their release end up in the production deployment. The pipeline implements this extra layer of security by utilizing common container scanning tools, If vulnerabilities are found in the container image, the pipeline fails.

Stage 3: Automated Application Deployment

At this point in the pipeline the new container image has been successfully downloaded and scanned. The next step is to set up the pipeline’s environment so that it resembles our production deployment’s environment as closely as possible. To achieve this, we created a testing system within a Docker in Docker (DIND) pipeline container that simulates the process of upgrading applications in a real deployment environment. The process keeps track of our configuration files for the software and loads test data into the application to ensure that everything works as expected. To differentiate between these environments, we used an environment-based DevSecOps workflow (Figure 2: Git Branch Diagram) that offers more fine-grained control between configuration setups on each deployment environment. This workflow enables us to develop and test on feature branches, engage in code reviews when merging feature branches into the main branch, automate testing on the main branch, and account for environmental differences between the test and production code (e.g. different sets of credentials are required in each environment).

figure2_05202025
Figure 2: The Git Branch Diagram

Since we are using containers, it’s not relevant that the container runs in two completely different environments between the pipeline and production deployments. The outcome of the testing is expected to be the same in both environments.

Now, the application is up and running inside the pipeline. To better simulate a real deployment, we load test data into the application which will serve as a basis for a later testing stage in the pipeline.

Stage 4: Automated Testing

Automated tests in this stage of the pipeline fall into several categories. For this specific application, the most relevant testing strategies are regression tests, smoke tests, and functional testing.

After the application has been successfully deployed inside of the pipeline, we run a series of tests on the software to ensure that it is functioning and that there are no issues using the configuration files that we provided. One way that this can be accomplished is by making use of the application’s APIs to access the data that was loaded in during Stage 3. It can be helpful to read through the third-party software's documentation and look for API references or endpoints that might simplify this process. This ensures that you not only test basic functionality of the application, but that the system is functioning practically as well, and that the API usage is sound.

Stage 5: Automated Delivery

Finally, after all of the previous stages are completed successfully, the pipeline will make the fully tested container image available for use in production deployments. After the container has been thoroughly tested in the pipeline and becomes available, engineers can choose to use the container in whichever environment they want (e.g., test, quality assurance, staging, production, etc.).

An important aspect to delivery is the communication channels that the pipeline uses to convey the information that has been collected. This SEI blog post explains the benefits of communicating directly with developers and DevSecOps engineers through channels that are already a part of their respective workflows.

It is important here to make the distinction between delivery and deployment. Delivery refers to the process of making software available to the systems where it will end up being installed. In contrast, the term deployment refers to the process of automatically pushing the software out to the system, making it available to the end users. In our pipeline, we focus on delivery instead of deployment because the services for which we are automating upgrades require a high degree of reliability and uptime. A future goal of this work is to eventually implement automated deployments.

Handling Pipeline Failures

With this model for a custom pipeline, failures modes are designed into the process. When the pipeline fails, diagnosis of the failure should identify remedial actions to be undertaken by the engineers. These problems could be issues with the configuration files, software versions, test data, file permissions, environment setup, or some other unforeseen error. By running an exhaustive series of tests, engineers can come into the situation equipped with a greater understanding of potential problems with the setup. This ensures that they can make the needed adjustments as effectively as possible and avoid running into the incompatibility issues on a production deployment.

Implementation Challenges

We faced some particular challenges in our experimentation, and we share them here, since they may be instructive.

The first challenge was deciding how the pipeline would be designed. Because the pipeline is still evolving, flexibility was required by members of the team to ensure there was a consistent picture regarding the status of the pipeline and future goals. We also needed the team to stay committed to continuously improving the pipeline. We found it helpful to sync up on a regular basis with progress updates so that everyone stayed on the same page throughout the pipeline design and development processes.

The next challenge appeared during the pipeline implementation process. While we were migrating our data to a container-based platform, we discovered that many of the containerized releases of different software needed in our pipeline lacked documentation. To ensure that all the knowledge we gained throughout the design, development, and implementation processes was shared by the entire team, , we found it necessary to write a large amount of our own documentation to serve as a reference throughout the process.

A final challenge was to overcome a tendency to stick with a working process that is minimally feasible, but that fails to benefit from modern process approaches and tooling. It can be easy to settle into the mindset of “this works for us” and “we’ve always done it this way” and fail to make the implementation of proven principles and practices a priority. Complexity and the cost of initial setup can be a major barrier to change. Initially, we had to master the hassle of creating our own custom container images that had the same functionalities as an existing, working systems. At that time, we questioned whether this extra effort was even necessary at all. However, it became clear that switching to containers significantly reduced the complexity of automatically deploying the software in our environment, and that reduction in complexity allowed the time and cognitive space for the addition of extensive automated testing of the upgrade process and the functionality of the upgraded system.

Now, instead of manually performing all the tests required to ensure the upgraded system functions correctly, the engineers are only alerted when an automated test fails and requires intervention. It is important to consider the various organizational barriers that teams might run into while dealing with implementing complex pipelines.

Managing Technical Debt and Other Decisions When Automating Your Software Delivery Workflow

When making the decision to automate a major part of your software delivery workflow, it is important to develop metrics to demonstrate benefits to the organization to justify the investment of upfront time and effort into crafting and implementing all the required tests, learning the new workflow, and configuring the pipeline. In our experimentation, we judged that is was a highly worthwhile investment to make the change.

Modern CI/CD tools and practices are some of the best ways to help combat technical debt. The automation pipelines that we implemented have saved countless hours for engineers and we expect will continue to do so over the years of operation. By automating the setup and testing stage for updates, engineers can deploy the latest versions of software more quickly and with more confidence. This allows our team to stay up to date on software releases to better support our customers’ needs and help them stay current on security patches. Our team is able to utilize the newly freed up time to work on other research and projects that improve the capabilities of the DoD warfighter.

Additional Resources

The DevSecOps Capability Maturity Model by Timothy A. Chick, Brent Frye, and Aaron Reffett.

Extending Agile and DevSecOps to Improve Efforts Tangential to Software Product Development by David Sweeney and Lyndsi A. Hughes.

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed