Optic fibers of pink and purple color overlay purple background

Making it hum

  • Katherine Robins, Partner |

Most large organizations have dabbled in DevOps. Many of those organizations have embraced it and are using it for an application’s full deployment lifecycle along with infrastructure as code. For every organization that’s doing DevOps at scale, there are as many who are earlier on in their journey. The current market DevOps trend is to do it faster, better and more securely. How do we make that happen? How do we make it hum?

SecDevOps or Secure DevOps depending on one’s preference, when done well, it accelerates development and deployment by adhering to pre-defined guardrails and automatic checking during these stages of the lifecycle. Many organizations are coming to a tipping point where the DevOps piece is humming along and things grind to a halt when security activities such as post development architectural review, vulnerability scanning and penetration testing are performed manually before go-live.

From a technical perspective, SecDevOps is embedding the security checking into the CI/CD pipeline so that security can be baked into the design and done at scale. Security architecture patterns and guardrails, such as secure controls for container images, are built into the code so that infrastructure is stood up with security in mind. Secure code scanning is built into the CI/CD pipeline to verify secure coding practices before the code can be executed and deployed. Vulnerability scanning is done before go-live but automated into the CI/CD pipeline so that automated scanning occurs before rolling into Production. Failures are auto-remediated by running appropriate patches or failing back to the deployment team for remediation.  

Sounds easy, right?  

The technology exists to create security at scale and build security into every design. The hard bit is marrying the risk appetite, the security architect, the security engineer, the developer and the operations team into one long, seamless pipeline.  

The first step in enabling this is overcoming the problem of people. Breaking down silos and communicating effectively between the teams. The second step is determining the risk appetite for your organization — are you in an organization that balks at the thought of performing security checks in a fully automated way? Is there a middle ground so that you can go faster before you truly go fast? The third step is the coalition of the willing — a coalition that includes programs and teams that want to truly go fast and at scale. And they know how to leverage processes and technology to integrate secure design methodologies into their products.

So let’s breakdown an example of an organization that does it well. This group has a small engineering team and a small operations team but they run in a large organization. The teams are in different silos but work similar to a single team, almost like a production line. They’ve established a “coalition of the willing” — they need to get their jobs done smarter because they can’t afford to do it manually. So everything is running as code. They’re also in a heavily regulated industry with a low risk appetite and a high need for regulatory compliance. They must be secure, compliant and able to show where controls are applied. To meet these requirements they have made a decision to use SecDevOps for their deployments on-premise and in the cloud. Everything they do is done “as code.”

The security architectural patterns for infrastructure are used to write scripts that will build architecture with security guardrails built-in. The code is scanned for secure code practices before it goes from Build to Testing. The infrastructure is vulnerability scanned before it goes from Test into Production. If, at any point in the pipeline, there is a failure, there are steps to flag it for redevelopment. For a code failure, an automated ticket is sent with the failure to the development team to remediate. This requires a manual review to address the code failure. If the vulnerability scan fails, an automated ticket is raised for review, and a manual review of the output is conducted. Their organizational risk appetite doesn’t allow for full auto-remediation just yet. Still they are talking about it in areas that are deemed “low risk.” The security operations and network operations team are beginning to use artificial intelligence and machine learning for anomaly detection for level 1 and level 2 incidents with the plan to auto-generate cases and tickets for events. They’re on a continuous improvement journey, never resting on their laurels.

This group uses role-based access and network segmentation to ensure that least privilege is applied as a practice across the top. Any manual reviews are done by more than one person to ensure there isn’t a single point of failure. And segregation of duties are applied, so a compromised person cannot alter the CI/CD pipeline and deploy compromised infrastructure.

The journey they embarked upon to get to this point took two years. They started small and worked their way up, becoming more confident. They heavily used KPMG and their vendors to enable code-driven integration into their physical infrastructure, such as API integration to firewall infrastructure and routing infrastructure. In some cases, the lifecycle managed older infrastructure for newer versions that enabled API integration and orchestration.

These groups failed often and fast, but they weren’t afraid to fail. Each failure taught them what to do next and how to improve. As they became successful, they built on their success by adding more system integration and scale.

Their recipe for success was simple: Automate everything and be secure by design. Try. Fail. Learn. The task was monumental: to eat an elephant by picking up a spoon and taking it piece by piece.