DevSecOps Engineering
Building Security Into the Foundations
At the start of the DevSecOps transformation at a Tier-1 bank, the state of affairs was what one would expect from a large, regulated institution that had grown through decades of acquisition: hundreds of applications, dozens of delivery teams, and a security review process that averaged six weeks per release. Developers treated security as a gate they had to survive, not a discipline they owned. The security team, outnumbered fifty-to-one by engineers, was drowning in manual reviews and rubber-stamping findings they did not have time to validate. Compliance audits consumed entire quarters.
A critical early decision changed everything: the team would not bolt security onto existing pipelines. Instead, the pipelines would be rebuilt so that security was the path of least resistance. These were called golden paths -- opinionated, paved roads through the delivery lifecycle where security scanning, policy enforcement, and compliance evidence generation happened automatically, invisibly, and without a single developer filing a ticket. Within eighteen months, the mean time for a security review dropped from six weeks to under four hours. Change failure rates tied to security defects fell by over sixty percent. The lesson was clear: engineers do not resist security when security does not resist engineering. The practices documented here are not theoretical. They are the same patterns used to move a 40,000-person technology organization from quarterly releases to continuous delivery without increasing risk -- and in many cases, materially reducing it.
Overview
DevSecOps Engineering focuses on the practical implementation of DevSecOps principles and practices. It involves the integration of security into the software development lifecycle through automation, collaboration, and continuous improvement.
DevSecOps Engineering is a multidisciplinary field that combines principles from software engineering, security, and operations to create secure, scalable, and efficient systems. It requires a deep understanding of security best practices, automation tools, and continuous integration/continuous deployment (CI/CD) pipelines.
In regulated industries such as banking and financial services, DevSecOps engineering is not optional -- it is a survival requirement. Regulators expect demonstrable, auditable evidence that security controls are embedded throughout the delivery lifecycle, not applied as an afterthought. The Australian Signals Directorate's Information Security Manual (ISM) codifies this expectation: organizations must implement security controls across governance, physical, personnel, and ICT domains as a continuous process, with specific guidance for software development and secure configuration management.
Key Engineering Practices
- Infrastructure as Code (IaC): Managing infrastructure using code to ensure consistency and repeatability.
- Configuration Management: Automating the management of configuration settings across environments.
- Continuous Monitoring: Implementing monitoring solutions to detect and respond to security incidents in real-time.
- Golden Path Engineering: Building opinionated, secure-by-default delivery pipelines that make the right thing the easy thing.
- Policy as Code: Encoding security and compliance policies into machine-readable, version-controlled artifacts that are enforced automatically at every stage of the pipeline.
Infrastructure as Code (IaC)
IaC involves managing and provisioning infrastructure through code. Key practices include:
- Version Control: Storing infrastructure code in version control systems like Git.
- Automated Provisioning: Using tools like Terraform and Ansible to automate infrastructure provisioning.
- Environment Consistency: Ensuring that development, testing, and production environments are consistent.
In banking environments, IaC is the foundation of audit compliance. When every infrastructure change is a pull request with a review trail, you have eliminated an entire class of audit findings related to undocumented changes. At the bank, all infrastructure modifications were required to pass through Terraform plans reviewed by both the owning team and a rotating infrastructure security reviewer -- not as a gate, but as a peer review integrated into the normal development workflow.
Version Control
Version Control is the practice of storing infrastructure code in version control systems like Git. It allows multiple developers to collaborate on infrastructure changes, maintain a history of changes, and revert to previous versions if needed. Examples of version control systems include:
- Git: A distributed version control system widely used in the software industry.
- Subversion (SVN): A centralized version control system that tracks changes to files and directories.
- Mercurial: A distributed version control system similar to Git.
Example: Using Git for Version Control
For instance, using Git allows developers to create branches for new infrastructure changes, merge changes seamlessly, and revert to previous versions if issues arise. This ensures a smooth and collaborative development process.
In a Tier-1 bank context, Git-based version control of infrastructure code serves a dual purpose: it is both the engineering workflow and the compliance evidence. Every commit is a timestamped, attributed record of who changed what, when, and why. When regulators ask for change management documentation, you point them at the Git log and the associated pull request reviews. This replaced a manual change advisory board (CAB) process that consumed over 200 person-hours per month.
Automated Provisioning
Automated Provisioning involves using tools like Terraform and Ansible to automate infrastructure provisioning. Key practices include:
- Infrastructure as Code (IaC): Writing code to define and manage infrastructure.
- Automated Deployment: Using automation tools to deploy infrastructure consistently across environments.
- Configuration Management: Managing configuration settings using tools like Ansible, Chef, and Puppet.
Example: Using Terraform for Automated Provisioning
For example, using Terraform to define and manage infrastructure as code ensures that infrastructure is provisioned consistently across different environments. This reduces the risk of configuration drift and improves the reliability of deployments.
At the bank, the team built Terraform modules that encoded CIS Benchmark configurations by default. When a team provisioned a new AWS account or Azure subscription, the module automatically applied hardened security group rules, enabled logging to the central SIEM, configured encryption at rest and in transit, and registered the resources in the asset inventory. Teams could override defaults, but overrides triggered an automatic security review. The result: ninety-five percent of newly provisioned infrastructure was compliant from the moment it was created.
Environment Consistency
Environment Consistency ensures that development, testing, and production environments are consistent. Key practices include:
- Immutable Infrastructure: Deploying infrastructure that cannot be modified after it is created.
- Configuration Management: Using tools like Ansible, Chef, and Puppet to manage configuration settings.
- Automated Testing: Running tests automatically to ensure that environments are consistent.
Example: Using Ansible for Configuration Management
For instance, using Ansible to manage configuration settings ensures that development, testing, and production environments are consistent. This reduces the risk of configuration drift and improves the reliability of deployments.
In financial services, environment consistency is not a convenience -- it is a regulatory requirement. APRA examiners expect that what you test is what you deploy. At the bank, this was achieved by building container images in CI that were promoted immutably through environments. The same image hash that passed security scanning in the build stage was the exact artifact deployed to production. No rebuilds, no manual patches, no drift.
Configuration Management
Configuration Management involves automating the management of configuration settings. Key practices include:
- Configuration Drift Detection: Identifying and correcting configuration drift using tools like Chef and Puppet.
- Immutable Infrastructure: Deploying infrastructure that cannot be modified after it is created.
- Secret Management: Securely managing sensitive information like passwords and API keys.
Configuration Drift Detection
Configuration Drift Detection involves identifying and correcting configuration drift using tools like Chef and Puppet. Key practices include:
- Automated Scanning: Using tools to scan for configuration drift.
- Drift Remediation: Automatically correcting configuration drift to ensure consistency.
- Version Control: Storing configuration settings in version control systems to track changes.
Example: Using Chef for Configuration Drift Detection
For example, using Chef to scan for configuration drift and automatically correct it ensures that configuration settings remain consistent across environments. This reduces the risk of configuration issues and improves the reliability of deployments.
In a banking environment, configuration drift is not merely an operational inconvenience -- it is a potential compliance violation and a security exposure. The team implemented continuous drift detection using Open Policy Agent (OPA) policies that ran every fifteen minutes against the live state of every production environment. Drift was reported to a central dashboard and, for critical controls (encryption settings, network ACLs, IAM policies), automatically remediated. The security team shifted from manually auditing configurations to reviewing exception reports.
Immutable Infrastructure
Immutable Infrastructure involves deploying infrastructure that cannot be modified after it is created. Key practices include:
- Golden Images: Creating and deploying pre-configured images that cannot be modified.
- Automated Provisioning: Using automation tools to deploy immutable infrastructure.
- Configuration Management: Managing configuration settings using tools like Ansible, Chef, and Puppet.
Example: Using Golden Images for Immutable Infrastructure
For instance, creating and deploying golden images ensures that infrastructure is immutable and cannot be modified after it is created. This reduces the risk of configuration drift and improves the reliability of deployments.
The golden image pattern is where golden paths begin. At the bank, the team maintained a library of hardened base images -- one for each approved operating system and runtime -- that were rebuilt weekly with the latest security patches, scanned against CIS Benchmarks Level 2, and signed with a cryptographic attestation. Teams could only deploy containers or VMs derived from these signed base images. If a vulnerability was discovered in a base image, the team rebuilt and re-signed it, and every downstream deployment picked up the fix on its next release cycle without any team taking manual action.
Secret Management
Secret Management involves securely managing sensitive information like passwords and API keys. Key practices include:
- Encryption: Encrypting sensitive information to protect it from unauthorized access.
- Access Control: Implementing access control mechanisms to restrict access to sensitive information.
- Secret Rotation: Regularly rotating secrets to minimize the risk of exposure.
Example: Using HashiCorp Vault for Secret Management
For example, using HashiCorp Vault to manage secrets ensures that sensitive information is encrypted and access is controlled. This reduces the risk of unauthorized access and improves the security of the system.
At the bank, the team deployed HashiCorp Vault with dynamic secrets for database credentials. Instead of storing long-lived credentials in configuration files -- a practice that had contributed to multiple audit findings -- applications requested short-lived credentials at runtime that expired after the session ended. Database credentials were rotated automatically every twenty-four hours. When combined with Vault's audit logging, this provided complete visibility into every secret access event, which satisfied both internal audit requirements and APRA examination evidence requests.
Continuous Monitoring
Continuous Monitoring involves implementing solutions to detect and respond to security incidents in real-time. Key practices include:
- Log Management: Collecting and analyzing logs to identify security incidents.
- Intrusion Detection Systems (IDS): Detecting unauthorized access to systems.
- Security Information and Event Management (SIEM): Aggregating and analyzing security data from multiple sources.
Log Management
Log Management involves collecting and analyzing logs to identify security incidents. Key practices include:
- Centralized Logging: Collecting logs from multiple sources in a centralized location.
- Log Analysis: Analyzing logs to identify security incidents and trends.
- Alerting: Setting up alerts to notify stakeholders of security incidents.
Example: Using ELK Stack for Log Management
For instance, using the ELK Stack (Elasticsearch, Logstash, and Kibana) to collect and analyze logs provides a centralized view of security incidents. This helps in identifying and responding to security incidents quickly.
In the bank's environment, centralized logging was not negotiable -- APRA's CPS 234 requires that regulated entities maintain comprehensive audit trails for information security. The platform ingested over two terabytes of log data daily from application logs, infrastructure events, authentication systems, and network devices into a centralized platform. Correlation rules were tuned specifically for financial services threat patterns: credential stuffing against online banking, anomalous wire transfer approvals, after-hours privileged access to core banking systems.
Intrusion Detection Systems (IDS)
Intrusion Detection Systems (IDS) involve detecting unauthorized access to systems. Key practices include:
- Network-Based IDS: Monitoring network traffic for signs of unauthorized access.
- Host-Based IDS: Monitoring individual systems for signs of unauthorized access.
- Anomaly Detection: Using machine learning algorithms to detect anomalies in network traffic and system behavior.
Example: Using Snort for Network-Based IDS
For example, using Snort to monitor network traffic for signs of unauthorized access helps in detecting and responding to security incidents quickly. This improves the security of the system and reduces the risk of unauthorized access.
Security Information and Event Management (SIEM)
Security Information and Event Management (SIEM) involves aggregating and analyzing security data from multiple sources. Key practices include:
- Data Aggregation: Collecting security data from multiple sources in a centralized location.
- Correlation Analysis: Analyzing security data to identify patterns and correlations.
- Incident Response: Responding to security incidents based on the analysis of security data.
Example: Using Splunk for SIEM
For instance, using Splunk to aggregate and analyze security data provides a centralized view of security incidents. This helps in identifying and responding to security incidents quickly and improves the overall security posture of the system.
AI-Driven DevSecOps Practices
AI-driven DevSecOps practices leverage AI technologies to enhance the DevSecOps lifecycle. Key practices include:
- Automated Threat Detection: Using AI to identify and respond to security threats in real-time.
- Intelligent Incident Response: Leveraging AI to automate and optimize incident response processes.
- Predictive Analytics for Security: Using AI to predict potential security vulnerabilities and proactively address them.
- AI-Enhanced Compliance Monitoring: Implementing AI-driven tools to ensure compliance with security policies and regulations.
Automated Threat Detection
Automated Threat Detection involves using AI to identify and respond to security threats in real-time. Key practices include:
- Machine Learning Algorithms: Using machine learning algorithms to detect anomalies and identify security threats.
- Behavioral Analysis: Analyzing user and system behavior to identify potential security threats.
- Threat Intelligence: Leveraging threat intelligence to identify and respond to emerging security threats.
Example: Using Darktrace for Automated Threat Detection
For example, using Darktrace to detect anomalies and identify security threats in real-time helps in responding to security incidents quickly. This improves the security of the system and reduces the risk of unauthorized access.
In financial services, AI-driven threat detection addresses a fundamental scaling problem: the volume of security telemetry in a large bank exceeds what any human team can review. At the bank, the team deployed behavioral analytics models that baselined normal transaction patterns and flagged deviations -- not just network anomalies, but business-logic anomalies such as unusual approval chains, out-of-pattern batch processing, or API call sequences that did not match known application workflows. This approach caught threats that signature-based systems missed entirely.
Intelligent Incident Response
Intelligent Incident Response involves leveraging AI to automate and optimize incident response processes. Key practices include:
- Automated Incident Triage: Using AI to automatically triage security incidents and prioritize response efforts.
- Incident Response Playbooks: Developing and implementing incident response playbooks to guide response efforts.
- Continuous Improvement: Continuously improving incident response processes based on lessons learned from previous incidents.
Example: Using IBM QRadar for Intelligent Incident Response
For instance, using IBM QRadar to automate incident triage and prioritize response efforts helps in responding to security incidents quickly. This improves the security of the system and reduces the risk of unauthorized access.
Predictive Analytics for Security
Predictive Analytics for Security involves using AI to predict potential security vulnerabilities and proactively address them. Key practices include:
- Risk Assessment: Using AI to assess the risk of potential security vulnerabilities.
- Proactive Mitigation: Implementing measures to proactively mitigate potential security vulnerabilities.
- Continuous Monitoring: Continuously monitoring for potential security vulnerabilities and addressing them before they can be exploited.
Example: Using Vectra AI for Predictive Analytics
For example, using Vectra AI to predict potential security vulnerabilities and proactively address them helps in improving the security of the system. This reduces the risk of security incidents and improves the overall security posture of the system.
AI-Enhanced Compliance Monitoring
AI-Enhanced Compliance Monitoring involves implementing AI-driven tools to ensure compliance with security policies and regulations. Key practices include:
- Automated Compliance Checks: Using AI to automatically check for compliance with security policies and regulations.
- Policy Enforcement: Enforcing security policies using AI-driven tools.
- Continuous Auditing: Continuously auditing security practices to ensure compliance with security policies and regulations.
Example: Using Splunk for AI-Enhanced Compliance Monitoring
For instance, using Splunk to automatically check for compliance with security policies and regulations helps in ensuring that the system remains compliant. This reduces the risk of non-compliance and improves the overall security posture of the system.
In banking, AI-enhanced compliance monitoring transforms the audit cycle from a periodic, labor-intensive exercise into a continuous, automated assurance function. At the bank, the team built compliance dashboards that mapped every APRA CPS 234 obligation and ASD ISM control to specific pipeline stages, infrastructure configurations, and runtime checks. Auditors could pull evidence for any control at any time without requesting it from engineering teams. This reduced audit preparation effort by approximately seventy percent and eliminated the adversarial dynamic between engineering and audit teams.
Recent research quantifies the scale and trajectory of AI-driven security integration. Cheenepalli et al. (2025) surveyed 405 SME professionals and found that while 68% have adopted DevSecOps, only 12% perform security scans per commit — a gap that golden path engineering directly addresses by making per-commit scanning the default rather than the exception. API security tool adoption reached 63% and software composition analysis 62%, but container security lagged at 34%, mirroring the maturity patterns observed at the bank where container security was consistently the last capability teams adopted. The survey also confirmed what the transformation at the bank demonstrated empirically: leadership emphasis on security (73% of respondents) is necessary but insufficient without automation that removes friction from the developer experience.
The emerging field of agentic AI cybersecurity, surveyed by Lazer et al. (2026), describes systems capable of reasoning, planning, acting, and adapting over long-lasting security tasks — extending beyond traditional alert-driven detection toward dynamic threat intelligence, adversarial reasoning, and autonomous defence. The survey identifies critical gaps in governance frameworks for autonomous security agents, a challenge directly relevant to banking environments where every automated action must be attributable and auditable. The research on agent collusion and memory poisoning highlights risks that must be addressed as security operations increasingly incorporate autonomous decision-making.
Examples of AI-Driven DevSecOps Tools and Technologies
- Darktrace: An AI-powered cybersecurity platform that detects and responds to threats in real-time.
- Splunk: A platform that uses AI to analyze and visualize machine-generated data for security insights.
- IBM QRadar: An AI-driven security information and event management (SIEM) tool that identifies and prioritizes security threats.
- Cortex XDR: An AI-powered extended detection and response (XDR) platform that integrates data from multiple sources to detect and respond to threats.
- Vectra AI: A cybersecurity platform that uses AI to detect and respond to cyberattacks in real-time.
- Snyk: A security tool that uses AI to identify and fix vulnerabilities in code and dependencies.
- SonarQube: A code quality tool that uses AI to analyze code and provide actionable insights.
Golden Path Engineering in Regulated Environments
Golden paths are opinionated, paved roads through the software delivery lifecycle that encode security, compliance, and operational best practices into the default developer experience. The concept is simple: if you make the secure path the easiest path, adoption follows naturally.
At the bank, the golden paths were built as composable pipeline templates that teams could adopt with a single configuration file in their repository. Each golden path included:
- Pre-commit hooks for secret scanning and linting
- SAST scanning integrated into the build stage using tools aligned with OWASP Top 10 coverage
- Software Composition Analysis (SCA) for dependency vulnerability detection
- Container image scanning against CIS Benchmarks before registry push
- Dynamic security testing in staging environments
- Automated compliance evidence generation mapped to specific regulatory controls
- Deployment gates with automatic rollback on policy violations
Teams that adopted the golden path received faster pipeline execution (because the path was optimized), automatic compliance evidence generation (reducing their audit burden to near zero), and priority support from the platform engineering team. Teams that chose to build their own pipelines were free to do so, but they owned the compliance evidence burden. Within six months, voluntary adoption exceeded ninety percent.
Importance of DORA Metrics in DevSecOps Engineering
DORA (DevOps Research and Assessment) metrics are crucial for measuring the performance and effectiveness of DevSecOps engineering practices. The four key DORA metrics are:
- Deployment Frequency: How often new code is deployed to production.
- Lead Time for Changes: The time it takes for a code change to go from commit to production.
- Change Failure Rate: The percentage of changes that result in a failure in production.
- Mean Time to Restore (MTTR): The average time it takes to restore service after a failure.
Examples of Applying DORA Metrics in DevSecOps Engineering Projects
-
Deployment Frequency: By increasing the frequency of deployments, teams can quickly iterate on improvements and deliver new features to users more rapidly. For example, a team might aim to deploy new versions weekly instead of monthly. At the bank, teams on the golden path increased deployment frequency from monthly to multiple times per week, with security scanning adding less than four minutes to the pipeline.
-
Lead Time for Changes: Reducing the lead time for changes allows teams to respond faster to new requirements and issues. For instance, automating the CI/CD pipeline can significantly reduce the time it takes to deploy updates. Lead time for security-sensitive changes dropped from six weeks (due to manual security review) to under four hours by embedding automated scanning and policy-as-code enforcement.
-
Change Failure Rate: Monitoring and reducing the change failure rate helps ensure that updates do not negatively impact production systems. Implementing robust testing and validation processes can help catch issues before they reach production. Security-related change failures dropped by over sixty percent after golden path adoption because common vulnerability classes were caught during the build stage rather than in production.
-
Mean Time to Restore (MTTR): Minimizing MTTR ensures that any issues in production are resolved quickly, reducing downtime and maintaining service reliability. For example, setting up automated rollback mechanisms can help restore service quickly in case of a failure. The automated rollback capability, triggered by both functional and security anomaly detection, reduced MTTR for security incidents from hours to minutes.
References
-
Australian Government Information Security Manual (ISM). Australian Signals Directorate. Available at: https://www.cyber.gov.au/resources-business-and-government/essential-cyber-security/ism
-
OWASP Top 10 -- 2021. The Open Worldwide Application Security Project. Available at: https://owasp.org/Top10/
-
CIS Benchmarks. Center for Internet Security. Available at: https://www.cisecurity.org/cis-benchmarks
-
Mohan, V. and Ottenheimer, D. DevSecOps: A leader's guide to producing secure software without compromising flow. O'Reilly Media, 2020.
-
ACSC Essential Eight Maturity Model. Australian Cyber Security Centre, 2023. Available at: https://www.cyber.gov.au/resources-business-and-government/essential-cyber-security/essential-eight
-
APRA Prudential Standard CPS 234 -- Information Security. Australian Prudential Regulation Authority, July 2019. Available at: https://www.apra.gov.au/sites/default/files/cps_234_july_2019_for_public_release.pdf
-
Kim, G., Humble, J., Debois, P., and Willis, J. The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations. IT Revolution Press, 2016.
-
Cheenepalli, J. et al. (2025). "Advancing DevSecOps in SMEs: Challenges and Best Practices for Secure CI/CD Pipelines." arXiv:2503.22612. https://arxiv.org/abs/2503.22612
-
Lazer, S.J. et al. (2026). "A Survey of Agentic AI and Cybersecurity: Challenges, Opportunities and Use-case Prototypes." arXiv:2601.05293. https://arxiv.org/abs/2601.05293