How CISOs Can Prepare the Enterprise for AI Coding Assistants

By Pieter Danhieux , CEO & Co-Founder at Secure Code Warrior
Best Practices,

Secure Code Warrior’s Pieter Danhieux offers insights on how CISOs can prepare for enterprise AI coding assistants. This article originally appeared on Solutions Review’s Insight Jam, an enterprise IT community enabling the human conversation on AI.

LLMs struggle to produce consistently secure code. Security-skilled developers can help ensure secure AI-generated code while optimizing performance.

Development teams need to get better control of the large language models used in writing software code before those AI models, which have provided undeniable benefits, become a runaway train in terms of lax security protocols.

Software developers were quick to see AI’s advantages. A little more than a half-year after ChatGPT made its initial splash in November 2022, a GitHub survey found that 92 percent of U.S.-based developers were using AI coding tools—a number that has likely gone up since then. GitHub has said that its Copilot coding assistant wrote 82 billion lines of code in its first year.

The downside of enlisting LLMs to write code is the risk they pose to security. Flaws and vulnerabilities have always been present in software. However, the fast-moving evolution of cloud services has increased the demand for code. And if LLMs are used to meet that demand without having the security and quality of the code carefully checked, the consequences could be significant.

In our own experiments using LLMs to complete secure coding challenges, we routinely see error rates from 10 percent up to 60 percent, with the biggest models averaging around 20-25%. It’s imperative to note that this is a controlled situation in which we’re specifically prompting it in relation to security problems. If your prompt is not security-centric with the correct asks, your chances of success will be worse.

In terms of vulnerability classes, some are definitely easier for LLMs to navigate than others. They tend to score well on superficial, well-documented patterns such as SQL and other injection vulnerabilities, but suffer on more subjective, flexible issues like resource releasing, insufficient logging, and misconfigured permissions.

This current state of affairs poses a crisis for the cybersecurity industry, but it also creates an opportunity. On one hand, CISOs and security leaders need a comprehensive plan for implementing AI coding tools safely in order to protect their systems and data, avoid the consequences of a major breach, and stay in compliance with a growing number of regulations. On the other hand, such a plan will allow companies to benefit from the considerable advantages AI coding tools offer while establishing a reliable process for fast, productive and secure software development.

Reaping those benefits starts with a focus on risk reduction at the developer level.

Be Aware of AI’s Shortcomings

It’s not like code was pristine before AI showed up. Human software engineers make their share of mistakes, too. A study by Coralogix found that developers create, on average, 70 bugs per 1,000 lines of code, with 15 of those bugs ending up in production systems. As a result, 75 percent of developers’ time is spent on debugging, since fixing bugs takes 30 times longer than writing a line of code.

At a glance, AI models actually improve those numbers. Nearly 76 percent of respondents to a Synk survey said that, overall, AI code is more secure than code created by humans. But it’s far from perfect—56.4 percent said AI does introduce coding issues either sometimes or frequently. And considering the sheer volume of code AI creates—and that Synk found that 80% of developers using AI bypass AI code security policies—that threatens to put a lot of buggy code into the software ecosystem.

Overreliance on AI coding tools as they are currently used is risky because models can struggle to produce reliable results, particularly at the enterprise level.

An example of LLM’s shortcomings can be found in how the models are ill-equipped to stay current with changes in functionalities. As Andrea Valenzuela, a software developer and data scientist at CERN, points out, LLMs are trained on a snapshot of code and documentation taken at a specific point in time. But APIs and other interfaces, for example, change often. Because LLMs aren’t updated in real time, this leaves them blind to new security risks, which could result in using vulnerable code.

Although LLMs can be trained to write code, they are trained to predict the next line of code based on what’s come before. But training them to write code that’s optimized for specific business functions—or certain hardware or software environments—is a very difficult challenge.

Other potential vulnerabilities resulting from LLM-generated code include data poisoning used to manipulate machine learning models, the theft of LLM models, which can result in the creation of counterfeit models, adversarial inputs that trick LLMs into producing faulty output, and biases present in training data that manifest within its output. Cross-site scripting is another potential vulnerability resulting from AI-generated code. In fact, LLM code flaws are common enough that the OWASP Foundation has created a web page just for the top 10 most critical LLM vulnerabilities.

Setting the Stage for Secure AI Coding

Organizations aren’t going to abandon AI over these concerns. In fact, the trend is strongly in favor of using it more. However, they need to acknowledge that AI models can’t be trusted to produce consistently secure and optimally functional code. CISOs need to prepare their organizations’ foundations to apply security and oversight to LLMs to ensure that they get maximum benefits from AI-generated code while applying strict security controls to the process.

Another consideration is the decision-making process: Who will determine which AI agent should be used? As we have seen, there are various LLMs in the market, each with different strengths and shortcomings, and in terms of coding, one may be more accurate than another. Ultimately, highly regulated enterprise environments like the financial services sector will likely operate with a central decision, but more liberal environments, such as the tech sector, may leave this up to individual developers, which will vastly increase the risk and governance variables in the SDLC.

Among the steps they can take:

AI governance: Adopt a framework to establish safe and ethical practices and policies for using AI and machine learning. A governance team can include stakeholders from across the enterprise, including IT, data science, legal, compliance and business.
Regulatory legislation: Companies should be aware of governmental efforts to restrict AI use. The EU AI Act is the first regulatory framework that applies to AI. The United States doesn’t yet have legislation directly addressing AI, but the White House Executive Order from October 2023 does set standards for safety, transparency, and security.
Upskilling and reskilling: Secure code is at the core of cybersecurity, and improving the security and quality of AI code begins with developers. Organizations need to make sure developers have the training they need to apply secure coding best practices in the code they write and in checking the work of code generated by LLMs.

Teams need precision skills development as part of a thorough learning program designed to apply security when code is being created and throughout the software development lifecycle (SDLC). A developer-driven security program can increase productivity, improve the SLDC workflow and spur innovation, while also making software more secure and reliable.

Part of that program is ensuring that the training is taking hold with developers. Using a platform that provides a data-driven measurement of a security-learning program’s effectiveness, while also identifying top performers and those who need extra help. It can also provide benchmarks that identify areas that need to be addressed by the learning program, as well as a measure of how an organization’s program is performing relative to the rest of the industry.

Conclusion

Top leaders need to buy into the importance of secure code and the importance of training developers to be thoroughly versed in safe coding practices. An environment that allows LLMs to create code while ensuring security under the guidance of security-aware developers can allow organizations to improve productivity while minimizing risk.

This article was written by Pieter Danhieux on August 29, 2024

Pieter Danhieux

CEO & Co-Founder

Pieter Danhieux is the Co-Founder/CEO of Secure Code Warrior, a global security company that makes software development better and more secure. In 2016, he was No. 80 on the list of Coolest Tech people in Australia (Business Insider) and awarded Cyber Security Professional of the Year (AISA - Australian Information Security Association). Pieter is also a Principal instructor for the SANS Institute, teaching military, government, and private organizations offensive techniques on how to target and assess organizations, systems, and individuals for security weaknesses. He also serves as an advisory board member of NVISO, a cybersecurity security consulting company. Before starting his own company, Pieter worked at Ernst & Young and BAE Systems. He is also one of the Co-Founders of BruCON, one of the most awesome hacking conferences on this planet. He started his information security career early in life and obtained the Certified Information Systems Security Professional (CISSP) certification as one of the youngest persons ever in Belgium. On his way, he collected a whole range of cybersecurity certificates (CISA, GCFA, GCIH, GPEN, GWAP) and is currently one of the select few people worldwide to hold the top certification GIAC Security Expert (GSE).

How to Assess the AI Readiness of Your Information Security Team

Best Practices