
AI Related Threats and How to Protect Against Them
According to the OWASP AI Exchange, AI threats fall into three major categories, which are divided by attack surfaces.
Development-time Threats
These attacks occur during the development of the AI system, specifically attacking the data within the engineering pipeline. This includes poisoning or stealing data during data collection, preparation, and/or storage.
Runtime Security Threats
These attacks occur due to security weaknesses that are not AI-specific. AI systems are also IT systems, so common IT vulnerabilities can occur with the software that is wrapped around the model for public use as well as the IT infrastructure that hosts the AI solution. Specifically, attacks will occur on the model itself rather than at the data level. This attack surface focuses on attacking the entire model via poisoning or theft.
Threats through Use
These attacks occur during normal activities associated with the use of an AI model, particularly around input and output activities.
Training Data Leakage: Occurs during the development of the model, data can be leaked or stolen from the training environment.
Data Poisoning: Occurs during the development of the model and can occur anywhere from the source of the data, the preprocessing of the data, and during the training of the model.
Model Poisoning: A runtime security threat that involves modifying the model, typically at the source of the model.
Model Theft: Can be a runtime security threat or a threat through use of the model itself. The model is used to learn how the model behaves and develop further adversarial attacks.
Prompt Injections: Typically occur with GenAI models, with a goal to send nefarious prompts through the model and change the way the model is trained to behave and rather follow the attackers instructions.
Model Evasion: Has a goal to cause errors in predictions by modifying the data.
Model Inversion: Has a goal to extract sensitive data to infer the model’s inner workings.
Denial of Model Service: Similar to a DoS attack, where the attacker sends large volumes of interference requests to cause degradation of the AI solution.
What can be done to stop these attacks from happening?
Protecting Against AI-Related Threats
Mitigating the probability of AI attacks from happening is a similar approach to how you would mitigate any attack to an IT system, by implementing internal controls. Access controls and application level monitoring are critical in both respects, including data validation for training sets and monitoring throughout the entire lifecycle.
AI specific controls relate to attacks like prompt injections and denial of model service attacks. Companies can implement templates to enforce structure and sanitization of inputs and outputs to prevent against prompt injections. Companies can also perform adversarial testing, which is similar to Red Team / Ethical hacking activities. This involves intentionally feeding malicious data inputs into the model to ensure the model is robust enough not to be affected by unexpected data input, abnormal conditions, or cyberattacks.
It’s crucial to highlight that the purpose of internal controls are to help safeguard an organization, minimize risks, and protect its assets. For these controls to work effectively, they must be periodically monitored, tested, and adjusted as changes in the environment and organization occur throughout time.
Can you think of any other AI related threats I might have missed?