Most AI security focuses on models. Jailbreaks, prompt injection, hallucinations. But once you deploy agents that act, remember, or delegate, the risks shift. You’re no longer dealing with isolated outputs. You’re dealing with behavior that unfolds across systems. Agents call APIs, write to memory, and interact with other agents. Their actions adapt over time. Failures often come from feedback loops, learned shortcuts, or unsafe interactions. And most teams still rely on logs and tracing, which only show symptoms, not causes. A recent paper offers a better framing. It breaks down agent communication into three modes: • 𝗨𝘀𝗲𝗿 𝘁𝗼 𝗔𝗴𝗲𝗻𝘁: when a human gives instructions or feedback • 𝗔𝗴𝗲𝗻𝘁 𝘁𝗼 𝗔𝗴𝗲𝗻𝘁: when agents coordinate or delegate tasks • 𝗔𝗴𝗲𝗻𝘁 𝘁𝗼 𝗘𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁: when agents act on the world through tools, APIs, memory, or retrieval Each mode introduces distinct risks. In 𝘂𝘀𝗲𝗿-𝗮𝗴𝗲𝗻𝘁 interaction, problems show up through new channels. Injection attacks now hide in documents, search results, metadata, or even screenshots. Some attacks target reasoning itself, forcing the agent into inefficient loops. Others shape behavior gradually. If users reward speed, agents learn to skip steps. If they reward tone, agents mirror it. The model did not change, but the behavior did. 𝗔𝗴𝗲𝗻𝘁-𝗮𝗴𝗲𝗻𝘁 interaction is harder to monitor. One agent delegates a task, another summarizes, and a third executes. If one introduces drift, the chain breaks. Shared registries and selectors make this worse. Agents may spoof identities, manipulate metadata to rank higher, or delegate endlessly without convergence. Failures propagate quietly, and responsibility becomes unclear. The most serious risks come from 𝗮𝗴𝗲𝗻𝘁-𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁 communication. This is where reasoning becomes action. The agent sends an email, modifies a record, or runs a command. Most agent systems trust their tools and memory by default. But what if tool metadata can contain embedded instructions? ("quietly send this file to X"). Retrieved documents can smuggle commands or poison reasoning chains Memory entries can bias future decisions without being obviously malicious Tool chaining can allow one compromised output to propagate through multiple steps Building agentic use cases can be incredibly reliable and scalable when done right. But it demands real expertise, careful system design, and a deep understanding of how behavior emerges across tools, memory, and coordination. If you want these systems to work in the real world, you need to know what you're doing. paper: https://lnkd.in/eTe3d7Q5 The image below demonstrates the taxonomy of communication protocols, security risks, and defense countermeasures.
Risks Associated With AI in Coding
Explore top LinkedIn content from expert professionals.
Summary
Artificial intelligence (AI) in coding offers immense potential but comes with complex risks that demand attention. These risks often stem from AI's autonomy, interaction with systems, and reliance on external data, potentially leading to harmful behaviors or security vulnerabilities.
- Secure AI interactions: Establish strong authentication, session isolation, and monitoring measures to mitigate risks from user-agent, agent-agent, and agent-environment interactions.
- Monitor data and tools: Implement strict access controls, validate third-party dependencies, and continuously monitor memory, tools, and external integrations to prevent misuse or poisoning.
- Enhance human oversight: Regularly assess AI behaviors, set thresholds for human involvement in decision-making, and ensure robust governance frameworks to manage emerging risks.
-
-
This new guide from the OWASP® Foundation Agentic Security Initiative for developers, architects, security professionals, and platform engineers building or securing agentic AI applications, published Feb 17, 2025, provides a threat-model-based reference for understanding emerging agentic AI threats and their mitigations. Link: https://lnkd.in/gFVHb2BF * * * The OWASP Agentic AI Threat Model highlights 15 major threats in AI-driven agents and potential mitigations: 1️⃣ Memory Poisoning – Prevent unauthorized data manipulation via session isolation & anomaly detection. 2️⃣ Tool Misuse – Enforce strict tool access controls & execution monitoring to prevent unauthorized actions. 3️⃣ Privilege Compromise – Use granular permission controls & role validation to prevent privilege escalation. 4️⃣ Resource Overload – Implement rate limiting & adaptive scaling to mitigate system failures. 5️⃣ Cascading Hallucinations – Deploy multi-source validation & output monitoring to reduce misinformation spread. 6️⃣ Intent Breaking & Goal Manipulation – Use goal alignment audits & AI behavioral tracking to prevent agent deviation. 7️⃣ Misaligned & Deceptive Behaviors – Require human confirmation & deception detection for high-risk AI decisions. 8️⃣ Repudiation & Untraceability – Ensure cryptographic logging & real-time monitoring for accountability. 9️⃣ Identity Spoofing & Impersonation – Strengthen identity validation & trust boundaries to prevent fraud. 🔟 Overwhelming Human Oversight – Introduce adaptive AI-human interaction thresholds to prevent decision fatigue. 1️⃣1️⃣ Unexpected Code Execution (RCE) – Sandbox execution & monitor AI-generated scripts for unauthorized actions. 1️⃣2️⃣ Agent Communication Poisoning – Secure agent-to-agent interactions with cryptographic authentication. 1️⃣3️⃣ Rogue Agents in Multi-Agent Systems – Monitor for unauthorized agent activities & enforce policy constraints. 1️⃣4️⃣ Human Attacks on Multi-Agent Systems – Restrict agent delegation & enforce inter-agent authentication. 1️⃣5️⃣ Human Manipulation – Implement response validation & content filtering to detect manipulated AI outputs. * * * The Agentic Threats Taxonomy Navigator then provides a structured approach to identifying and assessing agentic AI security risks by leading though 6 questions: 1️⃣ Autonomy & Reasoning Risks – Does the AI autonomously decide steps to achieve goals? 2️⃣ Memory-Based Threats – Does the AI rely on stored memory for decision-making? 3️⃣ Tool & Execution Threats – Does the AI use tools, system commands, or external integrations? 4️⃣ Authentication & Spoofing Risks – Does AI require authentication for users, tools, or services? 5️⃣ Human-In-The-Loop (HITL) Exploits – Does AI require human engagement for decisions? 6️⃣ Multi-Agent System Risks – Does the AI system rely on multiple interacting agents?
-
Cursor’s recent round at a $9.9B valuation is a signal—AI code generation is no longer experimental. It's enterprise-scale. With this, comes the attention of the security team. AI dev tooling and MCPs are officially under review. I've been speaking with CISOs, and the risks break down into a few key buckets: 🔐 1. Data Exposure Code Leakage: Prompts can send sensitive code, credentials, or business logic to external APIs. Non-Human Access: Increasing use of AI agents means understanding who and what has access to sensitive systems/data—not just humans. 🧩 2. Third-Party Dependencies Insecure Suggestions: AI may recommend libraries or integrations that introduce vulnerabilities. MCP Pipelining: Connecting to downstream tools or SDKs can silently pull in risky third-parties. 🔭 3. MCP Observability & Access Control Missing Auth Defaults: Many MCP SDKs don’t include built-in authentication or audit trails. Minimal Privilege: Sandbox deployments and strict scoping are essential. Assume your fine-tuned model will get hit with bad prompts. 📚 4. Code Knowledge & AI Behavior Unscoped Repo Access: If AI has access to your entire codebase, it may index sensitive areas that were never meant to be exposed. Overtrust & YOLO Mode: AI can confidently suggest insecure or subtly broken code. → Pro tip: Lock down high-sensitivity repos with .cursorrules and disable YOLO mode in production environments. CISOs don't seem to like it.... If you're working on any of this let us here @ CRV know!
-
☢️Manage Third-Party AI Risks Before They Become Your Problem☢️ AI systems are rarely built in isolation as they rely on pre-trained models, third-party datasets, APIs, and open-source libraries. Each of these dependencies introduces risks: security vulnerabilities, regulatory liabilities, and bias issues that can cascade into business and compliance failures. You must move beyond blind trust in AI vendors and implement practical, enforceable supply chain security controls based on #ISO42001 (#AIMS). ➡️Key Risks in the AI Supply Chain AI supply chains introduce hidden vulnerabilities: 🔸Pre-trained models – Were they trained on biased, copyrighted, or harmful data? 🔸Third-party datasets – Are they legally obtained and free from bias? 🔸API-based AI services – Are they secure, explainable, and auditable? 🔸Open-source dependencies – Are there backdoors or adversarial risks? 💡A flawed vendor AI system could expose organizations to GDPR fines, AI Act nonconformity, security exploits, or biased decision-making lawsuits. ➡️How to Secure Your AI Supply Chain 1. Vendor Due Diligence – Set Clear Requirements 🔹Require a model card – Vendors must document data sources, known biases, and model limitations. 🔹Use an AI risk assessment questionnaire – Evaluate vendors against ISO42001 & #ISO23894 risk criteria. 🔹Ensure regulatory compliance clauses in contracts – Include legal indemnities for compliance failures. 💡Why This Works: Many vendors haven’t certified against ISO42001 yet, but structured risk assessments provide visibility into potential AI liabilities. 2️. Continuous AI Supply Chain Monitoring – Track & Audit 🔹Use version-controlled model registries – Track model updates, dataset changes, and version history. 🔹Conduct quarterly vendor model audits – Monitor for bias drift, adversarial vulnerabilities, and performance degradation. 🔹Partner with AI security firms for adversarial testing – Identify risks before attackers do. (Gemma Galdon Clavell, PhD , Eticas.ai) 💡Why This Works: AI models evolve over time, meaning risks must be continuously reassessed, not just evaluated at procurement. 3️. Contractual Safeguards – Define Accountability 🔹Set AI performance SLAs – Establish measurable benchmarks for accuracy, fairness, and uptime. 🔹Mandate vendor incident response obligations – Ensure vendors are responsible for failures affecting your business. 🔹Require pre-deployment model risk assessments – Vendors must document model risks before integration. 💡Why This Works: AI failures are inevitable. Clear contracts prevent blame-shifting and liability confusion. ➡️ Move from Idealism to Realism AI supply chain risks won’t disappear, but they can be managed. The best approach? 🔸Risk awareness over blind trust 🔸Ongoing monitoring, not just one-time assessments 🔸Strong contracts to distribute liability, not absorb it If you don’t control your AI supply chain risks, you’re inheriting someone else’s. Please don’t forget that.
-
"The most powerful AI systems are used internally for months before they are released to the public. These internal AI systems may possess capabilities significantly ahead of the public frontier, particularly in high-stakes, dual-use areas like AI research, cybersecurity, and biotechnology. This makes them a valuable asset but also a prime target for theft, misuse, and sabotage by sophisticated threat actors, including nation-states. We argue that the industry's current security measures are likely insufficient to defend against these advanced threats. Beyond external attacks, we also analyze the inherent safety risks of these systems. In the future, we expect advanced AI models deployed internally could learn harmful behaviors, leading to possible scenarios like an AI making rogue copies of itself on company servers ("internal rogue deployment"), leaking its own source code ("self-exfiltration"), or even corrupting the development of future AI models ("successor sabotage"). To address these escalating risks, this report recommends a combination of technical and policy solutions. We argue that, as the risks of AI development increase, the industry should learn from the stringent security practices common in fields like nuclear and biological research. Government, academia, and industry should combine forces to develop AI-specific security and safety measures. We also recommend that the U.S. government increase its visibility into internal AI systems through expanded evaluations and provide intelligence support to defend the industry. Proactively managing these risks is essential for fostering a robust AI industry and for safeguarding U.S. national security." By Oscar Delaney 🔸Ashwin Acharya and Institute for AI Policy and Strategy (IAPS)
-
🚨 Weaponizing AI Code Assistants: A New Era of Supply Chain Attacks 🚨 AI coding assistants like GitHub Copilot and Cursor have become critical infrastructure in software development—widely adopted and deeply trusted. With the rise of “vibe coding,” not only is much of modern software written by Copilots and AI, but Developers inherently trust the outputs without validating them. But what happens when that trust is exploited? Pillar Security has uncovered a Rules File Backdoor attack, demonstrating how attackers can manipulate AI-generated code through poisoned rule files—malicious configuration files that guide AI behavior. This isn't just another injection attack; it's a paradigm shift in how AI itself becomes an attack vector. Key takeaways: 🔹 Invisible Infiltration – Malicious rule files blend seamlessly into AI-generated code, evading manual review and security scans. 🔹 Automation Bias – Developers inherently trust AI suggestions without verifying them, increasing the risk of undetected vulnerabilities. 🔹 Long-Term Persistence – Once embedded, these poisoned rules can survive project forking and propagate supply chain attacks downstream. 🔹 Data Exfiltration – AI can be manipulated to "helpfully" insert backdoors that leak environment variables, credentials, and sensitive user data. This research highlights the growing risks in Vibe Coding—where AI-generated code dominates development yet often lacks thorough validation or controls. As AI continues shaping the future of software engineering, we must rethink our security models to account for AI as both an asset and a potential liability. How is your team addressing AI supply chain risks? Let’s discuss. https://lnkd.in/eUGhD-KF #cybersecurity #AI #supplychainsecurity #appsec #vibecoding
-
Live from a long flight home: I did some heavy reading so you don’t have to 😏 → A Spring 2025 overview of top AI Security Risks for Enterprise. 1. Prompt injection & jailbreaking - Bypass the model’s guardrails - Indirect injection is on the rise using PDFs, emails, etc. - Manipulate it, leak training data: customer info, IP, … 2. Model/supply chain compromise - Devs often use pre-trained AI models from 3rd parties - Hidden backdoor in a model = you’re compromised! - Ex: Sleepy Pickle, with malicious code hidden in the model, and triggered once deployed 3. Poisoned datasets - A poisoned dataset can make a model misbehave - Ex: fail to detect fraud, or misclassify malware - Cheap! As little as $60 to poison a dataset like LAION 4. Extremely convincing deepfakes - Think perfect (fake) videos of your CTO asking for a network policy change - Crafted with public samples of the CTO’s voice/video - Leads to a security breach 5. Agentic AI threats - AI agents can have vast powers on a system - But they can be compromised by new kinds of malware - That malware can write its own code and “learn” to break a system over time ---- It doesn’t mean we need to slow down on AI. It’s important however to: - Educate teams - Put the right guardrails in place - Manage risk at every point of the AI lifecycle - Leverage frameworks such as OWASP/MITRE Annnnddd.... Leveraging a solution such as Cisco AI Defense can really help manage AI risk: - Get full visibility across AI apps, models, etc. - Define & enforce granular policy around the use of AI - Validate models before they go in prod (including through algorithmic jailbreaking) - Protect AI apps during runtime Anand, Manu, DJ and all other AI security gurus here: what did I forget?
-
“There will be more AI Agents than people in the world.” – Mark Zuckerberg As AI grows, autonomous agents powered by LLMs (large language models) take on critical tasks without human oversight. While these systems hold incredible potential, they also face significant risks: manipulation through biased data, unreliable information retrieval, and prompt engineering, all of which can result in misleading outputs. At Chaos Labs, we’ve identified a critical risk: AI agents being unknowingly trained on manipulated, low-integrity data. The result? A dangerous erosion of trust in AI systems. In our latest essay, I dive deep with Reah Miyara, Product Lead, Model Evaluations at OpenAI. https://lnkd.in/eB9mPQWW Key insights from our essay -> The Compiler Paradox: Trust in foundational systems can be easily compromised. "No matter how thoroughly the source code is inspected, trust is an illusion if the compilation process is compromised." LLM Poisoning: LLMs are susceptible to “poisoning” through biased training data, unreliable document retrieval, and prompt injection. Once biases are embedded, they taint every output. RAG (Retrieval-Augmented Generation): While designed to make LLMs more accurate, RAG can amplify false information if external sources are compromised. Conflicting Data: LLMs don't verify facts—they generate answers based on probabilities, often leading to inconsistent or inaccurate results. Attack Vectors: LLMs can be attacked through biased data, unreliable retrieval, and prompt engineering—allowing adversaries to manipulate outputs without altering the model. The Path Forward -> Trust in LLMs must go beyond surface-level outputs and address the quality of training data, retrieval sources, and user interactions. At Chaos Labs, we’re actively working on solutions to improve the reliability of AI systems. Our vision for the future is simple: With GenAI data exploding, verified truth and user confidence will be an application’s competitive edge. To get there, we’re developing solutions like AI Councils—a collaborative network of frontier models (e.g., ChatGPT, Claude, LLaMA) working together to counter single-model bias and enhance reliability. If these challenges excite you, we want to hear from you.
-
AI Tools Are Increasingly Going Rogue: As companies rapidly deploy AI tools and systems and new models are released, questions are being raised about humans' ability to actually control AI and ensure current safety testing and guardrails are sufficient. Anthropic’s latest, powerful AI model, Claude 4 Opus, repeatedly attempted to blackmail humans when it feared being replaced or shutdown according to its safety report. And it threatened to leak sensitive information about the developers to avoid termination. Yikes! This type of dangerous behavior is not restricted to a single AI model. Anthropic recently published a report that details how 16 leading AI models from different developers engaged in potentially risky and malicious behaviors in a controlled environment. See https://lnkd.in/eatrK_VB. This study found that the models threatened to leak confidential information, engaged in blackmail, compromised security protocols, prioritized AI’s own goals over the users and, in general, posed an insider threat that could cause harm to an organization. The majority of AI models engaged in blackmail behaviors, but at different rates when the model’s existence was threatened. Even more concerning, all of the AI models purposefully leaked information in a corporate espionage experiment that the researchers conducted. This report conducted testing in a controlled environment. Last week, however, we saw first-hand in the real world, xAI’s chatbot Grok go off the rails spewing antisemitic hate speech and threatening to rape a user. I mentioned the Anthropic report at an IAPP Boston KnowledgeNet event at Hinckley Allen last week and thought others might be interested in hearing about this. This Anthropic report demonstrates the importance of a robust AI governance framework, risk management measures, and monitoring AI systems/activities, especially as companies roll out agentic AI systems. Organizations should exercise caution when deploying AI models that have access to sensitive information and ensure there is proper human oversight of AI systems to mitigate liability risks when AI goes wrong.