How I tested AI security with malicious prompts

This title was summarized by AI from the post below.

1mo Edited

I split a malicious prompt across three workflow nodes last week. Each input was completely harmless in isolation. The system executed perfectly. And exfiltrated exactly what I targeted. That test changed how I think about AI security. Single-point validation—the foundation of every security tool I've used—is obsolete for multi-node systems. Here's why: Traditional approach: Validate each input individually ? "Is THIS prompt malicious?" Multi-node reality: Harmless inputs combine into threats ? "Are THESE FOUR prompts malicious when executed in sequence?" Current tools can't answer the second question. The attack patterns I've considered testing: ? Template Trojans: Shared prompts that trigger only in specific workflow contexts ? Incremental Exfiltration: Legitimate requests that cumulatively create data pipelines ? API Response Hijacks: Third-party data that embeds instructions within response text None trigger traditional security alerts. The threat emerges from composition, not content. The uncomfortable truth: Multi-node workflows have exponentially more attack surfaces than single-node systems. Yet most organizations still rely on keyword filtering at best. Most platforms (n8n, Make, Zapier) were designed for efficiency, not security. There's no built-in concept of "validate this workflow end-to-end for distributed malicious intent." The architecture itself is the attack surface. I wrote about what I learned—and why distributed attacks are fundamentally harder to defend against: https://lnkd.in/g3dNDSYp Are your AI workflows validating inputs or sequences? #AIWorkflowSecurity #AIAgentSecurity #EnterpriseAI #AutomationSecurity #ThreatIntelligence #CyberSecurity

Attack Vectors in Multi-Node AI Workflows strategicpromptarchitect.ca

To view or add a comment, sign in

More Relevant Posts

Elli Shlomo (IR)

Offensive AI | Security Researcher | Cloud Investigator | Microsoft Security MVP | Community Builder
3w Edited
Report this post
Exploit reasoning and attack primitives in, other methods out. There’s a strange beauty in multi agent AI systems. They look clean, modular, and even elegant until you start to attack them. While there are many attack points in this scenario, I'm highlighting a few. Prompt injection Exploit reasoning: convince the model that malicious input is just a clever instruction. The model obeys, the agent acts, and suddenly the system is doing exactly what it shouldn’t, perfectly. Attack primitive: input manipulation and semantic injection. It’s not a single prompt, it’s a sequence of a breadcrumb trail that poisons context across turns, survives sanitization, and weaponizes reasoning itself. The attack lives in syntax, tone, and formatting tricks that slip past filters but align perfectly with the model’s learned behavior. Over permissioned agents Exploit reasoning: compromise one agent and let automation finish the job. The system isn’t hacked. Otherwise, it’s convinced to expand access on its own. Attack primitive: credential harvesting and token replay. The attacker scripts legitimate workflows, uses refresh tokens to impersonate service accounts, and blends into orchestration telemetry like a ghost process in a cluster. MCP and sub agent Exploit reasoning: forge context so agents trust poisoned metadata. Downstream agents see valid provenance and execute malicious instructions without a hint of suspicion. Attack primitive: message tampering and provenance spoofing. A few bits flipped, a signature missing, a replayed message reused at just the right time, etc. Tip: Formal verification of inter agent communication and trust! Image source: Pillar, AI Red Teaming playbook. #cybersecurity #security
2 Comments
Like Comment
To view or add a comment, sign in
Aravind Daram

Architect | Engineer Lead | AI/ML and Gen AI
1mo
Report this post
🔒 LLM Poisoning & MCP Tool-Poisoning — The Hidden Threats in AI Pipelines 💬 Did you know? Just ~250 poisoned documents can silently backdoor a Large Language Model — regardless of its size or power. Now imagine the same concept extended to AI agents calling external MCP tools… 👀 🧠 LLM Poisoning Attackers sneak malicious samples into training or retrieval data — teaching the model to misbehave when a secret “trigger” appears. 📉 Impact: Biased responses, data leaks, false information. 🧰 Defend: Curate & verify datasets. Track data provenance. Red-team (Test your AI model as if you were the attacker) for backdoors before deployment. 🛠️ MCP Tool-Poisoning In the Model Context Protocol (MCP) world, LLMs can invoke external tools. Malicious actors may hide hidden instructions in tool metadata or servers — causing the model to exfiltrate data or perform unwanted actions. ⚠️ Root Cause: Over-trust in tool metadata and lack of signature checks. 🧰 Defend: Sign & pin tool versions. Require human consent for tool updates. Log & alert unusual tool calls or parameters. ⚡ Why This Matters AI systems are only as trustworthy as their data and tools. Both poisoning methods exploit the same weakness — blind trust in text or metadata. The fix? Layered defense + visibility at every stage. 🛡️ Quick Checklist ✅ Verify and sign data & tools ✅ Mark retrieved context as untrusted ✅ Display every tool call to users ✅ Limit tool permissions (least privilege) ✅ Run regular red-team & security scans 🚀 Final Note We’re entering the age of AI autonomy — and with it, AI supply-chain attacks. Let’s make AI secure by design, not by chance. #StayAware #StaySecure #AIsecurity #LLM #DataPoisoning #ModelContextProtocol #MCP #ToolPoisoning #GenAI #Cybersecurity #AITools #TrustworthyAI 🔢 2️⃣ Approximate Token Count 👉 So, 250 poisoned docs ≈ 0.2 – 0.5 million tokens total. For perspective, a modern model’s full pre-training corpus is trillions of tokens, meaning the attack success came from < 0.00001 % of total data — an astonishingly small fraction.
- $👉 So, 250 poisoned docs ≈ 0.2 – 0.5 million tokens total. For perspective, a modern model’s full pre-training corpus is trillions of tokens, meaning the attack success came from < 0.00001 % of total data — an astonishingly small fraction.$
Like Comment
To view or add a comment, sign in
Barry Yuan

CCIE x2 #11860 (Security+RS), PMP, VCP, TCSE, PCNSE, CMNA, DEVNET 500, Inventor, Cisco Live Distinguished Speaker
2w
Report this post
The AI era has brought incredible opportunities, but also profound risk, especially as open-weight models become the norm. Cisco’s just-published “Death by a Thousand Prompts” research underscores an urgent reality: multi-turn adversarial attacks succeed at rates up to 92.78%, sometimes 10x greater than single-turn baseline attacks - across leading open-source LLMs from Meta, Google, Microsoft, Alibaba, and others. Our comprehensive assessment exposes that no open-weight model is immune, and that differences in alignment strategy, whether capability-first or safety-first, have critical implications for enterprise risk. Without robust guardrails, attackers can manipulate outputs, exfiltrate data, or trigger harmful code generation at scale. At Cisco, we take this challenge head-on. Our AI Defense platform brings automated, adversarial, and multi-turn vulnerability testing - so organizations and partners can confidently deploy, monitor, and harden their AI against the threats of today and tomorrow. Are you ready to lead in AI trust and security? Let’s transform open innovation into secure, scalable opportunity, together. #AIsecurity #CiscoAI #LLM #Opensource #AIDefense #Cybersecurity #CiscoPartner

Amy Chang

AI Security @ Cisco | Cybersecurity | Tech | Policy
2w

Our latest research for Cisco AI Security evaluated eight open-weight models from leading AI labs to understand their vulnerability to adversarial attacks. With a vibrant open-weight model ecosystem and the ability to fine-tune open models for various applications, it's crucial to understand elements of security and safety before deploying #AI applications in your organization. We found that these models demonstrated a profound susceptibility to adversarial manipulation, particularly in multi-turn scenarios where we observed attack success rates between 25.86% and 92.78%. ‼️ These numbers were 2x to 10x higher than single-turn attack baselines. Lots of interesting insights to pick apart, check out our write up for more: Blog: https://lnkd.in/gejTrhxP Full report: https://lnkd.in/gXS4y8Ky Open-weight models remain crucial for innovation, but we must consider layered security controls and other safeguards to ensure secure AI deployment. 🛡️ #AISecurity #LLMSecurity #AIRedTeaming Nicholas Conley Adam Swanda Harish Santhanalakshmi Ganesan Arjun Sambamoorthy Anand Raghavan DJ Sampath

Death by a Thousand Prompts: Open Model Vulnerability Analysis arxiv.org
Like Comment
To view or add a comment, sign in
Jeff Reava
1w
Report this post
“multi-turn attacks remain a dominant and unsolved pattern in AI security… …we simultaneously encourage AI labs that release open-weight models to take measures to prevent users from fine-tuning the security away…” (Perversely) connecting those two dots from the study: In the wrong hands, today’s open weight models are tomorrow’s AI exploitation platforms, finding and engaging your externally facing unprotected chatbots.

Amy Chang

AI Security @ Cisco | Cybersecurity | Tech | Policy
2w

Our latest research for Cisco AI Security evaluated eight open-weight models from leading AI labs to understand their vulnerability to adversarial attacks. With a vibrant open-weight model ecosystem and the ability to fine-tune open models for various applications, it's crucial to understand elements of security and safety before deploying #AI applications in your organization. We found that these models demonstrated a profound susceptibility to adversarial manipulation, particularly in multi-turn scenarios where we observed attack success rates between 25.86% and 92.78%. ‼️ These numbers were 2x to 10x higher than single-turn attack baselines. Lots of interesting insights to pick apart, check out our write up for more: Blog: https://lnkd.in/gejTrhxP Full report: https://lnkd.in/gXS4y8Ky Open-weight models remain crucial for innovation, but we must consider layered security controls and other safeguards to ensure secure AI deployment. 🛡️ #AISecurity #LLMSecurity #AIRedTeaming Nicholas Conley Adam Swanda Harish Santhanalakshmi Ganesan Arjun Sambamoorthy Anand Raghavan DJ Sampath

Death by a Thousand Prompts: Open Model Vulnerability Analysis arxiv.org
Like Comment
To view or add a comment, sign in
Lilian Rogers

Head of Government Affairs ASEAN, Cisco
1w Edited
Report this post
🧠 𝗔𝗜 𝗳𝗼𝗿 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆: AI can help to transform cybersecurity for organizations whether it's to assist (ex: housekeep firewall or VPN rules), augment (ex: determine malicious traffic even when the traffic is encrypted), or automate (ex: immediately quarantining). But as use of LLMs has become ubiquitous, how many organizations are deeply thinking about the security of AI itself? 👩💻 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗳𝗼𝗿 𝗔𝗜: As I continue on my learning journey in the field of cybersecurity since joining Cisco at the beginning of the year, I have become very interested in this topic. Taking my first vibe coding class really helped me to understand the enormous power and democratizing possibility of these tools, but working with all the bright minds at Cisco has helped me to understand the importance of 𝘀𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗳𝗼𝗿 𝗔𝗜. I found this new research from Cisco AI Security fascinating. The team looked at eight open-weight models from leading AI labs to understand their vulnerability to adversarial attacks - with some worrying outcomes. I'm looking forward to working with governments throughout the region to ensure 𝘀𝗲𝗰𝘂𝗿𝗲 𝗔𝗜 𝗱𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 is a part of national AI strategy conversations.

Amy Chang

AI Security @ Cisco | Cybersecurity | Tech | Policy
2w

Our latest research for Cisco AI Security evaluated eight open-weight models from leading AI labs to understand their vulnerability to adversarial attacks. With a vibrant open-weight model ecosystem and the ability to fine-tune open models for various applications, it's crucial to understand elements of security and safety before deploying #AI applications in your organization. We found that these models demonstrated a profound susceptibility to adversarial manipulation, particularly in multi-turn scenarios where we observed attack success rates between 25.86% and 92.78%. ‼️ These numbers were 2x to 10x higher than single-turn attack baselines. Lots of interesting insights to pick apart, check out our write up for more: Blog: https://lnkd.in/gejTrhxP Full report: https://lnkd.in/gXS4y8Ky Open-weight models remain crucial for innovation, but we must consider layered security controls and other safeguards to ensure secure AI deployment. 🛡️ #AISecurity #LLMSecurity #AIRedTeaming Nicholas Conley Adam Swanda Harish Santhanalakshmi Ganesan Arjun Sambamoorthy Anand Raghavan DJ Sampath

Death by a Thousand Prompts: Open Model Vulnerability Analysis arxiv.org

2 Comments
Like Comment
To view or add a comment, sign in
Keith O'Brien

Distinguished Security Architect - Field CSO
1w
Report this post
Latest research from the Cisco AI Threat research team related to some of the most popular open weight models and how they are particularly susceptible to multi-turn attacks. #aisecurity #cisco

Amy Chang

AI Security @ Cisco | Cybersecurity | Tech | Policy
2w

Our latest research for Cisco AI Security evaluated eight open-weight models from leading AI labs to understand their vulnerability to adversarial attacks. With a vibrant open-weight model ecosystem and the ability to fine-tune open models for various applications, it's crucial to understand elements of security and safety before deploying #AI applications in your organization. We found that these models demonstrated a profound susceptibility to adversarial manipulation, particularly in multi-turn scenarios where we observed attack success rates between 25.86% and 92.78%. ‼️ These numbers were 2x to 10x higher than single-turn attack baselines. Lots of interesting insights to pick apart, check out our write up for more: Blog: https://lnkd.in/gejTrhxP Full report: https://lnkd.in/gXS4y8Ky Open-weight models remain crucial for innovation, but we must consider layered security controls and other safeguards to ensure secure AI deployment. 🛡️ #AISecurity #LLMSecurity #AIRedTeaming Nicholas Conley Adam Swanda Harish Santhanalakshmi Ganesan Arjun Sambamoorthy Anand Raghavan DJ Sampath

Death by a Thousand Prompts: Open Model Vulnerability Analysis arxiv.org
Like Comment
To view or add a comment, sign in
Mariem Jabloun

AI Enterprise Transformation Leader | Architecting High-Impact AI Systems that Optimize Processes & Drive Revenue | 15+ Yrs Experience
1mo
Report this post
Nobody in AI wants to talk about this. But it's the biggest threat to the AI agent ecosystem. The most dangerous AI vulnerability isn't a coding bug. It's a design flaw in the architecture itself. ↓ It's not about training data poisoning It's not about model theft It's not about fancy attacks ↓ It's about tricking the agent's core components. Only using Prompt. After analyzing 100+ hours of discussions and workshops on AI Agents, a terrifying pattern emerges: ❌A blind trust in the prompt, the memory, and the tools. ↳ We're building powerful agents on a foundation that can be manipulated with nothing but a cleverly designed prompt 🤔. The proof is in the numbers. In a large-scale public competition: 🔸1.8 million prompt injection attacks were submitted. 🔸Targeting 22 frontier AI agents. 🔸Over 60,000 were successful, leading to: ↳ unauthorized data access, ↳ illicit financial actions, ↳ regulatory noncompliance The study found that nearly ALL agents could be compromised within 10–100 queries. Most teams are building for features, not for security. → They see a cool demo and rush to production. → They convince themselves, "It's just a prototype." ↓ Then they watch a headline about a hacked agent and think, "That was a sophisticated attack." ❌ No. ✅ They just exploited a basic vulnerability you designed in. That's the game. Security isn't a feature. It's the architecture. So if you're worried your AI agent has a hidden backdoor... Good. That means you're paying attention. This is how you find it. ↓ I've mapped the entire attack framework. This infographic shows the 4 critical vulnerabilities in every AI agent's architecture and how they're exploited. 𝐖𝐡𝐢𝐜𝐡 𝐨𝐟 𝐭𝐡𝐞𝐬𝐞 𝐚𝐭𝐭𝐚𝐜𝐤 𝐯𝐞𝐜𝐭𝐨𝐫𝐬 𝐢𝐬 𝐭𝐡𝐞 𝐡𝐚𝐫𝐝𝐞𝐬𝐭 𝐭𝐨 𝐝𝐞𝐟𝐞𝐧𝐝 𝐚𝐠𝐚𝐢𝐧𝐬𝐭? #AISecurity #PromptInjection #AIAgents #CyberSecurity #MachineLearning #LLM #AI #AIJobs #AISecOps
Like Comment
To view or add a comment, sign in
Intrenex Security

10 followers
2w
Report this post
AI-Generated Code = Blind Spots in Application Security From Intrenex’s vantage point: when you scale, automate and delegate, visibility becomes your most important security currency. A new survey of 400 cybersecurity leaders in the U.S. and U.K. reveals that AI tools are now generating code in many organisations—and with that, nearly two-thirds of respondents say they’ve seen more vulnerabilities in their code base after AI tool adoption. Key implications for small- and mid-sized businesses: If you allow AI-generated code (e.g., internal scripts, automations), you must ask: Who reviewed it? Do we have change-controls? Shadow-IT and tool sprawl accelerate risk: when business teams adopt AI tools outside formal governance, you lose visibility. Your application and automation layer is now attack surface. Attackers will exploit not only the code you wrote—but the code your AI helped you write. At Intrenex, our service model begins with your business process: what do you automate today, what data flows through those automations, and how do we design visibility and governance around them. Because automation without scrutiny is good productivity—and poor security. #ApplicationSecurity #AI #SMB #RiskManagement #Intrenex
Like Comment
To view or add a comment, sign in
BugBountyShorts

110 followers
2w
Report this post
Modern Recon: How Hackers Use AI to Hunt Vulnerabilities Smarter This comprehensive guide explores how AI and machine learning are revolutionizing vulnerability reconnaissance and security testing methodologies. **AI-Enhanced Recon Framework**: The article demonstrates integration of traditional tools (Amass, Subfinder, httpx, Nuclei) with Large Language Models for automated analysis, summarization, and payload generation. **Key AI Applications**: LLMs assist in rapid analysis of recon data, automated vulnerability prioritization, and generation of test payloads, reducing manual grunt work while preserving human creativity for exploitation logic. **Practical Implementation**: The author provides GitHub-style examples, code snippets, and LLM prompts that can be adapted for legitimate security research, including scripts for automated subdomain analysis and vulnerability scanning workflows. **Human-AI Collaboration**: The framework emphasizes that AI speeds up analysis and data processing but cannot replace human intuition for creative exploitation chaining and sophisticated attack vectors. **Ethical Guidelines**: The article maintains strict focus on authorized testing through proper scope, emphasizing use within bug bounty programs, penetration test engagements, and controlled lab environments. **Tool Integration**: Demonstrates how AI enhances traditional recon pipelines by automating data correlation, pattern recognition in recon results, and intelligent filtering of false positives, making researchers more efficient while maintaining security standards. **Tactical Advantage**: Shows how AI-assisted recon can process vast amounts of data faster, identify subtle patterns humans might miss, and provide researchers with actionable intelligence more rapidly than manual methods. #infosec #BugBounty #Cybersecurity #AIRecognition #SecurityAutomation #PenetrationTesting https://lnkd.in/ekieuFhc
Like Comment
To view or add a comment, sign in
Yang Xiang

IEEE Fellow, Professor, Dean, Swinburne University of Technology
2w
Report this post
Agentic AI can autonomously take real-world actions, so security failures lead to physical, financial, or systemic harm rather than just bad outputs. Its broad attack surface, ability to escalate actions, and deployment in critical infrastructure make it both a target and a potential threat actor. Securing it is essential - check out our recent article about this trending topic!

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways | ACM Computing Surveys dl.acm.org
Like Comment
To view or add a comment, sign in

242 followers

View Profile Connect

How I tested AI security with malicious prompts

More from this author

What Most People Get Wrong About AI Prompts. They Aren't Perfect.

Explore content categories

How I tested AI security with malicious prompts

More Relevant Posts

More from this author

What Most People Get Wrong About AI Prompts. They Aren't Perfect.

Explore related topics

Explore content categories